sorrison | Hi lxkong, I'm trying to figure out where to go next with trove. My main priority is making trove be upgradable from ussuri which isn't possible. | 00:06 |
---|---|---|
sorrison | I think the first step is to make the "upgrade instance" method work how it does in ussuri with a nova rebuild, thoughts? | 00:07 |
lxkong | sorrison: "making trove be upgradable from ussuri which isn't possible", you mean trove upgrade itself or the trove instance (guest agent)? | 00:18 |
lxkong | could you please remind again what's the problem? | 00:19 |
lxkong | i have been so busy these days, not sure if we have talked about that before | 00:19 |
lxkong | sorrison: in case i am distracted again, can you create a story to describe your problem? | 00:20 |
lxkong | i am happy to help if that really needs to be fixed | 00:21 |
sorrison | I have https://storyboard.openstack.org/#!/story/2008315 | 00:24 |
sorrison | the issue is I have ussuri guest agents running and I want to upgrade them which is now not possible | 00:24 |
sorrison | I also am wondering about the future of how it works with containers currently. I wonder if there are any thoughts on changing it back to how it used to work so we can get it working in the gate | 00:25 |
sorrison | I was wondering if the whole container thing could be done in a way that doesn't impact how it currently works in ussuri. The changes are so fundamental that it really breaks for all trove users | 00:26 |
sorrison | eg. it could be done as some kind of experimental thing that is enabled with a config flag or something | 00:27 |
lxkong | thanks sorrison , i will take a look and get back to you asap. | 00:46 |
lxkong | as you can see, i've been working on some datastore version improvement, lots of places need to change. | 00:49 |
lxkong | for this issue "Make guestagent api use RPC versions", i think it's already been fixed by https://review.opendev.org/#/c/762098/, right? | 00:49 |
lxkong | For "Make trove upgrade command work from ussuri guest agents to newer" | 00:49 |
lxkong | i think you mean trove-guestagent upgrade (from ussuri to victoria), is that acceptable for your users to backup their instances first, then create new ones (new datastore image) from the backup? | 00:50 |
lxkong | If you are not interested in using container, to make the implementation simple, what i would suggest is: | 00:50 |
lxkong | v | 00:50 |
lxkong | 1. create separate DIB elements (please don't just simply restore things, you can refer to the old ones and but only submit necessary changes) | 00:50 |
lxkong | 2. create a separte datastore category, e.g. trove/guestagent/datastore/vm and put all the supported databases there, e.g. trove/guestagent/datastore/vm/mysql. | 00:51 |
openstackgerrit | Nguyen Thanh Cong proposed openstack/trove master: No validate when perform eject replica source https://review.opendev.org/762496 | 01:24 |
*** rcernin has quit IRC | 01:43 | |
congnt95 | sorrison: | 01:54 |
sorrison | No it's not really acceptable to do backup restore etc. the way it used to work was the best so I think trove should move back to that way | 02:20 |
*** rcernin has joined #openstack-trove | 02:20 | |
*** rcernin has quit IRC | 02:24 | |
*** rcernin has joined #openstack-trove | 02:28 | |
*** rcernin has quit IRC | 02:31 | |
*** rcernin has joined #openstack-trove | 02:32 | |
*** spatel has joined #openstack-trove | 06:49 | |
*** icey has quit IRC | 06:52 | |
*** spatel has quit IRC | 06:53 | |
*** icey has joined #openstack-trove | 06:54 | |
*** sapd1 has joined #openstack-trove | 07:26 | |
*** tosky has joined #openstack-trove | 08:45 | |
*** rcernin has quit IRC | 08:52 | |
openstackgerrit | Lingxian Kong proposed openstack/trove master: Support datastore version number for creating instance https://review.opendev.org/763139 | 09:08 |
*** rcernin has joined #openstack-trove | 09:15 | |
openstackgerrit | Lingxian Kong proposed openstack/trove master: Update datastore version name https://review.opendev.org/762948 | 09:33 |
openstackgerrit | Lingxian Kong proposed openstack/trove master: Support datastore version number for creating instance https://review.opendev.org/763139 | 09:33 |
*** e0ne has joined #openstack-trove | 09:38 | |
openstackgerrit | Nguyen Thanh Cong proposed openstack/trove master: convert to type str to compare https://review.opendev.org/762823 | 09:44 |
*** rcernin has quit IRC | 09:45 | |
lxkong | sorrison: may i know why backup/restore doesn't work for you? If rebuild is the thing you are pursuing, there is an admin rebuild API in Victoria. | 09:49 |
*** rcernin has joined #openstack-trove | 09:55 | |
*** sapd1 has quit IRC | 10:18 | |
*** icey has quit IRC | 10:31 | |
*** icey has joined #openstack-trove | 10:32 | |
*** sapd1 has joined #openstack-trove | 10:38 | |
*** sapd1 has quit IRC | 10:47 | |
*** sapd1 has joined #openstack-trove | 10:53 | |
*** sapd1 has quit IRC | 10:57 | |
*** icey has quit IRC | 11:05 | |
*** rcernin has quit IRC | 11:18 | |
*** rcernin has joined #openstack-trove | 11:19 | |
*** icey has joined #openstack-trove | 11:26 | |
*** rcernin has quit IRC | 11:27 | |
*** spatel has joined #openstack-trove | 11:31 | |
*** rcernin has joined #openstack-trove | 11:33 | |
*** spatel has quit IRC | 11:36 | |
*** icey has quit IRC | 11:41 | |
*** icey has joined #openstack-trove | 11:43 | |
*** e0ne has quit IRC | 11:44 | |
*** rcernin has quit IRC | 11:46 | |
*** icey has quit IRC | 11:49 | |
*** rcernin has joined #openstack-trove | 11:52 | |
*** rcernin has quit IRC | 11:55 | |
*** icey has joined #openstack-trove | 11:55 | |
*** e0ne has joined #openstack-trove | 12:18 | |
*** icey has quit IRC | 12:23 | |
*** icey has joined #openstack-trove | 12:24 | |
*** icey has quit IRC | 12:50 | |
*** e0ne has quit IRC | 13:51 | |
*** sapd1 has joined #openstack-trove | 13:51 | |
*** e0ne has joined #openstack-trove | 14:01 | |
*** __ministry1 has joined #openstack-trove | 14:04 | |
*** __ministry1 has quit IRC | 14:05 | |
*** e0ne has quit IRC | 17:21 | |
*** e0ne has joined #openstack-trove | 17:22 | |
*** e0ne has quit IRC | 17:33 | |
*** e0ne has joined #openstack-trove | 17:57 | |
*** e0ne has quit IRC | 18:36 | |
-openstackstatus- NOTICE: The Gerrit service at review.opendev.org is being restarted quickly as a pre-upgrade sanity check, estimated downtime is less than 5 minutes. | 18:37 | |
*** openstackgerrit has quit IRC | 19:02 | |
*** openstackgerrit has joined #openstack-trove | 20:36 | |
openstackgerrit | Lingxian Kong proposed openstack/trove master: Support datastore version number for creating configuration https://review.opendev.org/763259 | 20:36 |
*** dkehn has joined #openstack-trove | 20:40 | |
*** sapd1 has quit IRC | 21:04 | |
*** rcernin has joined #openstack-trove | 21:13 | |
*** sapd1 has joined #openstack-trove | 21:17 | |
sorrison | Hi lxkong: I've described in https://storyboard.openstack.org/#!/story/2008373 about the need to reinstate trove upgrade command like ussuri | 21:58 |
sorrison | Let me know what you think, I think this is now the main blocker for anyone wanting to upgrade | 21:58 |
*** jmlowe has quit IRC | 22:26 | |
lxkong | 22:49 <lxkong> sorrison: may i know why backup/restore doesn't work for you? If rebuild is the thing you are pursuing, there is an admin rebuild API in Victoria. | 23:27 |
lxkong | because from victoria, the guest image doesn't contain the database software, so simply calling rebuild nova instance won't affect the database version | 23:29 |
lxkong | what's why i added a rebuild api for trove admin for upgrading the OS and trove guest agent. | 23:30 |
lxkong | so we need to split the upgrade process to two parts, one is for the end user to upgrade the database version, the other is for the trove admin to upgrade trove-guestagent | 23:31 |
sorrison | Yeah I understand how its been change, I just don't think that is a good idea, the previous way it was is better from my point of view | 23:32 |
lxkong | if datastore version upgrade API for the end user is still there | 23:32 |
lxkong | but is not doing the same thing under the hood | 23:32 |
sorrison | we don't want to have to upgrade all the users DB instances ourselves. We want users to do it | 23:32 |
lxkong | no, the end user shouldn't know anything about trove-guestagent, should they? | 23:33 |
sorrison | correct | 23:33 |
lxkong | from user's perspective, they only need to upgrade datastore version | 23:34 |
sorrison | correct | 23:34 |
lxkong | so update instance API can do that | 23:34 |
sorrison | but with that datastore upgrade we also take the opportunity to upgrade the operating system and trove guest agent | 23:34 |
lxkong | upgrading os and trove-guestagnet is the trove admin's task | 23:34 |
sorrison | It's not with Ussuri and isn't a good model. | 23:35 |
sorrison | it means extra work for operators | 23:35 |
lxkong | sorrison: that's not extra work, that's operators' job | 23:35 |
sorrison | why? | 23:35 |
sorrison | with ussuri it worked so well | 23:35 |
lxkong | because opreator should decide when and how to upgrade trove guest agent | 23:36 |
lxkong | again, ussuri works just because of the implementation allows you | 23:36 |
lxkong | that's not the intention | 23:36 |
sorrison | yes it was the intention | 23:36 |
sorrison | that is why the guest api versioning existed in the first place | 23:37 |
lxkong | from the point of API design, why the trove guest agnet should be upgraded when the end user triggers a datastore version upgrade? | 23:37 |
sorrison | because it allows us to do seamless upgrades | 23:37 |
lxkong | The api tells the user you can upgrade mysql 5.7.29 to mysql 5.7.30, why trove should touch os and guest agent? | 23:38 |
sorrison | all user cares about is db version, but operators want to keep the images fresh | 23:38 |
sorrison | users don't need to know or care about operating system | 23:38 |
lxkong | yes, so the api is doing the right thing for the end user | 23:38 |
sorrison | yes but not the right thing by the operator | 23:38 |
sorrison | it now means more work | 23:38 |
lxkong | as operator, you just want to use that chance to do your own work, it used to work because of the original implementation of building guest agnet image | 23:39 |
lxkong | now the guest agent image build mechanism is changing | 23:39 |
sorrison | yes exactly | 23:39 |
lxkong | if we really want the old way to upgrade, we need to support the old style guest image first | 23:40 |
lxkong | like we discussed yersterday | 23:40 |
sorrison | I don't think so | 23:40 |
sorrison | the existing images could be made to work in the same way | 23:40 |
lxkong | as i said, the guest image doesn't contain database any more | 23:41 |
sorrison | yeah that's fine | 23:41 |
lxkong | doing rebuild is actually doing nothing | 23:41 |
sorrison | just need to do a nova rebuild and also do the switch in containers | 23:41 |
sorrison | the rebuild would then use the new image | 23:41 |
lxkong | what you are saying is actually the trove admin rebuild api | 23:42 |
sorrison | yeah and that functionality needs to be put back in the upgrade step | 23:42 |
sorrison | as an example in our install we've never had to take down a users trove DB for an outage and we've been running trove for a long time | 23:44 |
sorrison | this is awesome | 23:44 |
sorrison | the only time a users db instance goes down is when they initiate the upgrade | 23:44 |
lxkong | no, trove already provides the capability for what you want, if you insist to do the same thing as before, you can wrap the apis | 23:44 |
lxkong | rebuild will stop the db | 23:45 |
lxkong | that's outage | 23:45 |
sorrison | yeah but it's a user initiated outage | 23:45 |
sorrison | they get to choose when to do it | 23:45 |
sorrison | that is the awesome feature of trove | 23:45 |
sorrison | Is that making sense now? | 23:46 |
lxkong | no, i don't think so, allowing the user to choose when to upgrade os or trove guest agent is not reasonalbe from the cloud provider's perspective. | 23:49 |
lxkong | like us, we said clearly to customers it's our responsibility to upgrading OS and software inside when there is a security patch available | 23:50 |
sorrison | how come? I'm a cloud provider :-) | 23:50 |
lxkong | leaving it to customers will affect our own SLA, and can lead to seucrity issues for us | 23:51 |
sorrison | That is still possible for the old way | 23:51 |
sorrison | we have done this in the past where it is a security release so we give them say 24 hours to do the trove upgrade and if they haven't done it we do it for them | 23:52 |
lxkong | i have an idea | 23:53 |
lxkong | when the user is upgrading datastore version, trove does rebuild if the image changes. Otherwise, only upgrade the docker image tag. | 23:53 |
sorrison | yeah that would be great | 23:54 |
sorrison | nice idea | 23:54 |
lxkong | because there was an assumption that we only support patch version upgrade (5.7.29 to 5.7.30), so the image won't change | 23:55 |
lxkong | but if that's something you need, we can add the ability to rebuild when the image is intentionally changed | 23:55 |
sorrison | sounds good | 23:55 |
lxkong | that won't affect us | 23:55 |
sorrison | eg. we also support upgrading from a 5.7.x release to a 8.0.x release which has been handy too | 23:56 |
lxkong | sorrison: i haven't tested that yet, not sure if we should change some code | 23:56 |
lxkong | i can start implementing that once i finish my current tasks | 23:56 |
lxkong | supporting datastore version number | 23:57 |
sorrison | awesome, I can help of course | 23:57 |
lxkong | sorrison: yep, feel free to submit patch | 23:57 |
sorrison | I've been trying to support upgrading between major pgsql versions but it's a but more involved | 23:57 |
lxkong | i will get back to you once finish the version number stuff | 23:57 |
sorrison | ok thanks | 23:58 |
lxkong | if you haven't started, i can pick up | 23:58 |
lxkong | sorrison: great discussion with you, lunch time now | 23:58 |
sorrison | haven't started yet, but might have some time tomorrow to start | 23:58 |
sorrison | enjoy! | 23:58 |
Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!