09:03:03 <jakeyip> #startmeeting magnum 09:03:03 <opendevmeet> Meeting started Wed Aug 28 09:03:03 2024 UTC and is due to finish in 60 minutes. The chair is jakeyip. Information about MeetBot at http://wiki.debian.org/MeetBot. 09:03:03 <opendevmeet> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 09:03:03 <opendevmeet> The meeting name has been set to 'magnum' 09:03:45 <jakeyip> #link https://etherpad.opendev.org/p/magnum-weekly-meeting 09:03:50 <jakeyip> Please put your topics into to Agenda 09:03:54 <jakeyip> #topic Roll Call 09:03:58 <jakeyip> o/ 09:04:04 <jakeyip> mnasiadka / dalees if you are around 09:04:16 <mnasiadka> o/ 09:04:17 <mnasiadka> I'm here 09:04:23 <dalees> o/ 09:04:31 <dalees> I'm around, sort of. 09:06:28 <jakeyip> cool let's get on with it :) 09:07:01 <jakeyip> #topic QueuePool limit bug 09:07:06 <jakeyip> #link https://bugs.launchpad.net/magnum/+bug/2067345 09:07:20 <jakeyip> andrewbonney: I believe this is from you 09:07:36 <andrewbonney> Yeah, I just wanted to raise it again as we've seen the same as others since upgrading to C 09:07:54 <andrewbonney> It's pretty major as we have to restart Magnum services frequently to keep things working 09:08:01 <jakeyip> did the patches fix things? 09:08:59 <andrewbonney> I haven't applied them personally yet as patching oslo.db is a little involved, but given other services are using oslo.db without issue I was a little surprised that might be required 09:11:28 <jakeyip> yeah I am not sure where the bug is, as I haven't encountered it in prod (we are still at B). 09:12:25 <jakeyip> I was planning to get to C then I can debug, but unfortunately I had to chase down a few bugs in other places affecting our deployment of Magnum, so C upgrade got delayed 09:12:50 <jakeyip> how about mnasiadka or dalees ? 09:14:06 <dalees> Likewise, not running C yet; CAPI driver has got most of my attention for now and Magnum version isn't the limitation anymore. 09:16:36 <jakeyip> if I was to guess, it may have been something introduced by us trying to bring sqlalchemy up to date 09:17:18 <andrewbonney> I did have a look at the code around those changes but nothing jumped out unfortunately 09:17:26 <jakeyip> andrewbonney: are you able to help us test by rolling back those commits? 09:17:35 <jakeyip> #link https://review.opendev.org/c/openstack/magnum/+/910722 09:17:46 <jakeyip> #link https://review.opendev.org/c/openstack/magnum/+/910512 09:18:08 <mnasiadka> we are going to work on upgrades to C - so sooner or later this year we'll probably stumble on the same issue 09:18:42 <jakeyip> andrewbonney: which driver are you using? 09:19:52 <andrewbonney> We're running the vexxhost CAPI integration 09:21:25 <andrewbonney> I'm happy to try rolling stuff back, but that will also involve pinning oslo.db back to ensure compatibility with the autocommit changes 09:25:03 <jakeyip> will reverting just the autocommit change https://review.opendev.org/c/openstack/magnum/+/910722 fail ? 09:25:39 <andrewbonney> If we stick with oslo.db 15 from upper-constraints I believe so yes 09:26:44 <jakeyip> what are the magnum / oslo.db versions you are running now? 09:27:14 <andrewbonney> Magnum 18.0.1, oslo.db 15.0.0 09:28:03 <jakeyip> sqlalchemy? 09:28:30 <andrewbonney> 1.4.51 09:29:58 <dalees> andrewbonney: what is the pattern you see with db connections, how quickly do they rise with approx how many clusters? similar to https://bugs.launchpad.net/magnum/+bug/2067345/comments/12 ? 09:31:29 <andrewbonney> I can go away and collect some data. Looking at our logs it takes maybe 3 days from service restart to start seeing errors, but this is with 1-3 clusters present at any one time 09:31:38 <andrewbonney> We're running all this in a staging environment at present 09:34:36 <dalees> I'll try an upgrade in development env soon, and see if I can reproduce the issues. 09:35:01 <jrosser> what andrewbonney is describing is an environment where we do man create/delete of a small number of clusters 09:35:13 <jrosser> rather than having a large number of clusters that is long lived 09:35:22 <jrosser> *many create/delete 09:35:49 <jakeyip> andrewbonney: another thing you can try is try this patch https://review.opendev.org/c/openstack/magnum/+/926626 09:36:09 <andrewbonney> Will do, ta 09:36:50 <jakeyip> this switches over the code from the legacy facade to the new one introduced in 2024.1, possibly fixing the issue 09:37:42 <jakeyip> no sorry, not introduced in 2024.1, introduced many years ago 09:40:38 <jakeyip> I think rolling forward to https://review.opendev.org/c/openstack/magnum/+/926626 is prob the best choice 09:44:46 <andrewbonney> I'll give that a go and feed back in the issue after I've got some data on connections 09:45:24 <jakeyip> thanks! 09:49:08 <jakeyip> anything else? 09:54:05 <jakeyip> ok thanks everyone for coming 09:54:07 <jakeyip> #endmeeting