14:00:07 #startmeeting neutron_drivers 14:00:08 Meeting started Fri Nov 13 14:00:07 2020 UTC and is due to finish in 60 minutes. The chair is slaweq. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:00:09 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:00:12 The meeting name has been set to 'neutron_drivers' 14:00:15 o/ 14:00:20 Hi 14:00:20 welcome everyone! 14:00:26 hi 14:01:15 lets wait few more minutes for other people to join 14:01:32 hi 14:02:44 ping haleyb: njohnston amotoki yamamoto 14:02:54 maybe they will join us soon 14:04:05 don't pay them today 14:04:17 :D 14:05:13 LOL 14:05:35 hi, sorry i'm late 14:07:35 ok, lets start 14:07:42 we quorum already 14:07:57 and we have only one topic in the on demand agenda for today 14:08:01 no new RFEs to discuss 14:08:05 #topi On Demand 14:08:08 #topic On Demand 14:08:23 obondarev: topic was added by You so please go on :) 14:09:07 yep, it's about https://bugs.launchpad.net/neutron/+bug/1887523 14:09:08 Launchpad bug 1887523 in neutron "Deadlock detection code can be stale" [High,In progress] 14:09:10 so let me briefly describe my point 14:09:42 I think the bug with it's current title, description and suggested approach is not quite correct 14:10:06 "Deadlock detection code can be stale" - don't think so 14:10:24 "neutron has it's own implementation of it which is missing a bunch of deadlocks" - not True 14:11:11 so indeed neutron has a few decorators on top of oslo.db one 14:11:13 yeah perhaps "neutron has its own extra decorators around oslo decorator" 14:11:30 but they were added for a reason 14:11:36 for many reasons in fact 14:12:05 neutron had a long history of fighting against DB errors in concurrent scenarios 14:12:29 git blame may shed some light on it: https://github.com/openstack/neutron/blame/stable/rocky/neutron/db/api.py 14:12:36 just a few examples 14:12:45 https://bugs.launchpad.net/neutron/+bug/1596075 14:12:47 Launchpad bug 1596075 in neutron "Neutron confused about overlapping subnet creation" [High,Fix released] - Assigned to Kevin Benton (kevinbenton) 14:12:52 https://bugs.launchpad.net/neutron/+bug/1612798 14:12:53 Launchpad bug 1612798 in neutron "Move db retry logic closer to where DB error occur" [Critical,Fix released] - Assigned to Kevin Benton (kevinbenton) 14:13:13 https://github.com/openstack/neutron/commit/948461c8b2fbeb30e4fa3a43cc523cff76327d4e 14:13:31 most were quite a tricky ones 14:14:00 so I don't think moving back to oslo.db decorator is the right thing to do 14:14:24 there were not much work done in oslo.db: https://github.com/openstack/oslo.db/commits/master/oslo_db/api.py 14:14:36 you mean lately? 14:14:43 right 14:14:45 thanks 14:15:05 in fact oslo.db retries were started by neutron folks) 14:15:11 LOL 14:15:11 IIRC 14:15:29 and in fact neutron retry logic now handles more cases 14:15:48 but from what is written in the bug description we are missing some of deadlocks and not handling them properly 14:15:50 some more info could be found here: https://github.com/openstack/neutron/blob/master/doc/source/contributor/internals/retries.rst 14:16:20 slaweq, correct, so I think the bug should be about quota deadlocks 14:16:29 maybe we should change this LP to something like "compare our implementation with oslo db and update our where it's needed" 14:16:29 not about bad neutron retry logic 14:17:05 mhhhhh.... 14:17:06 at least the bug should clearly show where neutron retries are bad 14:17:33 yeah that 's perhaps missing 14:17:43 as if we just start replacing retry_if_session_inactive - we may got regressions 14:17:53 in principle I might agree wioth you obondarev. However, I want to point out that Mohammed is the CEO of a big operator 14:17:54 wjich might be hard to spot 14:18:00 if I undrstand well from summit generally there are scaling issues with neutron and one thing was that db issues can be behind that 14:18:09 even with Loki service plugin 14:18:48 so at the very least I would like to get more input from mnaser and hear more about his point of view 14:18:51 so in the bug comment and proposed patch I agree that quota retries have issues 14:19:25 let me remember you that we are still migrating to the new DB engine facade 14:19:35 that will remove subtransactions 14:19:37 that is also true 14:19:42 but please let's not just replace retry_if_session_inactive for oslo-db.wrap_db_retries all over neutron 14:19:47 and other issues related to mixing both facades 14:20:18 so I recommend start investigating this after completing the migration 14:20:37 ralonsoh, so there is no evidence what's exactly wrong with quota deadlock issue, right? 14:20:53 I can't tell, sorry 14:20:59 but 14:21:26 with the new facade at least we know each thread has one single transaction without subtransactions 14:21:33 and the context is unique 14:21:36 (per thread) 14:21:45 e.g.: https://review.opendev.org/#/c/715315/20/neutron/tests/functional/services/portforwarding/test_port_forwarding.py 14:21:45 patch 715315 - neutron - Finish the new DB engine facade migration - 20 patch sets 14:22:24 great, so I'd suggest we put this bug on hold for now, and after new facade is there - try to reproduce and investigate 14:22:36 +1 to this 14:22:41 I propose three steps: 14:23:09 1) finish the migration to the new db engine facade, as indicated by ralonsoh and obondarev 14:23:27 2) Limit this bug, for the time being, to the quota issue 14:23:58 3) Seek more input from mnaser. He may giove us some good insights, given his operational experience 14:24:06 +1 14:24:16 mlavalle++ 14:24:22 sounds good 14:24:24 +1 to this proposal 14:24:47 and also if there will be another similar issues to what we have now with quota, lets treat them separately as regular bugs 14:24:50 * mnaser is happy to chime in -- just ping me via email/ml (cc'd directly) or any other way :) 14:25:05 we can always increase the scope of 2) to other code places 14:25:14 thanks mnaser ! 14:25:18 makes sense to me 14:26:04 ok, so I think we have agreement on that for now 14:26:15 I will sum it up in the LP comment after the meeting 14:26:29 thanks slaweq 14:26:30 thanks slaweq 14:27:50 so I think this topic is done for today, right? 14:27:57 + 14:27:57 I think so 14:28:14 do You have anything else You want to discuss today? if not, I will give You 30 minutes back :) 14:28:34 yaaay, weekened is now a bit closer! 14:28:41 LOL 14:28:50 for me it's almost there :) 14:28:53 it's time :-) 14:28:56 same here) 14:28:58 ok, thx for attending the meeting 14:29:01 bye! enjoy the weekend 14:29:06 and have a great weekend! 14:29:06 o/ 14:29:07 bye! 14:29:08 Bye 14:29:09 o/ 14:29:09 #endmeeting