14:00:04 <slaweq> #startmeeting networking
14:00:04 <opendevmeet> Meeting started Tue Jun 29 14:00:04 2021 UTC and is due to finish in 60 minutes.  The chair is slaweq. Information about MeetBot at http://wiki.debian.org/MeetBot.
14:00:04 <opendevmeet> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
14:00:04 <opendevmeet> The meeting name has been set to 'networking'
14:00:09 <mlavalle> o/
14:00:18 <gibi> \o
14:00:20 <amotoki> hi
14:00:22 <obondarev> hi
14:00:25 <slaweq> hi
14:00:55 <lajoskatona> o/
14:02:17 <slaweq> let's start
14:02:17 <slaweq> #topic Announcements
14:02:21 <slaweq> Xena cycle calendar https://releases.openstack.org/xena/schedule.html
14:02:32 <slaweq> Xena-2 milesone is in few weeks - Jul 12th
14:02:33 <ralonsoh> hi
14:02:41 <manub> hi
14:03:00 <slaweq> we are going well with specs - thanks for all reviews You did
14:03:25 <slaweq> and we still have few of them opened so there is more work to do there :)
14:03:43 <rubasov> hi
14:03:43 <mlavalle> Today I'll review https://review.opendev.org/c/openstack/neutron-specs/+/783791
14:03:48 <ivc_> o/
14:04:01 <slaweq> thx mlavalle
14:04:13 <slaweq> except that one, there is also https://review.opendev.org/c/openstack/neutron-specs/+/770540 opened
14:04:27 <mlavalle> yeah, that would be the next one for me
14:04:32 <slaweq> thx
14:05:00 <mlavalle> most likely tomorrow
14:05:07 <slaweq> ok, next one
14:05:10 <slaweq> Drivers team members are now ops in the neutron channel: https://review.opendev.org/c/openstack/project-config/+/796521
14:05:30 <slaweq> just a heads-up that some of You have now new "super powers" :)
14:05:33 <mlavalle> so we can wreck it?
14:05:42 <ralonsoh> I was going to ask this
14:05:56 <slaweq> mlavalle You can try :P
14:06:05 <mlavalle> LOL
14:06:38 <slaweq> but remember - https://www.youtube.com/watch?v=kb4jEHmH_kU :D
14:07:29 <manub> :)
14:07:40 <slaweq> and that's all announcements from me for today
14:07:49 <slaweq> do You have anything else You want to share with the team now?
14:09:34 <slaweq> if not I think we can move on
14:09:38 <slaweq> #topic Blueprints
14:09:50 <slaweq> Neutron Xena-2 https://bugs.launchpad.net/neutron/+milestone/xena-2
14:10:26 <slaweq> I marked as implemented https://blueprints.launchpad.net/neutron/+spec/distributed-dhcp-for-ml2-ovs and https://blueprints.launchpad.net/neutron/+spec/default-dns-zone-per-tenant
14:10:34 <slaweq> as all related patches are now merged
14:11:20 <slaweq> for https://blueprints.launchpad.net/neutron/+spec/bfd-support-for-neutron and https://blueprints.launchpad.net/neutron/+spec/explicit-management-of-default-routes we have already merged specs
14:11:25 <mlavalle> Nice!
14:12:03 <slaweq> ohh, so there is one more "prio" spec to review: https://review.opendev.org/c/openstack/neutron-specs/+/779511
14:12:11 <slaweq> it's related to https://blueprints.launchpad.net/neutron/+spec/multiple-external-gateways
14:13:05 <mlavalle> added to my pile of this week. Coukld you please add it to the high priority reviews dashboard?
14:13:17 <mlavalle> that way I don't forget
14:13:18 <slaweq> mlavalle I just did
14:13:22 <slaweq> thank You
14:13:23 <mlavalle> Thanks!
14:13:28 <rubasov> mlavalle, slaweq: thanks
14:13:36 <slaweq> and that are all updates from me regarding Blueprints
14:13:46 <slaweq> any other updates regarding Blueprints?
14:13:51 <obondarev> Node Local IP spec is published: https://review.opendev.org/c/openstack/neutron-specs/+/797798
14:13:57 <obondarev> early comments are welcome :)
14:14:15 <mlavalle> do you want to implement it this cycle?
14:14:33 <obondarev> I guess start this cycle
14:15:00 <obondarev> Y is more realistic
14:15:03 <mlavalle> ahh, ok. Just to have an idea of priorities
14:15:27 <mlavalle> I'll look at it soon, anyways
14:15:41 <slaweq> obondarev ok, thx
14:15:49 <obondarev> thanks
14:15:50 <slaweq> so I set it to "neutron next" to not forget :)
14:17:59 <slaweq> if there are no other updates, I think we can move on to the next topic
14:18:04 <slaweq> #topic Bugs
14:18:23 <slaweq> amotoki was bug deputy last week
14:18:41 * mlavalle likes amotoki's bug report format
14:18:41 <amotoki> my report is found at http://lists.openstack.org/pipermail/openstack-discuss/2021-June/023362.html  sorry for late
14:19:46 <amotoki> there are three bugs which need attentions: one OVN related and two DVR related.
14:20:17 <amotoki> I haven't succeeded to repro the OVN one related to security group rule operation
14:20:29 <amotoki> mlavalle: thanks
14:20:35 <slaweq> I will take a look at the OVN one
14:20:46 <ralonsoh> me neither, I don't know what is failing there
14:21:55 <haleyb> slaweq: i was going to add a comment to the OVN one as it affects octavia, not sure how to address it
14:22:05 <slaweq> Regarding https://bugs.launchpad.net/neutron/+bug/1933234 I today took a look. I have some initial toughts what is wrong there but I will need more logs in the L3 agent to confirm that
14:22:50 <slaweq> haleyb sure, feel free to take it if You want or update with Your ideas. I will take a look at it too, maybe I will find something there :)
14:23:30 <haleyb> slaweq: ack, i'll ping you after meeting since we need to work around it in Octavia
14:24:14 <amotoki> regarding ONV one, 409 conflict will be retuned when an exception happens during BEFORE_DELETE hook.
14:24:57 <amotoki> I am not sure we may want to control an exception more granually or it is just a bug in ovn driver.
14:26:03 <haleyb> amotoki: the rule might not exist, so returning a 409 seems wrong.  But i'm not sure we can change to SecurityGroupRuleNotFound without breaking something ?
14:26:27 <haleyb> without any callback registered it would return SecurityGroupRuleNotFound
14:26:48 <haleyb> i.e. without OVN
14:27:23 <amotoki> haleyb: yeah, I am not sure what is wrong. it happens in BEFORE_DELETE so I wonder why not found is raised....
14:27:31 <opendevreview> Ilya Chukhnakov proposed openstack/neutron-specs master: [WIP] Add Node-Local Virtual IP Spec  https://review.opendev.org/c/openstack/neutron-specs/+/797798
14:28:21 <haleyb> amotoki: the _registry_notify() call specifies SecurityGroupRuleInUse, which i'm assuming is the default when it fails?
14:28:37 <haleyb> the callee can't say why it failed
14:29:05 <amotoki> haleyb: yes, my understanding is same, but the error message from OVN driver says the rule is not found. that confuses me.
14:29:22 <haleyb> a few lines down it calls "sgr = self._get_security_group_rule(context, id)" which would also have returned Not Found, right?
14:29:31 <slaweq> amotoki maybe it wasn't found in ovn db?
14:29:53 <amotoki> slaweq: it might be.
14:30:04 <amotoki> it may happen
14:30:17 <haleyb> slaweq: i wonder if it's in the neutron DB?  would be easy enough to verifiy with two calls
14:30:27 <haleyb> the bug report says there were multiple calls to remove the same rule
14:30:35 <slaweq> haleyb is it failing everytime in Octavia CI?
14:30:39 <slaweq> or sometimes only?
14:30:52 <haleyb> gthiemonge would know
14:31:06 <amotoki> I wonder it happens in a race condition.
14:31:13 <gthiemonge> hi
14:31:41 <haleyb> gthiemonge: talking about the SG issue with OVN
14:31:47 <amotoki> gthiemonge: we are talking about https://bugs.launchpad.net/bugs/1933638 you filed.
14:32:02 <haleyb> and i saw https://review.opendev.org/c/openstack/octavia/+/798676 to try and work around it
14:32:32 <gthiemonge> so, it happened many times in CI, I didn't reproduce it in my env
14:33:43 <gthiemonge> In CI, we are deleting 2 or 3 times the same SG rule, the first one is ok, the 2nd one gets a 409 and sometimes there's 3rd one that gets a 404
14:34:19 <amotoki> thanks, so a race condition seems to happen.
14:34:24 <slaweq> yes
14:34:31 <haleyb> oh, so it's not always a 409 on second and greater
14:34:43 <slaweq> and 409 is definitely wrong as response to DELETE request
14:34:54 <amotoki> slaweq: totally agree
14:35:19 <ralonsoh> folks, if the SG rule does not exist, the BEFORE_DELETE will raise always this exception in OVN
14:35:41 <ralonsoh> just reading the code
14:35:49 <haleyb> ralonsoh: that's what it looks like to me too
14:36:03 <ralonsoh> so we need to "hide" this exception in the event
14:36:04 <amotoki> perhaps we can improve the notification logic so that we can control an exception raised.
14:36:17 <ralonsoh> and let the DB transaction to fail correctly
14:36:35 <amotoki> +1
14:36:53 <ralonsoh> (1 bug less)
14:37:12 <gthiemonge> gthiemonge: I have this log file: https://9d5339e09bfaa3ab0b67-c3fbbb652718002c010964532c238f5b.ssl.cf5.rackcdn.com/798676/1/check/octavia-v2-dsvm-scenario/de9eccb/controller/logs/screen-q-svc.txt
14:37:26 <gthiemonge> starting at 12:36:00.226969
14:37:38 <slaweq> ralonsoh do You want to send patch for that?
14:37:40 <gthiemonge> 3 DELETEs, response codes: 204, 404, 409
14:37:44 <ralonsoh> sure, right now
14:37:49 <slaweq> ++
14:37:51 <slaweq> thank You
14:38:11 <gthiemonge> ralonsoh: thanks ;-)
14:38:22 <haleyb> ralonsoh: yes, catching that and just returning might help, but it's odd we get multiple responses, anyways probably time to move on
14:38:25 <slaweq> ralonsoh assigned that bug to You
14:40:15 <slaweq> ok, thx amotoki for great summary of the bug deputy week
14:40:22 <amotoki> one last thing I would like to raise is https://bugs.launchpad.net/bugs/1930866  It was filed as a normal bug but it leads to an API change to handle this, so I marked it as RFE. feel free to share you thoughts.
14:40:24 <slaweq> any other bugs to discuss today?
14:40:33 <amotoki> we don't need to discuss it here though.
14:41:32 <amotoki> that's all from me.
14:41:37 <slaweq> thx amotoki
14:41:48 <slaweq> regarding that RFE I will check it later this week
14:42:12 <slaweq> but from quick look I think we discussed something like that with Nova team on one of the PTGs already
14:43:40 <slaweq> ok, so we should have something like "locked" port and then forbid to delete it probably
14:43:52 <slaweq> I will check it and we will discuss it in the drivers meeting later
14:44:30 <slaweq> so I think we can move on if there are no other bugs to discuss
14:44:38 <slaweq> mlavalle is our bug deputy this week
14:44:42 <mlavalle> o/
14:44:52 <slaweq> and next week will be rubasov's turn
14:44:57 <rubasov> ack
14:45:04 <slaweq> thx
14:45:07 * mlavalle would appreciate if ralonsoh takes another look at https://bugs.launchpad.net/neutron/+bug/1933813. Submitter added some more info
14:45:13 <ralonsoh> mlavalle, sure
14:45:35 <slaweq> ok, so last topic for today
14:45:41 <slaweq> #topic On Demand Agenda
14:45:48 <slaweq> ralonsoh You added topic there
14:45:56 <slaweq> so all channel is Yours :)
14:46:04 <ralonsoh> sorry, link?
14:46:10 <ralonsoh> to the agenda
14:46:12 <slaweq> (ralonsoh): https://bugs.launchpad.net/neutron/+bug/1933517. Proposed solution: https://bugs.launchpad.net/neutron/+bug/1933517/comments/1
14:46:16 <slaweq> https://wiki.openstack.org/wiki/Network/Meetings
14:46:23 <ralonsoh> ahhh yes, sorry
14:46:36 <ralonsoh> we have a serious problem with OVN live migrations
14:46:45 <ralonsoh> in OVS with hybrid plug we are ok
14:46:59 <ralonsoh> because os-vif creates a bridge between the VM and OVS
14:47:10 <ralonsoh> so Neutron is aware of this new port and creates the needed rules
14:47:20 <ralonsoh> when the VM is unpaused, the backend (OVS) is ready
14:47:34 <ralonsoh> in OVN and OVS native, this is not happening
14:47:46 <ralonsoh> because libvirt creates the port when the VM is unpaused
14:48:09 <ralonsoh> what Sean Mooney is proposing is something similar to OVS hhybird
14:48:25 <ralonsoh> but with OVS bridges, created and deleted by os-vif
14:48:39 <ralonsoh> 1) this is 100% compatible with DPDK
14:48:54 <ralonsoh> 2) that won't affect performance: the bridge will collapse in the dataplane
14:49:08 <ralonsoh> of course, we'll have one extra bridge per VM port
14:49:31 <ralonsoh> so, this is is, in  a nuthsell, the problem and the proposed solution
14:49:39 <ralonsoh> do you need a spec? a BP?
14:49:53 <ralonsoh> or this is a no-go feature
14:50:32 <ralonsoh> (btw, in os-vif we don't differenciate OVS native or OVN ports, that will be applicable to OVS too)
14:50:40 <ralonsoh> and will be configurable
14:50:48 <ralonsoh> that's all, feedback welcome
14:50:54 <slaweq> IMO it's "go" feature for sure as it will solve valid issue
14:51:05 <slaweq> what are the cons of it?
14:51:07 <amotoki> a spec sounds reasonable to me as we need to capture the issue and understand the solution correctly. it also helps users understand the change.
14:51:13 <ralonsoh> amotoki, perfect
14:51:36 <mlavalle> yeah, let's follow the RFE/spec process
14:51:37 <ralonsoh> cons: more bridges in OVS
14:51:44 <ralonsoh> mlavalle, I'll propose it
14:51:59 <mlavalle> that way we can thoroughly discuss it
14:52:04 <amotoki> I am not sure how the number of OVS bridges affects the performance. too many bridge works?
14:52:19 <rubasov> maybe it could be converged to how trunk ports have an extra bridge
14:52:37 <rubasov> and then we would kill a few limitations of converting between normal and trunk ports
14:52:51 <ralonsoh> this is not planning using linux bridges, but OVS ones
14:53:02 <ralonsoh> I wouldn't mix features
14:53:34 <ralonsoh> and the goal is to make neutron agnostic to this (with some exceptions related to QoS)
14:53:47 <ralonsoh> do everything in os-vif
14:53:54 <ralonsoh> anyway, I'llpresent a spec
14:54:06 <slaweq> ++ for spec
14:54:31 <amotoki> so do we convert https://bugs.launchpad.net/neutron/+bug/1933517 into RFE?
14:55:31 <ralonsoh> yes, should be a RFE
14:55:33 <slaweq> strictly speaking it should be like that IMO
14:55:51 <amotoki> :)
14:56:35 <amotoki> I added rfe-triaged tag to it
14:56:58 <slaweq> thx amotoki
14:57:13 <slaweq> we are almost on top of the hour now
14:57:31 <slaweq> and I think we can finish it now
14:57:37 <ralonsoh> bye
14:57:39 <slaweq> thx for attending the meeting today
14:57:43 <slaweq> o/
14:57:45 <rubasov> o/
14:57:46 <slaweq> #endmeeting