Tuesday, 2021-04-06

slaweq#startmeeting networking14:01
*** lajoskatona has joined #openstack-meeting-314:02
slaweq#topic announcements14:02
*** openstack changes topic to "announcements (Meeting topic: networking)"14:02
slaweqRelease calendar https://releases.openstack.org/wallaby/schedule.html14:02
slaweqaccording to it, this week is the final deadline for last Wallaby RC14:03
slaweqRC2 for Neutron is proposed https://review.opendev.org/c/openstack/releases/+/78488214:03
slaweqfor stadium projects we don't need RC214:03
slaweqnext one14:04
slaweqDeprecation of networking-midonet is DONE now14:04
slaweqit was super fast deprecation process14:04
slaweqso we will not have any new networking-midonet releases14:04
slaweqit's out of neutron stadium now14:04
slaweqsorrioson will maybe revive it in the x/ namespace14:05
bcafarelthat was fast indeed14:05
amotokiit is unfortunate, but it is the real14:05
amotokiI will support the migration to x/ namespace.14:05
slaweqthx amotoki14:06
slaweqok, next one14:06
slaweqvirtual PTG is in less than 2 weeks from now14:06
slaweqetherpad https://etherpad.opendev.org/p/neutron-xena-ptg14:06
slaweqplease add Your topics to it14:06
slaweqalso gibi asked me today if we have any topics to discuss with Nova team14:07
slaweqso far I don't see anything on our etherpad14:07
slaweqbut if You would have anything, please let me know asap so I can sync with migi and schedule some cross project session14:07
bcafarelralonsoh: I vaguely recall something related to nova for the rootwrap removal? ^^14:07
ralonsohbcafarel, not really, just a question about how it was done in Nova14:08
bcafarelack probably not worth a cross-project topic then14:08
slaweqand that14:09
slaweqthat's all announcements from me today14:09
slaweqanything else You want to add to it?14:09
amotokislaweq:  did you have a conversation with gmann on the timeslot on the policy stuff?14:09
slaweqamotoki: nope14:09
amotokiIIRC he coordinate slots on the policy stuff14:10
amotokiokay, let's follow it up this week14:10
slaweqamotoki: you mean new policy rules?14:10
slaweqand personas14:10
amotokislaweq: yes14:10
slaweqI will talk with him, thx for the heads up14:10
bcafarelalso a quick note (stable hat on), a new setuptools release breaks most py2 jobs - don't recheck on stable/train and older backports for now14:11
bcafarelwell even stable/ussuri as grenade will fail :/14:11
slaweqbcafarel: thx14:12
slaweqok, let's move on14:15
slaweq#topic Blueprints14:15
*** openstack changes topic to "Blueprints (Meeting topic: networking)"14:15
slaweqdo You want to talk about any specific BP today?14:15
mlavallenot me14:16
ralonsohno thanks14:16
slaweqI also have nothing for today14:16
slaweqjust one thing14:17
slaweqthere is a lot of BPs on that list14:17
slaweqI will start looking into them in next days to check what we should maybe close or work on14:17
slaweqbut if You would have some time, please take a look at that list too14:17
slaweqmany of them are pretty old things so please also be ready for questions from me regarding them :)14:18
slaweqand that's all from me regarding BPs14:19
slaweqok, lets move on then14:20
mlavalleI'll take a look with an eye of aopting one14:20
slaweqmlavalle: thx a lot14:20
slaweq#topic Community Goals14:20
*** openstack changes topic to "Community Goals (Meeting topic: networking)"14:20
slaweqon that list we have only that rootwrap->privsep migration14:21
slaweqwhich AFAIK is (almost) finished on our side14:21
slaweqexcept the long running processes about which ralonsoh sent email to ML14:21
ralonsoha couple of things:14:21
slaweqralonsoh: should we still keep track of it in that section? or can we remove it already?14:21
ralonsoh1) long running process: ok to continue using rootwrap14:22
ralonsoh2) as Sean commented, we need to segregate the privsep contexts14:22
ralonsohto make them more granular14:22
ralonsohbut this is an optimization14:22
ralonsoh(no rush)14:22
ralonsohthat's all14:22
ralonsohslaweq, yes we can remove this section14:23
slaweqralonsoh: ok, ths14:24
slaweqI will remove it from that section then14:24
amotokiI think that long running processes were not considered when it was proposed as a community goal and we can tackle it separately.14:25
amotokiperhaps along with nova(?)14:25
ralonsohamotoki, sure and this could discussed in the PTG14:25
slaweqI think we can move on to the next topic then14:27
slaweq#topic Bugs14:27
*** openstack changes topic to "Bugs (Meeting topic: networking)"14:27
slaweqamotoki was our bug deputy last week14:27
slaweqamotoki: any bugs You want to highligh now?14:28
amotokiI was a bug deputy last week and just sent a report before the meeting http://lists.openstack.org/pipermail/openstack-discuss/2021-April/021613.html14:28
amotokiLast week is relatively quite. Please check my report.14:29
amotokihttps://bugs.launchpad.net/neutron/+bug/1921809 is the only unassigned bug and it is related to OVN doc14:29
openstackLaunchpad bug 1921809 in neutron "OpenStack Metadata API and OVN in Neutron" [Low,Confirmed]14:29
amotokithere was a response from the bug author. it would be nice if ovn folks can check the response.14:30
slaweqotherwiseguy: hi, can You take a look at ^^ when You will have some time?14:31
amotokiin addition, it would be nice if you can look https://bugs.launchpad.net/neutron/+bug/1922222  I requested more info on the background.14:33
openstackLaunchpad bug 1922222 in neutron "allow using tap device on netdev enabled host" [Undecided,Opinion]14:33
amotokithat's all from me14:34
slaweqralonsoh: can You take a look at https://bugs.launchpad.net/neutron/+bug/1922222 ?14:34
slaweqthx a lot14:34
slaweqand thx amotoki for the report14:34
slaweqsounds like easy week indeed :)14:34
slaweqthis week our bug deputy is mlavalle14:35
slaweqand next week it will be rubasov's turn14:35
rubasovthanks for the reminder14:35
slaweqany other bugs You want to discuss today?14:35
slaweqrubasov: yw :)14:35
slaweqif there are no other bugs, then it is all from me for today14:37
slaweqso I can give You few minutes back :)14:37
slaweqthx for attending the meeting and have a great week14:37
slaweq#startmeeting neutron_ci15:00
slaweqok, let's start15:02
slaweqGrafana dashboard: http://grafana.openstack.org/dashboard/db/neutron-failure-rate15:02
slaweqPlease open now :)15:02
slaweq#topic Actions from previous meetings15:04
*** openstack changes topic to "Actions from previous meetings (Meeting topic: neutron_ci)"15:04
slaweqralonsoh to check failed qos scenario test15:04
ralonsohno, sorry, I just started. I was busy with the py38/FTs timeouts15:05
slaweqralonsoh: sure15:05
slaweqcan I assign it to You for next week too?15:05
slaweq#action ralonsoh to check failed qos scenario test15:05
slaweqnext one15:05
slaweq    ralonsoh to check https://bugs.launchpad.net/neutron/+bug/192186615:05
openstackLaunchpad bug 1917793 in neutron "duplicate for #1921866 [HA] keepalived_state_change does not finish "handle_initial_state"execution" [Critical,Confirmed] - Assigned to Rodolfo Alonso (rodolfo-alonso-hernandez)15:05
ralonsohI pushed a patch to mitigate it15:05
ralonsohone sec15:05
*** lpetrut has quit IRC15:06
ralonsoh#link https://review.opendev.org/c/openstack/neutron/+/77902415:06
ralonsoh(already merged)15:06
slaweqthx, so we should be good with that one :)15:06
slaweqnext one then15:07
slaweqslaweq to check failed start metadata proxy issue15:07
slaweq    Bug https://bugs.launchpad.net/neutron/+bug/192268415:07
openstackLaunchpad bug 1922684 in neutron "Functional dhcp agent tests fails to spawn metadata proxy" [High,Confirmed] - Assigned to Slawek Kaplonski (slaweq)15:07
slaweqand proposed fix https://review.opendev.org/c/openstack/neutron/+/78490315:07
slaweqralonsoh: I saw You had some questions about it15:07
ralonsohwe can discuss it in the patch15:08
slaweqlet me try to quickly explain it here15:08
slaweqfirst of all, You can easy reproduce it if You will raise exceptions.ProcessExecutionError somewhere in fill_dhcp_udp_checksums() method in https://github.com/openstack/neutron/blob/58c9912be0ce5d9bf9eb9e1c44b87cdf90aab452/neutron/agent/linux/dhcp.py#L176215:09
slaweqthis is what happens really in those failed tests15:09
slaweqso during iptables-restore command there is exception raised15:10
slaweqand this is handled properly by dhcp driver15:10
slaweqbut when it tries call setup() method again https://github.com/openstack/neutron/blob/58c9912be0ce5d9bf9eb9e1c44b87cdf90aab452/neutron/agent/linux/dhcp.py#L1664 it fails on ensure_device_is_ready: https://github.com/openstack/neutron/blob/58c9912be0ce5d9bf9eb9e1c44b87cdf90aab452/neutron/agent/linux/dhcp.py#L169215:11
slaweqit happens like that because in the test we prepare network object with port prepared15:11
slaweqand that "fake" port is used in the first call of setup() method15:12
slaweqexactly here https://github.com/openstack/neutron/blob/58c9912be0ce5d9bf9eb9e1c44b87cdf90aab452/neutron/agent/linux/dhcp.py#L166715:12
slaweqso "port" is exactly what test expects that it will be15:12
slaweqbut we also mock get_dhcp_port() from the plugin rpc api class in that test15:13
slaweqso in first call of setup() method it will:15:13
slaweq1. get correct port in https://github.com/openstack/neutron/blob/58c9912be0ce5d9bf9eb9e1c44b87cdf90aab452/neutron/agent/linux/dhcp.py#L166715:13
slaweq2. update network.ports[0] to be mock instead of port which was returned in 1)15:14
slaweq3. fail on iptables call15:14
slaweqand now second call of setup()15:14
slaweq1. get wrong (mock) port in https://github.com/openstack/neutron/blob/58c9912be0ce5d9bf9eb9e1c44b87cdf90aab452/neutron/agent/linux/dhcp.py#L166715:14
slaweq2. fails at https://github.com/openstack/neutron/blob/58c9912be0ce5d9bf9eb9e1c44b87cdf90aab452/neutron/agent/linux/dhcp.py#L169215:14
slaweqI'm not sure if that's clear for You now15:15
ralonsohok, I'll check it locally, I still don't get it15:15
slaweqwe can continue in the review later15:15
slaweqthat's all regarding actions from last week15:16
slaweqlet's move on15:16
slaweq#topic Stadium projects15:16
*** openstack changes topic to "Stadium projects (Meeting topic: neutron_ci)"15:16
slaweqlajoskatona: any updates?15:16
slaweqexcept midonet as it's not stadium project anymore ;)15:16
lajoskatonanothing to tell the truth15:16
lajoskatonaas I saw this morning things a re going in, so no issue at leat as i checked15:17
slaweqok, thx for taking care of it15:17
slaweq#topic Stable branches15:17
*** openstack changes topic to "Stable branches (Meeting topic: neutron_ci)"15:17
slaweqbcafarel: any updates?15:17
slaweqexcept the issue with py2 (again) in older branches15:17
bcafarelmain issue I spoiled in previous meeting is py2 bug indeed15:18
bcafarelas it breaks up to ussuri included the list of ok branches got short :)15:18
slaweqis there any LP for that bug already?15:18
bcafarelI had opened one for neutron, but closed it as dup (gmann opened one for devstack)15:18
openstackLaunchpad bug 1922736 in devstack "Stable stein|train py2 devstack based jobs are broken on py2 interpreter" [Critical,Confirmed]15:18
bcafarelas it is rather generic issue not just for us15:18
slaweqthx bcafarel15:19
* slaweq wonders when we will need to stop testing all py2 branches in u/s15:20
bcafarelwell, train had still both IIRC15:20
bcafarelso expect a few other "oh yes whe should cap this one too"15:20
slaweqsomething else, easier regarding the stable branches15:21
slaweqwe need to update our grafana dashboads to include stable/wallaby15:21
slaweqbcafarel: will You take care of it?15:21
bcafarelsigh sorry I pushed doc update to note this as release step and then forgot about actually doing it15:22
bcafarelslaweq: let's add it as topic for next week so I do not keep forgeting :)15:22
slaweq#action bcafarel to update grafana dashboards with stable/wallaby15:23
slaweqok, next topic15:23
slaweq#topic Grafana15:23
*** openstack changes topic to "Grafana (Meeting topic: neutron_ci)"15:23
slaweqhere things looks pretty ok this week IMO15:23
slaweqI don't seen any major issues15:24
ralonsohwell, py38 and FTs were a bit unstable, too many timeouts15:24
slaweqralonsoh: true15:24
slaweqbut You proposed some patches to address, at least py38 issues, right?15:25
ralonsohand for FTs15:25
ralonsohsorry: https://review.opendev.org/c/openstack/neutron/+/78488915:25
slaweqok, both are approved already15:26
slaweqlets see if it will be better with those patches merged15:26
bcafarelseeing the times for the offline_migration tests it should help15:27
ralonsohmysql tests take around 10 mins, all of them15:27
ralonsohI'm trying to merge in one single test, to avoid executing the migration again and again15:27
slaweqok, lets talk about some specific issues15:29
slaweq#topic functional15:29
*** openstack changes topic to "functional (Meeting topic: neutron_ci)"15:29
slaweqI found one new issue for today15:29
slaweqit's failed test_get_egress_min_bw_for_port15:30
ralonsoh fail15:30
ralonsohft1.22: neutron.tests.functional.agent.common.test_ovs_lib.BaseOVSTestCase.test_get_egress_min_bw_for_porttesttools.testresult.real._StringException: Traceback (most recent call last):15:30
ralonsoh  File "/home/zuul/src/opendev.org/openstack/neutron/neutron/common/utils.py", line 708, in wait_until_true15:30
ralonsoh    eventlet.sleep(sleep)15:30
ralonsoh  File "/home/zuul/src/opendev.org/openstack/neutron/.tox/dsvm-functional/lib/python3.8/site-packages/eventlet/greenthread.py", line 36, in sleep15:30
ralonsoh    hub.switch()15:30
ralonsoh  File "/home/zuul/src/opendev.org/openstack/neutron/.tox/dsvm-functional/lib/python3.8/site-packages/eventlet/hubs/hub.py", line 313, in switch15:30
ralonsoh    return self.greenlet.switch()15:30
ralonsoheventlet.timeout.Timeout: 5 seconds15:30
ralonsohDuring handling of the above exception, another exception occurred:15:30
ralonsohTraceback (most recent call last):15:30
ralonsoh  File "/home/zuul/src/opendev.org/openstack/neutron/neutron/tests/functional/agent/common/test_ovs_lib.py", line 158, in _check_value15:30
ralonsoh    common_utils.wait_until_true(part_check_value, timeout=5, sleep=1)15:30
ralonsoh  File "/home/zuul/src/opendev.org/openstack/neutron/neutron/common/utils.py", line 713, in wait_until_true15:30
ralonsoh    raise WaitTimeout(_("Timed out after %d seconds") % timeout)15:30
ralonsohneutron.common.utils.WaitTimeout: Timed out after 5 seconds15:30
ralonsohDuring handling of the above exception, another exception occurred:15:31
ralonsohTraceback (most recent call last):15:31
ralonsoh  File "/home/zuul/src/opendev.org/openstack/neutron/neutron/tests/base.py", line 183, in func15:31
ralonsoh    return f(self, *args, **kwargs)15:31
ralonsoh  File "/home/zuul/src/opendev.org/openstack/neutron/neutron/tests/functional/agent/common/test_ovs_lib.py", line 452, in test_get_egress_min_bw_for_port15:31
ralonsoh    self._check_value(2800, self.ovs.get_egress_min_bw_for_port,15:31
ralonsoh  File "/home/zuul/src/opendev.org/openstack/neutron/neutron/tests/functional/agent/common/test_ovs_lib.py", line 160, in _check_value15:31
ralonsoh    self.fail('Expected value: %s, retrieved value: %s' %15:31
ralonsoh  File "/home/zuul/src/opendev.org/openstack/neutron/.tox/dsvm-functional/lib/python3.8/site-packages/unittest2/case.py", line 690, in fail15:31
ralonsoh    raise self.failureException(msg)15:31
ralonsohAssertionError: Expected value: 2800, retrieved value: 170015:31
ralonsohwhat I wanted to point out is the retrieved value, 170015:31
ralonsohthis could be due to an overloaded host15:31
slaweqwrong copy paste ;P15:31
slaweqralonsoh: but how overloaded host can impact that?15:33
ralonsohbecause it cannot transmit at the requested speed15:33
slaweqbut it's not checking actual bandwidth15:33
slaweqit's just checking what is set in ovs IMO15:33
ralonsohsorry! you are right15:34
ralonsohok, indeed this is an error15:34
slaweqit failed in that line15:34
slaweqso just "update_minimum_bandwidth_queue()"15:35
slaweqand then wait 5 seconds until it will be really set15:35
ralonsohthis is the most trivial check15:35
slaweqbut maybe we should use different ports in each test15:35
slaweqas now it seems that 1700 was set in different test: https://github.com/openstack/neutron/blob/58c9912be0ce5d9bf9eb9e1c44b87cdf90aab452/neutron/tests/functional/agent/common/test_ovs_lib.py#L37415:36
ralonsohwe do, we are generating a new port uuid per test15:36
slaweqso from where 1700 came?15:36
ralonsohups, the queue number15:38
ralonsohmaybe we need to make the queue number random15:38
ralonsohI'll check it15:38
slaweqqueue number is always 115:38
slaweqit may be that there is race between those tests15:39
slaweq#action ralonsoh to check failed test_get_egress_min_bw_for_port functional test15:40
slaweqok, that's basically all what I had for today15:40
slaweqI really didn't found many new issues in our jobs this week15:40
bcafarelnot complaining that you did not :)15:40
slaweqone last thing from me for today15:41
slaweqplease review those patches15:41
slaweqI'm pushing new UT for API policies15:41
bcafarelslaweq++ nice15:41
slaweq(and finding new bugs all the time :/)15:41
slaweqso those tests are useful IMO15:41
slaweqI know that those patches are huge but please review them :)15:42
slaweqand that's all what I have for today15:42
slaweqdo You have anything else You want to talk about today?15:43
openstackLaunchpad bug 1915341 in neutron "neutron-linuxbridge-agent not starting due to nf_tables rules" [Critical,New]15:43
ralonsohbut this could be discussed in the PTG15:43
ralonsohin a nutshell: this problem is related to nft API15:43
ralonsohif they use legacy ebtables (same as in our CI), the problem is gone15:44
ralonsohI'm trying to fix it for legacy and ebtables-nft (new API)15:44
lajoskatonaso this iwhy I cant reproduce it ?15:44
ralonsohyou can force the new api15:44
ralonsohone sec15:44
ralonsohthis is the patch I'm using to test it15:45
ralonsohbut this is just a heads-up, we'll talk about the future of linux bridge and nft in the PTG15:45
lajoskatonathanks, I check it15:45
ralonsohI'll add a topic15:45
ralonsoh(that's all)15:45
slaweqthx for topic proposal15:45
slaweqI already added something about linuxbridge agent to the etherpad15:46
slaweqbut please add Your notes to it too :)15:46
slaweqralonsoh: regarding bug https://bugs.launchpad.net/neutron/+bug/1915341 do You think we should have note about it somewhere in our docs?15:47
openstackLaunchpad bug 1915341 in neutron "neutron-linuxbridge-agent not starting due to nf_tables rules" [Critical,New]15:47
ralonsohslaweq, yes, we should add this in the documentation15:47
ralonsohI'll do it15:47
slaweqralonsoh++ thx a lot15:48
slaweq#action ralonsoh to update LB installation guide with info about legacy ebtables15:48
slaweqwith that I think we can finish today's meeting15:49
slaweqthx for attending15:49
