Tuesday, 2022-09-06

*** dkehn_ is now known as dkehn01:06
opendevreviewZhouHeng proposed openstack/neutron-lib master: [ovn]Floating IP adds distributed attributes  https://review.opendev.org/c/openstack/neutron-lib/+/85505302:34
opendevreviewliujinxin proposed openstack/neutron master: For DvrEdgeRouter, snat namespace should not be created in initialize.  https://review.opendev.org/c/openstack/neutron/+/85599502:40
opendevreviewyangjianfeng proposed openstack/neutron-tempest-plugin master: Create extra external network with address scope for `ndp proxy` tests  https://review.opendev.org/c/openstack/neutron-tempest-plugin/+/85599702:53
opendevreviewZhouHeng proposed openstack/neutron-tempest-plugin master: skip some port_forwarding test  https://review.opendev.org/c/openstack/neutron-tempest-plugin/+/84058402:53
opendevreviewyangjianfeng proposed openstack/neutron master: Forbid enable ndp proxy when external netwrok has no IPv6 address scope  https://review.opendev.org/c/openstack/neutron/+/85585003:17
opendevreviewyangjianfeng proposed openstack/neutron master: Forbid enable ndp proxy when external netwrok has no IPv6 address scope  https://review.opendev.org/c/openstack/neutron/+/85585003:36
opendevreviewLajos Katona proposed openstack/networking-sfc master: Adopt to latest VlanManager changes  https://review.opendev.org/c/openstack/networking-sfc/+/85588708:30
opendevreviewLajos Katona proposed openstack/neutron-fwaas master: Adopt to latest VlanManager changes  https://review.opendev.org/c/openstack/neutron-fwaas/+/85589109:16
opendevreviewSzymon Wróblewski proposed openstack/neutron master: Fix test_nova_send_events_* tests  https://review.opendev.org/c/openstack/neutron/+/85603409:41
opendevreviewyangjianfeng proposed openstack/neutron master: Forbid enable ndp proxy when external netwrok has no IPv6 address scope  https://review.opendev.org/c/openstack/neutron/+/85585010:14
opendevreviewyangjianfeng proposed openstack/neutron master: Forbid enable ndp proxy when external netwrok has no IPv6 address scope  https://review.opendev.org/c/openstack/neutron/+/85585010:52
opendevreviewSlawek Kaplonski proposed openstack/neutron master: Add new role "prepare_functional_tests_logs"  https://review.opendev.org/c/openstack/neutron/+/85586810:57
opendevreviewSlawek Kaplonski proposed openstack/neutron master: DNM Just run small subset of the functional jobs to test new role  https://review.opendev.org/c/openstack/neutron/+/85603910:57
opendevreviewSlawek Kaplonski proposed openstack/neutron master: Add new role "prepare_functional_tests_logs"  https://review.opendev.org/c/openstack/neutron/+/85586812:04
opendevreviewSlawek Kaplonski proposed openstack/neutron master: Add new role "prepare_functional_tests_logs"  https://review.opendev.org/c/openstack/neutron/+/85586813:07
*** kleini- is now known as kleini13:17
opendevreviewLajos Katona proposed openstack/networking-bagpipe stable/ussuri: [stable-only] Cap virtualenv for py37  https://review.opendev.org/c/openstack/networking-bagpipe/+/85588313:23
*** dasm is now known as Guest211513:31
*** Guest2115 is now known as dasm14:02
slaweq#startmeeting neutron_ci15:00
opendevmeetMeeting started Tue Sep  6 15:00:12 2022 UTC and is due to finish in 60 minutes.  The chair is slaweq. Information about MeetBot at http://wiki.debian.org/MeetBot.15:00
opendevmeetUseful Commands: #action #agreed #help #info #idea #link #topic #startvote.15:00
opendevmeetThe meeting name has been set to 'neutron_ci'15:00
slaweqhi15:00
mlavalleo/15:00
ykarelo/15:00
slaweqralonsoh_ is on PTO, bcafarel too15:02
slaweqI don't think if lajoskatona will be able to join today15:03
slaweqso I guess we can start15:03
mlavalleprobably not15:03
slaweqGrafana dashboard: https://grafana.opendev.org/d/f913631585/neutron-failure-rate?orgId=115:03
lajoskatonaHi, I can, but only on IRC15:03
slaweqlets go with first topic15:03
slaweqlajoskatona: hi, yeah, we have it on irc today15:03
slaweq#topic Actions from previous meetings15:03
lajoskatonaand I am on mobilnet, so possible that I will disappear time-to-time....15:03
slaweq    slaweq to fix functiona/fullstack failures on centos 9 stream: https://bugs.launchpad.net/neutron/+bug/197632315:03
slaweqlajoskatona: sure, thx for the heads up15:03
slaweqregarding that action item, I didn't made any progress really15:04
slaweqso I will add it for myself for next week too15:05
slaweq#action slaweq to fix functiona/fullstack failures on centos 9 stream: https://bugs.launchpad.net/neutron/+bug/197632315:05
slaweqnext one15:05
slaweqslaweq to check POST_FAILURE reasons15:05
slaweqI checked it with infra team and it seems that it is timing out while uploading logs to swift15:05
slaweqand we have a lot of small log files in the "dsvm-functional-logs" directory and that may be slow to upload all those files to Swift15:06
lajoskatonaok so it is not that our tests take again longer time15:06
slaweqso I prepared patch https://review.opendev.org/c/openstack/neutron/+/85586815:06
slaweqlajoskatona: nope15:06
slaweqwith that patch we will upload to swift .tar.gz archive with those logs which should be faster (I hope)15:06
slaweqI also did additional patch https://review.opendev.org/c/openstack/neutron/+/855867/ which removes store of the journal.log in the logs of the job15:07
slaweqit's not needed as devstack is already doing that and storing in the devstack.journal.gz file15:07
slaweqso it can save some disk space and few seconds during the job execution :)15:08
slaweqplease review both those patches when You will have a minute or two15:08
ykarelack15:08
slaweqnext one15:09
slaweqykarel to check interface not found issues in the periodic functional jobs15:09
ykarelyes i checked all the three failures linked15:09
ykarelAll the failures share common symptoms where interface get's deleted/added quickly, and in that period neutron fails with device missing in namespace in two of those failures15:09
ykarellike two of them, deleted at 02:45:35.681, readded at 02:45:35.778, fails at 02:45:35.70515:09
ykareldeleted at 02:55:12.157, readded at 02:55:13.608, fails at 02:55:13.52715:10
ykarel    One failure share same observations as done by slawek in https://bugs.launchpad.net/neutron/+bug/1961740/comments/1715:10
ykarelfrom opensearch i see some more occurances in non periodic jobs too in master and stable/yoga15:10
ykarelhttps://opensearch.logs.openstack.org/_dashboards/app/discover/?security_tenant=global#/?_g=(filters:!(),refreshInterval:(pause:!t,value:0),time:(from:now-30d,to:now))&_a=(columns:!(_source),filters:!(),index:'94869730-aea8-11ec-9e6a-83741af3fdcd',interval:auto,query:(language:kuery,query:'message:%22not%20found%20in%20namespace%20snat%22'),sort:!())15:11
slaweqykarel: yes, I also saw that "readd" of the interfaces some time ago when I was investigating that15:11
slaweqbut I have no idea why it happens like that15:11
ykarelslaweq, yes i too didn't got the root cause for that15:12
slaweqsome time ago I even did patch which I hoped will workaround it15:12
slaweqlet me find it15:12
ykarelthe retry one?15:12
slaweqyes15:12
ykarelyeap that's not helping atleast not avoiding this issue completely15:13
slaweqI know :/15:13
ykarelas in two of them i see that device added to namespace without retry15:13
slaweqand that's strange as interface is added/removed/added in short period of time15:13
ykarelbut then removed15:13
ykarelyes15:14
ykarelalso noticed there was cpu load > 10 around the failure, but i see similar with success jobs15:15
ykarelalso ram was not full utilized during failures15:15
ykarelalso observed many failures were seen in test patch https://review.opendev.org/c/openstack/neutron/+/854191/ as per opensearch15:16
ykarelbut that's just to trigger jobs, a lot of jobs15:17
ykareli recall some time back it was discussed to not use rootwrap in functional tests, you think that's related here?15:18
slaweqmaybe I have some theory15:18
slaweqit is failing with error like "Interface not found in namespace snat..." or something like that15:19
ykarelyes15:19
slaweqso maybe as device is re-added, it's not in the snat-XXX namespace but in the global namespace15:20
slaweqand that's why it cannot find it15:20
slaweqlook at https://github.com/openstack/neutron/blob/master/neutron/agent/linux/ip_lib.py#L46315:20
slaweqit's where it is failing15:20
slaweqand here "self._parent.namespace" is namespace in which interface is looked for?15:20
slaweqand "net_ns_fd=namespace" is attribute to set for the interface15:21
slaweqso it is expected to be in snat-XXX namespace but it's not there15:21
slaweqas it was deleted/added again15:21
slaweqdoes it makes sense?15:21
ykareldidn't got why it's in global namespace15:21
slaweqwhen You are adding new port it's always first in global namespace15:22
slaweqright?15:22
ykarelyes i think so 15:22
ykareland to add it to namespace it needs some explicit calss15:23
ykarels/calss/calls15:23
slaweqok, I know why15:25
slaweqit's bug in my retry15:25
ykarelahh15:25
slaweqwhen it calls first time add_device_to_namespace15:25
slaweqit set's parent namespace to namespace in https://github.com/openstack/neutron/blob/master/neutron/agent/linux/ip_lib.py#L46415:26
slaweqand when it's deleted and added again, it's in global namespace15:26
slaweqbut _parent.namespace is already set15:26
slaweqso that's why it's failing as it's looking for it in wrong namespace15:26
slaweq:)15:26
slaweqin this except block https://github.com/openstack/neutron/blob/master/neutron/agent/linux/interface.py#L36015:27
slaweqwe should do something like:15:27
slaweq    device._parent.namespace = None before retrying15:27
slaweqand that should make it working fine IMO15:27
ykarelso iiuc this will fix the case where it's failing even after multiple retry, right?15:28
slaweqyes15:28
ykarelnot the other two cases 15:28
ykarelokk15:28
slaweqI will propose patch for that15:28
ykarelk Thanks15:29
slaweqI think it may fix all cases where interface is "re-added"15:30
slaweqas currently retry mechanism is broken15:30
slaweq#action slaweq to fix add_device_to_namespace retry mechanism15:32
ykarelif add_interface_to_namespace is called everytime port is added to ovs-bridge, then yes it should fix15:32
ykareli still have to check complete flow15:32
slaweqk15:32
slaweqI will propose patch to fix that issue which we found15:32
slaweqbut if You will find anything else, please propose fixes too :)15:33
slaweqok, lets move on15:33
slaweqmlavalle to check failing quota test in openstack-tox-py39-with-oslo-master periodic job15:33
mlavalleIt is failing sometime15:33
mlavalleI filed this bug: https://bugs.launchpad.net/neutron/+bug/198860415:34
mlavalleand proposed this fix: https://review.opendev.org/c/openstack/neutron/+/85570315:34
lajoskatonaquick question: can it be realted to the sqlalchemy2.0 vs oslo.db relase thread?15:34
mlavalleit might15:35
lajoskatonaok, thanks, it is interesting to have an opinion on the debate15:35
lajoskatonathis morning I said let's wait with it, but if we are on the safe side with our best understanding let's have a release 15:36
slaweqlajoskatona: oslo.db version which has this "issue" is 12.1.0, right?15:36
lajoskatonayes I think15:36
slaweqok15:36
slaweqin "normal" unit test jobs we are still using 12.0.015:36
lajoskatonaIt is not an issue more that some project not adopted, and we have this flapping job15:36
slaweqso that's why those jobs are working fine15:36
lajoskatonayes this is how I understand15:37
slaweqmlavalle: I just run experimental jobs on Your patch15:37
slaweqI think we can run it few times to check if that oslo-master job will be stable with it15:37
lajoskatonabut if mlavalle's patch fixes the job, I would say let's have this oslo.db out15:37
slaweq++15:37
lajoskatonaslaweq: good idea15:37
lajoskatonai forgot tht we have experimental for this15:37
slaweqmlavalle: and also, I would really like ralonsoh_ to look at Your patch too :)15:38
mlavalleslaweq: we actually discussed it before he went on vacation15:38
lajoskatona+115:38
mlavalleit is in this channel's log a week ago15:38
slaweqmlavalle: ahh, ok15:38
slaweqso if he was fine with it, I'm good too :)15:39
slaweqI trust You ;)15:39
mlavalleyes, he was15:39
lajoskatonaok, than let's see the experimental jobs results and go back to the thread15:39
slaweq++15:39
slaweqthx mlavalle 15:39
slaweqnext topic then15:39
slaweq#topic Stable branches15:39
slaweqanything new regarding stable branches?15:40
lajoskatonaI just checked (https://review.opendev.org/c/openstack/requirements/+/855973 ) and cinderseems to be failing but I can't check the logs on mobile net :P15:40
lajoskatonaelodilles proposed a series for caping virtualenv: https://review.opendev.org/q/topic:cap-virtualenv-py3715:40
lajoskatonaif effects some networking projects also, I started to check (bagpipe perhaps) if you have tim please keep an eye on these15:41
lajoskatonait is for ussuri only as I see15:41
slaweqthx lajoskatona 15:42
slaweqI will take a look15:42
lajoskatonathanks15:42
slaweqok, next topic15:42
slaweq#topic Stadium projects15:42
* slaweq will be back in 2 minutes15:43
lajoskatonaOne topic, with the segments patches we let in a change in vlanmanager that brakes some stadium15:43
lajoskatonaI added the patches to the etherpad15:43
lajoskatonahttps://review.opendev.org/c/openstack/networking-bagpipe/+/85588615:43
lajoskatonahttps://review.opendev.org/c/openstack/networking-sfc/+/85588715:43
lajoskatonahttps://review.opendev.org/c/openstack/neutron-fwaas/+/85589115:43
lajoskatonait was too late when I switched to FF mode and stopped merging of this feature sorry for it :-(15:44
* slaweq is back15:45
slaweqno worries15:45
slaweqgood that we found it before final release of Zed15:45
slaweqso we still have time to fix those15:45
lajoskatonaAnd i have to drop (low battery, and have to fetch my sons from English lesson)15:46
lajoskatonayeah good that we have periodic jobs :-)15:46
slaweqlajoskatona: thx, see You15:46
lajoskatonao/15:46
slaweqok, lets move on to the next topic15:46
slaweq#topic Grafana15:46
slaweqdashboards looks pretty good IMO15:47
mlavalleyeap15:47
slaweqI don't see anything very bad there15:47
slaweqdo You see anything worth discussion there?15:47
slaweqif not, I think we can quickly move on15:48
slaweq#topic Rechecks15:48
slaweqrechecks stats are in the meeting agenda etherpad https://etherpad.opendev.org/p/neutron-ci-meetings#L5215:49
slaweqbasically it looks good still15:49
slaweqlast week we had 0.17 recheck in average to get patch merged15:49
slaweqthis week it's 1.5 but it's just begin of the week15:49
slaweqso hopefully it will be better15:49
slaweqregarding bare rechecks it's also much better this week15:50
slaweq+---------+---------------+--------------+-------------------+... (full message at https://matrix.org/_matrix/media/r0/download/matrix.org/WAYKKcqlXmMdgZjqrbwNZgul)15:50
slaweqthx a lot to all of You who are checking failures before rechecking :)15:50
mlavalle+115:50
slaweqanything else You want to add/ask regarding rechecks?15:51
mlavallenope15:51
slaweqok, so next topic15:52
slaweq#topic fullstack/functional15:52
slaweqhere I found one "new" error15:52
slaweqhttps://zuul.openstack.org/build/ad0801f20bc143cebf5692440b331df415:52
slaweqmetadata proxy didn't start15:52
slaweqbut I didn't had time to look into it deeper15:53
slaweqanyone wants to check it?15:54
mlavalleI'll look15:54
slaweqfrom log https://6338bbe59b3242bd04ef-84c9f5cd8c2b87d7cd3ff61e3f0a2559.ssl.cf2.rackcdn.com/periodic/opendev.org/openstack/neutron/master/neutron-functional-with-uwsgi-fips/ad0801f/controller/logs/dsvm-functional-logs/neutron.tests.functional.agent.test_dhcp_agent.DHCPAgentOVSTestCase.test_metadata_proxy_respawned.txt it seems that it was respawned15:54
mlavallewe don't know it's and issue yet, right?15:54
slaweqso maybe that's some issue in test15:54
slaweqmlavalle: nope15:54
slaweqthx for volunteering15:54
slaweq#action mlavalle to check metadata proxy not respawned error15:55
slaweqmlavalle: but please don't treat it as high priority (for now) as it happened only once15:55
mlavalleyeap, that's why I asked15:55
slaweq++15:55
slaweqany other issues/questions related to the functional or fullstack jobs?15:56
slaweqor can we move on?15:56
slaweqok, lets move on15:56
slaweq#topic Tempest/Scenario15:56
slaweqhere I just wanted to share with You one failure15:57
slaweqhttps://3525f1c73d59ef5d5b98-485374e596f765d9f96c9ac94e680c34.ssl.cf2.rackcdn.com/840421/34/check/neutron-tempest-plugin-ovn/b503178/testr_results.html15:57
slaweqit seems like some segfault in the guest ubuntu image15:57
slaweqI saw it only once and it's not neutron related issue15:57
slaweqbut just wanted to make You aware for things like that15:57
slaweqand that's all15:57
slaweqregarding periodic jobs, it looks good this week15:58
slaweqit was even all green 3 or 4 days so it's great15:58
slaweqthat's all from me for today15:58
slaweqany last minute topics for the CI meeting for today?15:58
ykarelnone from me15:58
mlavallenon from me either15:59
slaweqok, if not, then thx for attending the meeting and have a great week :)15:59
slaweq#endmeeting15:59
opendevmeetMeeting ended Tue Sep  6 15:59:13 2022 UTC.  Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4)15:59
opendevmeetMinutes:        https://meetings.opendev.org/meetings/neutron_ci/2022/neutron_ci.2022-09-06-15.00.html15:59
opendevmeetMinutes (text): https://meetings.opendev.org/meetings/neutron_ci/2022/neutron_ci.2022-09-06-15.00.txt15:59
opendevmeetLog:            https://meetings.opendev.org/meetings/neutron_ci/2022/neutron_ci.2022-09-06-15.00.log.html15:59
mlavalleo/15:59
opendevreviewMerged openstack/neutron-tempest-plugin master: skip some port_forwarding test  https://review.opendev.org/c/openstack/neutron-tempest-plugin/+/84058416:30
opendevreviewMerged openstack/neutron master: [api]adds port_forwarding id when list floatingip  https://review.opendev.org/c/openstack/neutron/+/84056517:14
opendevreviewMerged openstack/neutron stable/train: Bump revision number of objects when description is changed  https://review.opendev.org/c/openstack/neutron/+/85499017:14
*** dasm is now known as dasm|off22:56

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!