| *** dkehn_ is now known as dkehn | 01:06 | |
| opendevreview | ZhouHeng proposed openstack/neutron-lib master: [ovn]Floating IP adds distributed attributes https://review.opendev.org/c/openstack/neutron-lib/+/855053 | 02:34 |
|---|---|---|
| opendevreview | liujinxin proposed openstack/neutron master: For DvrEdgeRouter, snat namespace should not be created in initialize. https://review.opendev.org/c/openstack/neutron/+/855995 | 02:40 |
| opendevreview | yangjianfeng proposed openstack/neutron-tempest-plugin master: Create extra external network with address scope for `ndp proxy` tests https://review.opendev.org/c/openstack/neutron-tempest-plugin/+/855997 | 02:53 |
| opendevreview | ZhouHeng proposed openstack/neutron-tempest-plugin master: skip some port_forwarding test https://review.opendev.org/c/openstack/neutron-tempest-plugin/+/840584 | 02:53 |
| opendevreview | yangjianfeng proposed openstack/neutron master: Forbid enable ndp proxy when external netwrok has no IPv6 address scope https://review.opendev.org/c/openstack/neutron/+/855850 | 03:17 |
| opendevreview | yangjianfeng proposed openstack/neutron master: Forbid enable ndp proxy when external netwrok has no IPv6 address scope https://review.opendev.org/c/openstack/neutron/+/855850 | 03:36 |
| opendevreview | Lajos Katona proposed openstack/networking-sfc master: Adopt to latest VlanManager changes https://review.opendev.org/c/openstack/networking-sfc/+/855887 | 08:30 |
| opendevreview | Lajos Katona proposed openstack/neutron-fwaas master: Adopt to latest VlanManager changes https://review.opendev.org/c/openstack/neutron-fwaas/+/855891 | 09:16 |
| opendevreview | Szymon Wróblewski proposed openstack/neutron master: Fix test_nova_send_events_* tests https://review.opendev.org/c/openstack/neutron/+/856034 | 09:41 |
| opendevreview | yangjianfeng proposed openstack/neutron master: Forbid enable ndp proxy when external netwrok has no IPv6 address scope https://review.opendev.org/c/openstack/neutron/+/855850 | 10:14 |
| opendevreview | yangjianfeng proposed openstack/neutron master: Forbid enable ndp proxy when external netwrok has no IPv6 address scope https://review.opendev.org/c/openstack/neutron/+/855850 | 10:52 |
| opendevreview | Slawek Kaplonski proposed openstack/neutron master: Add new role "prepare_functional_tests_logs" https://review.opendev.org/c/openstack/neutron/+/855868 | 10:57 |
| opendevreview | Slawek Kaplonski proposed openstack/neutron master: DNM Just run small subset of the functional jobs to test new role https://review.opendev.org/c/openstack/neutron/+/856039 | 10:57 |
| opendevreview | Slawek Kaplonski proposed openstack/neutron master: Add new role "prepare_functional_tests_logs" https://review.opendev.org/c/openstack/neutron/+/855868 | 12:04 |
| opendevreview | Slawek Kaplonski proposed openstack/neutron master: Add new role "prepare_functional_tests_logs" https://review.opendev.org/c/openstack/neutron/+/855868 | 13:07 |
| *** kleini- is now known as kleini | 13:17 | |
| opendevreview | Lajos Katona proposed openstack/networking-bagpipe stable/ussuri: [stable-only] Cap virtualenv for py37 https://review.opendev.org/c/openstack/networking-bagpipe/+/855883 | 13:23 |
| *** dasm is now known as Guest2115 | 13:31 | |
| *** Guest2115 is now known as dasm | 14:02 | |
| slaweq | #startmeeting neutron_ci | 15:00 |
| opendevmeet | Meeting started Tue Sep 6 15:00:12 2022 UTC and is due to finish in 60 minutes. The chair is slaweq. Information about MeetBot at http://wiki.debian.org/MeetBot. | 15:00 |
| opendevmeet | Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. | 15:00 |
| opendevmeet | The meeting name has been set to 'neutron_ci' | 15:00 |
| slaweq | hi | 15:00 |
| mlavalle | o/ | 15:00 |
| ykarel | o/ | 15:00 |
| slaweq | ralonsoh_ is on PTO, bcafarel too | 15:02 |
| slaweq | I don't think if lajoskatona will be able to join today | 15:03 |
| slaweq | so I guess we can start | 15:03 |
| mlavalle | probably not | 15:03 |
| slaweq | Grafana dashboard: https://grafana.opendev.org/d/f913631585/neutron-failure-rate?orgId=1 | 15:03 |
| lajoskatona | Hi, I can, but only on IRC | 15:03 |
| slaweq | lets go with first topic | 15:03 |
| slaweq | lajoskatona: hi, yeah, we have it on irc today | 15:03 |
| slaweq | #topic Actions from previous meetings | 15:03 |
| lajoskatona | and I am on mobilnet, so possible that I will disappear time-to-time.... | 15:03 |
| slaweq | slaweq to fix functiona/fullstack failures on centos 9 stream: https://bugs.launchpad.net/neutron/+bug/1976323 | 15:03 |
| slaweq | lajoskatona: sure, thx for the heads up | 15:03 |
| slaweq | regarding that action item, I didn't made any progress really | 15:04 |
| slaweq | so I will add it for myself for next week too | 15:05 |
| slaweq | #action slaweq to fix functiona/fullstack failures on centos 9 stream: https://bugs.launchpad.net/neutron/+bug/1976323 | 15:05 |
| slaweq | next one | 15:05 |
| slaweq | slaweq to check POST_FAILURE reasons | 15:05 |
| slaweq | I checked it with infra team and it seems that it is timing out while uploading logs to swift | 15:05 |
| slaweq | and we have a lot of small log files in the "dsvm-functional-logs" directory and that may be slow to upload all those files to Swift | 15:06 |
| lajoskatona | ok so it is not that our tests take again longer time | 15:06 |
| slaweq | so I prepared patch https://review.opendev.org/c/openstack/neutron/+/855868 | 15:06 |
| slaweq | lajoskatona: nope | 15:06 |
| slaweq | with that patch we will upload to swift .tar.gz archive with those logs which should be faster (I hope) | 15:06 |
| slaweq | I also did additional patch https://review.opendev.org/c/openstack/neutron/+/855867/ which removes store of the journal.log in the logs of the job | 15:07 |
| slaweq | it's not needed as devstack is already doing that and storing in the devstack.journal.gz file | 15:07 |
| slaweq | so it can save some disk space and few seconds during the job execution :) | 15:08 |
| slaweq | please review both those patches when You will have a minute or two | 15:08 |
| ykarel | ack | 15:08 |
| slaweq | next one | 15:09 |
| slaweq | ykarel to check interface not found issues in the periodic functional jobs | 15:09 |
| ykarel | yes i checked all the three failures linked | 15:09 |
| ykarel | All the failures share common symptoms where interface get's deleted/added quickly, and in that period neutron fails with device missing in namespace in two of those failures | 15:09 |
| ykarel | like two of them, deleted at 02:45:35.681, readded at 02:45:35.778, fails at 02:45:35.705 | 15:09 |
| ykarel | deleted at 02:55:12.157, readded at 02:55:13.608, fails at 02:55:13.527 | 15:10 |
| ykarel | One failure share same observations as done by slawek in https://bugs.launchpad.net/neutron/+bug/1961740/comments/17 | 15:10 |
| ykarel | from opensearch i see some more occurances in non periodic jobs too in master and stable/yoga | 15:10 |
| ykarel | https://opensearch.logs.openstack.org/_dashboards/app/discover/?security_tenant=global#/?_g=(filters:!(),refreshInterval:(pause:!t,value:0),time:(from:now-30d,to:now))&_a=(columns:!(_source),filters:!(),index:'94869730-aea8-11ec-9e6a-83741af3fdcd',interval:auto,query:(language:kuery,query:'message:%22not%20found%20in%20namespace%20snat%22'),sort:!()) | 15:11 |
| slaweq | ykarel: yes, I also saw that "readd" of the interfaces some time ago when I was investigating that | 15:11 |
| slaweq | but I have no idea why it happens like that | 15:11 |
| ykarel | slaweq, yes i too didn't got the root cause for that | 15:12 |
| slaweq | some time ago I even did patch which I hoped will workaround it | 15:12 |
| slaweq | let me find it | 15:12 |
| ykarel | the retry one? | 15:12 |
| slaweq | yes | 15:12 |
| ykarel | yeap that's not helping atleast not avoiding this issue completely | 15:13 |
| slaweq | I know :/ | 15:13 |
| ykarel | as in two of them i see that device added to namespace without retry | 15:13 |
| slaweq | and that's strange as interface is added/removed/added in short period of time | 15:13 |
| ykarel | but then removed | 15:13 |
| ykarel | yes | 15:14 |
| ykarel | also noticed there was cpu load > 10 around the failure, but i see similar with success jobs | 15:15 |
| ykarel | also ram was not full utilized during failures | 15:15 |
| ykarel | also observed many failures were seen in test patch https://review.opendev.org/c/openstack/neutron/+/854191/ as per opensearch | 15:16 |
| ykarel | but that's just to trigger jobs, a lot of jobs | 15:17 |
| ykarel | i recall some time back it was discussed to not use rootwrap in functional tests, you think that's related here? | 15:18 |
| slaweq | maybe I have some theory | 15:18 |
| slaweq | it is failing with error like "Interface not found in namespace snat..." or something like that | 15:19 |
| ykarel | yes | 15:19 |
| slaweq | so maybe as device is re-added, it's not in the snat-XXX namespace but in the global namespace | 15:20 |
| slaweq | and that's why it cannot find it | 15:20 |
| slaweq | look at https://github.com/openstack/neutron/blob/master/neutron/agent/linux/ip_lib.py#L463 | 15:20 |
| slaweq | it's where it is failing | 15:20 |
| slaweq | and here "self._parent.namespace" is namespace in which interface is looked for? | 15:20 |
| slaweq | and "net_ns_fd=namespace" is attribute to set for the interface | 15:21 |
| slaweq | so it is expected to be in snat-XXX namespace but it's not there | 15:21 |
| slaweq | as it was deleted/added again | 15:21 |
| slaweq | does it makes sense? | 15:21 |
| ykarel | didn't got why it's in global namespace | 15:21 |
| slaweq | when You are adding new port it's always first in global namespace | 15:22 |
| slaweq | right? | 15:22 |
| ykarel | yes i think so | 15:22 |
| ykarel | and to add it to namespace it needs some explicit calss | 15:23 |
| ykarel | s/calss/calls | 15:23 |
| slaweq | ok, I know why | 15:25 |
| slaweq | it's bug in my retry | 15:25 |
| ykarel | ahh | 15:25 |
| slaweq | when it calls first time add_device_to_namespace | 15:25 |
| slaweq | it set's parent namespace to namespace in https://github.com/openstack/neutron/blob/master/neutron/agent/linux/ip_lib.py#L464 | 15:26 |
| slaweq | and when it's deleted and added again, it's in global namespace | 15:26 |
| slaweq | but _parent.namespace is already set | 15:26 |
| slaweq | so that's why it's failing as it's looking for it in wrong namespace | 15:26 |
| slaweq | :) | 15:26 |
| slaweq | in this except block https://github.com/openstack/neutron/blob/master/neutron/agent/linux/interface.py#L360 | 15:27 |
| slaweq | we should do something like: | 15:27 |
| slaweq | device._parent.namespace = None before retrying | 15:27 |
| slaweq | and that should make it working fine IMO | 15:27 |
| ykarel | so iiuc this will fix the case where it's failing even after multiple retry, right? | 15:28 |
| slaweq | yes | 15:28 |
| ykarel | not the other two cases | 15:28 |
| ykarel | okk | 15:28 |
| slaweq | I will propose patch for that | 15:28 |
| ykarel | k Thanks | 15:29 |
| slaweq | I think it may fix all cases where interface is "re-added" | 15:30 |
| slaweq | as currently retry mechanism is broken | 15:30 |
| slaweq | #action slaweq to fix add_device_to_namespace retry mechanism | 15:32 |
| ykarel | if add_interface_to_namespace is called everytime port is added to ovs-bridge, then yes it should fix | 15:32 |
| ykarel | i still have to check complete flow | 15:32 |
| slaweq | k | 15:32 |
| slaweq | I will propose patch to fix that issue which we found | 15:32 |
| slaweq | but if You will find anything else, please propose fixes too :) | 15:33 |
| slaweq | ok, lets move on | 15:33 |
| slaweq | mlavalle to check failing quota test in openstack-tox-py39-with-oslo-master periodic job | 15:33 |
| mlavalle | It is failing sometime | 15:33 |
| mlavalle | I filed this bug: https://bugs.launchpad.net/neutron/+bug/1988604 | 15:34 |
| mlavalle | and proposed this fix: https://review.opendev.org/c/openstack/neutron/+/855703 | 15:34 |
| lajoskatona | quick question: can it be realted to the sqlalchemy2.0 vs oslo.db relase thread? | 15:34 |
| mlavalle | it might | 15:35 |
| lajoskatona | ok, thanks, it is interesting to have an opinion on the debate | 15:35 |
| lajoskatona | this morning I said let's wait with it, but if we are on the safe side with our best understanding let's have a release | 15:36 |
| slaweq | lajoskatona: oslo.db version which has this "issue" is 12.1.0, right? | 15:36 |
| lajoskatona | yes I think | 15:36 |
| slaweq | ok | 15:36 |
| slaweq | in "normal" unit test jobs we are still using 12.0.0 | 15:36 |
| lajoskatona | It is not an issue more that some project not adopted, and we have this flapping job | 15:36 |
| slaweq | so that's why those jobs are working fine | 15:36 |
| lajoskatona | yes this is how I understand | 15:37 |
| slaweq | mlavalle: I just run experimental jobs on Your patch | 15:37 |
| slaweq | I think we can run it few times to check if that oslo-master job will be stable with it | 15:37 |
| lajoskatona | but if mlavalle's patch fixes the job, I would say let's have this oslo.db out | 15:37 |
| slaweq | ++ | 15:37 |
| lajoskatona | slaweq: good idea | 15:37 |
| lajoskatona | i forgot tht we have experimental for this | 15:37 |
| slaweq | mlavalle: and also, I would really like ralonsoh_ to look at Your patch too :) | 15:38 |
| mlavalle | slaweq: we actually discussed it before he went on vacation | 15:38 |
| lajoskatona | +1 | 15:38 |
| mlavalle | it is in this channel's log a week ago | 15:38 |
| slaweq | mlavalle: ahh, ok | 15:38 |
| slaweq | so if he was fine with it, I'm good too :) | 15:39 |
| slaweq | I trust You ;) | 15:39 |
| mlavalle | yes, he was | 15:39 |
| lajoskatona | ok, than let's see the experimental jobs results and go back to the thread | 15:39 |
| slaweq | ++ | 15:39 |
| slaweq | thx mlavalle | 15:39 |
| slaweq | next topic then | 15:39 |
| slaweq | #topic Stable branches | 15:39 |
| slaweq | anything new regarding stable branches? | 15:40 |
| lajoskatona | I just checked (https://review.opendev.org/c/openstack/requirements/+/855973 ) and cinderseems to be failing but I can't check the logs on mobile net :P | 15:40 |
| lajoskatona | elodilles proposed a series for caping virtualenv: https://review.opendev.org/q/topic:cap-virtualenv-py37 | 15:40 |
| lajoskatona | if effects some networking projects also, I started to check (bagpipe perhaps) if you have tim please keep an eye on these | 15:41 |
| lajoskatona | it is for ussuri only as I see | 15:41 |
| slaweq | thx lajoskatona | 15:42 |
| slaweq | I will take a look | 15:42 |
| lajoskatona | thanks | 15:42 |
| slaweq | ok, next topic | 15:42 |
| slaweq | #topic Stadium projects | 15:42 |
| * slaweq will be back in 2 minutes | 15:43 | |
| lajoskatona | One topic, with the segments patches we let in a change in vlanmanager that brakes some stadium | 15:43 |
| lajoskatona | I added the patches to the etherpad | 15:43 |
| lajoskatona | https://review.opendev.org/c/openstack/networking-bagpipe/+/855886 | 15:43 |
| lajoskatona | https://review.opendev.org/c/openstack/networking-sfc/+/855887 | 15:43 |
| lajoskatona | https://review.opendev.org/c/openstack/neutron-fwaas/+/855891 | 15:43 |
| lajoskatona | it was too late when I switched to FF mode and stopped merging of this feature sorry for it :-( | 15:44 |
| * slaweq is back | 15:45 | |
| slaweq | no worries | 15:45 |
| slaweq | good that we found it before final release of Zed | 15:45 |
| slaweq | so we still have time to fix those | 15:45 |
| lajoskatona | And i have to drop (low battery, and have to fetch my sons from English lesson) | 15:46 |
| lajoskatona | yeah good that we have periodic jobs :-) | 15:46 |
| slaweq | lajoskatona: thx, see You | 15:46 |
| lajoskatona | o/ | 15:46 |
| slaweq | ok, lets move on to the next topic | 15:46 |
| slaweq | #topic Grafana | 15:46 |
| slaweq | dashboards looks pretty good IMO | 15:47 |
| mlavalle | yeap | 15:47 |
| slaweq | I don't see anything very bad there | 15:47 |
| slaweq | do You see anything worth discussion there? | 15:47 |
| slaweq | if not, I think we can quickly move on | 15:48 |
| slaweq | #topic Rechecks | 15:48 |
| slaweq | rechecks stats are in the meeting agenda etherpad https://etherpad.opendev.org/p/neutron-ci-meetings#L52 | 15:49 |
| slaweq | basically it looks good still | 15:49 |
| slaweq | last week we had 0.17 recheck in average to get patch merged | 15:49 |
| slaweq | this week it's 1.5 but it's just begin of the week | 15:49 |
| slaweq | so hopefully it will be better | 15:49 |
| slaweq | regarding bare rechecks it's also much better this week | 15:50 |
| slaweq | +---------+---------------+--------------+-------------------+... (full message at https://matrix.org/_matrix/media/r0/download/matrix.org/WAYKKcqlXmMdgZjqrbwNZgul) | 15:50 |
| slaweq | thx a lot to all of You who are checking failures before rechecking :) | 15:50 |
| mlavalle | +1 | 15:50 |
| slaweq | anything else You want to add/ask regarding rechecks? | 15:51 |
| mlavalle | nope | 15:51 |
| slaweq | ok, so next topic | 15:52 |
| slaweq | #topic fullstack/functional | 15:52 |
| slaweq | here I found one "new" error | 15:52 |
| slaweq | https://zuul.openstack.org/build/ad0801f20bc143cebf5692440b331df4 | 15:52 |
| slaweq | metadata proxy didn't start | 15:52 |
| slaweq | but I didn't had time to look into it deeper | 15:53 |
| slaweq | anyone wants to check it? | 15:54 |
| mlavalle | I'll look | 15:54 |
| slaweq | from log https://6338bbe59b3242bd04ef-84c9f5cd8c2b87d7cd3ff61e3f0a2559.ssl.cf2.rackcdn.com/periodic/opendev.org/openstack/neutron/master/neutron-functional-with-uwsgi-fips/ad0801f/controller/logs/dsvm-functional-logs/neutron.tests.functional.agent.test_dhcp_agent.DHCPAgentOVSTestCase.test_metadata_proxy_respawned.txt it seems that it was respawned | 15:54 |
| mlavalle | we don't know it's and issue yet, right? | 15:54 |
| slaweq | so maybe that's some issue in test | 15:54 |
| slaweq | mlavalle: nope | 15:54 |
| slaweq | thx for volunteering | 15:54 |
| slaweq | #action mlavalle to check metadata proxy not respawned error | 15:55 |
| slaweq | mlavalle: but please don't treat it as high priority (for now) as it happened only once | 15:55 |
| mlavalle | yeap, that's why I asked | 15:55 |
| slaweq | ++ | 15:55 |
| slaweq | any other issues/questions related to the functional or fullstack jobs? | 15:56 |
| slaweq | or can we move on? | 15:56 |
| slaweq | ok, lets move on | 15:56 |
| slaweq | #topic Tempest/Scenario | 15:56 |
| slaweq | here I just wanted to share with You one failure | 15:57 |
| slaweq | https://3525f1c73d59ef5d5b98-485374e596f765d9f96c9ac94e680c34.ssl.cf2.rackcdn.com/840421/34/check/neutron-tempest-plugin-ovn/b503178/testr_results.html | 15:57 |
| slaweq | it seems like some segfault in the guest ubuntu image | 15:57 |
| slaweq | I saw it only once and it's not neutron related issue | 15:57 |
| slaweq | but just wanted to make You aware for things like that | 15:57 |
| slaweq | and that's all | 15:57 |
| slaweq | regarding periodic jobs, it looks good this week | 15:58 |
| slaweq | it was even all green 3 or 4 days so it's great | 15:58 |
| slaweq | that's all from me for today | 15:58 |
| slaweq | any last minute topics for the CI meeting for today? | 15:58 |
| ykarel | none from me | 15:58 |
| mlavalle | non from me either | 15:59 |
| slaweq | ok, if not, then thx for attending the meeting and have a great week :) | 15:59 |
| slaweq | #endmeeting | 15:59 |
| opendevmeet | Meeting ended Tue Sep 6 15:59:13 2022 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) | 15:59 |
| opendevmeet | Minutes: https://meetings.opendev.org/meetings/neutron_ci/2022/neutron_ci.2022-09-06-15.00.html | 15:59 |
| opendevmeet | Minutes (text): https://meetings.opendev.org/meetings/neutron_ci/2022/neutron_ci.2022-09-06-15.00.txt | 15:59 |
| opendevmeet | Log: https://meetings.opendev.org/meetings/neutron_ci/2022/neutron_ci.2022-09-06-15.00.log.html | 15:59 |
| mlavalle | o/ | 15:59 |
| opendevreview | Merged openstack/neutron-tempest-plugin master: skip some port_forwarding test https://review.opendev.org/c/openstack/neutron-tempest-plugin/+/840584 | 16:30 |
| opendevreview | Merged openstack/neutron master: [api]adds port_forwarding id when list floatingip https://review.opendev.org/c/openstack/neutron/+/840565 | 17:14 |
| opendevreview | Merged openstack/neutron stable/train: Bump revision number of objects when description is changed https://review.opendev.org/c/openstack/neutron/+/854990 | 17:14 |
| *** dasm is now known as dasm|off | 22:56 | |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!