15:00:06 <slaweq> #startmeeting neutron_ci 15:00:06 <opendevmeet> Meeting started Tue Jun 28 15:00:06 2022 UTC and is due to finish in 60 minutes. The chair is slaweq. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:00:06 <opendevmeet> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:00:06 <opendevmeet> The meeting name has been set to 'neutron_ci' 15:00:32 <ralonsoh> hi 15:00:41 <slaweq> hi 15:00:51 <mlavalle> hi 15:01:03 <bcafarel> o/ 15:01:04 <mlavalle> is this a video or irc meeting? 15:01:27 <slaweq> mlavalle today on irc 15:01:32 <mlavalle> ack 15:01:55 <slaweq> I think we can start as lajoskatona is not available today 15:02:02 <slaweq> Grafana dashboard: https://grafana.opendev.org/d/f913631585/neutron-failure-rate?orgId=1 15:02:02 <slaweq> Please open now :) 15:02:07 <slaweq> #topic Actions from previous meetings 15:02:17 <slaweq> slaweq to fix functiona/fullstack failures on centos 9 stream: https://bugs.launchpad.net/neutron/+bug/1976323 15:02:39 <slaweq> I didn't made any progress on that one this week 15:02:50 <slaweq> I will assign it to me for next week again 15:02:55 <slaweq> #action slaweq to fix functiona/fullstack failures on centos 9 stream: https://bugs.launchpad.net/neutron/+bug/1976323 15:03:03 <slaweq> next one 15:03:05 <slaweq> ykarel to update Neutron-tempest-plugin jobs graphs in Grafana 15:03:47 <ykarel> hi 15:04:17 <ykarel> done 15:04:18 <ykarel> https://review.opendev.org/c/openstack/project-config/+/845975 15:04:18 <ykarel> https://review.opendev.org/c/openstack/project-config/+/845978 15:04:49 <slaweq> both merged already 15:04:52 <slaweq> thx ykarel 15:05:07 <slaweq> next one 15:05:09 <slaweq> ykarel to increase swap size in the neutron-tempest-plugin jobs 15:05:32 <ykarel> Done with https://review.opendev.org/c/openstack/neutron-tempest-plugin/+/845888 15:05:38 <slaweq> thx a lot 15:05:49 <slaweq> I didn't saw similar issues in last days 15:05:59 <slaweq> next one 15:06:01 <slaweq> slaweq to move fedora periodic job to centos9 stream 15:06:16 <slaweq> I'm slowly progressing with this one in https://review.opendev.org/c/openstack/neutron/+/844335 15:06:33 <slaweq> but for some reason ovs-vswitchd is crashing there 15:06:43 <slaweq> I will need to check why it's like that 15:07:19 <slaweq> #action slaweq to move fedora periodic job to centos9 stream 15:07:31 <slaweq> next one 15:07:33 <slaweq> ykarel to propose fix/workaround for the fips jobs and missing rabbitmq-server package 15:07:51 <ykarel> Done with https://review.opendev.org/c/openstack/neutron/+/846001 15:08:09 <slaweq> thx a lot 15:08:19 <slaweq> and the last one 15:08:21 <slaweq> ykarel to fix propose-translation-update periodic job 15:08:58 <ykarel> done, that needed couple of iterations https://review.opendev.org/q/topic:fix-propose-updates 15:09:38 <slaweq> but it seems that it's fixed already as periodic jobs were green few days this last week 15:09:40 <slaweq> thx a lot for that 15:10:00 <opendevreview> yatin proposed openstack/neutron stable/yoga: Set nslookup_target in FIPS jobs https://review.opendev.org/c/openstack/neutron/+/847995 15:10:03 <slaweq> any questions/comments regarding those action items? 15:10:04 <mlavalle> one action item missing in this list is https://review.opendev.org/c/openstack/neutron/+/845181 15:10:20 <slaweq> ups, sorry mlavalle 15:10:21 <ykarel> for fips job needs backport in yoga too 15:10:27 <ykarel> just pushed 15:10:32 <mlavalle> I just addressed ralonsoh's suggestioins. It should be good to go 15:10:37 <ralonsoh> +2 15:10:50 <mlavalle> along with https://review.opendev.org/c/openstack/neutron-tempest-plugin/+/845646 15:11:21 <mlavalle> the first patch depends on the second 15:11:35 <slaweq> I added it to my review list for tomorrow 15:11:42 <mlavalle> thanks :-) 15:12:17 <slaweq> thank You for working on that important stuff 15:12:44 <slaweq> ykarel I will also review Your backport :) 15:12:53 <ykarel> thx 15:13:59 <slaweq> ok, I think we can move on 15:14:01 <slaweq> #topic Stable branches 15:14:07 <slaweq> bcafarel any updates? 15:14:13 <slaweq> or new issues 15:14:41 <bcafarel> none that I spotted, though there are not many backports in the queue right now (which is good also!) 15:14:59 <bcafarel> openstack as a whole is moving to EOL pike but we already did for networking not so long ago, so nothing on us 15:15:37 <slaweq> ok, thx for the updates 15:15:53 <slaweq> as there is no Lajos today, I think we can skip stadium projects topic 15:16:01 <slaweq> and move directly to the next one 15:16:07 <slaweq> #topic Grafana 15:16:39 <slaweq> it looks pretty ok, except rally jobs which are broken totally 15:17:04 <slaweq> other jobs are pretty ok IMO 15:17:04 <mlavalle> well, there is Lajos, just not here today :-) 15:17:32 <slaweq> mlavalle :D true, sorry 15:18:23 <slaweq> anything regarding grafana You want to discuss today? 15:18:56 <ykarel> for rally i pushed a patch, but rally CI is in bad shape 15:19:04 <ykarel> https://review.opendev.org/c/openstack/rally-openstack/+/847879 15:19:06 <bcafarel> ykarel++ 15:20:06 <slaweq> yeah, I wanted to talk about it later in the meeting :) 15:20:12 <slaweq> but we can talk about it now 15:20:33 <slaweq> what if we would make those jobs non-voting temporary? 15:20:43 <slaweq> or do You expect that your fix will be merged soon in rally? 15:20:59 <slaweq> I don't want to block our gate for too long 15:22:09 <ykarel> +1 to unblock , ovn one is already non voting, temporary ok to make ovs too non voting 15:22:30 <ykarel> there are multiple issues in rally , so i doubt it will get merged soon 15:22:31 <slaweq> ykarel will You do it? 15:22:35 <ykarel> yes sure 15:22:38 <slaweq> thx a lot 15:22:47 <slaweq> ok, next topic then 15:22:51 <slaweq> #topic Rechecks 15:23:10 <slaweq> we are still below 1 recheck in average to get patches merged 15:23:23 <slaweq> even below 0.5 rechecks last week 15:23:33 <slaweq> so good job :) 15:23:35 <bcafarel> that is really nice 15:24:16 <slaweq> #topic fullstack/functional 15:24:29 <slaweq> functional tests are in better shape recently 15:24:35 <slaweq> but still we have some failures there 15:25:02 <slaweq> test_agent_updated_at_use_nb_cfg_timestamp - AssertionError: Chassis timestamp: 1655824139000, agent updated_at: 2022-06-21 15:08:58+00:00 15:25:08 <slaweq> https://a725fc0d7f8b52d360c4-66ce5f117c645ca152390f12473225b2.ssl.cf5.rackcdn.com/797120/19/gate/neutron-functional-with-uwsgi/4b9fd54/testr_results.html 15:25:19 <slaweq> ykarel reopened https://bugs.launchpad.net/neutron/+bug/1974149 15:25:55 <ykarel> seen this only once 15:25:59 <slaweq> and there is fix proposed https://review.opendev.org/c/openstack/neutron/+/847349 15:26:37 <slaweq> another one 15:26:45 <slaweq> test_virtual_port_host_update - AssertionError: Expected 'update_virtual_port_host' to be called once. Called 0 times. 15:26:51 <slaweq> https://cb16041cfb3c54cedd2e-24bc61d83ed5aece64ab40b405cf025c.ssl.cf5.rackcdn.com/797121/16/check/neutron-functional-with-uwsgi/12ca1dc/testr_results.html 15:27:00 <slaweq> ykarel reopened https://bugs.launchpad.net/neutron/+bug/1971672 15:27:36 <slaweq> ralonsoh it seems that You were working on it in the past 15:27:42 <ralonsoh> yeah, not now 15:28:07 <slaweq> will You be able to check it again? 15:28:16 <slaweq> I don't think it's urgent this week 15:28:21 <ralonsoh> sure 15:28:28 <slaweq> thx a lot 15:28:39 <slaweq> next one 15:28:40 <slaweq> neutron.tests.functional.services.trunk.drivers.openvswitch.agent.test_trunk_manager.TrunkManagerTestCase.test_connectivity 15:28:45 <slaweq> https://cb16041cfb3c54cedd2e-24bc61d83ed5aece64ab40b405cf025c.ssl.cf5.rackcdn.com/797121/16/check/neutron-functional-with-uwsgi/12ca1dc/testr_results.html 15:30:55 <slaweq> anyone saw something like that already? 15:31:10 <ralonsoh> no sorry 15:32:09 <mlavalle> no I haven't 15:32:26 <slaweq> but the error message there is strange 15:32:28 <slaweq> RuntimeError: Process ['ping', '-W', '1', '-c', '3', '192.168.0.1'] hasn't been spawned in 20 seconds. Return code: 0, stdout: PING 192.168.0.1 (192.168.0.1) 56(84) bytes of data.... (full message at https://matrix.org/_matrix/media/r0/download/matrix.org/HKNHjohjKrArohSCNEOkafNH) 15:32:41 <slaweq> "ping wasn't spawn" but there is result from that ping command 15:32:58 <slaweq> and it seems to be working fine 15:33:53 <slaweq> ok, I will try to take a look deeper into it early next week 15:34:11 <slaweq> #action slaweq to check trunk connectivity test failure https://cb16041cfb3c54cedd2e-24bc61d83ed5aece64ab40b405cf025c.ssl.cf5.rackcdn.com/797121/16/check/neutron-functional-with-uwsgi/12ca1dc/testr_results.html 15:34:23 <slaweq> next one 15:34:26 <slaweq> test_metadata_proxy_respawned 15:34:33 <slaweq> https://fd50651997fbb0337883-282d0b18354725863279cd3ebda4ab44.ssl.cf5.rackcdn.com/846960/1/gate/neutron-functional-with-uwsgi/baf4db6/testr_results.html 15:34:33 <slaweq> https://628f2b7919091567c7a1-482044f534933477a9da6fbd27b4ad69.ssl.cf1.rackcdn.com/periodic/opendev.org/openstack/neutron/master/neutron-functional/154a050/testr_results.html 15:34:42 <slaweq> this one failed twice at least this week 15:35:39 <ralonsoh> do you have a LP bug? just to document it 15:36:18 <slaweq> nope 15:36:25 <ralonsoh> I'll open it 15:36:28 <slaweq> thx a lot 15:36:35 <slaweq> and someone will need to check it 15:36:48 <ralonsoh> I'll try 15:36:55 <slaweq> thx a lot 15:36:58 <slaweq> ok, next one 15:37:01 <slaweq> https://fd50651997fbb0337883-282d0b18354725863279cd3ebda4ab44.ssl.cf5.rackcdn.com/846960/1/gate/neutron-functional-with-uwsgi/baf4db6/testr_results.html 15:37:11 <opendevreview> yatin proposed openstack/neutron master: Temporary make rally job non voting https://review.opendev.org/c/openstack/neutron/+/847989 15:37:14 <slaweq> problem with db migration tests (again) 15:37:33 <opendevreview> Arnau Verdaguer proposed openstack/neutron master: Migration revert plan https://review.opendev.org/c/openstack/neutron/+/835638 15:37:36 <ralonsoh> with Psql? 15:37:37 <slaweq> ralonsoh can You check that one when You will have some time? 15:37:49 <slaweq> yes, it's Psql test 15:37:54 <ralonsoh> yes, I think I only modified the mysql ones 15:38:01 <slaweq> ok 15:38:27 <slaweq> #action ralonsoh to check test_walk_versions failure https://fd50651997fbb0337883-282d0b18354725863279cd3ebda4ab44.ssl.cf5.rackcdn.com/846960/1/gate/neutron-functional-with-uwsgi/baf4db6/testr_results.html 15:38:49 <slaweq> next one 15:39:08 <slaweq> failure in test_ha_router_lifecycle 15:39:14 <slaweq> https://d2e721b9a6905a827b60-69bfa1706af4af4c0b48b8bfd809f2ca.ssl.cf2.rackcdn.com/835638/15/check/neutron-functional-with-uwsgi/56a2c78/testr_results.html 15:39:14 <slaweq> https://fb75ecc35c58d9fe2410-512bee2f5825275b34720067f00890fc.ssl.cf2.rackcdn.com/840419/8/check/neutron-functional-with-uwsgi/e15f93a/testr_results.html 15:39:39 <slaweq> I though that those tests should be skipped when fails like that 15:39:51 <slaweq> maybe that one is going through some other path and I missed it somehow 15:39:54 <slaweq> I will check it 15:40:16 <slaweq> #action slaweq to check why test_ha_router_lifecycle test wasn't skipped as it should be in case of failure 15:40:52 <slaweq> and the last one on that list 15:40:53 <slaweq> test_dvr_router_lifecycle_ha_with_snat_with_fips 15:40:53 <slaweq> https://7892a49fff80be41bd93-7937b4b8835d06e87bcc77aa86f44280.ssl.cf5.rackcdn.com/periodic/opendev.org/openstack/neutron/master/neutron-functional-with-uwsgi-fips/99f97f6/testr_results.html 15:41:13 <slaweq> again problem with device not found in the namespace 15:41:24 <slaweq> but I don't really know why it's like that 15:41:29 <slaweq> maybe someone wants to check it 15:41:45 <ralonsoh> I'll try this week 15:42:39 <slaweq> the common patter in those cases is that interface is added, deleted, added in the ovs-vswitchd log 15:42:43 <slaweq> see https://7892a49fff80be41bd93-7937b4b8835d06e87bcc77aa86f44280.ssl.cf5.rackcdn.com/periodic/opendev.org/openstack/neutron/master/neutron-functional-with-uwsgi-fips/99f97f6/controller/logs/openvswitch/ovs-vswitchd_log.txt 15:42:55 <slaweq> You can grep for qr-81815200-dd 15:43:03 <slaweq> 2022-06-28T02:43:01.305Z|03101|bridge|INFO|bridge test-br32e76b53: added interface qr-81815200-dd on port 2 15:43:08 <slaweq> 2022-06-28T02:43:01.465Z|03105|bridge|INFO|bridge test-br32e76b53: deleted interface qr-81815200-dd on port 2 15:43:13 <slaweq> 2022-06-28T02:43:01.656Z|03107|bridge|INFO|bridge test-br32e76b53: added interface qr-81815200-dd on port 2 15:43:18 <slaweq> 2022-06-28T02:43:01.749Z|03117|bridge|INFO|bridge test-br32e76b53: deleted interface qr-81815200-dd on port 2 15:43:41 <slaweq> but I have no idea what can be problem there really 15:43:46 <ralonsoh> so maybe we are processing the router events in the wrong order 15:43:58 <slaweq> ralonsoh maybe 15:44:35 <slaweq> if You can check it with fresh look, that would be great 15:44:50 <ralonsoh> sure 15:45:40 <slaweq> thx a lot 15:46:01 <slaweq> #action ralonsoh to check missing qr- device in the namespace 15:46:22 <slaweq> ok, that's all issues with functional tests for today 15:46:25 <slaweq> any questions/comments? 15:47:52 <slaweq> ok, lets move on 15:47:57 <slaweq> #topic Tempest/Scenario 15:48:05 <slaweq> here we have 2 issues for today 15:48:13 <slaweq> first one with live migration and trunk ports: 15:48:19 <slaweq> https://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_8c3/797121/16/check/neutron-ovs-tempest-multinode-full/8c3b8eb/testr_results.html 15:48:26 <slaweq> but it seems for me like it's nova issue 15:48:35 <slaweq> because migration wasn't done properly 15:49:13 <slaweq> if that will happen more often I think we can ask nova team to take a look but for now I wouldn't bother with that too much 15:49:16 <mlavalle> we talked about something similar downstream yesterday 15:50:33 <slaweq> mlavalle do You want to check it together with d/s issue? 15:50:37 <slaweq> maybe it's the same one 15:51:07 <mlavalle> slaweq: yeah, I would need a pointer to the downstream bugzilla 15:51:18 <mlavalle> was someone assigned to it? 15:51:50 <slaweq> are You talking about https://bugzilla.redhat.com/show_bug.cgi?id=2097160 ? 15:52:09 <slaweq> if so, it's for sure different issue 15:52:12 <mlavalle> that's it 15:52:24 <ralonsoh> I don't think both errors are related 15:52:32 <ralonsoh> the CI error has a live migration problem 15:52:32 <mlavalle> ok 15:52:38 <mlavalle> it just rang a bell 15:52:49 <ralonsoh> but the BZ happens once the migration failed 15:53:16 <slaweq> yeah, I also don't think those are the same issues 15:53:25 <slaweq> anyway, lets not bother with that too much for now 15:53:30 <slaweq> :) 15:53:41 <mlavalle> ok 15:54:03 <slaweq> last issue for today on my list is qos scenario test failure 15:54:04 <slaweq> https://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_4cd/847832/2/check/neutron-tempest-plugin-linuxbridge/4cd9322/testr_results.html 15:54:10 <slaweq> in linuxbridge job 15:54:25 <slaweq> I don't think anyone of us will have any cycles to check it 15:55:32 <slaweq> and that's basically all what I had for today 15:55:43 <slaweq> do You have any other ci related topics to discuss today? 15:55:53 <ralonsoh> no thanks 15:55:56 <bcafarel> none from me 15:56:15 <ykarel> just please check non voting patch https://review.opendev.org/c/openstack/neutron/+/847989 15:56:16 <slaweq> ok, so I will give You 4 minutes back 15:56:18 <mlavalle> neither do I 15:56:24 <mlavalle> thanks! 15:56:35 <slaweq> thx for attending the meeting 15:56:42 <slaweq> ykarel I already +2 it 15:56:48 <ykarel> thx 15:56:53 <slaweq> have a great week and see You online! 15:56:56 <slaweq> #endmeeting