15:00:12 #startmeeting neutron_ci 15:00:12 Meeting started Tue Jan 31 15:00:12 2023 UTC and is due to finish in 60 minutes. The chair is slaweq. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:00:12 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:00:12 The meeting name has been set to 'neutron_ci' 15:00:16 ping bcafarel, lajoskatona, mlavalle, mtomaska, ralonsoh, ykarel, jlibosva 15:00:35 o/ 15:00:42 o/ 15:00:55 o/ 15:01:03 Grafana dashboard: https://grafana.opendev.org/d/f913631585/neutron-failure-rate?orgId=1 15:01:10 thanks 15:01:10 o/ 15:01:26 * mlavalle opening grafana 15:01:29 hi 15:02:39 ok, lets start 15:02:41 #topic Actions from previous meetings 15:03:05 first one is on lajoskatona but he's not here today 15:03:12 so I will assign it to him for next week to not forget 15:03:18 #action lajoskatona to check fullstack failure neutron.tests.fullstack.test_agent_bandwidth_report.TestPlacementBandwidthReport.test_configurations_are_synced_towards_placement(Open vSwitch agent) 15:03:30 next one 15:03:35 ralonsoh to check neutron.tests.functional.agent.test_ovs_flows.ARPSpoofTestCase.test_arp_spoof_doesnt_block_ipv6 https://404fa55dc27f44e0606d-9f131354b122204fb24a7b43973ed8e6.ssl.cf5.rackcdn.com/860639/22/check/neutron-functional-with-uwsgi/6a57e16/testr_results.html 15:03:44 yes 15:03:45 #link https://review.opendev.org/c/openstack/neutron/+/871101 15:03:56 this is not solving the issue but improving the logs 15:04:22 (well, adding a retry on the ping) 15:04:32 ok 15:04:42 I didin't saw the same issue this week 15:05:31 next one is also on lajoskatona so I will assign it for next week 15:05:36 #action lajoskatona to check in https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_ae9/870024/4/check/neutron-functional-with-uwsgi/ae93cab/testr_results.html if there are available additional logs there 15:05:45 next one 15:05:51 mlavalle to check failed neutron.tests.fullstack.test_l3_agent.TestHAL3Agent.test_ha_router - https://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_dd2/867769/11/gate/neutron-fullstack-with-uwsgi/dd2aa74/testr_results.html 15:06:04 I've been checking https://zuul.opendev.org/t/openstack/builds?job_name=neutron-fullstack-with-uwsgi&branch=master&skip=0 15:06:10 for the past two weeks 15:06:39 The only instance of that test case failing was the one that slaweq reported here 15:06:48 so I think it was a one of 15:06:54 ok, so lets forget about it for now 15:06:58 yeap 15:07:05 also fullstack jobs seems to be pretty stable recently 15:07:13 thx mlavalle for taking care of it 15:07:21 next one 15:07:25 ralonsoh to check new occurence of the https://bugs.launchpad.net/neutron/+bug/1940425 15:07:51 I commented on https://bugs.launchpad.net/neutron/+bug/1940425/comments/22 15:08:23 in this case, Nova didn't update the port binding 15:09:01 ok, should we change status of this bug in Nova? 15:09:03 now it's set as "Invalid" 15:09:08 maybe it should be reopened on Nova's side now? 15:09:42 if that happens again, yes 15:09:54 ok 15:10:04 amd the last one 15:10:05 slaweq to check tempest-slow jobs failures 15:10:18 patches https://review.opendev.org/q/topic:bug%252F2003063 are all merged 15:10:23 and those jobs are fixed now 15:10:33 it was indeed due to update of Cirros image 15:10:42 so some patches to tempest were required 15:10:44 Merged openstack/neutron stable/yoga: Use common wait_until_ha_router_has_state method everywhere https://review.opendev.org/c/openstack/neutron/+/872008 15:11:02 with that I think we can move on to the next topic 15:11:04 #topic Stable branches 15:11:11 bcafarel any updates? 15:12:06 overall it looks good from my checks today, yatin backported some functional fixes (and marking a test unstable) wallaby looked fine 15:12:37 ussuri is broken from comment in https://review.opendev.org/c/openstack/neutron/+/871989 - grenade gate not passing from train paramiko issue 15:13:03 ussuri is EM already, right? 15:14:15 yep this could be reduced (major changes to test from train to ussuri should be quite rare now!) 15:14:58 I will check with Yatin (he is off today) if train is easily fixable for that - this issue may break a few jobs there 15:15:11 ++ 15:15:12 thx 15:15:41 #action bcafarel to check if we should maybe drop broken grenade jobs from Ussuri branch 15:16:02 anything else or can we move on? 15:16:09 all good for the rest :) 15:16:12 thx 15:16:15 so lets move on 15:16:27 I will skip stadium projects are lajoskatona is not here 15:16:37 but all periodic jobs are green this week 15:16:43 #topic Grafana 15:16:51 #link https://grafana.opendev.org/d/f913631585/neutron-failure-rate 15:17:17 all looks good for me in grafana 15:17:51 there is raise of the UT failures today, but I saw couple of patches where UT were failing due to proposed change 15:17:58 so nothing really critical for us here 15:18:24 ++ 15:18:30 any other comments related to grafana? 15:18:40 none from me 15:19:20 ok, so lets move on 15:19:21 #topic Rechecks 15:19:34 with rechecks we are back below 1 in average :) 15:19:48 ++ 15:20:32 +---------+----------+... (full message at ) 15:20:35 so it's better \o/ 15:20:46 actually, when I checked patches merged in last 7 days, all of them were merged without any failed build in last PS: 15:21:09 list is in the agenda document https://etherpad.opendev.org/p/neutron-ci-meetings#L74 15:21:21 regarding bare rechecks it's also good: 15:21:27 +---------+---------------+--------------+-------------------+... (full message at ) 15:21:43 just 1 out of 23 rechecks was "bare" in last 7 days 15:21:51 good numbers :) 15:21:51 so thx a lot for that 15:21:58 bcafarel indeed :) 15:22:36 any questions/comments? 15:22:49 or are we moving on to the next topic(s)? 15:24:00 ok, so lets move on 15:24:06 #topic Unit tests 15:24:22 there is one issue for today 15:24:31 sqlite3.OperationalError: no such table: (seen once in tox-cover job and many occurrences in arm64 unit test jobs) 15:24:41 https://6203638836ab77538319-23561effe696d4d724b67bcb38a7b69d.ssl.cf5.rackcdn.com/871991/1/check/openstack-tox-cover/f9e8ceb/testr_results.html 15:25:23 I think that "no such table" is just effect of some issue, not a root cause of the failure 15:26:08 is that always happening? 15:26:21 it wasn't added by me to the agenda 15:26:27 and quite a few tables missing 15:26:35 but according to the note there, it happens often in arm64 jobs 15:28:09 there is no many logs in the UT jobs :/ 15:28:22 the tox-cover one was https://review.opendev.org/c/openstack/neutron/+/871991/ I see Yatin commenting on "recheck unit test sqlite failure" 15:29:31 ralonsoh can it be that sqlite3 was crashed/killed by oom/something like that during tests? 15:29:53 I don't think so in this case 15:29:58 maybe we should add journal log to the UT job logs? 15:29:59 the error is happening during the cleanup 15:30:21 so maybe (maybe), we have already deleted the table or the DB 15:30:32 cleanup? 15:30:51 check any error log in https://6203638836ab77538319-23561effe696d4d724b67bcb38a7b69d.ssl.cf5.rackcdn.com/871991/1/check/openstack-tox-cover/f9e8ceb/testr_results.html 15:30:55 File "/home/zuul/src/opendev.org/openstack/neutron/.tox/shared/lib/python3.8/site-packages/fixtures/fixture.py", line 125, in cleanUp 15:31:01 this is the first line 15:31:45 but it's not like that in all tests 15:31:49 yeah 15:31:52 I see it in setUp e.g. in neutron.tests.unit.services.trunk.test_rules.TrunkPortValidatorTestCase.test_can_be_trunked_or_untrunked_unbound_port test 15:32:25 and this is during the initialization... 15:32:51 but there is "traceback-2" in that test result and this is in cleanUp 15:33:12 so it seems that it fails in setUp and then it also causes issue in cleanUp phase 15:33:20 yes 15:33:20 but that's just result of previous issue 15:34:51 ok, I think that we need to: 15:34:52 1. report bug for this 15:35:08 2. start collecting journal log in UT jobs 15:35:12 *2 15:35:29 and then maybe we will be able to know something more what happened there 15:35:31 wdyt? 15:35:34 agree 15:35:51 +1 15:36:00 anyone wants to do that? 15:36:43 I'll do it 15:36:57 thx ralonsoh 15:37:27 #action ralonsoh to report UT issue with missing tables in db and propose patch to collect journal log in UT jobs 15:37:51 next topic 15:37:56 #topic fullstack/functional 15:39:09 here I just have issues from previous week as this week I didn't found any new issues 15:39:30 so I think we can skip them, especially as this week functional and fullstack jobs are pretty stable 15:39:52 there is one issue mentioned by ykarel probably: 15:39:53 test_arp_correct_protection.neutron.tests.functional.agent.linux.test_linuxbridge_arp_protect.LinuxBridgeARPSpoofTestCase (seen once in victoria) 15:39:53 https://7b7e23438828cd69d6b5-4d3d00830dc84883c962194d8b2d6bed.ssl.cf2.rackcdn.com/871988/1/check/neutron-functional-with-uwsgi/0272bfd/testr_results.html 15:42:28 I wouldn't spend much time on it as it's related to Linuxbridge agent 15:42:51 and it was seen just once so far 15:43:03 thoughts? 15:43:20 yes, let's niot waste time on it 15:43:21 agree with no spending time on this one 15:44:01 ok, next topic then 15:44:03 #topic Tempest/Scenario 15:44:11 here I wanted to ask about one issue 15:44:20 which I saw few times IIRC 15:44:34 devstack failed with error that there was no tenant network for allocation: https://zuul.opendev.org/t/openstack/build/b75030f968944508a3c3bbe6b4851584 15:44:42 did You saw such issue already? 15:45:29 The last time I built a devstack in my local system was 2 days ago 15:45:33 it worked fine 15:46:06 ok, I think I know now 15:46:28 there are errors related to the DB connection in neutron log in that job https://b84531a976ef476331e1-baa8eceea3205baf832239057c78a658.ssl.cf1.rackcdn.com/869196/11/check/neutron-ovs-grenade-dvr-multinode/b75030f/controller/logs/screen-q-svc.txt 15:46:44 and it's during creation of the subnetpool 15:47:05 you were faster than me :) yep there's oslo_db.exception.DBConnectionError just before that 503 15:47:06 so maybe as subnetpool wasn't created, later network couldn't be created too as there wasn't any pool to use 15:47:29 so, it's not neutron issue then 15:48:25 with that we got to the last topic for today 15:48:27 #topic Periodic 15:48:33 generally all looks good there 15:48:49 but I saw that job neutron-ovn-tripleo-ci-centos-9-containers-multinode is failing pretty often 15:49:09 last two times which it failed and which I checked failed on the undercloud deploy 15:49:14 https://zuul.openstack.org/build/772a941b25bd4dbbb66ddbd6544a3b63 15:49:46 I don't think it's neutron issue really but maybe someone will have some cycles to dig deeper into this and maybe try to find the root cause of it? 15:50:21 or wait for tripleo folks 15:50:50 ralonsoh yep, or ping them :) 15:51:51 ok, I will try to ask some tripleo folks for help with this 15:52:10 #action slaweq to check with tripleo experts failing neutron-ovn-tripleo-ci-centos-9-containers-multinode job 15:52:22 and with that we got to the end of agenda for today 15:52:32 anything else You want to discuss today? 15:52:43 nothing from me 15:52:48 nothing, thanks 15:52:51 nothing either 15:53:02 if nothing, then lets get back few minutes 15:53:10 thx for attending the meeting and have a great week 15:53:12 o/ 15:53:15 #endmeeting