15:02:42 <ykarel> #startmeeting neutron_ci 15:02:42 <opendevmeet> Meeting started Tue Mar 26 15:02:42 2024 UTC and is due to finish in 60 minutes. The chair is ykarel. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:02:42 <opendevmeet> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:02:42 <opendevmeet> The meeting name has been set to 'neutron_ci' 15:02:46 <mlavalle> \o 15:02:57 <ykarel> ping bcafarel, lajoskatona, mlavalle, mtomaska, ralonsoh, ykarel, jlibosva, elvira 15:03:04 <lajoskatona> o/ 15:03:05 <ralonsoh> hey, hello! 15:03:06 <haleyb> o/ 15:04:21 <mtomaska> o/ 15:05:32 <ykarel> Slawomir and Bernard said they will not be joining today 15:05:51 <ykarel> Lets start with 15:05:52 <ykarel> #topic Actions from previous meetings 15:06:00 <ykarel> lajoskatona to check fwass job failure 15:06:10 <opendevreview> Merged openstack/neutron-tempest-plugin master: Add Active Active L3 GW API test cases https://review.opendev.org/c/openstack/neutron-tempest-plugin/+/897823 15:06:24 <lajoskatona> it's green, it was a glitch 15:06:36 <lajoskatona> I pushed dnm patch, but since the periodic also green 15:07:12 <ykarel> k thx for checking, let's move to next 15:07:21 <ykarel> ajoskatona to push patch to drop grenade jobs from unmaintained branches 15:07:29 <ykarel> lajoskatona* 15:07:56 <lajoskatona> https://review.opendev.org/c/openstack/neutron/+/913700 15:08:11 <lajoskatona> it is merged and backports are green 15:08:44 <ykarel> k great, so we may need similar in other unmaintained/branches too, right? 15:09:02 <ykarel> with no much activity seems we not hit it already in others 15:09:24 <lajoskatona> ykarel: yes, I suppose it can be backported 15:09:56 <lajoskatona> just FYI I pushed similar for Heat also (https://review.opendev.org/c/openstack/heat/+/914096 ) so I suppose for other projects it can be useful also 15:10:27 <ykarel> k thx, let's do it if we see the issue in other branches 15:10:40 <lajoskatona> +1 15:11:25 <ykarel> ykarel to check functional failure in test test_floatingip_mac_bindings 15:11:36 <ykarel> https://c2deb3ebe4d3800fb471-ee896de7b34caf47b7848064119af8f8.ssl.cf2.rackcdn.com/periodic/opendev.org/openstack/neutron/master/neutron-functional-with-oslo-master/c97a63e/testr_results.html 15:13:02 <ykarel> looks same issue as https://bugs.launchpad.net/neutron/+bug/1955008 where some timeout was added to 3 seconds 15:13:25 <ykarel> I checked the logs 15:13:28 <ykarel> From logs can see sb db had Mac_Binding created at 03:20:17.642Z and select query failed seen in sb db log at 03:20:25.739Z 15:13:36 <ykarel> There are no logs in dstat after Mar 15 03:06:09.983539 np0037071870 dstat.sh[38660]: dstat: Timeout waiting for a response from PMCD 15:13:53 <ykarel> same is with memory tracker 15:13:59 <ykarel> it Seems related to slow node 15:14:06 <ralonsoh> so you can't see the created register in the DB? 15:14:20 <ykarel> i see it in the db 15:14:24 <ralonsoh> ah ok 15:14:46 <ralonsoh> so the IDL didn't received this update 15:15:25 <ykarel> yes seems so within the given timeout 15:16:13 <ykarel> so increasing timeout may help but considering it's seen quite rarely may be we shouldn't touch it now? 15:16:29 <ralonsoh> but we are using ovsdb-client 15:16:41 <ralonsoh> so we are directly requesting this value to the DB 15:16:52 <ralonsoh> we are not using the IDL, am I wrong? 15:17:10 <ralonsoh> so increasing the timeout I don't know if will help 15:17:45 <ykarel> hmm you seems to be right 15:17:56 <ykarel> the failing command is ovsdb-client 15:19:03 <ykarel> not sure then where the request stuck as db server logs shows it at 03:20:25.739Z 15:21:06 <ykarel> 2024-03-15T03:20:25.739Z|00174|jsonrpc|DBG|unix#4: received request, method="transact", params=["OVN_Southbound",{"where":[["_uuid","==",["uuid","5c47f93b-afdf-405b-8025-487d0d0034ee"]]],"table":"MAC_Binding","op":"select"}], id=0 15:21:06 <ykarel> 2024-03-15T03:20:25.740Z|00175|jsonrpc|DBG|unix#4: send reply, result=[{"rows":[{"ip":"100.0.0.21","_version":["uuid","f51f0a01-bef2-4b81-a47f-128e43e29389"],"_uuid":["uuid","5c47f93b-afdf-405b-8025-487d0d0034ee"],"logical_port":"","mac":"","datapath":["uuid","b0405f5a-e378-4e8c-92d1-e078f9f11d8b"],"timestamp":0}]}], id=0 15:21:25 <ralonsoh> pfffff 3 seconds later 15:21:33 <ralonsoh> so yes, increasing the timeout could help here 15:22:03 <ralonsoh> not 3 seconds, 8 seconds! 15:23:13 <ykarel> yes 8 seconds after create, but there were other operations in between 15:23:29 <ykarel> in good cases i see the diff was around 5 second b/w create and fetch 15:29:27 <ralonsoh> we can handle this issue later offline 15:29:55 <ykarel> yes sure , 15:30:13 <ykarel> slaweq to look at lp 2058378 15:30:26 <ykarel> it's now fixed with https://review.opendev.org/c/x/devstack-plugin-tobiko/+/913746 15:30:34 <ykarel> #topic Stable branches 15:31:09 <ykarel> Bernard is not around, but /me not seen any issue against stable apart from known intermittent failures 15:31:29 <ykarel> stable/2023.1 have some tobiko jbs failing, will raise it later 15:31:57 <ykarel> anything you notice in stable in last week? 15:32:08 <ralonsoh> no that I'm aware 15:32:15 <ykarel> there is not much activity in stable branches recently 15:32:16 <lajoskatona> nothing from me 15:32:28 <ykarel> k 15:32:29 <ykarel> #topic Stadium projects 15:32:36 <ykarel> all was green in periodic here 15:33:01 <ykarel> lajoskatona, anything to add here? 15:33:36 <lajoskatona> nothing from me, I plan to propose similar patch for them to have py312 job 15:33:45 <ykarel> k thx 15:33:48 <ykarel> #topic Rechecks 15:34:11 <ykarel> all good here, not many rechecks/bare-rechecks this week 15:34:25 <ykarel> #topic fullstack/functional 15:34:48 <ykarel> we still seeing couple of NetworkInterfaceNotFound error in functional 15:34:49 <ralonsoh> sorry I didn't push the patch to catch the interface errros 15:35:01 <ykarel> https://0b5127941ec9aa3887c8-33179f0f89ff01c931f1a595d9a195a6.ssl.cf2.rackcdn.com/periodic/opendev.org/openstack/neutron/master/neutron-functional-with-pyroute2-master/845d7ca/testr_results.html 15:35:01 <ykarel> https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_13d/periodic/opendev.org/openstack/neutron/master/neutron-functional-with-sqlalchemy-master/13deaeb/testr_results.html 15:35:01 <ykarel> https://1a04c94a7aaad24b1ac4-b9af9dcc921d9a146a71ab81c762059f.ssl.cf1.rackcdn.com/periodic/opendev.org/openstack/neutron/master/neutron-functional-with-uwsgi-fips/884c8aa/testr_results.html 15:35:01 <ykarel> https://1cc594fa7dea4a43967d-b4ffb63fd72a873cfc8fbd2b6e893a02.ssl.cf5.rackcdn.com/periodic/opendev.org/openstack/neutron/master/neutron-functional-with-pyroute2-master/432edee/testr_results.html 15:35:01 <ralonsoh> my bad, I had other priorities 15:35:04 <ykarel> https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_6f7/908695/3/gate/neutron-functional-with-uwsgi/6f77ba5/testr_results.html 15:35:09 <ykarel> https://68194d424d2293a2da9f-7bc79e13153424291afbe0a68842b9b3.ssl.cf2.rackcdn.com/913979/2/check/neutron-functional-with-uwsgi/41894c9/testr_results.html 15:35:20 <ykarel> thx ralonsoh for tackling it 15:35:25 <ykarel> np you can take your time 15:35:37 <ralonsoh> but this is affecting a lot the FT job... 15:35:56 <ykarel> yes :(, most of ft failures have this issue only 15:37:30 <ykarel> next ones are 15:37:38 <ykarel> test_direct_route_for_address_scope and test_fip_connection_for_address_scope 15:38:02 <ykarel> seen once against a backported patch in 2024.1 in different runs 15:38:03 <ykarel> https://e3038c0311308f214540-ebb0bf58265a2173bda790006658dd60.ssl.cf1.rackcdn.com/913809/1/gate/neutron-functional-with-uwsgi/9e7a70e/testr_results.html 15:38:09 <ykarel> https://5a9894cfd54f063c7126-217ff60ed5d3b708136ee1404afca04a.ssl.cf2.rackcdn.com/913809/1/gate/neutron-functional-with-uwsgi/88a9e7b/testr_results.html 15:38:23 <ykarel> https://review.opendev.org/c/openstack/neutron/+/913809 15:39:10 <ralonsoh> I don't think this is related 15:39:27 <ralonsoh> but I've seen these errors in other patches too 15:39:40 <ralonsoh> the address scope tests are not stable 15:40:01 <ykarel> ohkk i cound't trace in opensearch, and seen only against that patch 15:40:19 <ykarel> but if seen in other patches/branches then can isolate it further 15:41:35 <ykarel> one of those test was unskipped approx 6 month back https://review.opendev.org/c/openstack/neutron/+/896728 15:43:30 <ykarel> ralonsoh, please do share it later if you find other similar failures across patches/branches 15:43:38 <ralonsoh> for sure 15:43:39 <ykarel> we can handle it with a bug 15:43:59 <ykarel> #topic Periodic 15:44:27 <ykarel> apart from that functional failure in periodic rest all good 15:44:43 <ykarel> in stable/2023.1 we have consistant failure 15:44:45 <ykarel> https://zuul.openstack.org/builds?job_name=devstack-tobiko-neutron&project=openstack%2Fneutron&branch=stable%2F2023.1&skip=0 15:45:13 <ykarel> failing since a week 15:46:09 <ykarel> Will reach out to Eduardo may be it's a known issue 15:46:18 <ykarel> as seeing some test patches https://review.opendev.org/c/openstack/neutron/+/913763 15:47:00 <ralonsoh> seems to be an error in the FW 15:47:07 <ralonsoh> https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_010/periodic/opendev.org/openstack/neutron/stable/2023.1/devstack-tobiko-neutron/01081a2/tobiko_results_02_create_neutron_resources_neutron.html?sort=result 15:47:21 <ykarel> yes, i saw different errors across runs 15:47:36 <ykarel> for now will report a bug and check if it's something known already 15:47:39 <ykarel> #action ykarel to check and report lp for tobiko job failures 15:47:54 <ykarel> #topic Grafana 15:48:00 <ykarel> https://grafana.opendev.org/d/f913631585/neutron-failure-rate 15:48:18 <ykarel> let's have quick look here too, if we see anything abnormal 15:49:16 <ykarel> all good in gate, check have some spike and all likely related to patches itself and known intermittent failures 15:49:20 <ykarel> anything to add? 15:49:35 <ralonsoh> nothing from me 15:50:09 <ykarel> k let's move 15:50:10 <ykarel> #topic On Demand 15:50:34 <ykarel> anything else you would like to raise? 15:50:39 <ralonsoh> no thanks 15:50:45 <lajoskatona> nothing from me 15:52:43 <ykarel> k thx everyone for joining 15:52:48 <ykarel> #endmeeting