15:00:48 #startmeeting neutron_ci 15:00:48 Meeting started Tue Oct 17 15:00:48 2023 UTC and is due to finish in 60 minutes. The chair is ykarel. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:00:48 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:00:48 The meeting name has been set to 'neutron_ci' 15:00:57 ping bcafarel, lajoskatona, mlavalle, mtomaska, ralonsoh, ykarel, jlibosva, elvira 15:00:58 Grafana dashboard: https://grafana.opendev.org/d/f913631585/neutron-failure-rate?orgId=1 15:00:59 hello 15:00:59 o/ 15:01:12 o/ 15:02:01 o/ 15:02:22 o/ 15:02:39 Let's start, bernard notified he will not be joining today 15:02:42 #topic Actions from previous meetings 15:03:00 ralonsoh to backport https://review.opendev.org/c/openstack/neutron/+/878549 15:03:09 Done https://review.opendev.org/q/I07fa45401788da6b963830e72a7b3a3cd54662e1 15:03:40 it's merged 15:03:42 ralonsoh to check tempest failures related to metadata in openvswitch job 15:03:57 I'm still investigating this issue 15:04:05 no results so far 15:04:19 Ok thanks, you opened a bug already for it? 15:04:34 I don't remember now, I'll check it after this meeting 15:04:41 sure, thanks 15:04:49 ralonsoh to check failures with nftables job in stable periodics 15:04:57 https://bugs.launchpad.net/neutron/+bug/2039027 15:04:59 yeah, this is more complicated 15:05:11 I realized that we were not testing that since 2 releases ago 15:05:25 we have merged this patch, fixing this issue 15:05:45 now the problem is to know why the linux bridge job is failing when using nftables 15:06:16 ralonsoh, it should be the same issue we seen few weeks back and worked around using ebtables legacy 15:06:46 if I'm not wrong, this is applied in these jobs 15:06:51 but I'll check it of course 15:07:04 i had checked some logs and show same ebtables errors 15:07:05 to tell the truth we sould have a session for this ebtables thing next week 15:07:33 I remember we discussed it but I can't tell where we are with it, and if we ahve to do any extra testing or migration 15:07:39 ralonsoh, yes that is applied but these jobs specifically switch to ebtables-nft as part of nftables setup 15:07:57 but I don't think this is related, let me find the newer job executions 15:08:02 https://zuul.opendev.org/t/openstack/builds?job_name=neutron-linuxbridge-tempest-plugin-nftables&skip=0 15:08:11 https://github.com/openstack/neutron/blob/169deb24500caa43ce3f6c8cd837d2286d219d5f/roles/nftables/tasks/main.yaml#L16 15:08:43 right 15:08:50 https://d233c9d65731f58c26a3-15368fe3522896553ebd232c1020ea9b.ssl.cf1.rackcdn.com/periodic/opendev.org/openstack/neutron/master/neutron-linuxbridge-tempest-plugin-nftables/6da8e44/testr_results.html 15:08:56 this is the last execution 15:09:31 and in one of those failures i saw:- DEBUG oslo.privsep.daemon [-] privsep: reply[731663a7-3c8c-4a4a-b8f6-ac6318cdb904]: (4, ('', 'ebtables v1.8.7 (nf_tables): RULE_DELETE failed (Invalid argument): rule in chain neutronMAC-tap17686dc1-fe\n', 4)) {{(pid=58939) _call_back /opt/stack/data/venv/lib/python3.10/site-packages/oslo_privsep/daemon.py:499}} 15:09:33 I'll check after this meeting if there is any evident error 15:09:39 ^^^ like this oner 15:09:48 k Thanks 15:10:15 ykarel to open bug for functional failures TestMaintenance 15:10:25 i reported https://bugs.launchpad.net/neutron/+bug/2039417 15:11:12 it's a random issue, any volunteer to check that? 15:11:22 not this week, sorry 15:11:50 k then i can check later this week 15:11:54 i can take a look but later in the week 15:11:57 Friday or Monday 15:12:03 #action ykarel to check https://bugs.launchpad.net/neutron/+bug/2039417 15:12:22 thanks mtomaska, i will have a look 15:12:33 ykarel to check failures with openstacksdk and postgres job 15:12:44 https://bugs.launchpad.net/neutron/+bug/2039066 15:13:06 required two patches merged in openstacksdk for this, so all good now 15:13:19 for train one, neutron is marked as eol https://review.opendev.org/c/openstack/releases/+/895196 15:13:34 cool 15:13:36 so once branches are dropped these jobs will stop running 15:13:54 #topic Stable branches 15:14:36 all looks good in stable, Bernard communicated offline 15:14:47 #topic Stadium projects 15:14:54 periodic weekly all green 15:15:00 lajoskatona, anything to add here? 15:15:03 yes, 15:15:06 nothing more 15:15:15 Thanks 15:15:27 this week the fwaas patches were merged, so things are ok 15:15:37 nice 15:15:39 #topic Grafana 15:15:46 https://grafana.opendev.org/d/f913631585/neutron-failure-rate 15:16:46 let's see if all good here 15:16:57 IMO it looks good this week 15:17:23 yes, there are some failures but those are known and being worked upon 15:17:43 btw wdyt if we move this topic at last after discussing known issues/failures? 15:17:44 pretty subdued 15:18:03 one thing i noticed in the dashboard is it doesn't have every job (i think) 15:18:09 for example py39 15:18:29 yes it's possibly outdated as not updated from some time 15:18:33 or py311 (non-voting) 15:18:45 we might have renamed some other things recently 15:19:07 good catch, we also have some stable dashboards somewhere 15:19:52 Ok i will check the missings and add to the dashboard 15:19:55 https://opendev.org/openstack/project-config/src/branch/master/grafana/neutron-stable-minusone.yaml & https://opendev.org/openstack/project-config/src/branch/master/grafana/neutron-stable-minustwo.yaml 15:20:38 #action ykarel to update neutron grafana dashboard 15:21:43 for the question 15:21:46 wdyt if we move this topic at last after discussing known issues/failures? 15:22:02 ok from my side 15:22:02 you mean to be the last topic in the agenda? 15:22:02 as we will have more clear picture once those are discussed on what's missing 15:22:26 yes before on demand one 15:22:35 let's give it a try 15:22:42 see what happens 15:23:11 k Thanks, i will move it 15:23:35 #topic Rechecks 15:24:05 we have more rechecks last week 15:24:26 and the intermittent failures in couple of jobs leading to it 15:24:42 and there were 3/14 bare rechecks 15:25:14 and good that many patches merged last week 15:25:58 let's keep doing less bare rechecks 15:26:11 Now let's talk about failures 15:26:12 #topic fullstack/functional 15:26:23 test_floatingip_mac_bindings(seen once) 15:26:28 https://4bb852441388777a25ca-c1d45ef7eb040e2d7e76303651507120.ssl.cf1.rackcdn.com/897045/2/gate/neutron-functional-with-uwsgi/8954870/testr_results.html 15:26:53 anyone recall seeing similar in past? 15:28:13 https://bugs.launchpad.net/neutron/+bug/1884986 15:28:24 and I pushed https://review.opendev.org/c/openstack/neutron/+/769880 15:29:43 ohkk that's long back 15:30:51 let's see if we see more occurrences for it and investigate. 15:31:03 for now seen just once 15:31:26 next is 15:31:29 test_restart_rpc_on_sighup_multiple_workers 15:31:38 it fails as RuntimeError: Expected buffer size: 10, current size: 25 15:31:55 seen 4 occurrences 15:31:58 https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_9cd/periodic/opendev.org/openstack/neutron/master/neutron-functional-with-uwsgi-fips/9cde7d1/testr_results.html 15:32:17 http://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_263/893659/12/check/neutron-functional-with-uwsgi/263727d/testr_results.html 15:32:17 https://20c2c42dd35909dbd48c-6b0cadbc155764a70d2c1954f64b7ab0.ssl.cf2.rackcdn.com/897472/3/check/neutron-functional-with-uwsgi/20b1e5e/testr_results.html 15:32:17 https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_e50/897045/2/check/neutron-functional-with-uwsgi/e50c4f6/testr_results.html 15:32:27 this sounds a known bit? 15:32:34 this is maybe because the writing queue was not delted properly 15:32:47 and is reading the restart + the start messages 15:33:16 dealt properly? 15:33:57 when the queue is retrieved, is deleted at the same time 15:34:03 ahhh 15:34:13 but this is just pure speculation 15:35:38 I'll check it 15:35:40 ohkk any volunteer to look into this 15:35:45 k thanks 15:36:00 #action ralonsoh to check failures for test_restart_rpc_on_sighup_multiple_workers 15:36:12 #topic Tempest/Scenario 15:36:32 Failures downloading cirros image like in https://zuul.openstack.org/build/97f5b27439ca469a9cd110087a386c62 15:36:56 https://review.opendev.org/c/openstack/project-config/+/873735 dropped older cirros images from cache 15:37:05 so images are downloaded during the jobs in stable branches where these images are used and can fail with network issues 15:37:18 We can switch to recent image like 0.5.2 on those branches where 0.5.1 is used to avoid such issues 15:37:31 Takashi Kajinami proposed openstack/neutron master: Fix python shebang https://review.opendev.org/c/openstack/neutron/+/898591 15:37:40 on 10/11th oct seen couple of failures due to it 15:37:43 qq: was it necessary to remove these images from the cache? 15:37:54 if these images are still being used in the CI 15:37:56 Takashi Kajinami proposed openstack/neutron master: Fix python shebang https://review.opendev.org/c/openstack/neutron/+/898591 15:38:01 as per comment those older images are not used much 15:38:07 but yes, we can bump to 0.5.2 15:38:31 +1 15:38:42 any volunteer to update it? 15:39:28 I'll push a patch today, it shouldn't be difficult 15:39:34 yeap 15:39:36 Ok thanks 15:39:53 #action ralonsoh to switch stable jobs to use cirros 0.5.2 15:40:06 test_resize_volume_backed_server_confirm fails randomly in tempest-integrated-networking job, fails with kernel panic 15:40:19 /sbin/init: can't load library 'libtirpc.so.3' Kernel panic - not syncing: Attempted to kill init! exitcode=0x00001000 15:40:26 https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_bbd/884898/7/check/tempest-integrated-networking/bbdf62f/testr_results.html 15:40:49 i seen similar failure in other jobs too running across different projects 15:41:10 ping #openstack-qa to know if this is known bit but haven't got any response yet 15:41:51 Will check again tomorrow else will report a bug for it 15:42:28 #action ykarel to check and report bug for test_resize_volume_backed_server_confirm 15:42:37 #topic Periodic 15:42:48 https://zuul.openstack.org/builds?job_name=neutron-linuxbridge-tempest-plugin-nftables&project=openstack/neutron 15:43:02 this we already discussed, ralonsoh will be looking into it 15:43:12 These jobs are enabled in master periodic recently with https://review.opendev.org/c/openstack/neutron/+/897427 15:43:19 Stable periodic 15:43:25 https://zuul.openstack.org/builds?job_name=neutron-functional-with-uwsgi-fips&project=openstack%252Fneutron&branch=stable%252Fyoga 15:43:47 this job is failing in stable/yoga since https://review.opendev.org/c/openstack/neutron/+/897044 merged 15:44:00 upsss 15:44:31 I'll check it 15:44:34 this is my fault 15:44:51 but it was green in that patch in check queue 15:45:06 slaweq, it's fips jobs failing,non fips passing 15:45:22 ahh, ok 15:45:22 that is still in experimental? 15:45:33 yes should be 15:46:02 k thanks ralonsoh 15:46:10 I wonder what's the different between fips and non-fips which is causing this failure 15:46:27 I wasn't expecting anything what would make there such difference really 15:46:34 yeah, the port binding 15:46:37 curious 15:46:41 yes looks strange 15:46:44 #action ralonsoh to check failures in neutron-functional-with-uwsgi-fips in stable/yoga 15:48:19 https://zuul.openstack.org/builds?job_name=neutron-functional&project=openstack%252Fneutron&branch=stable%252Fvictoria 15:48:35 it was missing backport https://review.opendev.org/q/I2f8130dc3cf3244be2a44a4ecbdbaa9c7f865731 15:48:39 i proposed 15:49:06 FT job is passing, +2 15:49:17 thx 15:49:22 https://zuul.openstack.org/builds?job_name=neutron-ovn-tempest-ovs-release-fips&job_name=neutron-ovs-tempest-fips&project=openstack%2Fneutron&branch=stable%2F2023.2&skip=0 15:49:40 these are running on centos 9 stream and needs RDO stable/2023.2 release 15:49:59 I will check for the status of it 15:50:18 #action to check for failures in 9-stream jobs stable/2023.2 15:50:31 https://zuul.openstack.org/builds?job_name=neutron-ovn-grenade-multinode-skip-level&branch=stable%252F2023.2 15:50:47 Takashi Kajinami proposed openstack/neutron-lib master: Fix python shebang https://review.opendev.org/c/openstack/neutron-lib/+/898604 15:50:47 This failed in today's run 15:50:58 But This seems one time as https://review.opendev.org/q/I09a37ca451d44607b7dde344c93ace060c7bda01 stable/zed patch merged but not stable/2023.2 when job ran, next run will confirm 15:51:12 Takashi Kajinami proposed openstack/os-ken master: Fix python shebang https://review.opendev.org/c/openstack/os-ken/+/898605 15:51:27 So just need to monitor this 15:51:43 That's it for the jobs 15:51:51 #topic On Demand 15:52:23 CI meeting next two weeks 15:52:25 just a reminder the slurp job change is up for review, https://review.opendev.org/c/openstack/neutron/+/895515 15:52:39 thanks ralonsoh and lajoskatona for reviews 15:52:51 ykarel: when that merges will also need dashboard update 15:53:01 we should merge it now 15:53:03 haleyb, sure will take care of it 15:53:42 Next week is vPTG and on Tuesday i will be out as holiday so i propose meeting cancellation next week 15:53:49 ykarel: did you mean CI meeting wil not take place the next two weeks? 15:54:02 and next to next week i will be out full week 15:54:04 ahhh ok 15:54:09 have fun next week! 15:54:12 ykarel and what about meeting in 2 weeks? We will be in Brno 15:54:13 understood now 15:54:25 so maybe cancel it too? 15:54:26 slaweq, yes that's the next to next week 15:54:32 ok 15:54:34 ++ 15:54:49 ok to cancel for 2 weeks? 15:54:58 yes 15:55:06 +1 15:55:10 +1 15:55:28 k Thanks i will send a mail 15:55:41 ANy thing else to discuss? 15:56:26 nothing from me 15:56:56 Ok thanks everyone, see you again after two weeks \o/ 15:56:59 #endmeeting