15:00:48 <ykarel> #startmeeting neutron_ci 15:00:48 <opendevmeet> Meeting started Tue Oct 17 15:00:48 2023 UTC and is due to finish in 60 minutes. The chair is ykarel. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:00:48 <opendevmeet> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:00:48 <opendevmeet> The meeting name has been set to 'neutron_ci' 15:00:57 <ykarel> ping bcafarel, lajoskatona, mlavalle, mtomaska, ralonsoh, ykarel, jlibosva, elvira 15:00:58 <ykarel> Grafana dashboard: https://grafana.opendev.org/d/f913631585/neutron-failure-rate?orgId=1 15:00:59 <ralonsoh> hello 15:00:59 <slaweq> o/ 15:01:12 <haleyb> o/ 15:02:01 <mlavalle> o/ 15:02:22 <lajoskatona> o/ 15:02:39 <ykarel> Let's start, bernard notified he will not be joining today 15:02:42 <ykarel> #topic Actions from previous meetings 15:03:00 <ykarel> ralonsoh to backport https://review.opendev.org/c/openstack/neutron/+/878549 15:03:09 <ykarel> Done https://review.opendev.org/q/I07fa45401788da6b963830e72a7b3a3cd54662e1 15:03:40 <ykarel> it's merged 15:03:42 <ykarel> ralonsoh to check tempest failures related to metadata in openvswitch job 15:03:57 <ralonsoh> I'm still investigating this issue 15:04:05 <ralonsoh> no results so far 15:04:19 <ykarel> Ok thanks, you opened a bug already for it? 15:04:34 <ralonsoh> I don't remember now, I'll check it after this meeting 15:04:41 <ykarel> sure, thanks 15:04:49 <ykarel> ralonsoh to check failures with nftables job in stable periodics 15:04:57 <ykarel> https://bugs.launchpad.net/neutron/+bug/2039027 15:04:59 <ralonsoh> yeah, this is more complicated 15:05:11 <ralonsoh> I realized that we were not testing that since 2 releases ago 15:05:25 <ralonsoh> we have merged this patch, fixing this issue 15:05:45 <ralonsoh> now the problem is to know why the linux bridge job is failing when using nftables 15:06:16 <ykarel> ralonsoh, it should be the same issue we seen few weeks back and worked around using ebtables legacy 15:06:46 <ralonsoh> if I'm not wrong, this is applied in these jobs 15:06:51 <ralonsoh> but I'll check it of course 15:07:04 <ykarel> i had checked some logs and show same ebtables errors 15:07:05 <lajoskatona> to tell the truth we sould have a session for this ebtables thing next week 15:07:33 <lajoskatona> I remember we discussed it but I can't tell where we are with it, and if we ahve to do any extra testing or migration 15:07:39 <ykarel> ralonsoh, yes that is applied but these jobs specifically switch to ebtables-nft as part of nftables setup 15:07:57 <ralonsoh> but I don't think this is related, let me find the newer job executions 15:08:02 <ralonsoh> https://zuul.opendev.org/t/openstack/builds?job_name=neutron-linuxbridge-tempest-plugin-nftables&skip=0 15:08:11 <ykarel> https://github.com/openstack/neutron/blob/169deb24500caa43ce3f6c8cd837d2286d219d5f/roles/nftables/tasks/main.yaml#L16 15:08:43 <ralonsoh> right 15:08:50 <ralonsoh> https://d233c9d65731f58c26a3-15368fe3522896553ebd232c1020ea9b.ssl.cf1.rackcdn.com/periodic/opendev.org/openstack/neutron/master/neutron-linuxbridge-tempest-plugin-nftables/6da8e44/testr_results.html 15:08:56 <ralonsoh> this is the last execution 15:09:31 <ykarel> and in one of those failures i saw:- DEBUG oslo.privsep.daemon [-] privsep: reply[731663a7-3c8c-4a4a-b8f6-ac6318cdb904]: (4, ('', 'ebtables v1.8.7 (nf_tables): RULE_DELETE failed (Invalid argument): rule in chain neutronMAC-tap17686dc1-fe\n', 4)) {{(pid=58939) _call_back /opt/stack/data/venv/lib/python3.10/site-packages/oslo_privsep/daemon.py:499}} 15:09:33 <ralonsoh> I'll check after this meeting if there is any evident error 15:09:39 <ralonsoh> ^^^ like this oner 15:09:48 <ykarel> k Thanks 15:10:15 <ykarel> ykarel to open bug for functional failures TestMaintenance 15:10:25 <ykarel> i reported https://bugs.launchpad.net/neutron/+bug/2039417 15:11:12 <ykarel> it's a random issue, any volunteer to check that? 15:11:22 <ralonsoh> not this week, sorry 15:11:50 <ykarel> k then i can check later this week 15:11:54 <mtomaska> i can take a look but later in the week 15:11:57 <mtomaska> Friday or Monday 15:12:03 <ykarel> #action ykarel to check https://bugs.launchpad.net/neutron/+bug/2039417 15:12:22 <ykarel> thanks mtomaska, i will have a look 15:12:33 <ykarel> ykarel to check failures with openstacksdk and postgres job 15:12:44 <ykarel> https://bugs.launchpad.net/neutron/+bug/2039066 15:13:06 <ykarel> required two patches merged in openstacksdk for this, so all good now 15:13:19 <ykarel> for train one, neutron is marked as eol https://review.opendev.org/c/openstack/releases/+/895196 15:13:34 <ralonsoh> cool 15:13:36 <ykarel> so once branches are dropped these jobs will stop running 15:13:54 <ykarel> #topic Stable branches 15:14:36 <ykarel> all looks good in stable, Bernard communicated offline 15:14:47 <ykarel> #topic Stadium projects 15:14:54 <ykarel> periodic weekly all green 15:15:00 <ykarel> lajoskatona, anything to add here? 15:15:03 <lajoskatona> yes, 15:15:06 <lajoskatona> nothing more 15:15:15 <ykarel> Thanks 15:15:27 <lajoskatona> this week the fwaas patches were merged, so things are ok 15:15:37 <ykarel> nice 15:15:39 <ykarel> #topic Grafana 15:15:46 <ykarel> https://grafana.opendev.org/d/f913631585/neutron-failure-rate 15:16:46 <ykarel> let's see if all good here 15:16:57 <slaweq> IMO it looks good this week 15:17:23 <ykarel> yes, there are some failures but those are known and being worked upon 15:17:43 <ykarel> btw wdyt if we move this topic at last after discussing known issues/failures? 15:17:44 <mlavalle> pretty subdued 15:18:03 <haleyb> one thing i noticed in the dashboard is it doesn't have every job (i think) 15:18:09 <haleyb> for example py39 15:18:29 <ykarel> yes it's possibly outdated as not updated from some time 15:18:33 <haleyb> or py311 (non-voting) 15:18:45 <haleyb> we might have renamed some other things recently 15:19:07 <lajoskatona> good catch, we also have some stable dashboards somewhere 15:19:52 <ykarel> Ok i will check the missings and add to the dashboard 15:19:55 <lajoskatona> https://opendev.org/openstack/project-config/src/branch/master/grafana/neutron-stable-minusone.yaml & https://opendev.org/openstack/project-config/src/branch/master/grafana/neutron-stable-minustwo.yaml 15:20:38 <ykarel> #action ykarel to update neutron grafana dashboard 15:21:43 <ykarel> for the question 15:21:46 <ykarel> wdyt if we move this topic at last after discussing known issues/failures? 15:22:02 <lajoskatona> ok from my side 15:22:02 <mlavalle> you mean to be the last topic in the agenda? 15:22:02 <ykarel> as we will have more clear picture once those are discussed on what's missing 15:22:26 <ykarel> yes before on demand one 15:22:35 <mlavalle> let's give it a try 15:22:42 <mlavalle> see what happens 15:23:11 <ykarel> k Thanks, i will move it 15:23:35 <ykarel> #topic Rechecks 15:24:05 <ykarel> we have more rechecks last week 15:24:26 <ykarel> and the intermittent failures in couple of jobs leading to it 15:24:42 <ykarel> and there were 3/14 bare rechecks 15:25:14 <ykarel> and good that many patches merged last week 15:25:58 <ykarel> let's keep doing less bare rechecks 15:26:11 <ykarel> Now let's talk about failures 15:26:12 <ykarel> #topic fullstack/functional 15:26:23 <ykarel> test_floatingip_mac_bindings(seen once) 15:26:28 <ykarel> https://4bb852441388777a25ca-c1d45ef7eb040e2d7e76303651507120.ssl.cf1.rackcdn.com/897045/2/gate/neutron-functional-with-uwsgi/8954870/testr_results.html 15:26:53 <ykarel> anyone recall seeing similar in past? 15:28:13 <ralonsoh> https://bugs.launchpad.net/neutron/+bug/1884986 15:28:24 <ralonsoh> and I pushed https://review.opendev.org/c/openstack/neutron/+/769880 15:29:43 <ykarel> ohkk that's long back 15:30:51 <ykarel> let's see if we see more occurrences for it and investigate. 15:31:03 <ykarel> for now seen just once 15:31:26 <ykarel> next is 15:31:29 <ykarel> test_restart_rpc_on_sighup_multiple_workers 15:31:38 <ykarel> it fails as RuntimeError: Expected buffer size: 10, current size: 25 15:31:55 <ykarel> seen 4 occurrences 15:31:58 <ykarel> https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_9cd/periodic/opendev.org/openstack/neutron/master/neutron-functional-with-uwsgi-fips/9cde7d1/testr_results.html 15:32:17 <ykarel> http://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_263/893659/12/check/neutron-functional-with-uwsgi/263727d/testr_results.html 15:32:17 <ykarel> https://20c2c42dd35909dbd48c-6b0cadbc155764a70d2c1954f64b7ab0.ssl.cf2.rackcdn.com/897472/3/check/neutron-functional-with-uwsgi/20b1e5e/testr_results.html 15:32:17 <ykarel> https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_e50/897045/2/check/neutron-functional-with-uwsgi/e50c4f6/testr_results.html 15:32:27 <ykarel> this sounds a known bit? 15:32:34 <ralonsoh> this is maybe because the writing queue was not delted properly 15:32:47 <ralonsoh> and is reading the restart + the start messages 15:33:16 <mlavalle> dealt properly? 15:33:57 <ralonsoh> when the queue is retrieved, is deleted at the same time 15:34:03 <mlavalle> ahhh 15:34:13 <ralonsoh> but this is just pure speculation 15:35:38 <ralonsoh> I'll check it 15:35:40 <ykarel> ohkk any volunteer to look into this 15:35:45 <ykarel> k thanks 15:36:00 <ykarel> #action ralonsoh to check failures for test_restart_rpc_on_sighup_multiple_workers 15:36:12 <ykarel> #topic Tempest/Scenario 15:36:32 <ykarel> Failures downloading cirros image like in https://zuul.openstack.org/build/97f5b27439ca469a9cd110087a386c62 15:36:56 <ykarel> https://review.opendev.org/c/openstack/project-config/+/873735 dropped older cirros images from cache 15:37:05 <ykarel> so images are downloaded during the jobs in stable branches where these images are used and can fail with network issues 15:37:18 <ykarel> We can switch to recent image like 0.5.2 on those branches where 0.5.1 is used to avoid such issues 15:37:31 <opendevreview> Takashi Kajinami proposed openstack/neutron master: Fix python shebang https://review.opendev.org/c/openstack/neutron/+/898591 15:37:40 <ykarel> on 10/11th oct seen couple of failures due to it 15:37:43 <ralonsoh> qq: was it necessary to remove these images from the cache? 15:37:54 <ralonsoh> if these images are still being used in the CI 15:37:56 <opendevreview> Takashi Kajinami proposed openstack/neutron master: Fix python shebang https://review.opendev.org/c/openstack/neutron/+/898591 15:38:01 <ykarel> as per comment those older images are not used much 15:38:07 <ralonsoh> but yes, we can bump to 0.5.2 15:38:31 <ykarel> +1 15:38:42 <ykarel> any volunteer to update it? 15:39:28 <ralonsoh> I'll push a patch today, it shouldn't be difficult 15:39:34 <ykarel> yeap 15:39:36 <ykarel> Ok thanks 15:39:53 <ykarel> #action ralonsoh to switch stable jobs to use cirros 0.5.2 15:40:06 <ykarel> test_resize_volume_backed_server_confirm fails randomly in tempest-integrated-networking job, fails with kernel panic 15:40:19 <ykarel> /sbin/init: can't load library 'libtirpc.so.3' Kernel panic - not syncing: Attempted to kill init! exitcode=0x00001000 15:40:26 <ykarel> https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_bbd/884898/7/check/tempest-integrated-networking/bbdf62f/testr_results.html 15:40:49 <ykarel> i seen similar failure in other jobs too running across different projects 15:41:10 <ykarel> ping #openstack-qa to know if this is known bit but haven't got any response yet 15:41:51 <ykarel> Will check again tomorrow else will report a bug for it 15:42:28 <ykarel> #action ykarel to check and report bug for test_resize_volume_backed_server_confirm 15:42:37 <ykarel> #topic Periodic 15:42:48 <ykarel> https://zuul.openstack.org/builds?job_name=neutron-linuxbridge-tempest-plugin-nftables&project=openstack/neutron 15:43:02 <ykarel> this we already discussed, ralonsoh will be looking into it 15:43:12 <ykarel> These jobs are enabled in master periodic recently with https://review.opendev.org/c/openstack/neutron/+/897427 15:43:19 <ykarel> Stable periodic 15:43:25 <ykarel> https://zuul.openstack.org/builds?job_name=neutron-functional-with-uwsgi-fips&project=openstack%252Fneutron&branch=stable%252Fyoga 15:43:47 <ykarel> this job is failing in stable/yoga since https://review.opendev.org/c/openstack/neutron/+/897044 merged 15:44:00 <ralonsoh> upsss 15:44:31 <ralonsoh> I'll check it 15:44:34 <ralonsoh> this is my fault 15:44:51 <slaweq> but it was green in that patch in check queue 15:45:06 <ykarel> slaweq, it's fips jobs failing,non fips passing 15:45:22 <slaweq> ahh, ok 15:45:22 <lajoskatona> that is still in experimental? 15:45:33 <ykarel> yes should be 15:46:02 <ykarel> k thanks ralonsoh 15:46:10 <slaweq> I wonder what's the different between fips and non-fips which is causing this failure 15:46:27 <slaweq> I wasn't expecting anything what would make there such difference really 15:46:34 <ralonsoh> yeah, the port binding 15:46:37 <ralonsoh> curious 15:46:41 <ykarel> yes looks strange 15:46:44 <ykarel> #action ralonsoh to check failures in neutron-functional-with-uwsgi-fips in stable/yoga 15:48:19 <ykarel> https://zuul.openstack.org/builds?job_name=neutron-functional&project=openstack%252Fneutron&branch=stable%252Fvictoria 15:48:35 <ykarel> it was missing backport https://review.opendev.org/q/I2f8130dc3cf3244be2a44a4ecbdbaa9c7f865731 15:48:39 <ykarel> i proposed 15:49:06 <ralonsoh> FT job is passing, +2 15:49:17 <ykarel> thx 15:49:22 <ykarel> https://zuul.openstack.org/builds?job_name=neutron-ovn-tempest-ovs-release-fips&job_name=neutron-ovs-tempest-fips&project=openstack%2Fneutron&branch=stable%2F2023.2&skip=0 15:49:40 <ykarel> these are running on centos 9 stream and needs RDO stable/2023.2 release 15:49:59 <ykarel> I will check for the status of it 15:50:18 <ykarel> #action to check for failures in 9-stream jobs stable/2023.2 15:50:31 <ykarel> https://zuul.openstack.org/builds?job_name=neutron-ovn-grenade-multinode-skip-level&branch=stable%252F2023.2 15:50:47 <opendevreview> Takashi Kajinami proposed openstack/neutron-lib master: Fix python shebang https://review.opendev.org/c/openstack/neutron-lib/+/898604 15:50:47 <ykarel> This failed in today's run 15:50:58 <ykarel> But This seems one time as https://review.opendev.org/q/I09a37ca451d44607b7dde344c93ace060c7bda01 stable/zed patch merged but not stable/2023.2 when job ran, next run will confirm 15:51:12 <opendevreview> Takashi Kajinami proposed openstack/os-ken master: Fix python shebang https://review.opendev.org/c/openstack/os-ken/+/898605 15:51:27 <ykarel> So just need to monitor this 15:51:43 <ykarel> That's it for the jobs 15:51:51 <ykarel> #topic On Demand 15:52:23 <ykarel> CI meeting next two weeks 15:52:25 <haleyb> just a reminder the slurp job change is up for review, https://review.opendev.org/c/openstack/neutron/+/895515 15:52:39 <haleyb> thanks ralonsoh and lajoskatona for reviews 15:52:51 <haleyb> ykarel: when that merges will also need dashboard update 15:53:01 <ralonsoh> we should merge it now 15:53:03 <ykarel> haleyb, sure will take care of it 15:53:42 <ykarel> Next week is vPTG and on Tuesday i will be out as holiday so i propose meeting cancellation next week 15:53:49 <mlavalle> ykarel: did you mean CI meeting wil not take place the next two weeks? 15:54:02 <ykarel> and next to next week i will be out full week 15:54:04 <mlavalle> ahhh ok 15:54:09 <ralonsoh> have fun next week! 15:54:12 <slaweq> ykarel and what about meeting in 2 weeks? We will be in Brno 15:54:13 <mlavalle> understood now 15:54:25 <slaweq> so maybe cancel it too? 15:54:26 <ykarel> slaweq, yes that's the next to next week 15:54:32 <slaweq> ok 15:54:34 <slaweq> ++ 15:54:49 <ykarel> ok to cancel for 2 weeks? 15:54:58 <ralonsoh> yes 15:55:06 <lajoskatona> +1 15:55:10 <mlavalle> +1 15:55:28 <ykarel> k Thanks i will send a mail 15:55:41 <ykarel> ANy thing else to discuss? 15:56:26 <slaweq> nothing from me 15:56:56 <ykarel> Ok thanks everyone, see you again after two weeks \o/ 15:56:59 <ykarel> #endmeeting