15:00:34 <ykarel> #startmeeting neutron_ci
15:00:34 <opendevmeet> Meeting started Tue Nov 28 15:00:34 2023 UTC and is due to finish in 60 minutes.  The chair is ykarel. Information about MeetBot at http://wiki.debian.org/MeetBot.
15:00:34 <opendevmeet> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
15:00:34 <opendevmeet> The meeting name has been set to 'neutron_ci'
15:00:42 <ykarel> ping bcafarel, lajoskatona, mlavalle, mtomaska, ralonsoh, ykarel, jlibosva, elvira
15:00:48 <ykarel> Grafana dashboard: https://grafana.opendev.org/d/f913631585/neutron-failure-rate?orgId=1
15:00:48 <ykarel> Please open now :)
15:01:05 <mtomaska> o/
15:01:15 <slaweq> o/
15:01:16 <bcafarel> I barely had time to go make a coffee :)
15:01:18 <bcafarel> o/
15:01:21 <mlavalle> \o
15:01:33 <lajoskatona> o/
15:02:06 <haleyb> o/
15:02:10 <ykarel> k let's start with the topics
15:02:15 <ykarel> #topic Actions from previous meetings
15:02:22 <ykarel> bhaley to refresh grafana dashboard for neutron-lib/neutron-tempest-plugin in project-config
15:02:51 <haleyb> i have the neutron-lib one almost done, just working on the periodic section
15:03:00 <haleyb> short week last week
15:03:13 <ykarel> thx not a urgent one so should be fine
15:03:23 <ykarel> #topic Stable branches
15:03:38 <ykarel> bcafarel, anything to share for stable?
15:04:06 <bcafarel> I just finished catching up on my reviews, all looks good on CI status on active branches
15:04:22 <ykarel> k that's good
15:04:34 <ykarel> #topic Stadium projects
15:04:58 <ykarel> periodic lines are green here, and the ui issue related to job link also fixed
15:05:05 <lajoskatona> projects are green, so no big problem
15:05:41 <ykarel> k Thanks
15:05:50 <ykarel> #topic Rechecks
15:06:01 <lajoskatona> my usual request: if you have few mins pleasecchek the stadium reviews also :-)
15:06:48 <ykarel> we are seeing some random failures across jobs and that leads to recheck on patches this week
15:07:09 <ykarel> there were 2 bare rechecks too, we should try to avoid that
15:07:39 <ykarel> Let's check the failures
15:07:40 <ykarel> #topic fullstack/functional
15:07:49 <ykarel> test_dvr_router_lifecycle_without_ha_without_snat_with_fips
15:07:57 <ykarel> AssertionError: Device name: fg-863e72ab-79, expected MAC: fa:16:3e:80:8d:89
15:08:10 <ykarel> #link https://ef6fea84da73ed48af61-f850a1c88b63080f1e34c13fe4924008.ssl.cf2.rackcdn.com/901476/2/gate/neutron-functional-with-uwsgi/782e426/testr_results.html
15:09:01 <ykarel> anyone recall this issue?
15:09:29 <mlavalle> I don't
15:09:42 <ykarel> i have seen this before but it's not a frequent one
15:09:53 <haleyb> vaguely but don't remember the context
15:10:32 <ykarel> may be https://bugs.launchpad.net/neutron/+bug/2000164
15:11:51 <haleyb> Co-Authored-By: Brian Haley <haleyb.dev@gmail.com>
15:12:36 <slaweq> To confirm that you can check ovs logs to see if port was created, deleted and created again quickly
15:12:44 <haleyb> ykarel: was this failure in all branches?
15:12:56 <ykarel> seen once as of now in master
15:13:20 <opendevreview> Merged openstack/ovn-bgp-agent stable/2023.2: Ensure withdrawn events are only processed in relevant nodes  https://review.opendev.org/c/openstack/ovn-bgp-agent/+/902065
15:15:36 <ykarel> slaweq, ack will check the logs and confirm if it's same
15:15:57 <ykarel> #action ykarel to check if functional failure is same as bug 2000164
15:16:16 <ykarel> Next one
15:16:17 <ykarel> test_port_binding_chassis_create_event
15:16:24 <ykarel> ovsdbapp.backend.ovs_idl.idlutils.RowNotFound: Cannot find Port_Binding with logical_port=1bbcd693-2b3b-4e72-8143-5a67359d56b2
15:16:30 <ykarel> https://3b2ee5c5f48b31d3cd88-77332c10a4b1568b93e2e5229e79a685.ssl.cf5.rackcdn.com/901507/2/gate/neutron-functional-with-uwsgi/ffae97d/testr_results.html
15:16:40 <ykarel> seen this also one
15:17:27 <ykarel> for this too need to dig logs, any volunteer to check this?
15:17:41 <mlavalle> I can do it
15:17:46 <ykarel> thx mlavalle
15:17:55 <ykarel> #action mlavalle to check failure with test_port_binding_chassis_create_event
15:18:24 <ykarel> For fullstack we still hitting TIMEOUTs sometimes even after dropping linuxbridge tests
15:18:43 <ykarel> so need to check those, but for that we already have a known issue
15:19:31 <ykarel> #topic Tempest/Scenario
15:20:01 <ykarel> we have a infra specific problem where Sometimes job failing randomly with Unable to establish SSL connection to cloud-images.ubuntu.com
15:20:25 <ykarel> you may have noticed it already
15:21:01 <haleyb> yup
15:21:11 <lajoskatona> yesterday there was some zuul issue also without logs
15:21:47 <ykarel> yeap zuul bit is fixed now, for the cloud-images if it's some known bit or not
15:22:14 <ykarel> me not sure, let's see if we still see this frequently, need to reach out to infra for this
15:22:44 <ykarel> also in some ovs jobs we seeing failures like
15:22:44 <ykarel> https://ca6bf2baf6d89f5ed910-37ccdaa5e24fdd183c62eb6fdd1f8a71.ssl.cf5.rackcdn.com/901827/1/gate/neutron-tempest-plugin-openvswitch-enforce-scope-old-defaults/3e27967/testr_results.html
15:22:45 <ykarel> https://4cd02d83e42f664c3d38-4289c2655d9b20a523cb41ee99eef7c9.ssl.cf5.rackcdn.com/901474/3/check/neutron-tempest-plugin-openvswitch-enforce-scope-old-defaults/fb28885/testr_results.html
15:22:55 <ykarel> where test fails to ssh guest vm
15:23:33 <ykarel> in these examples, guest vms don't have ip assigned by dhcp
15:24:40 <lajoskatona> the 2 examples test_qos is it happens for other tests also?
15:25:00 <ykarel> i recall rodolfo was checking something similar before his PTOs
15:25:24 <ykarel> i think it's not specific to that test/job
15:25:39 <ykarel> i saw some more occurances in opensearch
15:26:01 <slaweq> from quick look I see many ERRORs logged in the q-dhcp log file: https://4cd02d83e42f664c3d38-4289c2655d9b20a523cb41ee99eef7c9.ssl.cf5.rackcdn.com/901474/3/check/neutron-tempest-plugin-openvswitch-enforce-scope-old-defaults/fb28885/controller/logs/screen-q-dhcp.txt
15:26:02 <ykarel> looking for : probing for an IPv4LL address
15:26:19 <slaweq> maybe one of those errors is related to the failed test
15:27:44 <ykarel> in one of those i noticed there were no dhcprequest/dhcpack messages, and continuos dhcpoffer
15:28:17 <ykarel> may be dhcpoffer not reaching vms for some reason?
15:30:40 <ykarel> and in one i saw dhcp agent took time to configure the port(~ 1minute) and in that time client side configured ipv4ll address
15:32:05 <ykarel> iirc in older cirros versions where udhcpc was used it used to try thrice every 60 second before giving up but that doesn't seems to be the case with recent cirros/dhcpcd version
15:32:17 <ykarel> anyway we need to look into these failures
15:32:24 <ykarel> any volunteer to log a bug for this?
15:33:04 <ykarel> ok i will do log a bug for this and then we will see
15:33:30 <ykarel> #action ykarel to log a bug for dhcp/metadata issue in ovs jobs
15:33:32 <lajoskatona> +1
15:33:46 <ykarel> #topic Periodic
15:33:55 <ykarel> master line is all good
15:34:01 <ykarel> for stable we have some broken jobs
15:34:16 <ykarel> #link https://zuul.openstack.org/builds?job_name=neutron-tempest-mariadb-full&project=openstack%2Fneutron&branch=stable%2Fussuri&skip=0
15:34:25 <ykarel> This Job runs on bionic and setup mariadb 10.3 repos which are no longer available
15:34:31 <opendevreview> Merged openstack/ovn-bgp-agent stable/2023.2: Avoid race when deleting VM with FIP  https://review.opendev.org/c/openstack/ovn-bgp-agent/+/902067
15:35:09 <ykarel> wdyt should we just drop that job from ussuri as its already extended maintenance?
15:35:30 <slaweq> ++
15:35:37 <lajoskatona> +1
15:35:42 <haleyb> +1
15:35:43 <bcafarel> +1, and ussuri is also headed for EOL no?
15:35:55 <bcafarel> so cleaning the job earlier will not hurt
15:36:43 <ykarel> ok any volunteer to push the patch to drop?
15:37:15 <bcafarel> I already ahve the zuul file open in vim :)
15:37:22 <ykarel> ++ thx
15:37:40 <ykarel> #action bcafarel to drop neutron-tempest-mariadb-full in ussuri
15:37:50 <ykarel> #link https://zuul.openstack.org/builds?job_name=neutron-linuxbridge-tempest-plugin-scenario-nftables&job_name=neutron-ovs-tempest-plugin-scenario-iptables_hybrid-nftables&project=openstack%2Fneutron&branch=stable%2Fxena&skip=0
15:38:25 <ykarel> these failing with RETRY_LIMIT and no logs to check the reason so need to be digged
15:38:42 <ykarel> and these are for xena
15:39:18 <ykarel> any volunteer to check these?
15:40:34 <ykarel> k i will check these and report bug
15:40:49 <ykarel> #action ykarel to check issue with nftables xena periodic job
15:40:58 <ykarel> #link https://zuul.openstack.org/builds?job_name=neutron-ovs-tempest-fips&job_name=neutron-ovn-tempest-ovs-release-fips&project=openstack%2Fneutron&branch=stable%2Fyoga&skip=0
15:41:18 <ykarel> this is specific to 9 stream fips jobs
15:41:29 <ykarel> in stable/yoga
15:42:05 <ykarel> in yoga we have pyroute 0.6.6 and due to it's hitting https://bugs.launchpad.net/tripleo/+bug/2023764
15:43:17 <lajoskatona> so this means pyroute req update on yoga?
15:43:45 <ykarel> yes if we want to get that fixed or drop the jobs to run in yoga
15:45:54 <ykarel> wdyt? should we just drop these two jobs
15:46:23 <ykarel> bumping pyroute2 in stable might be trickier
15:46:29 <lajoskatona> agree
15:46:57 <slaweq> drop jobs IMO
15:47:07 <opendevreview> Bernard Cafarelli proposed openstack/neutron stable/ussuri: [stable-only] Drop neutron-tempest-mariadb-full periodic job  https://review.opendev.org/c/openstack/neutron/+/902090
15:47:08 <lajoskatona> perhaps the thing should be documented somewhere to help operators, bu tlet's drop the jobs
15:47:31 <ykarel> Ok, in RDO side it's already patched
15:47:44 <ykarel> any volunteer to drop these jobs in yoga?
15:49:02 <ykarel> k i will send the patch to drop these
15:49:15 <lajoskatona> thanks
15:49:18 <ykarel> #action ykarel to send patch to drop ovs/ovn fips job in stable/yoga
15:49:33 <ykarel> Last one
15:49:37 <ykarel> #link https://zuul.openstack.org/builds?job_name=devstack-tobiko-neutron&project=openstack%2Fneutron&branch=stable%2Fxena&branch=stable%2Fyoga&branch=stable%2Fzed&skip=0
15:49:56 <ykarel> This job is Running on focal and running tests which shouldn't on focal as per https://review.opendev.org/c/openstack/neutron/+/871982
15:50:48 <ykarel> and this is included in antelope+ so for older branches need to skip those tests in some other way
15:51:24 <ykarel> slaweq, eolivare may be you know ^
15:52:34 <ykarel> k i will report a bug for this and then will see
15:52:45 <bcafarel> do we have something like tempest_exclude_regex for tobiko?
15:52:50 <ykarel> #action ykarel to report bug for tobiko stable jobs
15:52:54 <ykarel> me no idea
15:53:08 <bcafarel> yes let's wait for tobiko experts here :)
15:53:25 <slaweq> bcafarel no, there is nothing like that
15:53:35 <ykarel> k last topic
15:53:39 <ykarel> #topic Grafana
15:53:41 <ykarel> https://grafana.opendev.org/d/f913631585/neutron-failure-rate?orgId=1
15:53:51 <slaweq> I will check that tobiko issue
15:53:57 <ykarel> Thanks slaweq
15:54:04 <ykarel> let's have a quick look on grafana too
15:55:19 <ykarel> it looks overall good to me, there are some failures in check but those i think are patch specific
15:56:07 <lajoskatona> or related to recent zuul issues
15:56:17 <ykarel> yeap
15:56:32 <ykarel> k let's move to on demand
15:56:33 <ykarel> #topic On Demand
15:56:41 <ykarel> anyone wants to bring something here?
15:57:32 <lajoskatona> nothing from me
15:58:31 <mlavalle> nothing from me either
15:58:51 <ykarel> K Thanks everyone, it's stretched longer today :)
15:58:56 <ykarel> #endmeeting