15:00:38 <slaweq> #startmeeting neutron_ci
15:00:38 <opendevmeet> Meeting started Tue Aug 31 15:00:38 2021 UTC and is due to finish in 60 minutes.  The chair is slaweq. Information about MeetBot at http://wiki.debian.org/MeetBot.
15:00:38 <opendevmeet> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
15:00:38 <opendevmeet> The meeting name has been set to 'neutron_ci'
15:00:43 <slaweq> hi again
15:00:48 <bcafarel> o/ again
15:01:01 <slaweq> first of all
15:01:03 <slaweq> Grafana dashboard: http://grafana.openstack.org/dashboard/db/neutron-failure-rate
15:01:31 <lajoskatona> Hi
15:01:43 <ralonsoh> hi
15:01:54 <slaweq> ok, let's start
15:01:56 <slaweq> #topic Actions from previous meetings
15:02:00 <slaweq> slaweq to update neutron-lib grafana dashboard
15:02:05 <slaweq> Patch https://review.opendev.org/c/openstack/project-config/+/806138
15:02:35 <ralonsoh> sorry, I forgot to review that patch
15:03:05 <slaweq> ralonsoh: np
15:03:18 <slaweq> that was only action from last week
15:03:27 <slaweq> so we can move on to the next topics quickly
15:03:32 <slaweq> #topic Stadium projects
15:03:37 <slaweq> any updates/issues there?
15:04:00 <lajoskatona> all green (I had a few days when thought all red, but it was some glitch)
15:04:16 <slaweq> ++
15:04:20 <slaweq> that's good news
15:04:21 <lajoskatona> except n-d-r: https://review.opendev.org/c/openstack/neutron-dynamic-routing/+/806543
15:04:22 <slaweq> thx lajos
15:04:35 <bcafarel> and https://review.opendev.org/c/openstack/networking-sfc/+/799838 if you have a minute for networking-sfc
15:04:55 <lajoskatona> for n-d-r it is related to recent payload changes, but can't figure out what is missing from my patch
15:05:02 <lajoskatona> bcafarel: ok
15:05:56 <slaweq> lajoskatona: I can take a look at those failed UT if You want
15:06:48 <lajoskatona> slaweq: thanks, I see binding failures for the failing test, but not sure if that is related to the failure or just happened earlier as well but not related to the test failure
15:07:18 <slaweq> I bet it's red hering
15:07:22 <slaweq> but maybe I'm wrong
15:07:23 <lajoskatona> like here: https://review.opendev.org/c/openstack/neutron-dynamic-routing/+/806543
15:07:35 <lajoskatona> slaweq: possible
15:07:51 <lajoskatona> I mean that red herring
15:07:56 <lajoskatona> :P
15:08:24 <slaweq> if I will find something, I will write a comment in the gerrit
15:08:58 <lajoskatona> ok, thanks
15:10:15 <slaweq> if that's all for stadium's CI, I think we can move on
15:10:18 <slaweq> #topic Stable branches
15:10:23 <lajoskatona> that's it
15:10:26 <slaweq> bcafarel: any updates there?
15:10:51 <bcafarel> overall quite good, https://review.opendev.org/c/openstack/neutron-tempest-plugin/+/806119 just merged so EM branches should be back to green soon too
15:11:24 <ralonsoh> I hope so
15:12:12 <slaweq> ++
15:12:16 <slaweq> great
15:13:50 <slaweq> so I think we can move on quickly
15:14:09 <slaweq> #topic Grafana
15:14:12 <slaweq> http://grafana.openstack.org/dashboard/db/neutron-failure-rate
15:14:43 <ralonsoh> functional testing is killing the CI
15:14:59 <ralonsoh> the problem with the OVN UUID
15:16:14 <slaweq> yes, and I know that jlibosva was working on it
15:16:26 <ralonsoh> there is a patch to check the ovn-controller status
15:16:28 <slaweq> and now otherwiseguy should continue that
15:16:59 <slaweq> main problem is that it can happens in (almost) any ovn related functional test
15:17:07 <ralonsoh> yes
15:17:09 <slaweq> so it's hard to blacklist something
15:17:14 <slaweq> or mark as unstable
15:18:31 <slaweq> I hope otherwiseguy will find some solution quickly :)
15:19:09 <slaweq> I think it's the reason of more than 90% of functional job's failures currently
15:19:59 <lajoskatona> it wasn't the one for which jlibosvar proposed to use older ovs?
15:21:22 <slaweq> lajoskatona: yes, but apparently the issue is more complicated than that
15:21:31 <lajoskatona> slaweq: ok
15:21:34 <slaweq> at least that's last info from jlibosva which I have
15:23:04 <slaweq> ok, let's move on to the specific job's issues
15:23:11 <slaweq> #topic Fullstack/functional
15:23:20 <slaweq> regarding functional we already discussed
15:23:35 <slaweq> for fullstack job, I found today 3 very similar failures
15:23:39 <slaweq> https://bugs.launchpad.net/neutron/+bug/1942190
15:23:44 <slaweq> https://40502112e1d4c65f94dd-005095c9da7f9886ddbc5e1cb2d2328c.ssl.cf5.rackcdn.com/806325/1/check/neutron-fullstack-with-uwsgi/09ff3ee/testr_results.html
15:23:46 <slaweq> https://838e8809c9c087f1d2df-d66d94e8460be82c507ecb0f70cc3225.ssl.cf2.rackcdn.com/798009/9/check/neutron-fullstack-with-uwsgi/73910c7/testr_results.html
15:23:48 <slaweq> https://a08f18652b533db4c020-23d868cf2a4e074c6e4235ff4566c9f8.ssl.cf5.rackcdn.com/806277/1/check/neutron-fullstack-with-uwsgi/78a9d70/testr_results.html
15:24:11 <slaweq> anyone wants to check it?
15:24:34 <ralonsoh> maybe I'll be able at the end of the week
15:24:50 <slaweq> thx ralonsoh
15:25:04 <slaweq> #action ralonsoh to check fullstack issue https://bugs.launchpad.net/neutron/+bug/1942190
15:25:12 <slaweq> next topic
15:25:18 <slaweq> #topic Tempest/Scenario
15:25:27 <slaweq> here, I again saw some oom-killer issues :/
15:25:32 <slaweq> https://84a83418fe08abe99649-be6253c0e82f1539fed391a5717e06a0.ssl.cf2.rackcdn.com/804832/1/check/neutron-tempest-plugin-scenario-linuxbridge/f663af5/testr_results.html
15:25:33 <slaweq> https://c7ef6c09f34b9ed727cc-08136eee394ca142f86118487824fe1a.ssl.cf5.rackcdn.com/800059/3/check/neutron-tempest-plugin-scenario-openvswitch-iptables_hybrid/e9ce1a1/testr_results.html
15:25:35 <slaweq> https://d683bf64be57725b07bf-9eecc5f5b2306eceabd25b057aacac8a.ssl.cf5.rackcdn.com/804218/4/check/neutron-tempest-plugin-scenario-openvswitch-iptables_hybrid/a237a57/testr_results.html
15:25:37 <slaweq> https://ecd9673e000fd4e988a6-dcb12bf222f1c96a489b3ed464d5a157.ssl.cf5.rackcdn.com/804218/4/gate/neutron-tempest-plugin-scenario-openvswitch/dfea836/testr_results.html
15:25:39 <slaweq> https://2b735aae18d0591220ca-ba27e931a99f05bd6f205438b3cd6a3a.ssl.cf1.rackcdn.com/805031/3/check/neutron-tempest-plugin-scenario-linuxbridge/e958ebe/testr_results.html
15:25:55 <ralonsoh> with the new image?
15:26:08 <slaweq> for example that https://d683bf64be57725b07bf-9eecc5f5b2306eceabd25b057aacac8a.ssl.cf5.rackcdn.com/804218/4/check/neutron-tempest-plugin-scenario-openvswitch-iptables_hybrid/a237a57/testr_results.html is from today
15:26:47 <ralonsoh> ADVANCED_INSTANCE_TYPE=ds512M
15:26:50 <ralonsoh> I don't understand
15:26:54 <ralonsoh> it is using the default one
15:27:41 <slaweq> I see ADVANCED_INSTANCE_TYPE=ntp_image_384M
15:27:45 <slaweq> in that job
15:28:35 <ralonsoh> not in
15:28:37 <ralonsoh> https://84a83418fe08abe99649-be6253c0e82f1539fed391a5717e06a0.ssl.cf2.rackcdn.com/804832/1/check/neutron-tempest-plugin-scenario-linuxbridge/f663af5/job-output.txt
15:28:54 <slaweq> but that one was somehow older
15:29:02 <ralonsoh> yeah
15:29:03 <slaweq> I may not check date properly
15:29:06 <slaweq> sorry for that link
15:29:19 <ralonsoh> anyway, the new image is not helping
15:29:34 <ralonsoh> maybe we have reduced the number of oom
15:29:35 <slaweq> but still, in https://d683bf64be57725b07bf-9eecc5f5b2306eceabd25b057aacac8a.ssl.cf5.rackcdn.com/804218/4/check/neutron-tempest-plugin-scenario-openvswitch-iptables_hybrid/a237a57/job-output.txt proper flavor is used and it still failed with same issue
15:29:42 <ralonsoh> but we didn't remove all
15:29:42 <slaweq> at least it looks like the same issue
15:29:58 <slaweq> yes, there is less such failures for sure
15:30:47 <ralonsoh> so, what now?
15:30:53 <ralonsoh> can we have bigger VMs?
15:31:05 <slaweq> maybe we should run tests which requires advanced image serially?
15:31:08 <ralonsoh> or we need to scratch for more RAM in the current envs?
15:31:09 <slaweq> one by one
15:31:31 <ralonsoh> the problem is the regex for this
15:31:43 <ralonsoh> this is not an easy one
15:31:44 <slaweq> I will try to figure out something
15:31:50 <ralonsoh> in any case, we can try
15:32:08 <slaweq> #action slaweq to check how to run advanced image tests in serial
15:33:14 <slaweq> and that's basically all from me for today :)
15:33:25 <slaweq> do You have anything else You want to talk about, regarding our CI?
15:34:05 <ralonsoh> no
15:34:12 <bcafarel> all good here
15:34:17 <lajoskatona> nothing from me
15:34:41 <slaweq> ok, so thx for attending the meeting
15:34:46 <slaweq> and see You online
15:34:48 <slaweq> o/
15:34:51 <ralonsoh> bye
15:34:51 <slaweq> #endmeeting