15:00:28 <slaweq> #startmeeting neutron_ci
15:00:29 <openstack> Meeting started Wed Apr 22 15:00:28 2020 UTC and is due to finish in 60 minutes.  The chair is slaweq. Information about MeetBot at http://wiki.debian.org/MeetBot.
15:00:30 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
15:00:30 <slaweq> hi
15:00:32 <openstack> The meeting name has been set to 'neutron_ci'
15:01:49 <bcafarel> o/
15:01:51 <ralonsoh> hi
15:01:54 <maciejjozefczyk> hey
15:02:05 <njohnston> o/
15:02:44 <slaweq> Grafana dashboard: http://grafana.openstack.org/dashboard/db/neutron-failure-rate
15:02:51 <slaweq> ok, lets start
15:02:55 <slaweq> #topic Actions from previous meetings
15:03:01 <slaweq> ralonsoh: check ovn jobs failure
15:03:17 <slaweq> fix is already merged: https://review.opendev.org/#/c/720248/
15:03:19 <slaweq> thx ralonsoh :)
15:03:24 <ralonsoh> yw
15:03:36 <slaweq> second one
15:03:38 <slaweq> slaweq: ping yamamoto about midonet gate problems
15:03:47 <slaweq> I still don't have any reply from yamamoto
15:04:09 <slaweq> I will keep trying and maybe take a look at those broken UTs if I will have few minutes
15:04:22 <slaweq> and that's all for actions from last week
15:04:40 <slaweq> #topic Stadium projects
15:04:48 <slaweq> standardize on zuul v3
15:04:50 <slaweq> Etherpad: https://etherpad.openstack.org/p/neutron-train-zuulv3-py27drop
15:05:05 <slaweq> I don't think there was any update on this last week
15:05:20 <njohnston> https://review.opendev.org/#/c/672925 is failing on the networking-odl-functional jobs,. I am looking into it
15:05:57 <slaweq> njohnston: I just wanted to mention to try https://review.opendev.org/#/c/715439/3 but I think You already did
15:06:44 <slaweq> so with this patch only midonet will still be not done, right?
15:06:47 <njohnston> Lajos did that in PS 31 I think
15:06:53 <njohnston> slaweq: Yes I believe so
15:07:03 <slaweq> that's great
15:07:18 <slaweq> thx njohnston for update
15:07:53 <slaweq> according to CI issues in stadium projects, we have still this midonet bug
15:08:18 <slaweq> other than that I think that stadium projects are running pretty fine
15:08:46 <slaweq> do You have anything to add/ask regarding stadium?
15:08:53 <njohnston> nothing here
15:09:31 <bcafarel> small change in neutron-tempest-plugin, some jobs that were still using all-plugin tox target now use the standard "all"
15:09:55 <bcafarel> (fixing this deprecated one was needed to make https://review.opendev.org/#/c/721277/ passing)
15:09:59 <slaweq> bcafarel: and that will help us to not have issues like we had with Stein jobs recently, right?
15:10:28 <bcafarel> yes, and hopefully should not have any side effect :)
15:10:42 <slaweq> bcafarel++ thx
15:11:26 <slaweq> ok, and with that I think we can move to the next topic which is
15:11:27 <slaweq> #topic Stable branches
15:11:29 <slaweq> :)
15:11:35 <slaweq> Train dashboard: http://grafana.openstack.org/d/pM54U-Kiz/neutron-failure-rate-previous-stable-release?orgId=1
15:11:37 <slaweq> Stein dashboard: http://grafana.openstack.org/d/dCFVU-Kik/neutron-failure-rate-older-stable-release?orgId=1
15:12:42 <slaweq> except this issue with neutron-tempest-plugin jobs on Stein it looks ok for me
15:12:55 <bcafarel> yep and that stein fix is now in
15:13:54 <slaweq> yes, I saw :) thx once again for taking care of this
15:14:56 <slaweq> ok, I think we can move on to next topic, right?
15:15:28 <njohnston> +1
15:15:58 <bcafarel> yes
15:16:03 <slaweq> #topic Grafana
15:17:13 <slaweq> in overall I think that it looks ok here too
15:17:18 <njohnston> BTW what is with the points appearing in the "Number of integrated Tempest jobs runs (Gate queue)" for the "24 hours" line?  Seems like an error in the grafana config.
15:18:27 <slaweq> njohnston: are You asking about lack of data there?
15:19:07 <njohnston> "24 hours" doesn't sound like a zuul job name
15:19:33 <njohnston> so I don't know why it would be in the data set
15:19:49 <bcafarel> hmm I don't have it, refreshing
15:20:00 <slaweq> me neighter :)
15:20:06 <ralonsoh> nope
15:20:12 <njohnston> weird, I just refreshed it and it went away
15:20:14 <njohnston> that is super weird
15:20:21 <njohnston> ok, never mind :-)
15:20:22 <bcafarel> :) you scared it away
15:20:27 <slaweq> LOL
15:20:30 <ralonsoh> zuul! the source of problems and solutions
15:20:49 <slaweq> ok, so we at least solved one problem today ;P
15:21:24 <njohnston> LOL
15:21:44 <slaweq> from the other things, I saw today that we have a lot of non-voting jobs in check queue
15:22:00 <slaweq> maybe we should think about promoting some of them to be voting?
15:22:17 <slaweq> I'm not saying about doing it now but after we will cut stable/ussuri
15:22:20 <njohnston> I was going to suggest that openstacksdk-functional-devstack-networking might be a good candidate
15:22:31 <ralonsoh> right
15:22:43 <ralonsoh> can we wait until V?
15:22:52 <slaweq> ralonsoh: yes :)
15:23:13 <njohnston> and openstack-tox-py38 is another good candidate.  No reason to move quickly though, I agree it would be good to wait until V
15:23:13 <ralonsoh> +1 to the idea
15:23:16 <slaweq> in next weeks I will prepare some data and proposals about what we can promote
15:23:23 <ralonsoh> agree
15:23:31 <bcafarel> sounds good some of these look interesting
15:23:33 <slaweq> thx
15:23:45 <bcafarel> like neutron-tempest-with-uwsgi (just quick pick in the list)
15:24:35 <njohnston> neutron-ovn-tempest-slow has <10% failures for some time it looks like
15:25:01 <slaweq> bcafarel: this one is actually already voting IIRC, we just still didn't merge https://review.opendev.org/#/c/718392/
15:25:36 <bcafarel> oh I thought I had seen it going in nvm then
15:25:56 <bcafarel> ok I just need to have my eyes sight, it was just Andreas' +2
15:26:04 <slaweq> :)
15:26:24 <slaweq> I added frickler to it today so hopefully he will check it soon
15:27:22 <slaweq> ok, I think we can continue with other topics now
15:27:24 <slaweq> #topic fullstack/functional
15:27:44 <slaweq> today I found one new example of timeout in neutron.tests.functional.agent.linux.test_keepalived.KeepalivedManagerTestCase
15:27:49 <slaweq> https://a1c2986e7388db3a1401-541b7a48fdccc7de277eccd1d7d5bce5.ssl.cf5.rackcdn.com/718690/2/check/neutron-functional/9f51d60/testr_results.html
15:27:56 <slaweq> but this time it's during creation of interface
15:28:06 <slaweq> ralonsoh: does it rings a bell for You?
15:28:34 <ralonsoh> this is during the interface creation, not the namespace
15:28:37 <slaweq> isn't that this GIL related issue which You were fixing recently?
15:28:39 <ralonsoh> but I can take a look at it
15:28:42 <ralonsoh> nope
15:29:28 <slaweq> ok, I though that maybe it may be same/similar root cause
15:29:36 <slaweq> thx for taking care of it :)
15:29:54 <slaweq> #action ralonsoh to check timeout during interface creation in functional tests
15:30:33 <slaweq> and I also found 2 ovn related issues in functional tests:
15:30:35 <slaweq> https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_abb/717851/2/gate/neutron-functional/abb91cb/testr_results.html
15:30:38 <slaweq> https://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_e2c/717083/6/check/neutron-functional/e2c63e4/testr_results.html
15:30:58 <slaweq> maciejjozefczyk: does it rings a bell for You maybe?
15:31:26 <maciejjozefczyk> slaweq, looks like the same story that the timeout change should solve
15:31:40 <slaweq> maciejjozefczyk: do You have link to patch?
15:32:13 <maciejjozefczyk> #link https://review.opendev.org/#/c/717704/
15:32:37 <maciejjozefczyk> but looks like it is still the case or something smiliar
15:33:24 <slaweq> maciejjozefczyk: yes, at least one of those failures is from this week
15:33:35 <slaweq> so both probably have this timeouts patch already
15:34:28 <maciejjozefczyk> I'm gonna take a look on those two
15:34:45 <slaweq> thx maciejjozefczyk
15:34:54 <maciejjozefczyk> and reopen bug: https://bugs.launchpad.net/neutron/+bug/1868110
15:34:54 <openstack> Launchpad bug 1868110 in neutron "[OVN] neutron.tests.functional.plugins.ml2.drivers.ovn.mech_driver.ovsdb.test_ovn_db_sync.TestOvnNbSyncOverTcp.test_ovn_nb_sync_log randomly fails" [High,Fix released] - Assigned to Maciej Jozefczyk (maciej.jozefczyk)
15:34:59 <slaweq> #action maciejjozefczyk to take a look at ovn related functional test failures
15:35:25 <slaweq> and that's all from me regarding functional/fullstack tests
15:35:32 <slaweq> anything else You want to add?
15:35:54 <ralonsoh> no
15:37:10 <slaweq> so lets move on
15:37:12 <slaweq> #topic Tempest/Scenario
15:37:19 <slaweq> first of all
15:37:24 <slaweq> I proposed patch https://review.opendev.org/#/c/721805/ to enable l3_ha in scenario jobs
15:37:38 <slaweq> IMO we lack of L3 HA coverage in our CI
15:37:48 <slaweq> and that would be easy way to have it covered somehow
15:38:07 <slaweq> even if those are singlenode jobs, it will spawn keepalived, and all that stuff for each router
15:38:44 <ralonsoh> we'll catch "functional" problems with keepalived
15:38:50 <ralonsoh> +1
15:38:54 <slaweq> yep
15:39:15 <njohnston> very good
15:39:16 <bcafarel> sounds good, does that reduce coverage on non-l3 ha?
15:39:17 <slaweq> in fact I got this idea when I did similar patch for tripleo standalone job yesterday
15:39:31 <slaweq> and I found new bug with keepalived 2.x with it immediately :)
15:39:35 <bcafarel> (not the code part I know the best)
15:39:53 <slaweq> bcafarel: we still have tempest jobs for legacy, non-ha routers
15:40:13 <bcafarel> ok then no objection at all!
15:40:22 <slaweq> but IMO we should focus more on testing L3HA as IMO it's more used than legacy routers
15:40:42 <njohnston> agreed, definitely
15:41:01 <slaweq> thx for supporting this :)
15:41:27 <slaweq> from other things related to scenario jobs, I found one issue recently
15:41:32 <slaweq> in neutron_tempest_plugin.scenario.test_trunk.TrunkTest.test_trunk_subport_lifecycle - timeout while waiting for port to be ACTIVE
15:41:37 <slaweq> https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_d8c/717851/2/check/neutron-ovn-tempest-ovs-release/d8c0282/testr_results.html
15:42:13 <slaweq> anyone wants to take a look into this?
15:42:37 <slaweq> it's in ovn job
15:42:43 <maciejjozefczyk> slaweq, I think Jakub was checking it in d/s
15:43:21 <maciejjozefczyk> slaweq, I'll create a lp and check with Jakub
15:43:27 <slaweq> maciejjozefczyk: thx a lot
15:44:01 <slaweq> #action maciejjozefczyk to report LP regarding failing neutron_tempest_plugin.scenario.test_trunk.TrunkTest.test_trunk_subport_lifecycle in neutron-ovn-tempest-ovs-release job
15:44:21 <slaweq> ok, and that's all what I have for today
15:44:29 <slaweq> periodic jobs are working fine
15:44:41 <slaweq> anything else You want to discuss today?
15:46:02 <slaweq> if not, I think we can finish a bit earlier today
15:46:07 <slaweq> thx for attending
15:46:09 <slaweq> o/
15:46:11 <bcafarel> o/
15:46:13 <slaweq> #endmeeting