16:00:17 <slaweq> #startmeeting neutron_ci
16:00:18 <openstack> Meeting started Tue Jan  7 16:00:17 2020 UTC and is due to finish in 60 minutes.  The chair is slaweq. Information about MeetBot at http://wiki.debian.org/MeetBot.
16:00:19 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
16:00:20 <slaweq> hi
16:00:21 <openstack> The meeting name has been set to 'neutron_ci'
16:00:24 <njohnston> o/
16:01:13 <ralonsoh> hi
16:01:30 <bcafarel> o/
16:02:28 <slaweq> ok, lets start as we have our regular attendees already
16:02:35 <slaweq> Grafana dashboard: http://grafana.openstack.org/dashboard/db/neutron-failure-rate
16:02:51 <slaweq> #topic Actions from previous meetings
16:03:13 <slaweq> actually this time those are "actions from previous year" even :)
16:03:22 <slaweq> and the first one is
16:03:24 <slaweq> njohnston to check failing NetworkMigrationFromHA in multinode dvr job
16:04:55 <slaweq> I think we lost njohnston now
16:05:41 <njohnston> sorry, network hiccup.  I'll work on this later today
16:06:16 <slaweq> ok :)
16:06:24 <slaweq> so I will set it as reminder for next week too
16:06:28 <slaweq> #action njohnston to check failing NetworkMigrationFromHA in multinode dvr job
16:06:45 <bcafarel> as long as it is not for next year :)
16:07:33 <slaweq> bcafarel: LOL, this time no
16:07:36 <slaweq> only for next week
16:07:45 <slaweq> next one
16:07:47 <slaweq> slaweq to talk with gmann about finish of tempest plugins migration
16:07:54 <slaweq> I talked with gmann about it today
16:08:05 <slaweq> so he is aware of that already
16:08:35 <slaweq> and that's all actions from last meeting on my list
16:08:41 <slaweq> lets move on
16:08:43 <slaweq> #topic Stadium projects
16:08:53 <slaweq> njohnston: any updates about dropping py2 support?
16:09:03 <njohnston> I took a look at the outstanding work
16:09:31 <njohnston> We have work proceeding on networking-odl, networking-bagpipe, networking-midonet, and neutron-fwaas
16:09:41 <njohnston> although I am not sure the author of the fwaas change is engaged anymore
16:09:51 <njohnston> midonet in particular is close to a merge
16:10:02 <njohnston> so the stadium is in a good position
16:10:12 <slaweq> that's good
16:10:13 <njohnston> the only remaining ones are neutron-lib and neutron proper
16:10:31 <njohnston> I will file changes for those after the meeting
16:10:33 <slaweq> I can take care of those
16:10:36 <slaweq> ahh, ok
16:10:38 <slaweq> thx
16:10:44 <bcafarel> so good progress overall then?
16:10:58 <njohnston> there is one existing for neutron-lib but it includes things like removing six and other 2.7-hostile changes, which I want to separate out
16:11:16 <slaweq> yes, I remember that one
16:11:47 <njohnston> so I think I will just edit that change
16:11:58 <slaweq> +1 for that
16:11:58 <njohnston> bcafarel: Yes, definitely good progress
16:12:23 <slaweq> thx njohnston for the update
16:12:28 <slaweq> that's really great work
16:13:07 <njohnston> as a team we are getting good at pulling together for these community-wide efforts :-)
16:13:58 <slaweq> anything else related to the stadium projects You want to discuss today?
16:14:38 <njohnston> For the zuulv3 migration, the standouts are networking-bgpvpn, midonet, neutron-dynamic-routing, and neutron-vpnaas
16:14:52 <njohnston> those are the ones that still need work, otherwise everything is covered
16:15:45 <slaweq> ok, I will try to find finally some time to work on at least some of them
16:16:11 <slaweq> ok, next topic
16:16:21 <slaweq> #topic Grafana
16:16:30 <slaweq> #link http://grafana.openstack.org/dashboard/db/neutron-failure-rate
16:17:35 <slaweq> we don't have too much data from last days, especially from gate queue
16:17:54 <ralonsoh> yeah, just waiting for https://review.opendev.org/#/c/701018/
16:18:05 <slaweq> in check queue, the biggest problem IMO are thise neutron-tempest-plugin failures
16:18:28 <slaweq> ralonsoh: exactly
16:18:33 <njohnston> yes, definitely, those are killer
16:19:14 <ralonsoh> I would delay any CI errors analysis until we have this patch
16:19:27 <ralonsoh> (but I'm very lazy too)
16:19:40 <bcafarel> laziness can be good :)
16:19:46 <slaweq> ralonsoh: this bug affects "only" neutron-tempest-plugin jobs
16:20:09 <slaweq> and in fact other jobs are in pretty good shape (at lest not worts than usually)
16:20:22 <njohnston> ralonsoh: Her, laziness is #1 on Larry Wall's three virtues of a great programmer
16:20:26 <njohnston> *hey
16:20:35 <ralonsoh> hahah
16:20:35 <slaweq> :)
16:21:34 <slaweq> anything else related to the grafana?
16:21:41 <slaweq> or can we move on to the next topic?
16:22:25 <ralonsoh> ok for me
16:22:32 <njohnston> +1 to move on
16:22:37 <slaweq> ok
16:22:39 <slaweq> #topic fullstack/functional
16:22:48 <slaweq> I was looking today on recent failures
16:23:02 <slaweq> and I found 3 various failed tests which we may check now
16:23:05 <slaweq> first one:
16:23:11 <slaweq> neutron.tests.functional.agent.test_firewall.FirewallTestCase.test_ingress_udp_rule
16:23:17 <slaweq> https://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_f37/697317/8/check/neutron-functional/f37a608/testr_results.html.gz
16:25:38 <slaweq> did You ever saw such failure before?
16:25:56 <ralonsoh> not very usual, do you want to implement some kind of retrial context?
16:26:26 <njohnston> that is new and different to me
16:26:37 <ralonsoh> but, IMO, the connection handler under the hood should be enough
16:28:31 <slaweq> ok, lets just be aware of it for now
16:28:55 <slaweq> maybe it will not happen often (or even will not happen at all anymore :))
16:30:29 <slaweq> ok, next one
16:30:54 <slaweq> those are 2 failures of different tests, but error looks kind of similar IMO
16:30:56 <slaweq> https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_ea8/677092/11/check/neutron-functional/ea8fe46/testr_results.html.gz
16:31:02 <slaweq> and https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_19d/701003/1/check/neutron-functional/19d8a0a/testr_results.html.gz
16:31:31 <slaweq> one is timeout during namespace creation (again)
16:31:33 <slaweq> and second is timeout during bridge creation
16:31:47 <slaweq> but it's linuxbridge
16:32:04 <slaweq> so IMHO netlink is common part of both of those issues
16:33:38 <slaweq> did You saw similar issues before?
16:33:45 <ralonsoh> yes, many times
16:33:51 <ralonsoh> at least with the ns creation
16:34:07 <ralonsoh> not so many, but some of them, related to the br creation
16:35:02 <slaweq> ok, so I think we should open bug for that, right?
16:35:24 <ralonsoh> yes, I'll do it
16:35:34 <ralonsoh> at least the one related to the bridge
16:35:53 <ralonsoh> the problem is I have spent days with those privsep/netlink errors
16:36:05 <ralonsoh> and I don't see the cause
16:36:18 <slaweq> ralonsoh: I know You were debugging and trying to fix it
16:36:30 <slaweq> it's not so often now but still happens from time to time
16:37:50 <slaweq> ralonsoh: thx for reporting this on LP
16:38:05 <slaweq> #action ralonsoh to report bug for timeout related to bridge creation
16:38:35 <slaweq> ok, that's all from my side about functional/fullstack jobs
16:38:41 <slaweq> anythig else You want to add here?
16:38:46 <ralonsoh> no
16:39:59 <bcafarel> me neither
16:40:02 <slaweq> ok, lets move on
16:40:12 <slaweq> #topic Tempest/Scenario
16:40:31 <slaweq> we already talked about issue with neutron-tempest-plugin jobs
16:40:43 <slaweq> I also reported today new bug for dvr jobs: https://bugs.launchpad.net/neutron/+bug/1858642
16:40:43 <openstack> Launchpad bug 1858642 in neutron "paramiko.ssh_exception.NoValidConnectionsError error cause dvr scenario jobs failing" [High,Confirmed]
16:40:58 <slaweq> it happens quite often that tests fails due to such error
16:41:18 <slaweq> and I saw it only in dvr jobs (which are non-voting fortunatelly)
16:41:32 <slaweq> but if someone would have some time, it would be good to take a look
16:42:37 <slaweq> next problen is that neutron-tempest-plugin-fwaas job is unstable
16:42:48 <slaweq> I reported it here: https://bugs.launchpad.net/neutron/+bug/1858645
16:42:48 <openstack> Launchpad bug 1858645 in neutron "Neutron-fwaas tempest job is not stable" [Critical,Confirmed]
16:43:06 <slaweq> and I proposed patch to make those jobs non-voting and remove from gate queue for now
16:43:12 <slaweq> https://review.opendev.org/#/c/701371/
16:43:30 <slaweq> if someone interested in neutron-fwaas could take a look, that would be great
16:44:03 <njohnston> will do
16:44:17 <slaweq> thx njohnston
16:44:18 <njohnston> despite my lack of interest in neutron-fwaas :-)
16:44:32 <slaweq> :)
16:44:49 <slaweq> and that's all what I have for scenario jobs for today
16:46:08 <slaweq> anything else related to scenario jobs You have or can we move on?
16:47:06 <njohnston> +1 to move on
16:47:57 <slaweq> ok, last topic for today from me
16:47:58 <slaweq> #topic Periodic
16:48:25 <slaweq> still our mariadb job is failing 100% due to issue in Mariadb 10.1
16:48:45 <slaweq> is there anyone who has some cycles to check how to upgrade mariadb in this job?
16:48:54 <ralonsoh> https://review.opendev.org/#/c/698980/1/tools/fixup_stuff.sh
16:49:01 <ralonsoh> I think this patch won't fix it
16:49:12 <ralonsoh> because fixup_ubuntu is not being called
16:49:34 <ralonsoh> I'll try again this week
16:49:48 <slaweq> yes, I think it's called only in functional/fullstack jobs
16:50:24 <slaweq> ok, thx ralonsoh
16:50:32 <slaweq> if You would need any help with this, please ping me
16:50:45 <slaweq> #action ralonsoh to take a look how to use newer Maridb in periodic job
16:50:57 <ralonsoh> thanks
16:51:00 <slaweq> ok, that's all from my side for today
16:51:15 <slaweq> anything else You want to discuss today?
16:51:20 <ralonsoh> no thanks
16:51:34 <njohnston> nothing more from me
16:51:35 <bcafarel> nothing here (except happy new ci year)
16:51:53 <slaweq> happy new CI year :)
16:51:56 <slaweq> ahh, one thing
16:52:18 <slaweq> I just pushed my blog post to be online: http://kaplonski.pl/blog/failed_builds_per_patch/
16:52:28 <slaweq> it's about number of "rechecks" in our CI
16:52:37 <njohnston> +1
16:52:54 * bcafarel adds to the "to read" tabs
16:53:02 <slaweq> if You have few minutes, please take a look and tell me if that makes any sense for You and if such metric may be useful for us somehow
16:53:02 <ralonsoh> +1
16:53:13 <slaweq> and thx njohnston for help with english in this post :)
16:54:19 <njohnston> my pleasure, your English is already very good :-)
16:54:26 <slaweq> thx njohnston :)
16:54:37 <slaweq> ok, I think we can finish our meeting now
16:54:41 <slaweq> thx for attending
16:54:45 <slaweq> and see You all online
16:54:47 <slaweq> o/
16:54:50 <slaweq> #endmeeting