#openstack-meeting-3 log

15:00:42 <slaweq> #startmeeting neutron_ci
15:00:44 <openstack> Meeting started Tue Feb 23 15:00:42 2021 UTC and is due to finish in 60 minutes.  The chair is slaweq. Information about MeetBot at http://wiki.debian.org/MeetBot.
15:00:45 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
15:00:46 <slaweq> hi
15:00:47 <openstack> The meeting name has been set to 'neutron_ci'
15:00:55 <lajoskatona> o/
15:01:29 <ralonsoh> hi
15:01:56 <slaweq> I think we can start quickly
15:02:01 <slaweq> Grafana dashboard: http://grafana.openstack.org/dashboard/db/neutron-failure-rate
15:02:06 <slaweq> please open now and we will move on
15:02:12 <slaweq> #topic Actions from previous meetings
15:02:23 <slaweq> ralonsoh to check problem with (not) deleted pid files in functional tests
15:02:37 <ralonsoh> I pushed a patch to have more info
15:02:50 <ralonsoh> but I didn't find the problem
15:03:07 <slaweq> ok, so You are still waiting for new occurence of that issue, right?
15:03:13 <ralonsoh> is still there, I think because the PID that created the file is still running
15:03:22 <ralonsoh> yes, kind of
15:03:29 <slaweq> ok
15:03:55 <slaweq> ok, next one
15:03:57 <slaweq> ralonsoh to check failing periodic task in functional test
15:04:16 <ralonsoh> no sorry, I didn't start this one
15:04:26 <slaweq> can I assign it to You for next week?
15:04:32 <slaweq> or do You want me to take it?
15:04:37 <ralonsoh> I don't think I'll have time
15:04:43 <slaweq> ok
15:04:54 <slaweq> #action slaweq to check failing periodic task in functional test
15:05:02 <slaweq> next one
15:05:03 <slaweq> bcafarel to check why functional tests logs files are empty
15:05:11 <slaweq> IIRC he sent patch for that
15:06:10 <slaweq> https://review.opendev.org/c/openstack/neutron/+/774865
15:06:12 <slaweq> :)
15:06:14 <slaweq> it's even merged
15:07:08 <slaweq> next one
15:07:10 <slaweq> slaweq to check ip allocation failures in fullstack tests
15:07:24 <slaweq> I checked it and it was "just" oom-killer which killed mysql server
15:07:45 <slaweq> we have this from time to time
15:08:07 <slaweq> and I don't know how we can improve that job to not have such issues anymore
15:09:52 <slaweq> next one
15:09:54 <slaweq> lajoskatona to check subnet in use errors in scenario jobs
15:10:07 <lajoskatona> Yeah I checked
15:10:38 <lajoskatona> In taht case what I cheked deeply it seems that the cleanup fails because nova even can't finish the plug (with os-vif)
15:11:02 <lajoskatona> the plug as I remember in that cas was more that 15 minutes, and it caused the thing
15:11:13 <slaweq> ouch
15:11:18 <slaweq> pretty long
15:11:28 <slaweq> especially to plug vif
15:11:44 <lajoskatona> don't know what can cause such thing
15:12:31 <slaweq> ok, we will need to investigate, maybe with nova people when it will happen again
15:12:51 <lajoskatona> yeah sure
15:13:01 <slaweq> thx lajoskatona
15:13:14 <slaweq> that was last action item on the today's list
15:13:18 <slaweq> so we can move on
15:13:20 <slaweq> #topic Stadium projects
15:13:31 <slaweq> any issues with stadium ci?
15:14:27 <lajoskatona> with master nothing at least with the ones I check
15:14:42 <lajoskatona> I struggle with the stable branches for odl/bagpipe/bgpvpn
15:15:05 <lajoskatona> elod helps me with this experience, and thanks everybody for the reviews
15:15:34 <lajoskatona> to tell the truth it's a low activity thing for me so not always work on these with full attention :P
15:16:11 <slaweq> lajoskatona: but also those projects don't have many patches opened so we are not blocking anyone (too much) :)
15:17:05 <lajoskatona> yeah that's true :-)
15:17:46 <slaweq> ok, please ping me if You will need any review there
15:17:54 <slaweq> next topic
15:17:56 <slaweq> #topic Stable branches
15:17:57 <lajoskatona> ok thanks
15:18:02 <slaweq> Victoria dashboard: https://grafana.opendev.org/d/HUCHup2Gz/neutron-failure-rate-previous-stable-release?orgId=1
15:18:03 <slaweq> Ussuri dashboard: https://grafana.opendev.org/d/smqHXphMk/neutron-failure-rate-older-stable-release?orgId=1
15:19:11 <slaweq> bcafarel is not here today but FWIW I think for branches except stable/stein things are going pretty ok
15:19:20 <slaweq> I didn't saw any major issues recently
15:20:23 <slaweq> for stable/stein we have this bug https://bugs.launchpad.net/neutron/+bug/1916041 which I will try to check this week
15:20:25 <openstack> Launchpad bug 1916041 in neutron "tempest-slow-py3 ipv6 gate job fails on stable/stein" [Critical,Confirmed] - Assigned to Slawek Kaplonski (slaweq)
15:21:13 <slaweq> I think we can move on
15:21:15 <slaweq> #topic Grafana
15:21:20 <slaweq> http://grafana.openstack.org/dashboard/db/neutron-failure-rate
15:23:41 <slaweq> I don't see any major issues there
15:23:47 <slaweq> things works pretty ok recently
15:24:29 <ralonsoh> fingers crossed
15:24:55 <slaweq> so let's go quickly through some failures which I found
15:25:03 <slaweq> #topic fullstack/functional
15:25:18 <slaweq> we are still seeing some timeouted jobs like e.g.:
15:25:22 <slaweq> https://0102a3b5a62ee8d841df-584a141890d4c711c7adef07d7bdf32d.ssl.cf5.rackcdn.com/773281/5/check/neutron-fullstack-with-uwsgi/3294c2d/job-output.txt
15:25:23 <slaweq> https://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_f90/777015/1/check/neutron-fullstack-with-uwsgi/f907071/job-output.txt
15:25:46 <slaweq> problem is always similar, job basically hangs at some point and e.g. 1h later it is killed by zuul
15:25:59 <slaweq> it happens for both fullstack and functional jobs
15:26:16 <slaweq> I think it is similar to what we had in the past with UT jobs when we moved to stestr
15:26:33 <slaweq> and it was caused by too much logs produced by our tests
15:27:00 <slaweq> if anyone would have some time to look at it, that would be great :)
15:27:51 <slaweq> and basically that's all what I have for today
15:28:03 <slaweq> I didn't saw any other failures worth to discuss here
15:28:14 <slaweq> one last thing from me for today
15:28:34 <slaweq> ralonsoh: please check https://review.opendev.org/c/openstack/neutron/+/774626 - it will hopefully improve our ci as not all jobs will run on every patch
15:28:44 <slaweq> so less chance to random failure :)
15:28:49 <ralonsoh> sure
15:29:01 <slaweq> thx
15:29:40 <slaweq> that's all from me for today
15:30:20 <slaweq> do You have anything else to talk about today?
15:30:30 <ralonsoh> no thanks
15:31:14 <slaweq> ok, so I will give You 30 minutes back :)
15:31:18 <slaweq> thx for attending the meeting
15:31:19 <slaweq> o/
15:31:22 <lajoskatona> Bye
15:31:24 <slaweq> #endmeeting