15:00:42 #startmeeting neutron_ci 15:00:44 Meeting started Tue Feb 23 15:00:42 2021 UTC and is due to finish in 60 minutes. The chair is slaweq. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:00:45 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:00:46 hi 15:00:47 The meeting name has been set to 'neutron_ci' 15:00:55 o/ 15:01:29 hi 15:01:56 I think we can start quickly 15:02:01 Grafana dashboard: http://grafana.openstack.org/dashboard/db/neutron-failure-rate 15:02:06 please open now and we will move on 15:02:12 #topic Actions from previous meetings 15:02:23 ralonsoh to check problem with (not) deleted pid files in functional tests 15:02:37 I pushed a patch to have more info 15:02:50 but I didn't find the problem 15:03:07 ok, so You are still waiting for new occurence of that issue, right? 15:03:13 is still there, I think because the PID that created the file is still running 15:03:22 yes, kind of 15:03:29 ok 15:03:55 ok, next one 15:03:57 ralonsoh to check failing periodic task in functional test 15:04:16 no sorry, I didn't start this one 15:04:26 can I assign it to You for next week? 15:04:32 or do You want me to take it? 15:04:37 I don't think I'll have time 15:04:43 ok 15:04:54 #action slaweq to check failing periodic task in functional test 15:05:02 next one 15:05:03 bcafarel to check why functional tests logs files are empty 15:05:11 IIRC he sent patch for that 15:06:10 https://review.opendev.org/c/openstack/neutron/+/774865 15:06:12 :) 15:06:14 it's even merged 15:07:08 next one 15:07:10 slaweq to check ip allocation failures in fullstack tests 15:07:24 I checked it and it was "just" oom-killer which killed mysql server 15:07:45 we have this from time to time 15:08:07 and I don't know how we can improve that job to not have such issues anymore 15:09:52 next one 15:09:54 lajoskatona to check subnet in use errors in scenario jobs 15:10:07 Yeah I checked 15:10:38 In taht case what I cheked deeply it seems that the cleanup fails because nova even can't finish the plug (with os-vif) 15:11:02 the plug as I remember in that cas was more that 15 minutes, and it caused the thing 15:11:13 ouch 15:11:18 pretty long 15:11:28 especially to plug vif 15:11:44 don't know what can cause such thing 15:12:31 ok, we will need to investigate, maybe with nova people when it will happen again 15:12:51 yeah sure 15:13:01 thx lajoskatona 15:13:14 that was last action item on the today's list 15:13:18 so we can move on 15:13:20 #topic Stadium projects 15:13:31 any issues with stadium ci? 15:14:27 with master nothing at least with the ones I check 15:14:42 I struggle with the stable branches for odl/bagpipe/bgpvpn 15:15:05 elod helps me with this experience, and thanks everybody for the reviews 15:15:34 to tell the truth it's a low activity thing for me so not always work on these with full attention :P 15:16:11 lajoskatona: but also those projects don't have many patches opened so we are not blocking anyone (too much) :) 15:17:05 yeah that's true :-) 15:17:46 ok, please ping me if You will need any review there 15:17:54 next topic 15:17:56 #topic Stable branches 15:17:57 ok thanks 15:18:02 Victoria dashboard: https://grafana.opendev.org/d/HUCHup2Gz/neutron-failure-rate-previous-stable-release?orgId=1 15:18:03 Ussuri dashboard: https://grafana.opendev.org/d/smqHXphMk/neutron-failure-rate-older-stable-release?orgId=1 15:19:11 bcafarel is not here today but FWIW I think for branches except stable/stein things are going pretty ok 15:19:20 I didn't saw any major issues recently 15:20:23 for stable/stein we have this bug https://bugs.launchpad.net/neutron/+bug/1916041 which I will try to check this week 15:20:25 Launchpad bug 1916041 in neutron "tempest-slow-py3 ipv6 gate job fails on stable/stein" [Critical,Confirmed] - Assigned to Slawek Kaplonski (slaweq) 15:21:13 I think we can move on 15:21:15 #topic Grafana 15:21:20 http://grafana.openstack.org/dashboard/db/neutron-failure-rate 15:23:41 I don't see any major issues there 15:23:47 things works pretty ok recently 15:24:29 fingers crossed 15:24:55 so let's go quickly through some failures which I found 15:25:03 #topic fullstack/functional 15:25:18 we are still seeing some timeouted jobs like e.g.: 15:25:22 https://0102a3b5a62ee8d841df-584a141890d4c711c7adef07d7bdf32d.ssl.cf5.rackcdn.com/773281/5/check/neutron-fullstack-with-uwsgi/3294c2d/job-output.txt 15:25:23 https://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_f90/777015/1/check/neutron-fullstack-with-uwsgi/f907071/job-output.txt 15:25:46 problem is always similar, job basically hangs at some point and e.g. 1h later it is killed by zuul 15:25:59 it happens for both fullstack and functional jobs 15:26:16 I think it is similar to what we had in the past with UT jobs when we moved to stestr 15:26:33 and it was caused by too much logs produced by our tests 15:27:00 if anyone would have some time to look at it, that would be great :) 15:27:51 and basically that's all what I have for today 15:28:03 I didn't saw any other failures worth to discuss here 15:28:14 one last thing from me for today 15:28:34 ralonsoh: please check https://review.opendev.org/c/openstack/neutron/+/774626 - it will hopefully improve our ci as not all jobs will run on every patch 15:28:44 so less chance to random failure :) 15:28:49 sure 15:29:01 thx 15:29:40 that's all from me for today 15:30:20 do You have anything else to talk about today? 15:30:30 no thanks 15:31:14 ok, so I will give You 30 minutes back :) 15:31:18 thx for attending the meeting 15:31:19 o/ 15:31:22 Bye 15:31:24 #endmeeting