15:00:37 <slaweq> #startmeeting neutron_ci 15:00:38 <openstack> Meeting started Tue Mar 9 15:00:37 2021 UTC and is due to finish in 60 minutes. The chair is slaweq. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:00:39 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:00:41 <openstack> The meeting name has been set to 'neutron_ci' 15:01:12 <lajoskatona> Hi 15:01:52 <ralonsoh> hi 15:01:56 <bcafarel> hey again 15:02:43 <slaweq> ok, let's start 15:02:50 <slaweq> Grafana dashboard: http://grafana.openstack.org/dashboard/db/neutron-failure-rate 15:03:15 <slaweq> #topic Actions from previous meetings 15:03:21 <slaweq> slaweq to check failing qos migration tests in train neutron-tempest-dvr-ha-multinode-full job 15:03:26 <slaweq> Bug reported for nova for now https://bugs.launchpad.net/nova/+bug/1917610 15:03:27 <openstack> Launchpad bug 1917610 in neutron "Migration and resize tests from tempest.scenario.test_minbw_allocation_placement.MinBwAllocationPlacementTest failing in neutron-tempest-dvr-ha-multinode-full" [Critical,Fix released] 15:03:28 <slaweq> Fixed in tempest https://review.opendev.org/c/openstack/tempest/+/778451 15:03:31 <slaweq> thx gibi for help with it :) 15:03:44 <slaweq> next one 15:03:50 <slaweq> ralonsoh to try to check how to limit number of logged lines in FT output 15:04:13 <ralonsoh> still checking this one, no progress yet 15:04:15 <ralonsoh> sorry 15:04:20 <slaweq> sure, np 15:04:32 <slaweq> can I assign it to You for next week? 15:05:13 <ralonsoh> sure 15:05:18 <slaweq> #action ralonsoh to try to check how to limit number of logged lines in FT output 15:05:20 <slaweq> thx 15:05:27 <slaweq> next one 15:05:29 <slaweq> ralonsoh to report bug with ip operations timeout in FT 15:05:36 <ralonsoh> one sec... 15:05:54 <ralonsoh> one patch: https://review.opendev.org/c/openstack/neutron/+/778735 15:06:01 <ralonsoh> LP: https://launchpad.net/bugs/1917487 15:06:02 <openstack> Launchpad bug 1917487 in neutron "[FT] "IpNetnsCommand.add" command fails frequently " [Critical,New] - Assigned to Rodolfo Alonso (rodolfo-alonso-hernandez) 15:06:32 <ralonsoh> still working on 2) timeouts during the sysctl command execution 15:07:18 <slaweq> thx for that 15:07:31 <slaweq> I hope that with https://review.opendev.org/c/openstack/neutron/+/778735 functional tests will be a bit more stable 15:07:41 <bcafarel> that would be nice 15:08:53 <slaweq> next one 15:08:54 <slaweq> bcafarel to check failing fedora based periodic job 15:09:17 <bcafarel> so we had in fact a LP for that https://bugs.launchpad.net/neutron/+bug/1911128 15:09:18 <openstack> Launchpad bug 1911128 in neutron "Neutron with ovn driver failed to start on Fedora" [Critical,In progress] - Assigned to Bernard Cafarelli (bcafarel) 15:09:47 <bcafarel> I think main issue is ovs daemons do not run as root in Fedora, and so can not read the TLS certs (owned by stack) 15:10:09 <bcafarel> I am testing this in https://review.opendev.org/c/openstack/neutron/+/779494 (could have had results if I had modified the correct job on first try...) 15:10:35 <bcafarel> if it passes, it sounds like a good fix, we can have fedora+tls support added later, what do you think? 15:10:52 <bcafarel> oh actually it passed zuul 15:11:00 <slaweq> that would be ok as workaround at least IMO 15:11:05 <slaweq> yes, it's green now 15:11:40 <bcafarel> yes for proper support I am not sure how it would go in devstack, as "chmod 777" the certs is not really a nice fix :) 15:12:05 <slaweq> yes, but I think that it's perfectly valid to test it without ssl in that job 15:12:21 <slaweq> we don't want really to test ovs on fedora in that job 15:12:23 <slaweq> but neutron 15:12:27 <slaweq> :) 15:12:29 <bcafarel> +1 15:12:45 <slaweq> if ralonsoh and lajoskatona are ok with that, I'm ok too 15:12:49 <ralonsoh> +1 15:12:58 <lajoskatona> +1 15:13:03 <bcafarel> ok I will remove "wip" flag and then periodic can go back to green then 15:13:29 <lajoskatona> 1 less periodic failure mail then? 15:13:34 <slaweq> ++ 15:13:36 <bcafarel> cross fingers :) 15:13:38 <slaweq> thx a lot 15:13:56 <slaweq> lajoskatona: do You get emails about periodic jobs results? 15:15:07 <lajoskatona> yes, but recently too much 15:15:15 <slaweq> how to configure that? 15:15:21 <slaweq> I don't get such emails :/ 15:15:28 <lajoskatona> if there's only a few networking related I checked 15:15:36 <lajoskatona> I check it for you 15:15:42 <slaweq> thx 15:16:14 <slaweq> ok, lets move on 15:16:21 <slaweq> #topic Stadium projects 15:16:30 <slaweq> anything related stadium's ci? 15:17:00 <lajoskatona> I think this is where you can suscribe: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-stable-maint 15:17:45 <lajoskatona> For stadiums: some patches are moving to the stable branches, nothing serious 15:17:55 <slaweq> thx lajoskatona 15:20:06 <slaweq> ok, thx 15:20:13 <slaweq> #topic Stable branches 15:20:27 <slaweq> bcafarel: except recent pip issue, anything else worth mentioning? 15:21:02 <bcafarel> rest is mostly OK, I saw more grenade failures/timeouts than usual recently but not too bad (yet) 15:21:11 <slaweq> k 15:21:16 <bcafarel> and thanks slaweq for all the CI improvement backports, they should help stable branches too! 15:21:33 <slaweq> yes, I made some but only up to train 15:21:40 <slaweq> in older branches we have many legacy jobs 15:21:50 <slaweq> and it would be too much to backport those things 15:22:55 <bcafarel> sounds good, older EM branches if jobs get problematic, we can limit them 15:23:12 <slaweq> yeah 15:23:13 <bcafarel> and though I stein needs some rechecks from time to time, rocky and queens are quite stable these days 15:24:33 <slaweq> let's move on 15:24:35 <slaweq> #topic Grafana 15:24:40 <slaweq> grafana.openstack.org/dashboard/db/neutron-failure-rate 15:25:15 <slaweq> in overall I think that things are pretty ok now 15:25:25 <slaweq> still functional/fullstack jobs are failing most 15:25:37 <slaweq> but they also went down a bit since last week 15:25:57 <slaweq> maybe it's due to mock of the ovn maintenance task there 15:26:22 <slaweq> do You have anything regading grafana dashboards for today? 15:28:08 <slaweq> ok, so lets talk about functional jobs then 15:28:10 <slaweq> #topic fullstack/functional 15:28:17 <slaweq> I have few things there 15:28:33 <slaweq> first one is interesting (for me) issue 15:28:50 <slaweq> I proposed some time ago patch to limit number of test workers in functional job 15:28:59 <slaweq> https://review.opendev.org/c/openstack/neutron/+/778151 15:29:09 <slaweq> and now I see that this job is failing 15:29:23 <slaweq> and many tests are failed due to "too many opened files" error 15:29:29 <slaweq> https://0bf054d7c7210f57ced8-38841c8dd9732a175234859ce574a8ea.ssl.cf5.rackcdn.com/778151/3/check/neutron-functional-with-uwsgi/6757358/testr_results.html 15:29:38 <slaweq> I have no idea why it is like that 15:29:45 <ralonsoh> should be the opposite... 15:29:48 <slaweq> do You maybe have any clues? 15:29:51 <slaweq> ralonsoh: exactly :) 15:30:09 <slaweq> but it's repeatable 15:30:17 <slaweq> I rechecked few times and had such problem 15:30:37 <lajoskatona> but only with zuul? 15:30:45 <lajoskatona> I have never seen locally 15:31:02 <slaweq> I didn't try to run all functional tests locally 15:32:48 <ralonsoh> I need to review that, I have no idea why this is happening 15:32:55 <slaweq> so, any help with that is more than welcome :) 15:33:02 <slaweq> thx ralonsoh 15:34:13 <slaweq> I also reported 2 new bugs 15:34:15 <slaweq> https://bugs.launchpad.net/neutron/+bug/1917487 15:34:16 <openstack> Launchpad bug 1917487 in neutron "[FT] "IpNetnsCommand.add" command fails frequently " [Critical,New] - Assigned to Rodolfo Alonso (rodolfo-alonso-hernandez) 15:34:18 <slaweq> sorry 15:34:24 <slaweq> this one was reported by ralonsoh :) 15:34:35 <slaweq> I just found new occurence of that issue this week 15:34:37 <slaweq> :) 15:34:50 <slaweq> but I opened new bug https://bugs.launchpad.net/neutron/+bug/1918266 15:34:50 <openstack> Launchpad bug 1918266 in neutron "Functional test test_gateway_chassis_rebalance failing due to "failed to bind logical router"" [High,Confirmed] 15:35:03 <slaweq> any volunteer to check that? 15:35:20 <slaweq> if not, I will ask jlibosva or otherwiseguy if they have some cycles to look 15:35:28 <jlibosva> o/ 15:35:33 <ralonsoh> sorry, not this week, I has 6 bugs today for me 15:35:35 * jlibosva looks 15:35:47 <slaweq> ralonsoh: no need to sorry, I know You are busy :) 15:36:57 <jlibosva> slaweq: I can have a look, tho I see we still don't collect OVN logs :-/ 15:37:07 <slaweq> we don't? 15:37:23 <slaweq> I thought we merged Your patch already 15:37:33 <jlibosva> yeah, we did but the logs are not there 15:37:43 <jlibosva> ah, sorry 15:37:46 <jlibosva> the patch is not yet merged 15:38:01 <jlibosva> wait :) 15:38:12 <slaweq> jlibosva: ok, so lets merge that patch first and then if the problem will happen again, I will ping You :) 15:38:17 <slaweq> fine for You? 15:38:36 <jlibosva> yes, I'll check if perhaps the patch fixed some jobs only or if functional was inlcuded too 15:39:18 <slaweq> jlibosva: are we talking about https://review.opendev.org/c/openstack/neutron/+/771658 ? 15:39:24 <slaweq> if so, it's just for tempest jobs 15:40:24 <jlibosva> slaweq: that's right 15:40:36 <jlibosva> maybe I'm looking at wrong place 15:40:47 <slaweq> jlibosva: can You do the same for functional job? 15:40:58 <jlibosva> slaweq: yes 15:41:08 <slaweq> You can make it "related to" to that LP mentioned above 15:41:24 <slaweq> #action jlibosva to fix collecting ovn logs in functional jobs 15:41:27 <slaweq> thx jlibosva 15:41:47 <slaweq> ok, generally it's all what I have for today 15:41:58 <slaweq> do You have anything else related to our CI to discuss? 15:43:08 <bcafarel> nothing else from me 15:43:12 <slaweq> if not, I think we can finish meeting earlier today 15:43:13 <ralonsoh> nope 15:43:19 <slaweq> thx for attending the meeting 15:43:24 <ralonsoh> bye 15:43:26 <slaweq> and have a great week o/ 15:43:30 <slaweq> #endmeeting