15:00:11 #startmeeting neutron_ci 15:00:11 Meeting started Wed Mar 18 15:00:11 2020 UTC and is due to finish in 60 minutes. The chair is slaweq. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:00:12 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:00:15 The meeting name has been set to 'neutron_ci' 15:00:16 o/ 15:00:32 o/ 15:00:37 hi 15:00:51 ok, lets go :) 15:00:55 Grafana dashboard: http://grafana.openstack.org/dashboard/db/neutron-failure-rate 15:00:56 Please open now :) 15:01:29 #topic Actions from previous meetings 15:01:41 maciejjozefczyk to take a look at https://bugs.launchpad.net/neutron/+bug/1865453 15:01:42 Launchpad bug 1865453 in neutron "neutron.tests.functional.plugins.ml2.drivers.ovn.mech_driver.test_mech_driver.TestVirtualPorts.test_virtual_port_created_before fails randomly" [High,In progress] - Assigned to Maciej Jozefczyk (maciej.jozefczyk) 15:03:14 I just ping maciej to join here 15:03:19 maybe he has some update 15:04:32 ok, lets move on for now 15:04:36 maybe he will join later 15:04:47 ralonsoh to check "neutron_tempest_plugin.scenario.test_multicast.MulticastTestIPv4 failing often on deletion of server" 15:04:56 I tried but I didn't find anything 15:05:05 sorry, I can't find the cause of this error 15:05:07 sorry, Im here 15:05:18 (let's go back to the previous one) 15:05:19 forgot about '-3' suffix:) 15:05:48 #link https://bugs.launchpad.net/neutron/+bug/1865453 15:05:49 Launchpad bug 1865453 in neutron "neutron.tests.functional.plugins.ml2.drivers.ovn.mech_driver.test_mech_driver.TestVirtualPorts.test_virtual_port_created_before fails randomly" [High,In progress] - Assigned to Maciej Jozefczyk (maciej.jozefczyk) 15:05:53 maciejjozefczyk, ^^ 15:06:00 thx ralonsoh 15:07:00 ok, so. I tried to debug whats going on there 15:07:26 proposed a change to retry on failed asserts 15:07:28 #link https://review.opendev.org/#/c/712888/ 15:08:03 but anyways, even with those, we still have random failures of tests in that class, like 1 test per 20 runs of functional tests (based the experiment from the link) 15:08:16 and actually I don't really know what causes that 15:08:17 maciejjozefczyk++ for hack with multiple functional jobs :) 15:08:53 Does it hurt as very much? I mean on gates? 15:09:02 if so, i vote for setting those tests as unstable for now 15:09:14 cause we could spend hours on debugging it... 15:09:42 and better solution would be to do virtual ports association in some more clever way, like setting a proper device_type 15:09:49 in neutron API by octavia 15:09:57 that will solve those failures instantly 15:10:20 thats what I discussed with Lucas 15:10:28 maciejjozefczyk: so I'm fine with marking those tests as unstable for now, let's make other's life easier :) 15:10:40 maciejjozefczyk: That is pretty darn cool what you did in that change 15:11:02 +1 to make then unstable, for now 15:11:10 mark* 15:11:21 njohnston, about multiplexing functionals? 15:11:23 maciejjozefczyk: will You propose patch to mark it as unstable? 15:11:26 slaweq, yes 15:11:30 thx a lot 15:11:35 maciejjozefczyk: yes 15:11:41 njohnston, creds for Jakub Libosvar :) 15:11:53 #action maciejjozefczyk to mark TestVirtualPorts functional tests as unstable for now 15:12:31 ok, so lets back to second action item 15:12:33 ralonsoh to check "neutron_tempest_plugin.scenario.test_multicast.MulticastTestIPv4 failing often on deletion of server" 15:12:49 You said that You couldn't find root cause of this 15:12:51 again, sorry for not finding abything 15:12:53 but is it still happening? 15:12:59 I didn't saw it this week 15:13:18 maybe (maybe) this is cause by another test executed in parallel 15:13:43 yes, I didn't see this test failing in the last 10 days 15:14:02 so lets hope it will not hit as again :D 15:14:08 and forget about it for now 15:14:13 do You agree with that? 15:14:15 :) 15:14:19 I'll freeze it for now 15:14:24 ++ 15:15:48 ok, last one from last week 15:15:50 slaweq to prepare etherpad to track progress with ipv6-only testing goal 15:15:53 Etherpad https://etherpad.openstack.org/p/neutron-stadium-ipv6-testing 15:16:09 but I didn't had time to work on any of those patches so far 15:16:16 if anyone wants to help, would be great 15:16:31 and also I will add this etherpad to the stadium projects topic for next weeks 15:16:41 to not forget about it and to track what is still todo 15:16:45 are You ok with that? 15:17:22 yes 15:17:43 sounds good 15:17:59 +1 15:18:22 ok 15:18:41 I'll try to work on it as well 15:18:45 thx njohnston 15:18:50 ok, lets move on 15:18:53 #topic Stadium projects 15:19:03 standardize on zuul v3 15:19:05 Etherpad: https://etherpad.openstack.org/p/neutron-train-zuulv3-py27drop 15:19:10 I don't think there was any progress on that 15:19:37 but maybe I missed something 15:20:02 I did not see any progress either 15:20:23 ok, anything else regarding stadium projects and ci for today? 15:21:17 yes 15:21:35 In OVN octavia provider driver we're setting up devstack plugin and CI (for now non voting) 15:21:59 #link https://review.opendev.org/#/c/708870/ 15:22:14 If you have a minute to check if that makes sense, would be great :) 15:23:35 maciejjozefczyk: noted, I will check it tomorrow morning 15:24:30 ok, lets move on 15:24:31 #topic Grafana 15:24:42 #link http://grafana.openstack.org/dashboard/db/neutron-failure-rate 15:25:20 slaweq, thanks 15:26:19 I checked today average number of rechecks from last days 15:26:20 Average number of rechecks in last weeks: 15:26:22 week 11 of 2020: 5.4 15:26:24 week 12 of 2020: 0.67 15:26:26 week 12 is now 15:26:30 wow nice 15:26:35 but it looks generally better than it was 15:26:57 That means its better or we don't produce code :)? 15:27:04 and if we will fix some functional tests failures, should be even better 15:27:43 maciejjozefczyk: my script don't reports how many patches was merged during the week but I think that we do still produce code 15:28:19 slaweq, ok :) 15:28:56 looking at grafana, tempest and neutron-tempest-plugin jobs looks pretty good 15:28:57 I hope we still do :) 15:29:03 specially voting ones 15:29:24 the highest failure rates are still on functional job and grenade jobs 15:30:31 but I know that people from nova are working on this failure which causes many grenade failures so hopefully it will be better soon 15:31:05 and our functional job failures are mostly related to those ovn tests which maciejjozefczyk will mark as unstable for now 15:31:25 anything else You want to add regarding grafana? 15:32:31 nothing here 15:32:48 ok, lets move on 15:32:56 #topic fullstack/functional 15:33:08 first I wanted to ask njohnston How it's going with Marking security group fullstack tests as stable - https://review.opendev.org/710782 15:33:21 should we try to merge this? 15:34:15 We have had 8 successes and 2 failures due to 'broken pipe' on ssh 15:34:32 and 1 failure due to an apt cache issue that was before the tests were even started 15:34:49 hmm, I'm affraid about this broken pipe failures 15:35:02 do You have link to such failure? 15:35:06 broken pipe example: https://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_f5c/710782/3/check/neutron-fullstack/f5c09cb/testr_results.html 15:36:02 broken pipe 2: https://82f990e55ca70e41f871-36409e2f060733ae498eef8cd27f40f4.ssl.cf5.rackcdn.com/710782/3/check/neutron-fullstack/126cd0c/testr_results.html 15:36:16 is it borken pipe, or _StringException before? 15:36:49 (_StringException reminding me of py3 fixes some time ago) 15:36:50 IMHO broken pipe raised in netcat.test_connectivity() 15:37:01 ok 15:37:06 so looks for me like we still have some issue there 15:37:33 lets not merge it yet, I will try to take a look into that 15:37:39 ok? 15:37:47 +1 15:37:48 I am of the same mind 15:38:13 #action slaweq to investigate fullstack SG test broken pipe failures 15:39:18 ok 15:39:21 now functional tests 15:39:36 I found this week 2 ovn related failures which seems new for me 15:39:42 neutron.tests.functional.plugins.ml2.drivers.ovn.mech_driver.test_mech_driver.TestVirtualPorts.test_virtual_port_delete_parents 15:39:43 https://1ad6f2e4d61e2bf5ff0b-20b98b64cfa6ea87451df6eaddafb782.ssl.cf5.rackcdn.com/712640/1/check/neutron-functional/17dbbf8/testr_results.html 15:39:45 neutron.tests.functional.plugins.ml2.drivers.ovn.mech_driver.ovsdb.test_ovn_db_sync.TestOvnNbSyncOverTcp.test_ovn_nb_sync_log 15:39:47 https://4232a5777fe8521e933a-68bc071a5cbea1ed39a41226592204b6.ssl.cf5.rackcdn.com/712474/3/check/neutron-functional/3e06670/testr_results.html 15:39:57 can You take a look and check if that is something You already know? 15:40:22 this first one may be related to same issue like we discussed earlier with maciejjozefczyk 15:40:36 but assertion failure seems different here 15:40:48 the test_ovn_db_sync is a different problem 15:41:04 I can take a look 15:41:15 maciejjozefczyk: thx a lot 15:41:30 please open LP bug for it too, ok? 15:41:43 slaweq, ok 15:42:31 ok 15:42:48 #action maciejjozefczyk to take a look and report LP for failures in neutron.tests.functional.plugins.ml2.drivers.ovn.mech_driver.ovsdb.test_ovn_db_sync.TestOvnNbSyncOverTcp.test_ovn_nb_sync_log 15:42:51 thx maciejjozefczyk 15:43:19 and one last thing regarding this topic from me 15:43:20 slaweq, sure 15:44:00 I was looking into grafana dashboards and I think that fullstack-with-uwsgi job is as stable as fullstack job 15:44:11 should we maybe promote it to be voting? 15:44:47 I don't want to do the same with functional job for now as it's failing much more often (the same as non uwsgi functional job but both are failing too often) 15:45:06 +1 for making fullstack-uwsgi voting 15:45:18 dashboard is here: http://grafana.openstack.org/d/Hj5IHcSmz/neutron-failure-rate?orgId=1&fullscreen&panelId=22 15:45:20 to check 15:45:50 yes it looks good compared to fullstack 15:46:49 ok, I will propose patch for that 15:47:08 #action slaweq to change fullstack-with-uwsgi job to be voting 15:47:23 what about gating? should we add it to gate queue also, or not yet? 15:47:33 I think that we can but want to know Your opinion :) 15:48:51 Let's give it a couple of weeks voting and make sure we don't have buyer's remorse 15:49:06 no reason to hit the accelerator 15:49:13 ok 15:49:19 sounds reasonable 15:49:33 also checking we do not have any uwsgi jobs in gate for now? 15:49:39 bcafarel: nope 15:49:48 we have only non-voting ones in check queue 15:50:52 ok, lets move on 15:50:54 #topic Tempest/Scenario 15:51:05 here I don't have any new issues 15:51:09 only one question 15:51:16 similar to the previous one 15:51:44 what do You think about making neutron-tempest-with-uwsgi job to be voting? 15:51:50 http://grafana.openstack.org/d/Hj5IHcSmz/neutron-failure-rate?orgId=1&fullscreen&panelId=16 15:52:14 it also looks pretty stable 15:52:38 in last 30 days it was following other jobs basically and most of the time was belowe 10% of failures 15:52:45 +1 for this too 15:52:54 more voting uwsgi jobs ++ 15:52:59 ++ 15:52:59 :) 15:53:11 #action slaweq to make neutron-tempest-with-uwsgi voting too 15:53:23 ok, that was fast 15:53:31 so one last topic or today 15:53:33 #topic Periodic 15:53:53 I noticed that since few days our neutron-ovn-tempest-ovs-master-fedora job is failing with RETRY_LIMIT 15:54:10 if may be something related to fedora/devstack/infra 15:54:14 but we should check that 15:54:19 is there any volunteer? 15:54:47 I can take a look (and grab ovn folks if help needed) 15:54:54 bcafarel: thx a lot 15:55:08 #action bcafarel to check neutron-ovn-tempest-ovs-master-fedora RETRY_LIMIT failures 15:55:22 ok, that's all from my side for today 15:55:34 is there anything else You want to talk about? 15:56:35 bcafarel, ping me if you need any help about ovn 15:57:11 :) maciejjozefczyk thanks will do (if it is not a generic fedora/infra issue)à 15:57:50 thx maciejjozefczyk :) 15:57:57 ok, I think we are done for today 15:58:00 o/ 15:58:01 thx for atttending 15:58:06 #endmeeting