15:00:57 <slaweq> #startmeeting neutron_ci 15:00:58 <openstack> Meeting started Wed Mar 25 15:00:57 2020 UTC and is due to finish in 60 minutes. The chair is slaweq. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:00:59 <slaweq> hi 15:01:00 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:01:02 <openstack> The meeting name has been set to 'neutron_ci' 15:01:02 <ralonsoh> hi 15:01:20 <bcafarel> hello 15:01:24 <maciejjozefczyk> hey 15:01:51 <slaweq> ok, lets start 15:01:53 <slaweq> #topic Actions from previous meetings 15:02:00 <slaweq> first one 15:02:01 <slaweq> maciejjozefczyk to mark TestVirtualPorts functional tests as unstable for now 15:02:09 <njohnston> o/ 15:02:40 <maciejjozefczyk> slaweq, thats done 15:02:55 <slaweq> thx maciejjozefczyk :) 15:02:57 <maciejjozefczyk> #link https://review.opendev.org/#/c/713860/ 15:03:18 <slaweq> ok, next one 15:03:20 <slaweq> slaweq to investigate fullstack SG test broken pipe failures 15:03:33 <slaweq> and unfortunatelly I didn't had time to check it 15:03:35 <slaweq> sorry 15:03:38 <slaweq> #action slaweq to investigate fullstack SG test broken pipe failures 15:03:43 <slaweq> I will try this week 15:04:06 <slaweq> next one 15:04:08 <slaweq> maciejjozefczyk to take a look and report LP for failures in neutron.tests.functional.plugins.ml2.drivers.ovn.mech_driver.ovsdb.test_ovn_db_sync.TestOvnNbSyncOverTcp.test_ovn_nb_sync_log 15:05:07 <maciejjozefczyk> #link https://bugs.launchpad.net/neutron/+bug/1868110 15:05:08 <openstack> Launchpad bug 1868110 in neutron "[OVN] neutron.tests.functional.plugins.ml2.drivers.ovn.mech_driver.ovsdb.test_ovn_db_sync.TestOvnNbSyncOverTcp.test_ovn_nb_sync_log randomly fails" [High,Confirmed] - Assigned to Maciej Jozefczyk (maciej.jozefczyk) 15:05:31 <maciejjozefczyk> I tried to figure it out, but for now I wasn't able to reproduce it, but I didn't spend too much time on it. 15:05:45 <maciejjozefczyk> This week I'm gonna spend more time on it. 15:06:04 <slaweq> today I saw yet another similar issue https://ab00d9261534a206496f-e4dcf05f554cd2b2192f6b35230c9943.ssl.cf1.rackcdn.com/708985/8/gate/neutron-functional/2132176/testr_results.html 15:07:36 <maciejjozefczyk> slaweq, that looks like different test in the same clas 15:07:38 <maciejjozefczyk> class* 15:07:58 <maciejjozefczyk> thanks for the link 15:09:30 <slaweq> #action maciejjozefczyk to take a look and report LP for failures in neutron.tests.functional.plugins.ml2.drivers.ovn.mech_driver.ovsdb.test_ovn_db_sync.TestOvnNbSyncOverTcp.test_ovn_nb_sync_log 15:09:42 <slaweq> next one 15:09:48 <slaweq> even next 2 :) 15:09:50 <slaweq> slaweq to change fullstack-with-uwsgi job to be voting 15:09:51 <slaweq> and 15:09:55 <slaweq> slaweq to make neutron-tempest-with-uwsgi voting too 15:10:00 <slaweq> both done in same patch 15:10:05 <slaweq> Patch https://review.opendev.org/714917 15:10:37 <bcafarel> and both passing (luckily) 15:11:16 <slaweq> yes :) 15:11:33 <slaweq> so please review it :) 15:11:43 <slaweq> and the last one from previous week 15:11:45 <slaweq> bcafarel to check neutron-ovn-tempest-ovs-master-fedora RETRY_LIMIT failures 15:12:01 * bcafarel looks for LP link 15:12:24 <bcafarel> root cause was new fedora image did not have directory created for cache, infra++ folks quickly fixed it 15:12:32 <bcafarel> #link https://bugs.launchpad.net/devstack/+bug/1868076 15:12:33 <openstack> Launchpad bug 1868076 in devstack "Fedora jobs fail in setup-devstack-cache: find: ‘/opt/cache/files’: No such file or directory" [Undecided,Fix released] 15:12:50 <slaweq> thx bcafarel and infra-root :) 15:13:56 <slaweq> ok, lets move on 15:13:58 <slaweq> #topic Stadium projects 15:14:14 <slaweq> njohnston: any updates about migration to zuulv3? 15:14:47 <njohnston> I think it's the same as yesterday's team meeting - midonet, odl, and one change in bagpipe 15:15:04 <njohnston> but I have been in meetings continuously to this point so I have not had a chance to check today, sorry 15:15:18 <slaweq> ok 15:15:21 <slaweq> no problem 15:15:29 <slaweq> that isn't urgent for now :) 15:15:47 <slaweq> about IPv6-only deployments, I don't have any updates too 15:16:05 <slaweq> do You have anything else regarding stadium projects and ci for today? 15:16:31 <njohnston> nope! Not aware of any pernicious issues. 15:17:25 <slaweq> ok, so lets move on 15:17:33 <slaweq> #topic Grafana 15:17:39 <slaweq> #link http://grafana.openstack.org/dashboard/db/neutron-failure-rate 15:17:47 <slaweq> (sorry that I forgot it at the beginning) 15:18:22 <slaweq> Average number of rechecks in last weeks: 15:18:24 <slaweq> week 12 of 2020: 0.74 15:18:26 <slaweq> week 13 of 2020: 0.67 15:18:36 <slaweq> those numners from this and previous week looks really good :) 15:19:26 <slaweq> and when looking at grafana, it looks for me that we are in pretty good shape currently 15:19:38 <maciejjozefczyk> \o/ 15:19:50 <slaweq> still same issues which we had, but fortunatelly mostly with non-voting jobs :) 15:21:50 <bcafarel> one step at a time, stabler voting jobs is already great! 15:22:12 <slaweq> bcafarel: indeed :) 15:22:49 <njohnston> +1 15:23:30 <slaweq> and, as our gate is working pretty good, I don't have many examples of new failures to discuss today 15:23:53 <slaweq> #topic fullstack/functional 15:24:01 <slaweq> I saw again issue in neutron.tests.functional.agent.linux.test_keepalived.KeepalivedManagerTestCase 15:24:06 <slaweq> https://116f9c4d7ae86f7f062e-eb80c0e4aee14229f3b21e03b2a44f1c.ssl.cf2.rackcdn.com/712446/1/check/neutron-functional-with-uwsgi/cdb2795/testr_results.html 15:24:16 <slaweq> ralonsoh: maybe You will want to take a look ^^ :) 15:24:20 <ralonsoh> yes 15:24:42 <ralonsoh> but this problem is, again, in the ns deletion 15:24:43 <ralonsoh> pfffff 15:24:54 <slaweq> yep :/ 15:25:06 <slaweq> IIRC it's always in ns deletion 15:25:16 <ralonsoh> ok, I'll take care of it 15:25:36 <slaweq> maybe we should add some "safe_cleanup" method in test 15:25:49 <slaweq> catch timeout there and retry 15:26:25 <ralonsoh> the problem is in the privsep method and the eventlet library 15:26:43 <ralonsoh> if, for any reason, we give the GIL, the timeout will happen 15:26:56 <ralonsoh> (I'll check this later) 15:27:35 <slaweq> ok, thx 15:27:47 <slaweq> may I add action with this for You? 15:28:45 <ralonsoh> sure 15:29:09 <slaweq> #action ralonsoh to check (again) issue with ns deletion in neutron.tests.functional.agent.linux.test_keepalived.KeepalivedManagerTestCase 15:29:40 <slaweq> ok, and that's all from me regarding functional/fullstack jobs 15:29:53 <slaweq> do You have anything else You want to discuss, related to those jobs? 15:30:08 <njohnston> The security group tests for fullstack marked as stable - https://review.opendev.org/#/c/710782/ 15:30:53 <njohnston> We have rechecked it 5 times since I removed the iptables-hybrid scenario and the fullstack job has not failed 15:31:04 <ralonsoh> so we drop LB testing 15:31:37 <njohnston> sorry I said iptables hybrid, I meant LB 15:31:52 <slaweq> but should we really drop this LB scenario from it? 15:31:56 <njohnston> yes, I think it's better to test the two other scenarios than to not test any 15:32:27 <slaweq> can You maybe just comment this scenario and add TODO to bring it back when it will be fixed? 15:32:33 <njohnston> sure thing 15:32:36 <ralonsoh> +1 15:32:54 <slaweq> thx 15:33:01 <slaweq> and I will +2 it :) 15:34:34 <slaweq> anything else related to fullstack/functional or we can move on? 15:36:08 <njohnston> go ahead 15:36:10 <slaweq> ok, lets move on 15:36:13 <slaweq> #topic Tempest/Scenario 15:36:25 <slaweq> here I also have only 1 issue to mention 15:36:47 <slaweq> I spotted again issue with server termination on multicast test 15:36:49 <slaweq> https://103900b4a03cdd60217d-625a0eb0440aa527fbdb216e8991f5a6.ssl.cf5.rackcdn.com/714726/1/check/neutron-ovn-tempest-ovs-release/2b52c20/testr_results.html 15:37:54 <slaweq> ralonsoh: do You want to take a look at it or do You want me to check it this week? 15:38:18 <ralonsoh> I don't know if I'll have time this week 15:38:23 <ralonsoh> sorry 15:38:25 <bcafarel> I think I saw it pop on few stable backports too (at least I remember typing "recheck test_multicast_between_vms_on_same_network") 15:38:41 <slaweq> ok, I will check that one 15:38:55 <slaweq> #action slaweq to check server termination on multicast test 15:39:06 <slaweq> bcafarel: good to know that it's not only on master branch :) 15:39:35 <slaweq> and that's all from me regarding tempest jobs 15:39:41 <slaweq> do You have anything else to discuss? 15:41:16 <slaweq> ok, so lets move on 15:41:26 <slaweq> one last thing from me or today 15:41:28 <slaweq> #topic Periodic 15:41:34 <slaweq> neutron-tempest-postgres-full - failing since 18.03 constantly 15:41:35 <slaweq> Errors like http://zuul.openstack.org/build/95cb0e8c4c664727a2235bf5ee8d76ca/log/controller/logs/screen-q-svc.txt?severity=4 every day 15:42:52 <slaweq> it's failing only on postgresql 15:43:02 <ralonsoh> another postgresql incompatibility? in our ORM definition 15:43:09 <slaweq> and IMO some db expert should take a look 15:43:11 <ralonsoh> that happened before 15:44:15 <slaweq> it's either one of our patches merged around 17.03 or some pgsql update 15:44:22 <slaweq> I will open LP for that 15:44:40 <slaweq> and I will check if PG versions didn't changed recently in jobs 15:45:01 <slaweq> but I don't think I will have time to check something more with this issue 15:46:17 <slaweq> #action slaweq to report LP about PGSQL periodic job failures 15:46:41 <slaweq> if You have any experience with postgresql, and some cycles, You are more than welcome to check it :) 15:47:09 <slaweq> ok, and that's all on my side for today 15:47:16 <slaweq> #topic Open discussion 15:47:27 <slaweq> do You have anything else related to our ci to discuss today? 15:47:32 <ralonsoh> no 15:48:26 <bcafarel> all good here 15:49:45 <slaweq> ok, so I will give You few minutes back :) 15:49:49 <slaweq> thx for attending 15:49:49 <bcafarel> yay 15:49:55 <bcafarel> o/ 15:49:56 <slaweq> and have a great week 15:49:58 <slaweq> o/ 15:50:01 <slaweq> #endmeeting