15:01:54 #startmeeting neutron_ci 15:01:55 Meeting started Wed Mar 11 15:01:54 2020 UTC and is due to finish in 60 minutes. The chair is slaweq. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:01:56 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:01:58 The meeting name has been set to 'neutron_ci' 15:02:01 o/ 15:02:06 njohnston: ralonsoh bcafarel: here it should be :) 15:02:07 hello again 15:02:11 o/ 15:02:25 Grafana dashboard: http://grafana.openstack.org/dashboard/db/neutron-failure-rate 15:02:28 ah you were in another chan? that was why I felt lonely here :) 15:02:38 bcafarel: sorry :) 15:02:43 it was my mistake 15:02:55 ok, lets start 15:02:57 #topic Actions from previous meetings 15:03:06 first one 15:03:08 slaweq to remove neutron-tempest-dvr job from grafana 15:03:13 patch: https://review.opendev.org/712048 15:03:44 I don't think there is much to say about it so lets move on to the next one 15:03:47 maciejjozefczyk to take a look at https://bugs.launchpad.net/neutron/+bug/1865453 15:03:48 Launchpad bug 1865453 in neutron "neutron.tests.functional.plugins.ml2.drivers.ovn.mech_driver.test_mech_driver.TestVirtualPorts.test_virtual_port_created_before fails randomly" [High,Confirmed] - Assigned to Maciej Jozefczyk (maciej.jozefczyk) 15:04:20 slaweq, yes, im going to do it asap, I was on pto 15:04:45 ok maciejjozefczyk, I will assign it to You for next week, ok? 15:05:11 slaweq, im already assigned to this one 15:05:19 maciejjozefczyk: ok 15:05:28 #action maciejjozefczyk to take a look at https://bugs.launchpad.net/neutron/+bug/1865453 15:05:29 Launchpad bug 1865453 in neutron "neutron.tests.functional.plugins.ml2.drivers.ovn.mech_driver.test_mech_driver.TestVirtualPorts.test_virtual_port_created_before fails randomly" [High,Confirmed] - Assigned to Maciej Jozefczyk (maciej.jozefczyk) 15:05:41 maciejjozefczyk: do You think we should mark those tests as unstable temporary? 15:06:32 slaweq, let me try to take a look and spend an hour on it, If I'll not find anything at first shot I'll send a patch to mark to unstable, aight? 15:06:44 maciejjozefczyk: sure, sounds good 15:07:05 ok, next one 15:07:07 ralonsoh to check "neutron_tempest_plugin.scenario.test_multicast.MulticastTestIPv4 failing often on deletion of server" 15:07:19 slaweq, sorry, I didn't have time for this one 15:07:58 ok 15:08:09 will You try to check it this week? 15:08:29 yes, sure, I think I'll have time 15:08:36 #action ralonsoh to check "neutron_tempest_plugin.scenario.test_multicast.MulticastTestIPv4 failing often on deletion of server" 15:08:38 ralonsoh: thx 15:08:52 and the last one 15:08:54 slaweq to check problem with console output in scenario test 15:09:00 Patch https://review.opendev.org/712054 15:09:17 I think this should solve this issue 15:09:49 please review when You will have some time 15:10:04 will do 15:10:06 and that's all on the list of actions from last week 15:10:08 thx njohnston 15:10:11 "for server in servers: server = server.get("server") or server" that makes a funny line 15:10:38 bcafarel: yes, but it works :) 15:10:42 slaweq, ;) 15:10:47 true :) 15:11:19 #topic Stadium projects 15:11:27 standardize on zuul v3 15:11:29 Etherpad: https://etherpad.openstack.org/p/neutron-train-zuulv3-py27drop 15:11:58 according to zuul v3 migration, we have only 3 stadium projects left on the list 15:12:04 networking-bagpipe 15:12:06 networking-midonet 15:12:08 networking-odl 15:12:22 for bagpipe there is some patch proposed to convert fullstack job 15:12:29 but it's very red currently 15:12:38 for midonet and odl there is nothing proposed yet 15:13:37 in overall we are doing good progress on this 15:13:56 anything else related to stadium projects for today? 15:14:27 should we add some ipv6-only goal section here? 15:14:44 bcafarel: good point 15:14:52 https://review.opendev.org/#/q/status:open+topic:ipv6-only-deployment-and-testing+(status:open+OR+status:merged)+(project:%255Eopenstack/neutron.*+OR+project:%255Eopenstack/networking-.*) you mentioned on neutron meeting mostly shows networking-* remaining patches 15:15:04 I will prepare etherpad with summary of what is wrong there and will add it to the agenda 15:15:39 #action slaweq to prepare etherpad to track progress with ipv6-only testing goal 15:15:47 thanks! I tried to rebase networking-generic-switch patch but apparently there are other failures around (and did not check the others yet) 15:16:02 thx bcafarel 15:17:46 ok, lets move on 15:17:48 #topic Grafana 15:17:55 #link http://grafana.openstack.org/dashboard/db/neutron-failure-rate 15:19:19 in overall we still have the same problems as last week(s) 15:19:42 high failure rate of dvr scenario jobs (but those are non-voting) 15:19:52 and pretty high failure rate of functional job and grenade jobs 15:20:48 do you have a list of grenade and FTs tests failing? 15:21:10 ralonsoh: for grenade jobs nope 15:21:27 and for functional tests, it's mostly those ovn mech driver tests 15:21:41 which maciejjozefczyk will check 15:21:50 you are right 15:22:04 I sent quite a few "recheck grenade" on stable/train backports recently too, older branches are not affected 15:22:16 for grenade job, last time when I was checking it, it was mostly issue with some timeout on placement and instance in error state due to that 15:22:36 but maybe there is some different issue now 15:22:59 https://zuul.opendev.org/t/openstack/build/e820bdd66eb040fa9db010de1fe821b3 15:23:09 here is the grenade job failure from today :) 15:24:40 failures with "No valid host was found" 15:24:41 https://5e801b59e3f992e71df8-ab7619989a2ab5c37e2a2a5061e93a1d.ssl.cf5.rackcdn.com/711887/1/check/neutron-grenade-multinode/e820bdd/logs/grenade.sh.txt 15:25:54 No valid host was found. There are not enough hosts available 15:26:00 same error always 15:26:07 yep 15:26:22 and I believe it would be some issue in placement 15:27:09 ralonsoh: some failures in http://paste.openstack.org/show/790544/ (I did not check them, maybe "no valid host" everywhere) 15:27:32 I'll check it 15:28:43 there is such error there: https://zuul.opendev.org/t/openstack/build/e820bdd66eb040fa9db010de1fe821b3/log/logs/screen-n-sch.txt#1031 15:28:48 I think it may be related 15:28:58 it's not from placement but nova 15:29:25 if anyone has got any cycles and can take a look, that would be great :) 15:29:54 ok, anything else related to grafana for today? 15:30:56 if not, I think we can move on to the next topic 15:31:17 I don't have anything about functional/fullstack jobs for today 15:31:24 so lets talk about scenario 15:31:31 #topic Tempest/Scenario jobs 15:31:58 as I said before, our biggest issues are with dvr related multinode jobs 15:32:14 so I spent some time today to prepare some summary of what tests are failing there 15:32:23 and I have: https://etherpad.openstack.org/p/neutron-dvr-jobs-issues 15:32:36 most of the failures are in neutron-tempest-plugin-dvr-multinode-scenario 15:32:44 that is a long list :( 15:33:06 bcafarel: some links are there couple of times as in one job more than one test failed 15:33:26 and I put it in each section as I wanted to check how much each of those tests is failing 15:34:00 so our biggest issues are: 15:34:14 router migration from something (HA or Legacy) to dvr 15:34:31 and I believe that this is the same issue in both cases 15:34:38 and neutron_tempest_plugin.scenario.test_connectivity.NetworkConnectivityTest.test_connectivity_through_2_routers 15:34:51 and also security_groups tests 15:35:17 but those may be hopefully fixed when we will use new cirros 0.5.0 in our gate 15:35:39 as maciejjozefczyk did some improvements in cirros image to address issues with metadata timeouts 15:35:58 I think there were some problems with cirros 5 and OVN 15:36:11 but maciejjozefczyk knows this better 15:36:33 ralonsoh, I remember issue with ip link, but it should be fixed in 0.5.1 15:36:44 ahhh ok, perfect 15:36:51 maciejjozefczyk: is 0.5.1 already released? 15:37:40 slaweq, yes: http://download.cirros-cloud.net/0.5.1/ 15:38:00 maciejjozefczyk: is this new new release for the failures you saw trying 0.5.0? 15:38:04 maciejjozefczyk: great, You have already patch to use it in neutron, right? 15:38:05 im gonna verify it this week if all is fine, I saw some comments from Radoslaw here: https://review.opendev.org/#/c/711492/ 15:38:36 maciejjozefczyk: can You also send patch to neutron-tempest-plugin to use it in our jobs? 15:38:51 we can then check how many (if any) tests more will be green 15:38:53 :) 15:39:21 slaweq, yes sure, but I think that setting it in global devstack is the right way? I didnt find any setting for this in our tempest configuration, afair 15:39:46 or I missed something, anyways, im gonna take a look on it this week 15:40:03 we retrieve that from devstack 15:40:08 maciejjozefczyk: setting it in devstack is the best place but we can do it for our jobs even just like DNM patch as I'm curius if that will really help 15:40:43 slaweq, I already did it similiar way in https://review.opendev.org/#/c/711425/ 15:40:51 with depends-on 15:41:20 anyways, lemme check this and I'll be back with results :) 15:41:27 ok, thx 15:41:34 so my plan for that is: 15:41:49 1. we will check with new cirros if some of tests will be more stable, 15:42:08 2. for router migrations tests and connectivity tests I will open new LP 15:42:17 and we will need to check them in next weeks 15:42:38 is it ok for You? 15:42:55 slaweq, you have my sword 15:43:07 hahahaha 15:43:14 :) 15:44:14 anything else You have related to scenario/tempest jobs? 15:44:32 I'm working on some improvements for QoS tests 15:44:55 We spotted some issues while using QoS tests with OVN as a backend 15:45:05 #link https://bugs.launchpad.net/neutron/+bug/1866039 15:45:06 Launchpad bug 1866039 in neutron "[OVN] QoS gives different bandwidth limit measures than ml2/ovs" [High,In progress] - Assigned to Maciej Jozefczyk (maciej.jozefczyk) 15:45:15 It should be addressed by: 15:45:18 #link https://review.opendev.org/#/c/711048/ 15:45:27 If you'll have a minute please take a look:) thanks! 15:45:45 thx maciejjozefczyk, I will review 15:46:14 all from me, thanks! 15:46:44 anything else? or should we move on? 15:47:41 ok, so lets move on 15:47:52 I see that njohnston added something related to fullstack tests 15:47:57 so lets get back to this topic now 15:48:03 #topic fullstack/functional 15:48:11 njohnston: You're up :) 15:48:24 Yes - I have a change to mark the fullstack security group tests as stable 15:48:41 https://review.opendev.org/710782 15:48:49 It has not failed in a bit 15:49:11 So I wanted to let you all know, and see how many passing rechecks you think are needed before calling it stable 15:49:37 so disabling concurrency on security group tests seems to be enough? 15:49:48 it seems to have done the trick 15:50:03 I don't know for sure but IMO it may be worth to try 15:50:12 worst case we will mark it as unstable again 15:50:17 +1 15:51:01 thx njohnston for bringing this up :) 15:51:01 that's it for me 15:51:05 I almost forgot about it 15:51:10 :-) 15:51:48 I'd feel safer with 2 or 3 additional rounds of test rechecks - will see how it runs locally 15:52:05 bcafarel: I can definitely do that 15:52:09 bcafarel: sure :) 15:52:44 so far it has passed the security group tests 5 times, I will do 3 more 15:52:45 apart from that nice to see it was "only" this secgroup issue :) 15:53:28 great, so I will also keep an eye on it 15:53:42 do You have anything else to talk about today? 15:53:53 nope 15:53:59 if not, I think I can give You few minutes back 15:54:02 yay 15:54:15 \o/ 15:54:16 bye 15:54:17 ;) 15:54:19 ok, so thx for attending 15:54:21 o/ 15:54:23 #endmeeting