16:01:41 #startmeeting neutron_ci 16:01:42 Meeting started Tue Mar 12 16:01:41 2019 UTC and is due to finish in 60 minutes. The chair is slaweq. Information about MeetBot at http://wiki.debian.org/MeetBot. 16:01:43 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 16:01:43 hi 16:01:45 The meeting name has been set to 'neutron_ci' 16:02:06 o/ 16:02:24 hi njohnston 16:02:39 hello! 16:02:49 lets wait a bit more for others like haleyb, hongbin, bcafarel 16:02:58 I know that mlavalle will not be here today 16:03:05 bot is back in order? 16:03:10 (also o/ ) 16:03:23 and also I need to finish this meeting in about 45 minutes 16:03:32 bcafarel: seems so 16:03:56 ok, lets go then 16:03:59 first of all 16:04:01 Grafana dashboard: http://grafana.openstack.org/dashboard/db/neutron-failure-rate 16:04:06 please open it now :) 16:04:17 :) 16:04:20 it will be ready for later :) 16:04:44 #topic Actions from previous meetings 16:04:55 lets go then 16:04:58 first action 16:05:00 * njohnston Debug fullstack DSCP issue 16:05:30 I know that njohnston didn't look into it 16:05:33 because I did :) 16:05:40 and I found an issue 16:05:44 fix is here: https://review.openstack.org/#/c/642186/ 16:06:18 slaweq++ 16:06:21 basically it looks that sometimes tcpdump was starting too slow, we send one icmp packet which wasn't captured by tcpdump and test failed 16:06:31 so I proposed to send at least 10 packets always 16:06:43 good catch! 16:06:50 then some should be always captured by tcpdump 16:07:06 thx ralonsoh :) 16:07:09 ok, next one 16:07:11 mlavalle to take a look at fullstack dhcp rescheduling issue https://bugs.launchpad.net/neutron/+bug/1799555 16:07:12 Launchpad bug 1799555 in neutron "Fullstack test neutron.tests.fullstack.test_dhcp_agent.TestDhcpAgentHA.test_reschedule_network_on_new_agent timeout" [High,Confirmed] 16:07:36 I don't think that mlavalle was looking into this one 16:08:16 slaweq, I can try it 16:08:22 thx ralonsoh :) 16:08:33 #action ralonsoh to take a look at fullstack dhcp rescheduling issue https://bugs.launchpad.net/neutron/+bug/1799555 16:08:34 Launchpad bug 1799555 in neutron "Fullstack test neutron.tests.fullstack.test_dhcp_agent.TestDhcpAgentHA.test_reschedule_network_on_new_agent timeout" [High,Confirmed] 16:08:52 and the last one from last week was: 16:08:54 slaweq to talk with tmorin about networking-bagpipe 16:09:03 and I totally forgot about this 16:09:05 sorry for that 16:09:13 I will add it to myself for this week 16:09:19 #action slaweq to talk with tmorin about networking-bagpipe 16:10:35 anything else You want to add/ask? 16:10:44 nope 16:11:17 ok, lets move on then 16:11:24 #topic Python 3 16:11:35 njohnston: any updates about stadium projects? 16:11:47 etherpad is here: https://etherpad.openstack.org/p/neutron_stadium_python3_status 16:12:17 Nothing yet, but I will note that as of now we have satisfied the basic requirements for the community goal 16:12:49 The community goal doesn't specify all the kinds of testing we're handling at this point, it establishes a baseline 16:13:14 great then 16:13:17 We've met that baseline across the stadium AFAICT 16:13:30 that is very good news njohnston2 :) 16:13:46 I'll continue and I think we should keep this topic, but this is more forward-looking, getting us ready for the python2 removal in the U cycle 16:13:56 but I will keep this topic in agenda for now, to track our (slow) progress on that in next cycle too 16:14:02 :) 16:14:07 perfect 16:14:13 thx njohnston2 16:14:28 ok, lets move on then 16:14:30 next topic 16:14:32 #topic Ubuntu Bionic in CI jobs 16:16:09 We are almost good with it 16:16:19 we have for now one issue with fullstack tests in neutron 16:16:37 because on Bionic node we shouldn't compile ovs kernel module anymore, it's not necessary 16:17:16 I will try to push patch for that tomorrow morning, unless there is someone else who want's to take care of it today 16:17:50 I tried today with https://review.openstack.org/#/c/642461/ and it looks good 16:18:08 but it should be done in the way which will work on both, xenial and bionic 16:18:40 I’ll take a look 16:18:42 for networking-dynamic-routing we should be good to go with patch https://review.openstack.org/#/c/642433/2 16:18:46 njohnston: thx :) 16:19:06 if some of You have +2 power in networking-dynamic-routing, please check this patch :) 16:19:17 it isn't complicated :) 16:19:38 and works fine for Bionic too: https://review.openstack.org/#/c/639675/2 16:19:45 +2 16:19:55 thx njohnston2 :) 16:20:28 and third problem (but not very urgent) is with networking-bagpipe: https://review.openstack.org/#/c/642456/1 fullstack job is running but there is some fail still: http://logs.openstack.org/87/639987/3/check/legacy-networking-bagpipe-dsvm-fullstack/8d5af6c/job-output.txt.gz 16:20:54 here we should have for sure same patch as for neutron to not compile ovs kernel module on Bionic 16:21:02 but even with this tests are failing 16:21:22 "ImportError: cannot import name async_process" perhaps a dependency changed names? 16:21:27 I'm not familiar with bagpipe so I'm not sure if that is somehow related to switch to Bionic 16:21:31 njohnston2: maybe 16:21:43 but problem isn't very urgent as it is non-voting job there :) 16:22:26 and last one with some problems is networking-midonet, but here yamamoto is aware of those issues so I hope they will fix them 16:23:23 ok, any questions/something to add regarding bionic? 16:23:51 looks good, great work 16:24:01 thx njohnston :) 16:24:12 ok, so moving to next topic 16:24:16 #topic tempest-plugins migration 16:24:26 njohnston_: how many of You are in the meeting? :D 16:24:47 Etherpad: https://etherpad.openstack.org/p/neutron_stadium_move_to_tempest_plugin_repo 16:24:54 any updates here? 16:24:59 I don't have anything 16:25:03 Hey, I've been saying that cloning myself was the only way to get ahead of these trello cards... 16:25:13 nothing from me 16:25:15 njohnston: LOL 16:25:20 great idea 16:25:33 but I don't know if my wife will handle that when I will clone myself :P 16:25:41 * haleyb wanders in late 16:25:46 you're not allowed to clone yourself, because an army of Hulks would be unstoppable 16:25:46 any bugs for me? :) 16:25:56 LOL 16:25:58 haleyb: all of them 16:26:10 haleyb: hi, sure, if You want some :P 16:26:18 ok, lets move on 16:26:20 #topic Grafana 16:26:31 reminder link: http://grafana.openstack.org/dashboard/db/neutron-failure-rate 16:27:07 that reminds me to update the ovn dashboard 16:27:50 is it weird to anyone else how for the check queue the number of runs is on a steady downward slope 16:28:39 example "Number of Unit Tests jobs runs (Check queue)" has a high around 78 on 3/6 steadily down to 45 for 3/11 16:29:05 njohnston: yes, looks strange but lets see how it will be in next days maybe 16:29:34 IMHO last week we had some "spike" there 16:30:24 njohnston: if You will get data from 30 days, it doesn't look odd 16:30:31 oh good 16:30:46 I set it to the 30 day view, that should render sometime on Thursday 16:31:49 actually that was very fast. That really helps show what is normal skitter and what isn't. 16:32:02 from other things it generally looks much better than in last week 16:32:19 I see that unit tests are rising, and that isn't good IMO 16:32:30 a few fixes got in yeah 16:32:47 and in last couple of days I saw issue like http://logs.openstack.org/79/633979/26/check/openstack-tox-py37/e7878ff/testr_results.html.gz in various unit tests 16:32:54 did You saw it too? 16:33:32 test_port_ip_update_revises? I don't think so 16:33:38 i saw this update_revises somewhere 16:33:54 looking at logstash: http://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%20%5C%22line%20170%2C%20in%20test_port_ip_update_revises%5C%22 16:34:06 it happend at least couple of times in last week 16:34:47 I managed to miss that one 16:34:51 anyone wants to look at this? 16:35:11 me 16:35:17 ralonsoh: thx a lot 16:35:30 please report a bug for it so we can track it there 16:35:37 dure 16:35:39 sure 16:35:52 #action ralonsoh to take a look at update_revises unit test failures 16:36:15 ok, lets move on then 16:36:34 or You want to talk about something else related to grafana? 16:36:45 While we have grafana up... What do people think about marking the neutron-tempest-iptables_hybrid-fedora job voting? It looks very stable as far as error rate, and will be helpful as the next version of RHEL comes down the pike. 16:37:35 njohnston: that is very good idea IMO 16:38:11 OK, I'll push a change for it 16:38:25 it's under 25% in last 30 days at least 16:38:36 there wasn't any real problem with it before IIRC 16:38:55 it only deviates from neutron-tempest-iptables_hybrid by a percent or two 16:39:01 so I think it's very good idea to make it voting (and maybe gating in few weeks too) 16:39:04 is the fedora one on that page? 16:39:07 which seems to be within the margin of error 16:39:25 haleyb: Yes, look in " Integrated Tempest Failure Rates (Check queue)" 16:39:32 which has a lot of lines but that is one of them 16:39:38 http://grafana.openstack.org/d/Hj5IHcSmz/neutron-failure-rate?panelId=16&fullscreen&orgId=1&from=now-30d&to=now 16:39:49 ah, a search didn't show it initially 16:40:08 latest datapoint: neutron-tempest-iptables_hybrid 8% fails, neutron-tempest-iptables_hybrid-fedora 6% fails 16:40:53 yes, there was one spike up to 100% about 2 weeks ago but it was the same for neutron-tempest-iptables_hybrid job and was related to os-vif issue IIRC 16:41:16 other than that it's good and I think it can be voting IMO 16:41:26 haleyb: any thoughts? 16:41:45 +1 for voting 16:41:57 ok, njohnston please propose patch for that 16:42:01 slaweq: https://review.openstack.org/642818 16:42:07 that was fast 16:42:11 :) 16:42:12 as long as we remember we're at the end of stein :) 16:42:48 yes, we can maybe wait with this patch couple of weeks to release stein and then merge this one 16:43:10 but IMO it will be good to have patch there and ask others what they think about it :) 16:43:16 I can -W until the branch 16:43:30 +1 16:43:34 sounds good 16:43:59 ok, lets move on as I will need to go soon 16:44:04 next topic 16:44:14 #topic functional/fullstack 16:44:32 we merged some fixes for recent failures and we are in better shape now 16:44:38 I wanted to share with You one bug 16:44:40 https://bugs.launchpad.net/neutron/+bug/1818614 16:44:41 Launchpad bug 1818614 in neutron "Various L3HA functional tests fails often" [Critical,In progress] - Assigned to Slawek Kaplonski (slaweq) 16:44:47 I was investigating it during last week 16:44:54 and I found 2 different issues there 16:45:22 one with neutron-keepalived-state-change for which patch is https://review.openstack.org/#/c/642295/ 16:45:30 but I will need to check it once again 16:45:36 so it's still -W 16:45:52 but I also opened second bug https://bugs.launchpad.net/neutron/+bug/1819160 16:45:54 Launchpad bug 1819160 in neutron "Functional tests for dvr ha routers are broken" [High,In progress] - Assigned to Slawek Kaplonski (slaweq) 16:46:07 and I would like to ask haleyb and other L3 experts to take a look on it 16:46:31 basically it looks for me that in test_dvr_router module in functional tests we are not testing dvr ha routers 16:46:43 or differently 16:46:54 we should test one dvr ha router spawned on 2 agents 16:47:09 but we are testing 2 independent dvr routers spawned on 2 different agents 16:47:27 and both of them can be set to master in the same time 16:47:48 IMO that is wrong and should be fixed but please check it if You will have time :) 16:48:02 will look 16:48:08 thx 16:48:49 ok, anything else You want to discuss today, because I need to leave now :) 16:49:02 that's it for me 16:49:21 ok, thx for attending and sorry for a nit shorter meeting today :) 16:49:22 nothing worth keeping you around! 16:49:25 #endmeeting