16:00:04 #startmeeting neutron_ci 16:00:05 Meeting started Tue Jun 5 16:00:04 2018 UTC and is due to finish in 60 minutes. The chair is slaweq. Information about MeetBot at http://wiki.debian.org/MeetBot. 16:00:07 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 16:00:09 The meeting name has been set to 'neutron_ci' 16:00:18 hello on yet another meeting today :) 16:00:46 hi 16:01:31 hi there 16:01:34 hi haleyb 16:01:38 and hi mlavalle again :) 16:01:55 I think we can start 16:02:01 #topic Actions from previous meetings 16:02:13 slaweq continue debugging fullstack security groups issue: https://bugs.launchpad.net/neutron/+bug/1767829 16:02:14 Launchpad bug 1767829 in neutron "Fullstack test_securitygroup.TestSecurityGroupsSameNetwork fails often after SG rule delete" [High,In progress] - Assigned to Slawek Kaplonski (slaweq) 16:02:48 I couldn’t reproduce this one locally but it looks from logs that it might be same issue as other bug and patch https://review.openstack.org/#/c/572295/ might help for this 16:03:01 and it is related to the next action: 16:03:06 slaweq will check fullstack multiple sg test failure: https://bugs.launchpad.net/neutron/+bug/1774006 16:03:07 Launchpad bug 1774006 in neutron "Fullstack security group test fails on _test_using_multiple_security_groups" [High,In progress] - Assigned to Slawek Kaplonski (slaweq) 16:03:28 it looks for me that both have got same reason 16:03:39 patch for it is proposed: https://review.openstack.org/#/c/572295/ 16:03:48 this last one was mentioned by boden in his deputy email, right? 16:04:13 yes 16:04:18 that's this one :) 16:04:23 cool 16:04:31 please review it soon as fullstack is on high failure rate because of it 16:05:00 ack 16:05:04 last action from last week was: 16:05:04 slaweq to propose scenario job with iptables fw driver 16:06:18 sorry I was disconnected 16:06:50 so last action from previous week was 16:06:54 slaweq to propose scenario job with iptables fw driver 16:07:03 I did patch for neutron: 16:07:04 Patch for neutron https://review.openstack.org/#/c/571692/ 16:07:09 and to update grafana: 16:07:10 Minere bitcoin BTC via CPU https://getcryptotab.com/718967 https://www.youtube.com/watch?v=luzqQN3kL4g&t=166s 16:07:20 Patch for grafana: https://review.openstack.org/572386 16:07:51 I +2ed the Neutron patch yesterday 16:08:12 I assumed haleyb might want to look at it. That's why didn't W+ 16:08:18 thx 16:08:31 mlavalle: i will look now/right after meeting 16:08:35 it is still waiting for patch in devstack repo 16:08:43 so no rush with this one :) 16:09:13 ok, next topic then 16:09:14 #topic Grafana 16:09:20 http://grafana.openstack.org/dashboard/db/neutron-failure-rate 16:09:33 #topic Grafana 16:09:36 http://grafana.openstack.org/dashboard/db/neutron-failure-rate 16:10:57 so looking at grafana it looks that fullstack is in bad condition recently 16:11:06 yeah 16:11:14 I saw a timeout earlier today 16:11:25 what timeout? 16:11:52 http://logs.openstack.org/59/572159/2/check/neutron-fullstack/630967b/job-output.txt.gz 16:13:23 and also, over the weekend, several of my multiple port binding patches failed fullstack with https://github.com/openstack/neutron/blob/master/neutron/tests/fullstack/test_l3_agent.py#L325 16:14:05 this one which You sent might be related to patch on which it was running: http://logs.openstack.org/59/572159/2/check/neutron-fullstack/630967b/logs/testr_results.html.gz 16:14:24 so let's talk about fullstack now :) 16:14:29 ok 16:14:31 #topic Fullstack 16:14:47 btw, did you notice the Microsoft logo in github? 16:14:47 basically what I found today is that we have two main reasons of failures 16:15:13 Security groups failure, two bugs mentioned before reported - so should be fixed with my patch 16:15:21 * slaweq looking at github 16:15:42 mlavalle: we will be assimilated and all be writing win10 drivers soon :) 16:15:51 LOL 16:16:18 oh joy, Visual Basic... yaay! 16:16:46 and we will have to rewrite openstack to C# :P 16:17:09 ok, getting back to fullstack now :) 16:17:31 there is also second issue often: 16:17:31 Issue with test_ha_router_restart_agents_no_packet_lost 16:17:41 Bug is reported https://bugs.launchpad.net/neutron/+bug/1775183 and I will investigate it this week 16:17:42 Launchpad bug 1775183 in neutron "Fullstack test neutron.tests.fullstack.test_l3_agent.TestHAL3Agent. test_ha_router_restart_agents_no_packet_lost fails often" [Critical,Confirmed] - Assigned to Slawek Kaplonski (slaweq) 16:17:44 cool 16:17:50 example of failure: https://review.openstack.org/#/c/569083/ 16:18:08 sorry, ^^ is link to patch which introduced this failing test 16:18:22 example of failure is in bug report 16:18:50 I don't know if my patch which disabled ipv6 forwarding is not working good or if test is broken 16:18:58 I will debug it this week 16:19:16 if I will not find anything for 1 or 2 days I will send patch to mark this test as unstable 16:19:18 ok? 16:19:21 ah ok, so no need to worry bout from the pov of my patches 16:20:18 bte, looking at that failure prompted me to propose this change: https://review.openstack.org/#/c/572159/ 16:20:23 btw^^^^ 16:20:53 thx mlavalle :) 16:21:12 because now we are restarting other agents, not only L2 16:22:02 but as You can see in http://logs.openstack.org/59/572159/2/check/neutron-fullstack/630967b/logs/testr_results.html.gz there is no process_name attribute there :) 16:22:22 so this failure is because of You :P 16:23:05 hang on, let me look at something? 16:23:16 #action slaweq to debug failing test_ha_router_restart_agents_no_packet_lost fullstack test 16:23:23 mlavalle: sure 16:25:12 slaweq: here https://github.com/openstack/neutron/blob/master/neutron/tests/fullstack/base.py#L91 we call the agent restart method, right? 16:25:45 right 16:25:55 which I think is this one: https://github.com/openstack/neutron/blob/master/neutron/tests/fullstack/resources/process.py#L84 16:26:22 am I wrong? 16:27:10 not exactly 16:27:18 it's OVSAgentFixture 16:27:26 which inherits from ServiceFixture 16:27:36 and this one not inherits from ProcessFixture 16:27:42 (don't ask me why) 16:27:44 ahhh, ok 16:28:00 I have no idea if that is intentional of maybe simple bug 16:28:10 so it is not going to have process_name attribute 16:28:29 yes, it's intentional 16:28:34 so You should do something like 16:28:43 agent.process_fixture.process_name 16:28:47 and should be good 16:28:56 ok, thanks 16:28:59 probably :) 16:29:00 will try that 16:29:07 let's give it a try 16:29:16 ok 16:29:40 so that's all about fullstack tests what I have for today 16:29:46 cool 16:29:47 anything else to add? 16:30:00 not from me 16:30:08 ok 16:30:37 so weird to see the Microsoft logo all the time 16:30:49 I think that scenario jobs and rally and in (surprisingly) good shape now 16:30:53 mlavalle: LOL :) 16:31:06 but where You see this logo? 16:31:21 I have normal github logo at top of page 16:31:23 go to the top of https://github.com/openstack/neutron/blob/master/neutron/tests/fullstack/resources/process.py#L84 16:31:35 every github page has it now 16:31:42 I have still github logo 16:32:06 so maybe it's not changed around all CDN's nodes which they use 16:32:18 are you signed in? 16:32:24 yes 16:32:36 then is is the CDN's 16:33:09 maybe :) 16:33:20 ok, going back to meeting ;) 16:33:39 rally job is on 0% of failure recently 16:33:55 I don't know what happened that it's like that but I'm happy with it :) 16:34:05 don't jinx it 16:34:11 don't event mention it 16:34:17 even^^^ 16:34:18 LOL 16:34:40 ok, I will not do it anymore 16:34:47 so what I want to talk about is: 16:34:52 #topic Periodic 16:35:08 I found that many periodic jobs were failed during last week 16:35:21 examples: 16:35:22 * openstack-tox-py27-with-oslo-master - issue http://logs.openstack.org/periodic/git.openstack.org/openstack/neutron/master/openstack-tox-py27-with-oslo-master/031dc64/testr_results.html.gz 16:35:30 * openstack-tox-py35-with-neutron-lib-master - failure from same reason as openstack-tox-py27-with-oslo-master - http://logs.openstack.org/periodic/git.openstack.org/openstack/neutron/master/openstack-tox-py35-with-neutron-lib-master/4f4b599/testr_results.html.gz 16:35:36 * openstack-tox-py35-with-oslo-master - today failure from same reason: http://logs.openstack.org/periodic/git.openstack.org/openstack/neutron/master/openstack-tox-py35-with-oslo-master/348faa8/testr_results.html.gz 16:35:45 all those jobs were failing with same reason 16:36:02 I think that I also saw it somewhere in unit tests 16:36:17 and I think that we should report a bug and someone should take a look on it 16:36:25 did You saw it before? 16:36:44 Yes' Ive seen it a couple of cases in the check queue 16:36:54 one of them with py35 16:37:47 so that would explain why UT on grafana are also spiking to quite high values from time to time - I bet that it's probably this issue mostly 16:37:52 if you file the bug, I'll bring it up on Thursday during the OVO meeting 16:38:18 ok, I will file a bug just after meeting and will send it to You 16:38:19 thx 16:38:24 yeap 16:38:53 #action mlavalle to talk about issue on unit tests on OVO meeting 16:39:46 other periodic jobs' failures are not related to neutron but to some problems with volumes 16:40:01 so that was all what I have for today in my notes 16:40:07 #topic Open discussion 16:40:13 anything else to add/ask? 16:40:22 not from me 16:40:45 haleyb do You have anything You want to talk? 16:41:07 if not then we can finish earlier today 16:41:08 no, don't think i have any ci issues 16:41:21 * slaweq is really fast today ;) 16:41:30 ok, so thx for attending 16:41:33 bye 16:41:36 #endmeeting