16:00:36 #startmeeting neutron_ci 16:00:37 Meeting started Tue Oct 23 16:00:36 2018 UTC and is due to finish in 60 minutes. The chair is slaweq. Information about MeetBot at http://wiki.debian.org/MeetBot. 16:00:38 hi 16:00:39 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 16:00:42 The meeting name has been set to 'neutron_ci' 16:00:44 o/ 16:00:57 o/ 16:01:42 #topic Actions from previous meetings 16:01:48 lets start 16:01:56 slaweq to continue checking how jobs will run on Bionic nodes 16:02:08 I didn't have time to work on this during last week 16:02:15 sorry for that 16:02:30 I will assign it to my for next week also 16:02:35 #action slaweq to continue checking how jobs will run on Bionic nodes 16:02:51 next one: 16:02:53 mlavalle to continue debugging issue with not reachable FIP in scenario jobs 16:03:07 I went over a coouple more failures 16:03:14 and left notes in the bug 16:03:32 I created an enviroment to test it loclly 16:03:36 locally 16:03:37 o/ 16:03:46 so I am running it 16:03:54 in my local environment 16:04:05 and was You able to reproduce it locally? 16:04:12 not yet 16:04:20 I'll try today and tomorrow 16:04:43 if I don't succeed then I will try to debug in zuul 16:04:53 following slaweq's recipe 16:05:10 ok, thx for update and for working on this mlavalle 16:05:24 #action mlavalle to continue debugging issue with not reachable FIP in scenario jobs 16:05:34 ok, next one is 16:05:35 * mlavalle to send an email about moving tempest plugins from stadium to separate repo 16:05:42 I did earlier today 16:06:05 http://lists.openstack.org/pipermail/openstack-dev/2018-October/135977.html 16:06:40 thx mlavalle 16:07:07 lets wait for response from stadium projects now 16:07:16 next one was: 16:07:18 * slaweq to report a bug about issue with process ending in fullstack tests 16:07:24 Done: https://bugs.launchpad.net/neutron/+bug/1798472 16:07:24 Launchpad bug 1798472 in neutron "Fullstack tests fails because process is not killed properly" [High,Confirmed] 16:07:45 I only reported it but I didn't have possibility to work on this 16:08:00 so it's free to take if someone wants :) 16:08:22 it's maybe not "low-hanging-fruit" but may be interesting to debug ;) 16:08:30 I might take it after I finsih debugging the previous to do that I have 16:08:47 thx mlavalle 16:09:01 ok, next one: 16:09:03 * haleyb to check issue with failing FIP transition to down state 16:09:14 I think haleyb is not around today 16:09:19 i just got here 16:09:26 ohh, hi haleyb :) 16:09:59 i looked into it, test seems good still, don't know why associated port gets into BUILD state, still looking 16:10:52 so it's port to which FIP is associated is in build state? 16:11:19 right, so fip stays in ACTIVE (from my memory) 16:12:22 so i'll continue the fight 16:13:58 can it be maybe related to https://review.openstack.org/#/c/606827/ somehow? 16:14:43 there is info that port may stuck in DOWN state but I don't know, maybe it's something similar? 16:15:10 i'll look, but in the original issue there is no live migration, it just removes the device from the VM 16:15:21 ahh, ok 16:15:28 so maybe that other issue then 16:15:58 from what I know port is in BUILD if L2 agent will not send info that port is ACTIVE (or DOWN) 16:16:12 so You should check in L2 agent's logs if port was wired properly 16:16:23 and dig from there IMHO 16:16:36 sure, will look there 16:16:42 thanks! 16:17:02 ok, move on then 16:17:06 next was: 16:17:09 slaweq to add e-r query for know fullstack issue (when bug will be reported) 16:17:14 Done: https://review.openstack.org/#/c/611529/ 16:17:20 patch is merged already 16:18:03 and that's all for actions from previous week 16:18:15 anyone wants to add anything? 16:18:29 not me 16:18:50 ok, lets move on then 16:18:57 #topic Grafana 16:19:04 #link http://grafana.openstack.org/dashboard/db/neutron-failure-rate 16:20:58 there was some spike during the weekend on some jobs, but as it was similar on man different jobs I suspect it's related to some infra issue maybe 16:21:14 I don't know about anything on our side which could cause such issue 16:21:57 anything specific You want to discuss now? 16:22:06 no, I'm good 16:22:17 or can we go to talk about different jobs as usually? 16:22:28 yes, let's do that 16:22:55 #topic fullstack/functional 16:23:13 speaing about fullstack, I found 2 issues recently 16:23:27 one was mentioned before: * https://bugs.launchpad.net/neutron/+bug/1798472 16:23:28 Launchpad bug 1798472 in neutron "Fullstack tests fails because process is not killed properly" [High,Confirmed] 16:23:35 and second is * https://bugs.launchpad.net/neutron/+bug/1798475 16:23:36 Launchpad bug 1798475 in neutron "Fullstack test test_ha_router_restart_agents_no_packet_lost failing" [High,Confirmed] 16:24:09 where looks that restart of L3 agents with ha routers again cause some packet loss sometimes 16:24:12 They are in haleyb's bug report 16:24:26 mlavalle: yes, both are there 16:26:17 currently it's not hitting as very often but if this second issue will be very annoying we can mark this test as unstable 16:26:29 but I would like first to try to fix it maybe :) 16:26:57 and speaking about functional tests, we also have still some issues there 16:27:36 I still saw sometimes that db migration tests are still failing, even with timeout set to 300 seconds 16:28:01 I checked in logs that those tests can take even up to 400 seconds (I found such numbers at least) 16:28:18 so I proposed another patch: https://review.openstack.org/#/c/612505/ 16:28:42 I know that it's not the best "solution" but I don't think we can do anything else with it :/ 16:28:57 so please check that patch 16:29:09 I just +2ed it 16:29:13 thx mlavalle 16:29:24 and I will push it when zuul returns green 16:29:28 and second issue which I found from time to time is failing test: neutron.tests.functional.agent.l3.test_ha_router.LinuxBridgeL3HATestCase.test_ha_router_namespace_has_ipv6_forwarding_disabled 16:29:54 like e.g. in http://logs.openstack.org/37/610737/2/gate/neutron-functional/286402b/logs/testr_results.html.gz 16:30:27 I think that it may be related to failing fullstack test: https://bugs.launchpad.net/neutron/+bug/1798475 but that should be checked in logs first 16:30:27 Launchpad bug 1798475 in neutron "Fullstack test test_ha_router_restart_agents_no_packet_lost failing" [High,Confirmed] 16:30:44 and I will check that this week 16:31:12 #action slaweq to check if failing test_ha_router_namespace_has_ipv6_forwarding_disabled is related to bug https://bugs.launchpad.net/neutron/+bug/1798475 16:31:50 and that's all about fullstack/functional for today on my side 16:31:55 do You want to add something? 16:31:58 +1 16:32:11 no 16:32:46 ok, so next topic 16:32:48 #topic Tempest/Scenario 16:32:58 I was looking at tempest jobs this week 16:33:30 and I found out that we have some timeouts reached for neutron-tempest-xxx jobs from time to time 16:33:40 it hits jobs like: 16:33:48 neutron-tempest-iptables_hybrid 16:33:54 neutron-tempest-linuxbridge 16:34:02 and 16:34:04 neutron-tempest-dvr 16:34:19 all of them has got timeout configured for 7800 seconds 16:34:32 maybe we should increase it? 16:34:53 we can give it a try 16:35:08 I can imagine that sometimes job can go to cloud node which is under heavy load and tests runs slower there 16:35:22 so I think it's better to wait e.g. 3h for results then recheck 16:35:51 and from what I was checking neutron related tests aren't in top of longest ones in those jobs :) 16:36:34 so mlavalle, would it be good to add 1h to those timeouts? 16:36:44 yes 16:36:50 ok, I will do that 16:36:52 +1 16:37:02 #action slaweq to increase neutron-tempest jobs timeouts 16:37:16 I hope it will make jobs at least a bit more stable 16:37:40 that's all from me about scenario/tempest jobs for this week 16:37:46 do You have anything to add? 16:37:49 nope 16:38:07 so last topic for today is 16:38:09 #topic Open discussion 16:38:22 and I wanted to ask here about few things 16:38:42 first of all: we have currently 2 quite big topics related to CI jobs: 16:38:57 running jobs with python 3 16:38:59 and second 16:39:08 use of Bionic instead of Xenial 16:39:30 indeed 16:39:32 what do You think to add to agenda of this meeting topics about those two things? 16:39:42 to track progress on both of them weekly? 16:39:50 sure, I think that is a good idea 16:39:54 that's a good proposal 16:40:35 ok, so I will include it in next meeting then, thx :) 16:40:42 and also I have one more question 16:40:48 related to patch https://review.openstack.org/#/c/573933/ 16:40:59 it's waiting for review since long time 16:41:17 but I'm not sure if we really want to maintain something like that in our repo 16:41:26 so I wanted to ask You about opinions about it 16:41:57 it You didn't saw it yet, please add it to Your review list and check it this week maybe 16:41:58 I'll look at it slowly and leave my comments 16:42:05 thx mlavalle 16:42:26 Since it's meant for developers to run by hand, it seems to me it's more of a dev tool than a CI tool at least at this point 16:42:38 I'll have to run it and see what he's trying to do 16:42:56 njohnston: yes, for me it's some kind of UT framework for bash 16:44:01 and I'm not sure it is something what should land in our repo but I would like to know others' opinion about it :) 16:44:29 ok, and that's all from my side for today 16:44:39 cool 16:44:39 anything else You want to discuss? 16:44:43 not from me 16:44:51 * mlavalle is going to rest for a while 16:44:59 get well soon 16:45:13 ok, thx for attending guys 16:45:17 thanks! 16:45:20 bye 16:45:26 and feel better soon mlavalle :) 16:45:32 thanks! 16:45:33 #endmeeting