16:00:31 <slaweq> #startmeeting neutron_ci 16:00:32 <openstack> Meeting started Tue Feb 19 16:00:31 2019 UTC and is due to finish in 60 minutes. The chair is slaweq. Information about MeetBot at http://wiki.debian.org/MeetBot. 16:00:33 <slaweq> hi 16:00:34 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 16:00:36 <openstack> The meeting name has been set to 'neutron_ci' 16:00:40 <mlavalle> o/ 16:00:49 <njohnston_> o/ 16:01:26 <slaweq> lets wait few minutes for others 16:01:44 <slaweq> maybe bcafarel and hongbin will join too 16:01:52 <slaweq> I know that haleyb is on PTO today 16:01:58 <bcafarel> o/ 16:02:33 <bcafarel> thanks for the ping slaweq, I was writing some doc (easy to forget the clock then) 16:02:45 <slaweq> bcafarel: :) 16:03:00 <slaweq> ok, let's start then 16:03:06 <slaweq> #topic Actions from previous meetings 16:03:14 <slaweq> mlavalle to check if adding KillFilter for neutron-keepalived-state-change will solve issues with L3 agent in dvr jobs 16:03:39 <mlavalle> I porposed patch https://review.openstack.org/#/c/636710/ 16:03:51 <hongbin> o/ 16:03:59 <slaweq> hi hongbin :) 16:03:59 <mlavalle> and actually dougwig made an ineteresting comment 16:04:10 <mlavalle> which I intend to explore 16:04:36 <njohnston_> yeah, that is interesting 16:05:03 <slaweq> but patch basically looks like is helping to solve some of failing tests in multinode job, right? 16:05:25 <mlavalle> I agree in that we should keep the door as narrowly open as possible 16:05:48 <bcafarel> +1 16:05:54 <slaweq> ++ 16:06:01 <mlavalle> slaweq: I haven't had time to chack on that since yesterday? Did you look? 16:06:55 <slaweq> I looked at test's results right now: http://logs.openstack.org/10/636710/2/check/neutron-tempest-plugin-dvr-multinode-scenario/f44d655/testr_results.html.gz 16:07:10 <slaweq> it looks that "only" 2 tests failed which is much better than it was 16:07:47 <mlavalle> oh yeah, in fact the trunk lifecycle test is passing 16:07:56 <mlavalle> which I haven't seen in a long time 16:08:08 <mlavalle> so it looks we are moving in the right direction 16:08:10 <slaweq> also from quick look at l3-agent's log, it looks much better and IMO agent was working all the time properly 16:08:22 <bcafarel> so there really was a rootwrap filter missing all along? 16:08:31 <mlavalle> it seems so 16:08:32 <slaweq> bcafarel: for some time at least 16:08:45 <mlavalle> it was only for some months 16:08:46 <slaweq> bcafarel: I removed them when I removed old metadata proxy code 16:09:05 <bcafarel> slaweq: ok, missing for some months makes more sense :) 16:09:28 <slaweq> :) 16:09:36 <mlavalle> bcafarel: when I started suspecting the filters were the cause, I had the exact same question in my mind 16:09:54 <mlavalle> how come did this work before? 16:10:10 <mlavalle> but we really introduced the bug recently 16:10:43 <mlavalle> ok, so I'll play with some more rechecks of the patch 16:10:51 <slaweq> I think that this bug was exposed by switch to python3 16:11:01 <slaweq> on python2.7 it is "working" somehow 16:11:19 <mlavalle> and I'll explore dougwig's suggestion, which is very sensible 16:11:21 <slaweq> at least agent is not dying when this issue occurs 16:11:46 <mlavalle> yeah, I think that in python2.7 we still miss the filter 16:11:56 <mlavalle> but for some reason we don't kill the agent 16:12:12 <mlavalle> anbd therefore we don't have a chain reaction in the tests 16:12:21 <slaweq> yep 16:13:16 <slaweq> ok, moving on to the next one then 16:13:18 <slaweq> bcafarel to continue work on grafana jobs switch to python 3 16:13:37 <bcafarel> s/grafana/grenade/ 16:13:53 <slaweq> bcafarel: right :) 16:14:39 <bcafarel> I commented on #openstack-neutron yesterday on it, I think it will be better to have grenade job definition in grenade repo once it is full zuul v3 (which will allow devstack_localrc vars to work) 16:14:50 <bcafarel> so I updated https://review.openstack.org/#/c/636356/ to simply switch to py3 in the meantime 16:15:13 <bcafarel> just got zuul +1 :) 16:15:43 <slaweq> and now it looks that it is running on python 3 indeed :) 16:15:54 <slaweq> for example here: http://logs.openstack.org/56/636356/7/check/neutron-grenade-multinode/0ea452a/logs/screen-q-l3.txt.gz#_Feb_19_14_14_16_092790 16:16:24 <bcafarel> yes, wrong variable in my first try sorry, now it looks good 16:16:41 <slaweq> +2 it already 16:16:56 <slaweq> IMO it will be good if we just have single node grenade on python 2.7 16:17:01 <slaweq> and others on python3 16:17:31 <njohnston_> and the nature of the grenade job is that it upgrades from an old version running py27 to a new version running py3 16:18:28 <njohnston_> so regardless of what code is running the grenade harness we get some testing on both versions of neutron code 16:18:29 <slaweq> njohnston_: I think that in those jobs it runs on python3 for both old and new 16:18:43 <slaweq> see e.g. here: http://logs.openstack.org/56/636356/7/check/neutron-grenade-multinode/0ea452a/logs/old/local_conf.txt.gz 16:19:24 <slaweq> if it would be possible to do upgrade from py27 to py3 that would be great IMO 16:19:37 <slaweq> but maybe that should be discussed with QA team? 16:20:00 <njohnston_> I checked the grenade output, for example 16:20:09 <slaweq> njohnston_: in grenade-py3 job (defined in grenade repo) it is also like that, all on py3: http://logs.openstack.org/56/636356/7/check/grenade-py3/b075864/logs/old/local_conf.txt.gz 16:20:43 <njohnston_> old: http://logs.openstack.org/56/636356/7/check/grenade-py3/b075864/logs/grenade.sh.txt.gz#_2019-02-19_13_45_48_138 "New python executable in /opt/stack/old/requirements/.venv/bin/python2" 16:20:56 <njohnston_> new: http://logs.openstack.org/56/636356/7/check/grenade-py3/b075864/logs/grenade.sh.txt.gz#_2019-02-19_14_19_45_871 "New python executable in /opt/stack/new/requirements/.venv/bin/python3.5" 16:21:11 <hongbin> if the upgrade from py3 to py3 success, is there any scenario that upgrade from py2 to py3 break? 16:22:28 <slaweq> njohnston_: but is devstack deployed in venv in grenade job? I really don't think so 16:22:41 <slaweq> but I'm not grenade expert so may be wrong here :) 16:22:57 <slaweq> hongbin: we don't have any other jobs which tests upgrade 16:23:55 <njohnston_> slaweq: I can ask the QA team to be sure 16:24:10 <slaweq> njohnston_: ok, thx 16:24:21 <hongbin> ok, from me, i think testing from py3 to py3 upgrade is enough 16:24:39 <slaweq> njohnston_: but still, patch from bcafarel is "consistent" with this single node grenade job so we can IMO go forward with it :) 16:24:56 <njohnston_> slaweq: Absolutely, no disagreement there 16:25:36 <slaweq> njohnston_: great :) 16:25:58 <slaweq> ok, can we move on? 16:27:08 <njohnston_> yep 16:27:17 <slaweq> ok, next one then 16:27:23 <slaweq> slaweq to propose patch with new decorator skip_if_timeout in functional tests 16:27:33 <slaweq> I did patch https://review.openstack.org/#/c/636892/ 16:27:40 <slaweq> please review it if You can 16:28:40 <njohnston_> looks like functional tests failed on the last run 16:29:22 <njohnston_> I guess that would be the run still running 16:29:31 <slaweq> njohnston_: where? 16:29:40 <bcafarel> http://logs.openstack.org/92/636892/1/gate/neutron-functional/85da30c/logs/testr_results.html.gz ? 16:30:13 <slaweq> ahh, in the gate 16:30:21 <slaweq> ok, so maybe I will need to investigate it more :) 16:30:23 <slaweq> thx 16:30:25 <njohnston_> http://logs.openstack.org/92/636892/1/gate/neutron-functional/85da30c/job-output.txt.gz#_2019-02-19_15_32_02_801108 16:31:17 <slaweq> so it looks that it maybe don't work properly, I will investigate that tomorrow morning then 16:31:36 <slaweq> #action slaweq to fix patch with skip_if_timeout decorator 16:31:45 <bcafarel> you did get your test failure in the end at least :) 16:31:56 <slaweq> LOL, indeed 16:32:20 <slaweq> ok, that's all for actions from last week 16:32:29 <slaweq> next topic is: 16:32:31 <slaweq> #topic Python 3 16:33:02 <slaweq> as we already talked today, we have (now in the gate even) patch for grenade multinode jobs to switch to py3 16:33:06 <slaweq> thx bcafarel :) 16:33:30 <slaweq> there is also patch https://review.openstack.org/633979 for neutron-tempest-dvr-ha-multinode-full job 16:33:39 <slaweq> but this one is still failing with some tests 16:34:19 <slaweq> I'm not sure if that is issue with tests or maybe similar problem like we have in neutron-tempes-plugin-scenario-dvr-multinode job 16:34:46 <slaweq> so I will probably split it into 2 pieces: migration to zuulv3 and then second patch with switch to py3 16:35:00 <slaweq> do You agree? 16:35:04 <mlavalle> yes 16:35:10 <bcafarel> sounds good yes 16:35:12 <slaweq> thanks :) 16:35:19 <njohnston_> +1 16:35:33 <slaweq> #action slaweq to split patch https://review.openstack.org/633979 into two: zuulv3 and py3 parts 16:36:00 <slaweq> and I think that this will be all for switch to py3 16:36:25 <slaweq> we will still have some experimental jobs to switch but that can be done slowly later I think 16:37:31 <bcafarel> yeah it would be a bad sign if important tests we needed for proper python3 support are hidden in experimental jobs 16:37:42 <bcafarel> so we can do these "leisurely" 16:37:58 <mlavalle> I like leisurely 16:38:04 <slaweq> LOL 16:38:08 <slaweq> me too 16:38:49 <slaweq> ok, any other questions/something to add about python3? 16:38:54 <slaweq> or can we move on? 16:38:57 <mlavalle> not from me 16:39:12 <njohnston_> I need to send an email to to openstack-discuss to ask the stadium projects about their py3 status 16:39:12 <bcafarel> all good here 16:39:14 * mlavalle will have to drop off at 45 minutes after the hour 16:39:26 <slaweq> njohnston_++ 16:39:45 <slaweq> #action njohnston to ask stadium projects about python3 status 16:39:49 <njohnston_> I can help with some, but I wouldn't want to but in to projects like midonet that I know little about 16:40:14 <slaweq> njohnston_: yes, same for me 16:40:24 <njohnston_> I believe I sent an email a while back and got no requests for help, but let's see if anyone is so motivated now 16:40:36 <slaweq> thx njohnston_ 16:40:48 <slaweq> ok, lets move on quickly 16:40:50 <slaweq> #topic Grafana 16:40:56 <slaweq> #link http://grafana.openstack.org/dashboard/db/neutron-failure-rate 16:42:05 <slaweq> I was looking at grafana today and TBH I don't see anything very bad on it in last few days 16:42:26 <slaweq> but maybe You noticed something what You want to discuss about 16:43:32 <slaweq> even all tempest jobs are in quite good shape this week 16:43:35 <mlavalle> looks pretty good to me 16:44:18 <mlavalle> the only thing I see if functional py27 in gate 16:44:35 <slaweq> mlavalle: yes, functional tests are not very good currently 16:44:48 <slaweq> and I have cuplrits of it 16:44:55 <slaweq> #topic fullstack/functional 16:45:16 <slaweq> we recently noticed at least 3 bugs in functional tests: 16:45:18 <slaweq> - https://bugs.launchpad.net/neutron/+bug/1816239 - patch proposed https://review.openstack.org/#/c/637544/ 16:45:19 <openstack> Launchpad bug 1816239 in neutron "Functional test test_router_processing_pool_size failing" [High,In progress] - Assigned to Brian Haley (brian-haley) 16:45:20 <slaweq> - https://bugs.launchpad.net/neutron/+bug/1815585 - if there will be no anyone to look at this, I will try to debug it 16:45:21 <openstack> Launchpad bug 1815585 in neutron "Floating IP status failed to transition to DOWN in neutron-tempest-plugin-scenario-linuxbridge" [High,Confirmed] 16:45:22 <slaweq> - https://bugs.launchpad.net/neutron/+bug/1816489 - same here, we need volunteer for this one 16:45:23 <openstack> Launchpad bug 1816489 in neutron "Functional test neutron.tests.functional.agent.l3.test_ha_router.LinuxBridgeL3HATestCase. test_ha_router_lifecycle failing" [High,Confirmed] 16:45:53 <slaweq> first one is hitting as the most often I think and it has already fix proposed 16:46:06 <slaweq> so we should be better after this will be merged 16:46:15 <slaweq> other 2 needs volunteers to debug :) 16:46:27 <mlavalle> oh, oh, do we depend on haleyb for this? we are in trouble :-) 16:46:57 <slaweq> I can take a look at https://bugs.launchpad.net/neutron/+bug/1815585 but someone else could look at the last one maybe :) 16:46:58 <openstack> Launchpad bug 1815585 in neutron "Floating IP status failed to transition to DOWN in neutron-tempest-plugin-scenario-linuxbridge" [High,Confirmed] 16:47:17 <mlavalle> ok, I'll try to look at the last one 16:47:20 <slaweq> mlavalle: no, fortunatelly fix was done by liuyulong :) 16:47:24 <slaweq> thx mlavalle 16:47:35 <mlavalle> LOL 16:47:35 <slaweq> #action mlavalle to check bug https://bugs.launchpad.net/neutron/+bug/1816489 16:47:36 <openstack> Launchpad bug 1816489 in neutron "Functional test neutron.tests.functional.agent.l3.test_ha_router.LinuxBridgeL3HATestCase. test_ha_router_lifecycle failing" [High,Confirmed] 16:47:47 <slaweq> #action slaweq to check bug https://bugs.launchpad.net/neutron/+bug/1815585 16:47:48 <mlavalle> ok guyas I got to leave 16:47:53 <slaweq> ok, thx mlavalle 16:47:55 <slaweq> see You later 16:47:59 <mlavalle> o/ 16:48:06 <slaweq> basically that is all what I have for today 16:48:07 <bcafarel> o/ mlavalle 16:48:17 <slaweq> other jobs are in pretty good shape 16:48:28 <slaweq> so do You have something else You want to talk about today? 16:49:13 <bcafarel> nothing from me 16:49:26 <slaweq> ok, so lets have 10 minutes back today :) 16:49:36 <slaweq> thx for attending and see You all around 16:49:41 <slaweq> o/ 16:49:43 <bcafarel> \o/ 16:49:45 <slaweq> #endmeeting