16:00:31 <slaweq> #startmeeting neutron_ci
16:00:32 <openstack> Meeting started Tue Feb 19 16:00:31 2019 UTC and is due to finish in 60 minutes.  The chair is slaweq. Information about MeetBot at http://wiki.debian.org/MeetBot.
16:00:33 <slaweq> hi
16:00:34 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
16:00:36 <openstack> The meeting name has been set to 'neutron_ci'
16:00:40 <mlavalle> o/
16:00:49 <njohnston_> o/
16:01:26 <slaweq> lets wait few minutes for others
16:01:44 <slaweq> maybe bcafarel and hongbin will join too
16:01:52 <slaweq> I know that haleyb is on PTO today
16:01:58 <bcafarel> o/
16:02:33 <bcafarel> thanks for the ping slaweq, I was writing some doc (easy to forget the clock then)
16:02:45 <slaweq> bcafarel: :)
16:03:00 <slaweq> ok, let's start then
16:03:06 <slaweq> #topic Actions from previous meetings
16:03:14 <slaweq> mlavalle to check if adding KillFilter for neutron-keepalived-state-change will solve issues with L3 agent in dvr jobs
16:03:39 <mlavalle> I porposed patch https://review.openstack.org/#/c/636710/
16:03:51 <hongbin> o/
16:03:59 <slaweq> hi hongbin :)
16:03:59 <mlavalle> and actually dougwig made an ineteresting comment
16:04:10 <mlavalle> which I intend to explore
16:04:36 <njohnston_> yeah, that is interesting
16:05:03 <slaweq> but patch basically looks like is helping to solve some of failing tests in multinode job, right?
16:05:25 <mlavalle> I agree in that we should keep the door as narrowly open as possible
16:05:48 <bcafarel> +1
16:05:54 <slaweq> ++
16:06:01 <mlavalle> slaweq: I haven't had time to chack on that since yesterday? Did you look?
16:06:55 <slaweq> I looked at test's results right now: http://logs.openstack.org/10/636710/2/check/neutron-tempest-plugin-dvr-multinode-scenario/f44d655/testr_results.html.gz
16:07:10 <slaweq> it looks that "only" 2 tests failed which is much better than it was
16:07:47 <mlavalle> oh yeah, in fact the trunk lifecycle test is passing
16:07:56 <mlavalle> which I haven't seen in a long time
16:08:08 <mlavalle> so it looks we are moving in the right direction
16:08:10 <slaweq> also from quick look at l3-agent's log, it looks much better and IMO agent was working all the time properly
16:08:22 <bcafarel> so there really was a rootwrap filter missing all along?
16:08:31 <mlavalle> it seems so
16:08:32 <slaweq> bcafarel: for some time at least
16:08:45 <mlavalle> it was only for some months
16:08:46 <slaweq> bcafarel: I removed them when I removed old metadata proxy code
16:09:05 <bcafarel> slaweq: ok, missing for some months makes more sense :)
16:09:28 <slaweq> :)
16:09:36 <mlavalle> bcafarel: when I started suspecting the filters were the cause, I had the exact same question in my mind
16:09:54 <mlavalle> how come did this work before?
16:10:10 <mlavalle> but we really introduced the bug recently
16:10:43 <mlavalle> ok, so I'll play with some more rechecks of the patch
16:10:51 <slaweq> I think that this bug was exposed by switch to python3
16:11:01 <slaweq> on python2.7 it is "working" somehow
16:11:19 <mlavalle> and I'll explore dougwig's suggestion, which is very sensible
16:11:21 <slaweq> at least agent is not dying when this issue occurs
16:11:46 <mlavalle> yeah, I think that in python2.7 we still miss the filter
16:11:56 <mlavalle> but for some reason we don't kill the agent
16:12:12 <mlavalle> anbd therefore we don't have a chain reaction in the tests
16:12:21 <slaweq> yep
16:13:16 <slaweq> ok, moving on to the next one then
16:13:18 <slaweq> bcafarel to continue work on grafana jobs switch to python 3
16:13:37 <bcafarel> s/grafana/grenade/
16:13:53 <slaweq> bcafarel: right :)
16:14:39 <bcafarel> I commented on #openstack-neutron yesterday on it, I think it will be better to have grenade job definition in grenade repo once it is full zuul v3 (which will allow devstack_localrc vars to work)
16:14:50 <bcafarel> so I updated https://review.openstack.org/#/c/636356/ to simply switch to py3 in the meantime
16:15:13 <bcafarel> just got zuul +1 :)
16:15:43 <slaweq> and now it looks that it is running on python 3 indeed :)
16:15:54 <slaweq> for example here: http://logs.openstack.org/56/636356/7/check/neutron-grenade-multinode/0ea452a/logs/screen-q-l3.txt.gz#_Feb_19_14_14_16_092790
16:16:24 <bcafarel> yes, wrong variable in my first try sorry, now it looks good
16:16:41 <slaweq> +2 it already
16:16:56 <slaweq> IMO it will be good if we just have single node grenade on python 2.7
16:17:01 <slaweq> and others on python3
16:17:31 <njohnston_> and the nature of the grenade job is that it upgrades from an old version running py27 to a new version running py3
16:18:28 <njohnston_> so regardless of what code is running the grenade harness we get some testing on both versions of neutron code
16:18:29 <slaweq> njohnston_: I think that in those jobs it runs on python3 for both old and new
16:18:43 <slaweq> see e.g. here: http://logs.openstack.org/56/636356/7/check/neutron-grenade-multinode/0ea452a/logs/old/local_conf.txt.gz
16:19:24 <slaweq> if it would be possible to do upgrade from py27 to py3 that would be great IMO
16:19:37 <slaweq> but maybe that should be discussed with QA team?
16:20:00 <njohnston_> I checked the grenade output, for example
16:20:09 <slaweq> njohnston_: in grenade-py3 job (defined in grenade repo) it is also like that, all on py3: http://logs.openstack.org/56/636356/7/check/grenade-py3/b075864/logs/old/local_conf.txt.gz
16:20:43 <njohnston_> old: http://logs.openstack.org/56/636356/7/check/grenade-py3/b075864/logs/grenade.sh.txt.gz#_2019-02-19_13_45_48_138 "New python executable in /opt/stack/old/requirements/.venv/bin/python2"
16:20:56 <njohnston_> new: http://logs.openstack.org/56/636356/7/check/grenade-py3/b075864/logs/grenade.sh.txt.gz#_2019-02-19_14_19_45_871 "New python executable in /opt/stack/new/requirements/.venv/bin/python3.5"
16:21:11 <hongbin> if the upgrade from py3 to py3 success, is there any scenario that upgrade from py2 to py3 break?
16:22:28 <slaweq> njohnston_: but is devstack deployed in venv in grenade job? I really don't think so
16:22:41 <slaweq> but I'm not grenade expert so may be wrong here :)
16:22:57 <slaweq> hongbin: we don't have any other jobs which tests upgrade
16:23:55 <njohnston_> slaweq: I can ask the QA team to be sure
16:24:10 <slaweq> njohnston_: ok, thx
16:24:21 <hongbin> ok, from me, i think testing from py3 to py3 upgrade is enough
16:24:39 <slaweq> njohnston_: but still, patch from bcafarel is "consistent" with this single node grenade job so we can IMO go forward with it :)
16:24:56 <njohnston_> slaweq: Absolutely, no disagreement there
16:25:36 <slaweq> njohnston_: great :)
16:25:58 <slaweq> ok, can we move on?
16:27:08 <njohnston_> yep
16:27:17 <slaweq> ok, next one then
16:27:23 <slaweq> slaweq to propose patch with new decorator skip_if_timeout in functional tests
16:27:33 <slaweq> I did patch https://review.openstack.org/#/c/636892/
16:27:40 <slaweq> please review it if You can
16:28:40 <njohnston_> looks like functional tests failed on the last run
16:29:22 <njohnston_> I guess that would be the run still running
16:29:31 <slaweq> njohnston_: where?
16:29:40 <bcafarel> http://logs.openstack.org/92/636892/1/gate/neutron-functional/85da30c/logs/testr_results.html.gz ?
16:30:13 <slaweq> ahh, in the gate
16:30:21 <slaweq> ok, so maybe I will need to investigate it more :)
16:30:23 <slaweq> thx
16:30:25 <njohnston_> http://logs.openstack.org/92/636892/1/gate/neutron-functional/85da30c/job-output.txt.gz#_2019-02-19_15_32_02_801108
16:31:17 <slaweq> so it looks that it maybe don't work properly, I will investigate that tomorrow morning then
16:31:36 <slaweq> #action slaweq to fix patch with skip_if_timeout decorator
16:31:45 <bcafarel> you did get your test failure in the end at least :)
16:31:56 <slaweq> LOL, indeed
16:32:20 <slaweq> ok, that's all for actions from last week
16:32:29 <slaweq> next topic is:
16:32:31 <slaweq> #topic Python 3
16:33:02 <slaweq> as we already talked today, we have (now in the gate even) patch for grenade multinode jobs to switch to py3
16:33:06 <slaweq> thx bcafarel :)
16:33:30 <slaweq> there is also patch https://review.openstack.org/633979 for neutron-tempest-dvr-ha-multinode-full job
16:33:39 <slaweq> but this one is still failing with some tests
16:34:19 <slaweq> I'm not sure if that is issue with tests or maybe similar problem like we have in neutron-tempes-plugin-scenario-dvr-multinode job
16:34:46 <slaweq> so I will probably split it into 2 pieces: migration to zuulv3 and then second patch with switch to py3
16:35:00 <slaweq> do You agree?
16:35:04 <mlavalle> yes
16:35:10 <bcafarel> sounds good yes
16:35:12 <slaweq> thanks :)
16:35:19 <njohnston_> +1
16:35:33 <slaweq> #action slaweq to split patch https://review.openstack.org/633979 into two: zuulv3 and py3 parts
16:36:00 <slaweq> and I think that this will be all for switch to py3
16:36:25 <slaweq> we will still have some experimental jobs to switch but that can be done slowly later I think
16:37:31 <bcafarel> yeah it would be a bad sign if important tests we needed for proper python3 support are hidden in experimental jobs
16:37:42 <bcafarel> so we can do these "leisurely"
16:37:58 <mlavalle> I like leisurely
16:38:04 <slaweq> LOL
16:38:08 <slaweq> me too
16:38:49 <slaweq> ok, any other questions/something to add about python3?
16:38:54 <slaweq> or can we move on?
16:38:57 <mlavalle> not from me
16:39:12 <njohnston_> I need to send an email to to openstack-discuss to ask the stadium projects about their py3 status
16:39:12 <bcafarel> all good here
16:39:14 * mlavalle will have to drop off at 45 minutes after the hour
16:39:26 <slaweq> njohnston_++
16:39:45 <slaweq> #action njohnston to ask stadium projects about python3 status
16:39:49 <njohnston_> I can help with some, but I wouldn't want to but in to projects like midonet that I know little about
16:40:14 <slaweq> njohnston_: yes, same for me
16:40:24 <njohnston_> I believe I sent an email a while back and got no requests for help, but let's see if anyone is so motivated now
16:40:36 <slaweq> thx njohnston_
16:40:48 <slaweq> ok, lets move on quickly
16:40:50 <slaweq> #topic Grafana
16:40:56 <slaweq> #link http://grafana.openstack.org/dashboard/db/neutron-failure-rate
16:42:05 <slaweq> I was looking at grafana today and TBH I don't see anything very bad on it in last few days
16:42:26 <slaweq> but maybe You noticed something what You want to discuss about
16:43:32 <slaweq> even all tempest jobs are in quite good shape this week
16:43:35 <mlavalle> looks pretty good to me
16:44:18 <mlavalle> the only thing I see if functional py27 in gate
16:44:35 <slaweq> mlavalle: yes, functional tests are not very good currently
16:44:48 <slaweq> and I have cuplrits of it
16:44:55 <slaweq> #topic fullstack/functional
16:45:16 <slaweq> we recently noticed at least 3 bugs in functional tests:
16:45:18 <slaweq> - https://bugs.launchpad.net/neutron/+bug/1816239 - patch proposed https://review.openstack.org/#/c/637544/
16:45:19 <openstack> Launchpad bug 1816239 in neutron "Functional test test_router_processing_pool_size failing" [High,In progress] - Assigned to Brian Haley (brian-haley)
16:45:20 <slaweq> - https://bugs.launchpad.net/neutron/+bug/1815585 - if there will be no anyone to look at this, I will try to debug it
16:45:21 <openstack> Launchpad bug 1815585 in neutron "Floating IP status failed to transition to DOWN in neutron-tempest-plugin-scenario-linuxbridge" [High,Confirmed]
16:45:22 <slaweq> - https://bugs.launchpad.net/neutron/+bug/1816489 - same here, we need volunteer for this one
16:45:23 <openstack> Launchpad bug 1816489 in neutron "Functional test neutron.tests.functional.agent.l3.test_ha_router.LinuxBridgeL3HATestCase. test_ha_router_lifecycle failing" [High,Confirmed]
16:45:53 <slaweq> first one is hitting as the most often I think and it has already fix proposed
16:46:06 <slaweq> so we should be better after this will be merged
16:46:15 <slaweq> other 2 needs volunteers to debug :)
16:46:27 <mlavalle> oh, oh, do we depend on haleyb for this? we are in trouble  :-)
16:46:57 <slaweq> I can take a look at https://bugs.launchpad.net/neutron/+bug/1815585 but someone else could look at the last one maybe :)
16:46:58 <openstack> Launchpad bug 1815585 in neutron "Floating IP status failed to transition to DOWN in neutron-tempest-plugin-scenario-linuxbridge" [High,Confirmed]
16:47:17 <mlavalle> ok, I'll try to look at the last one
16:47:20 <slaweq> mlavalle: no, fortunatelly fix was done by liuyulong :)
16:47:24 <slaweq> thx mlavalle
16:47:35 <mlavalle> LOL
16:47:35 <slaweq> #action mlavalle to check bug https://bugs.launchpad.net/neutron/+bug/1816489
16:47:36 <openstack> Launchpad bug 1816489 in neutron "Functional test neutron.tests.functional.agent.l3.test_ha_router.LinuxBridgeL3HATestCase. test_ha_router_lifecycle failing" [High,Confirmed]
16:47:47 <slaweq> #action slaweq to check bug https://bugs.launchpad.net/neutron/+bug/1815585
16:47:48 <mlavalle> ok guyas I got to leave
16:47:53 <slaweq> ok, thx mlavalle
16:47:55 <slaweq> see You later
16:47:59 <mlavalle> o/
16:48:06 <slaweq> basically that is all what I have for today
16:48:07 <bcafarel> o/ mlavalle
16:48:17 <slaweq> other jobs are in pretty good shape
16:48:28 <slaweq> so do You have something else You want to talk about today?
16:49:13 <bcafarel> nothing from me
16:49:26 <slaweq> ok, so lets have 10 minutes back today :)
16:49:36 <slaweq> thx for attending and see You all around
16:49:41 <slaweq> o/
16:49:43 <bcafarel> \o/
16:49:45 <slaweq> #endmeeting