14:03:02 <ihrachys> #startmeeting neutron_upgrades
14:03:02 <openstack> Meeting started Thu Mar  8 14:03:02 2018 UTC and is due to finish in 60 minutes.  The chair is ihrachys. Information about MeetBot at http://wiki.debian.org/MeetBot.
14:03:03 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
14:03:06 <openstack> The meeting name has been set to 'neutron_upgrades'
14:03:21 <lujinluo> o/
14:03:32 <ihrachys> #action ptg
14:03:43 <ihrachys> lujinluo, you've been to ptg right?
14:03:47 <lujinluo> yes
14:04:00 <ihrachys> lujinluo, first thing first, how's snow? :)
14:04:28 <lujinluo> haha, pretty bad. but i was lucky to get back home \o/
14:04:42 <lujinluo> they cancelled about 2 days' flights
14:06:02 <ihrachys> yeah. my last teammate reached India on Tuesday. the absolute "winner". :(
14:06:17 <ihrachys> I mean, he was the last one who got home :)
14:06:59 <ihrachys> lujinluo, so, since you were at ptg, would you mind giving an update of anything relevant to this group?
14:07:09 <lujinluo> sure
14:07:35 <lujinluo> there was one session about what we achieved in Queens, which mentioned our OVO work
14:07:45 <lujinluo> then a session about boden's 2 specs
14:07:54 <lujinluo> there were not much of discussions
14:08:08 <lujinluo> Miguel asked all the attendees to review the specs and comment
14:08:43 <lujinluo> some developers from networking-* left their comments about how they might want to use ovo from neutron-lib
14:08:58 <ihrachys> and by specs you mean https://review.openstack.org/473531 and https://review.openstack.org/509564
14:09:19 <lujinluo> yes, exactly
14:09:22 <ihrachys> the etherpad is https://etherpad.openstack.org/p/neutron-ptg-rocky
14:09:52 <lujinluo> and the general conclusion is that we will implement these two, but some details will need to be addressed
14:10:32 <ihrachys> lujinluo, you mentioned comments from networking-* folks. do you have a line number for the etherpad?
14:10:56 <lujinluo> no, they did not comment on etherpad. they left their comments on the specs
14:11:27 <lujinluo> https://review.openstack.org/#/c/473531/ one comment from networking-odl guy
14:12:08 <lujinluo> also Ajo gave +1, i guess which means he is good with the design?
14:12:37 <lujinluo> however, no body commented on https://review.openstack.org/#/c/509564/ :(
14:13:19 <ihrachys> right. it seems the same usual participants commenting on both. no one new expressed the interest.
14:13:25 <ihrachys> I mean, from networking-*
14:13:41 <lujinluo> yeah
14:14:12 <lujinluo> there were not many folks there though
14:14:19 <lujinluo> around 30
14:17:19 <lujinluo> hello? did i lose connect?
14:17:37 <ihrachys> eh no sorry that's me distracted
14:17:46 <ihrachys> thanks for the update lujinluo
14:17:53 <lujinluo> oh, ok. no problem
14:18:25 <ihrachys> #topic OVO patches
14:18:40 <ihrachys> https://review.openstack.org/#/q/status:open+project:openstack/neutron+branch:master+topic:bp/adopt-oslo-versioned-objects-for-db
14:19:07 <ihrachys> https://review.openstack.org/#/c/507772/ "Use Network OVO in db_base_plugin"
14:19:30 <TuanVu> I’ve already updated solution for 3 failed unit test (queries constant)
14:19:49 <TuanVu> just wondering if this solution is OK?
14:21:07 <ihrachys> well, I would need to read the patch through again since it's not tiny
14:21:13 <ihrachys> but several things pop up
14:21:23 <ihrachys> first, there are still failures in tempest now
14:21:26 <ihrachys> http://logs.openstack.org/72/507772/39/check/neutron-tempest-plugin-api/6e0bf34/logs/testr_results.html.gz
14:21:59 <ihrachys> and considering that NotFound is raised there, it seems that maybe the network is not persisted properly
14:22:02 <TuanVu> yes, I'm trying to fix remaining failed tempest tests
14:22:25 <ihrachys> TuanVu, do you have a local tempest setup to debug it not in gate?
14:22:34 <TuanVu> yes, I do
14:23:16 <ihrachys> ok great
14:23:38 <ihrachys> another thing to mention is that InvalidRequestError handlers in https://review.openstack.org/#/c/507772/39/neutron/objects/base.py should probably go.
14:23:58 <ihrachys> first, we had a fix lately that should get rid of those exceptions in most if not all cases.
14:24:20 <ihrachys> and second, if we for some reason still get those errors we should fix their root cause and not silently ignore them
14:24:47 <TuanVu> thanks for your suggestion, I'll check it again
14:25:13 <ihrachys> I will do a more detailed review after the meeting
14:25:26 <TuanVu> thank you in advance, Ihar :)
14:26:41 <ihrachys> https://review.openstack.org/#/c/549168/ "Use Router OVO in l3_db.py"
14:27:06 <ihrachys> that afaik depends on https://review.openstack.org/#/c/521797/
14:27:34 <ihrachys> yeah and the only thing that blocked the latter was waiting for stable/queens which is there already for a while
14:27:45 <ihrachys> so +W on that dependency
14:27:55 <ihrachys> as for the router ovo patch itself,
14:28:19 <hungpv> thank you
14:28:42 <ihrachys> hungpv, the patch looks entirely red in CI. I guess it's work in progress?
14:29:28 <hungpv> yes, it is
14:29:30 <ihrachys> "_precommit_router_create() takes exactly 7 arguments (6 given)" in unit tests
14:29:38 <ihrachys> seems rather obvious on how to fix it
14:29:44 <ihrachys> so I guess no need to dive in :)
14:29:59 <ihrachys> https://review.openstack.org/#/c/537325/ "Use Meter Label OVO in neutron/db/metering/metering_db.py"
14:30:04 <ihrachys> this one already has one +2
14:31:13 <ihrachys> and it seems fine so I +2d it just now
14:31:26 <lujinluo> \o/
14:31:30 <ihrachys> I mean +Wd too
14:31:40 <ihrachys> https://review.openstack.org/#/c/537320/ "Use Port OVO in neutron/db/external_net_db.py"
14:32:12 <lujinluo> we talked about this one in last meeting
14:32:39 <ihrachys> ah right. it's because of a bug in new_facade that makes it not pass unit tests.
14:32:42 <lujinluo> set port new_facade to true, affecting all the places we use port ovo obj
14:32:57 <lujinluo> not only the bug
14:32:58 <ihrachys> do we have a fix for that unit tests bug?
14:33:08 <lujinluo> i have an update for that
14:33:18 <ihrachys> yeah I get it; it will be harder to solve the issue of global change of facade
14:33:29 <ihrachys> lujinluo, go on
14:33:33 <lujinluo> yes
14:33:45 <lujinluo> for the update, last time you suggested me to add a new fake obj
14:34:01 <lujinluo> but the thing is once we set any ovo's new_facade to new
14:34:04 <lujinluo> we hit the bug
14:34:19 <lujinluo> the root cause is that new engine facade creates new session
14:34:26 <lujinluo> which makes these two lines useless
14:34:33 <lujinluo> https://github.com/openstack/neutron/blob/master/neutron/tests/unit/objects/test_base.py#L695-L696
14:35:15 <lujinluo> so if i check context.session before and after entering new engine facade
14:35:20 <lujinluo> the session is different
14:35:53 <lujinluo> see the detailed pdb check here
14:35:55 <lujinluo> https://bugs.launchpad.net/neutron/+bug/1750735
14:35:56 <openstack> Launchpad bug 1750735 in neutron "[OVO] UT fails when setting new_facade to True" [Undecided,New] - Assigned to Lujin Luo (luo-lujin)
14:36:33 <lujinluo> after "Update on 06/03/2018"
14:36:44 <ihrachys> can we mock e.g. .using() so that it 1) calls original and 2) then mock refresh / expunge on the new session?
14:37:09 <ihrachys> mock library allows to replace implementations, not just completely disable them
14:37:36 <lujinluo> do you mean mock the method in base.py?
14:38:08 <lujinluo> we can, but then we are not actually testing if new_facade is true or false, which is my concern
14:38:40 <ihrachys> lujinluo, no I mean mock the db_api.context_manager.writer.using
14:39:06 <ihrachys> and we would leave it in place, just extend it so that it mocks sessions it creates right before exiting __enter__
14:39:28 <lujinluo> yes, we can do that
14:40:02 <lujinluo> i will push a ps tmr
14:40:26 <ihrachys> and if that seems to hard we could even do that for db_context_[reader|writer] themselves though we would need to test it if it resolves all issues since then only OVO usage would be mocked in scope of the test class
14:40:51 <ihrachys> lujinluo, great, thanks for looking into it
14:40:59 <ihrachys> so that's for unit tests
14:41:05 <ihrachys> as for new_engine facade being a global flag
14:41:48 <ihrachys> I remember I was rambling about the idea of detecting the session type somehow and automatically using the right facade in OVO. have you considered it?
14:41:57 <ihrachys> I may take a look at that one if you are busy.
14:42:18 <lujinluo> no, i have not had time to do much investigation yet
14:42:29 <lujinluo> if you have bandwidth, please help with that
14:43:09 <ihrachys> ok I will put it on my plate;
14:43:45 <lujinluo> thanks
14:44:27 <ihrachys> ok other patches are in conflicts
14:44:32 <ihrachys> (or in gate)
14:45:07 <ihrachys> https://review.openstack.org/#/c/530182/ - "Use Router OVO in l3_db"
14:45:15 <ihrachys> no response from the original author so we should take it over
14:45:40 <ihrachys> hungpv, afaiu you were planning to take a look if the author is not responsive.
14:45:54 <ihrachys> or lujinluo I am not sure :)
14:45:59 <lujinluo> hungpv: do you think you will have time for that? i remember you said you were also working on it?
14:46:30 <lujinluo> ihrachys: i have enough on my list now :P
14:47:09 <ihrachys> hungpv, is the patch an older version of https://review.openstack.org/#/c/549168/ ?
14:48:48 <hungpv> yes, i'm working on it now
14:49:15 <hungpv> i'll look after it
14:49:24 <lujinluo> cool!
14:49:26 <ihrachys> hungpv, ok but is it the same?
14:49:35 <ihrachys> hungpv, because if it is then we can abandon the older one
14:50:07 <hungpv> yeah, i'm planing to replace the old on
14:50:11 <hungpv> *one
14:50:43 <ihrachys> ok I will abandon the old one then
14:51:35 <ihrachys> #topic Neutron under containers
14:52:05 <ihrachys> in case people forgot, it's upgrades meeting ;) so I have this topic to cover.
14:52:06 <ihrachys> https://bugs.launchpad.net/neutron/+bug/1738768
14:52:07 <openstack> Launchpad bug 1738768 in tripleo "Dataplane downtime when containers are stopped/restarted" [High,In progress] - Assigned to Brent Eagles (beagles)
14:52:49 <ihrachys> tl;dr if you run neutron agents under containers, and restart a container, then not only agent code is restarted, but daemons agents start - keepalived, radvd, dnsmasq - are killed / restarted
14:52:54 <ihrachys> which makes the network blip
14:53:22 <ihrachys> and agent restart is obviously part of upgrades procedures
14:53:35 <ihrachys> when you run on bare metal, daemons stay up and no blip happens
14:53:45 <ihrachys> we are looking into the issue on tripleo side right now
14:54:10 <ihrachys> but has anyone stumbled on the issue in any other context, and if so, any ideas on how to handle it correctly?
14:54:22 <ihrachys> slaweq, I wonder if you have thoughts on that one
14:54:33 <lujinluo> ihrachys: do you happen to know if this happens in other containerization tool? kolla?
14:54:55 <slaweq> ihrachys: I didn't play with containers and neutron agents yet
14:55:34 <ihrachys> lujinluo, it's kolla containers that tripleo uses
14:56:08 <slaweq> in such case all such daemons are running also in containers or outside container?
14:56:16 <ihrachys> there is a sort-of solution where we make netns all shared: https://review.openstack.org/#/c/542858/
14:56:18 <lujinluo> i see. so maybe we should report it to kolla as well?
14:56:32 <ihrachys> but afaik it still makes the daemons die. it's just that existing wiring is intact.
14:57:50 <ihrachys> lujinluo, yeah I think we are following up with them. but I was wondering if folks looked into it in other environments like openstack-ansible that afaik has its own containers
14:58:22 <ihrachys> if not, that's ok :)
14:58:26 <ihrachys> #topic Open floor
14:58:46 <ihrachys> we have 2 minutes, quick, solve the world hunger!
14:58:55 <ihrachys> anything to raise before we wrap up?
14:59:00 <lujinluo> nothing from me ;)
15:00:07 <ihrachys> ok fine! thanks e1 for joining
15:00:16 <ihrachys> #endmeeting