14:03:02 <ihrachys> #startmeeting neutron_upgrades 14:03:02 <openstack> Meeting started Thu Mar 8 14:03:02 2018 UTC and is due to finish in 60 minutes. The chair is ihrachys. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:03:03 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:03:06 <openstack> The meeting name has been set to 'neutron_upgrades' 14:03:21 <lujinluo> o/ 14:03:32 <ihrachys> #action ptg 14:03:43 <ihrachys> lujinluo, you've been to ptg right? 14:03:47 <lujinluo> yes 14:04:00 <ihrachys> lujinluo, first thing first, how's snow? :) 14:04:28 <lujinluo> haha, pretty bad. but i was lucky to get back home \o/ 14:04:42 <lujinluo> they cancelled about 2 days' flights 14:06:02 <ihrachys> yeah. my last teammate reached India on Tuesday. the absolute "winner". :( 14:06:17 <ihrachys> I mean, he was the last one who got home :) 14:06:59 <ihrachys> lujinluo, so, since you were at ptg, would you mind giving an update of anything relevant to this group? 14:07:09 <lujinluo> sure 14:07:35 <lujinluo> there was one session about what we achieved in Queens, which mentioned our OVO work 14:07:45 <lujinluo> then a session about boden's 2 specs 14:07:54 <lujinluo> there were not much of discussions 14:08:08 <lujinluo> Miguel asked all the attendees to review the specs and comment 14:08:43 <lujinluo> some developers from networking-* left their comments about how they might want to use ovo from neutron-lib 14:08:58 <ihrachys> and by specs you mean https://review.openstack.org/473531 and https://review.openstack.org/509564 14:09:19 <lujinluo> yes, exactly 14:09:22 <ihrachys> the etherpad is https://etherpad.openstack.org/p/neutron-ptg-rocky 14:09:52 <lujinluo> and the general conclusion is that we will implement these two, but some details will need to be addressed 14:10:32 <ihrachys> lujinluo, you mentioned comments from networking-* folks. do you have a line number for the etherpad? 14:10:56 <lujinluo> no, they did not comment on etherpad. they left their comments on the specs 14:11:27 <lujinluo> https://review.openstack.org/#/c/473531/ one comment from networking-odl guy 14:12:08 <lujinluo> also Ajo gave +1, i guess which means he is good with the design? 14:12:37 <lujinluo> however, no body commented on https://review.openstack.org/#/c/509564/ :( 14:13:19 <ihrachys> right. it seems the same usual participants commenting on both. no one new expressed the interest. 14:13:25 <ihrachys> I mean, from networking-* 14:13:41 <lujinluo> yeah 14:14:12 <lujinluo> there were not many folks there though 14:14:19 <lujinluo> around 30 14:17:19 <lujinluo> hello? did i lose connect? 14:17:37 <ihrachys> eh no sorry that's me distracted 14:17:46 <ihrachys> thanks for the update lujinluo 14:17:53 <lujinluo> oh, ok. no problem 14:18:25 <ihrachys> #topic OVO patches 14:18:40 <ihrachys> https://review.openstack.org/#/q/status:open+project:openstack/neutron+branch:master+topic:bp/adopt-oslo-versioned-objects-for-db 14:19:07 <ihrachys> https://review.openstack.org/#/c/507772/ "Use Network OVO in db_base_plugin" 14:19:30 <TuanVu> I’ve already updated solution for 3 failed unit test (queries constant) 14:19:49 <TuanVu> just wondering if this solution is OK? 14:21:07 <ihrachys> well, I would need to read the patch through again since it's not tiny 14:21:13 <ihrachys> but several things pop up 14:21:23 <ihrachys> first, there are still failures in tempest now 14:21:26 <ihrachys> http://logs.openstack.org/72/507772/39/check/neutron-tempest-plugin-api/6e0bf34/logs/testr_results.html.gz 14:21:59 <ihrachys> and considering that NotFound is raised there, it seems that maybe the network is not persisted properly 14:22:02 <TuanVu> yes, I'm trying to fix remaining failed tempest tests 14:22:25 <ihrachys> TuanVu, do you have a local tempest setup to debug it not in gate? 14:22:34 <TuanVu> yes, I do 14:23:16 <ihrachys> ok great 14:23:38 <ihrachys> another thing to mention is that InvalidRequestError handlers in https://review.openstack.org/#/c/507772/39/neutron/objects/base.py should probably go. 14:23:58 <ihrachys> first, we had a fix lately that should get rid of those exceptions in most if not all cases. 14:24:20 <ihrachys> and second, if we for some reason still get those errors we should fix their root cause and not silently ignore them 14:24:47 <TuanVu> thanks for your suggestion, I'll check it again 14:25:13 <ihrachys> I will do a more detailed review after the meeting 14:25:26 <TuanVu> thank you in advance, Ihar :) 14:26:41 <ihrachys> https://review.openstack.org/#/c/549168/ "Use Router OVO in l3_db.py" 14:27:06 <ihrachys> that afaik depends on https://review.openstack.org/#/c/521797/ 14:27:34 <ihrachys> yeah and the only thing that blocked the latter was waiting for stable/queens which is there already for a while 14:27:45 <ihrachys> so +W on that dependency 14:27:55 <ihrachys> as for the router ovo patch itself, 14:28:19 <hungpv> thank you 14:28:42 <ihrachys> hungpv, the patch looks entirely red in CI. I guess it's work in progress? 14:29:28 <hungpv> yes, it is 14:29:30 <ihrachys> "_precommit_router_create() takes exactly 7 arguments (6 given)" in unit tests 14:29:38 <ihrachys> seems rather obvious on how to fix it 14:29:44 <ihrachys> so I guess no need to dive in :) 14:29:59 <ihrachys> https://review.openstack.org/#/c/537325/ "Use Meter Label OVO in neutron/db/metering/metering_db.py" 14:30:04 <ihrachys> this one already has one +2 14:31:13 <ihrachys> and it seems fine so I +2d it just now 14:31:26 <lujinluo> \o/ 14:31:30 <ihrachys> I mean +Wd too 14:31:40 <ihrachys> https://review.openstack.org/#/c/537320/ "Use Port OVO in neutron/db/external_net_db.py" 14:32:12 <lujinluo> we talked about this one in last meeting 14:32:39 <ihrachys> ah right. it's because of a bug in new_facade that makes it not pass unit tests. 14:32:42 <lujinluo> set port new_facade to true, affecting all the places we use port ovo obj 14:32:57 <lujinluo> not only the bug 14:32:58 <ihrachys> do we have a fix for that unit tests bug? 14:33:08 <lujinluo> i have an update for that 14:33:18 <ihrachys> yeah I get it; it will be harder to solve the issue of global change of facade 14:33:29 <ihrachys> lujinluo, go on 14:33:33 <lujinluo> yes 14:33:45 <lujinluo> for the update, last time you suggested me to add a new fake obj 14:34:01 <lujinluo> but the thing is once we set any ovo's new_facade to new 14:34:04 <lujinluo> we hit the bug 14:34:19 <lujinluo> the root cause is that new engine facade creates new session 14:34:26 <lujinluo> which makes these two lines useless 14:34:33 <lujinluo> https://github.com/openstack/neutron/blob/master/neutron/tests/unit/objects/test_base.py#L695-L696 14:35:15 <lujinluo> so if i check context.session before and after entering new engine facade 14:35:20 <lujinluo> the session is different 14:35:53 <lujinluo> see the detailed pdb check here 14:35:55 <lujinluo> https://bugs.launchpad.net/neutron/+bug/1750735 14:35:56 <openstack> Launchpad bug 1750735 in neutron "[OVO] UT fails when setting new_facade to True" [Undecided,New] - Assigned to Lujin Luo (luo-lujin) 14:36:33 <lujinluo> after "Update on 06/03/2018" 14:36:44 <ihrachys> can we mock e.g. .using() so that it 1) calls original and 2) then mock refresh / expunge on the new session? 14:37:09 <ihrachys> mock library allows to replace implementations, not just completely disable them 14:37:36 <lujinluo> do you mean mock the method in base.py? 14:38:08 <lujinluo> we can, but then we are not actually testing if new_facade is true or false, which is my concern 14:38:40 <ihrachys> lujinluo, no I mean mock the db_api.context_manager.writer.using 14:39:06 <ihrachys> and we would leave it in place, just extend it so that it mocks sessions it creates right before exiting __enter__ 14:39:28 <lujinluo> yes, we can do that 14:40:02 <lujinluo> i will push a ps tmr 14:40:26 <ihrachys> and if that seems to hard we could even do that for db_context_[reader|writer] themselves though we would need to test it if it resolves all issues since then only OVO usage would be mocked in scope of the test class 14:40:51 <ihrachys> lujinluo, great, thanks for looking into it 14:40:59 <ihrachys> so that's for unit tests 14:41:05 <ihrachys> as for new_engine facade being a global flag 14:41:48 <ihrachys> I remember I was rambling about the idea of detecting the session type somehow and automatically using the right facade in OVO. have you considered it? 14:41:57 <ihrachys> I may take a look at that one if you are busy. 14:42:18 <lujinluo> no, i have not had time to do much investigation yet 14:42:29 <lujinluo> if you have bandwidth, please help with that 14:43:09 <ihrachys> ok I will put it on my plate; 14:43:45 <lujinluo> thanks 14:44:27 <ihrachys> ok other patches are in conflicts 14:44:32 <ihrachys> (or in gate) 14:45:07 <ihrachys> https://review.openstack.org/#/c/530182/ - "Use Router OVO in l3_db" 14:45:15 <ihrachys> no response from the original author so we should take it over 14:45:40 <ihrachys> hungpv, afaiu you were planning to take a look if the author is not responsive. 14:45:54 <ihrachys> or lujinluo I am not sure :) 14:45:59 <lujinluo> hungpv: do you think you will have time for that? i remember you said you were also working on it? 14:46:30 <lujinluo> ihrachys: i have enough on my list now :P 14:47:09 <ihrachys> hungpv, is the patch an older version of https://review.openstack.org/#/c/549168/ ? 14:48:48 <hungpv> yes, i'm working on it now 14:49:15 <hungpv> i'll look after it 14:49:24 <lujinluo> cool! 14:49:26 <ihrachys> hungpv, ok but is it the same? 14:49:35 <ihrachys> hungpv, because if it is then we can abandon the older one 14:50:07 <hungpv> yeah, i'm planing to replace the old on 14:50:11 <hungpv> *one 14:50:43 <ihrachys> ok I will abandon the old one then 14:51:35 <ihrachys> #topic Neutron under containers 14:52:05 <ihrachys> in case people forgot, it's upgrades meeting ;) so I have this topic to cover. 14:52:06 <ihrachys> https://bugs.launchpad.net/neutron/+bug/1738768 14:52:07 <openstack> Launchpad bug 1738768 in tripleo "Dataplane downtime when containers are stopped/restarted" [High,In progress] - Assigned to Brent Eagles (beagles) 14:52:49 <ihrachys> tl;dr if you run neutron agents under containers, and restart a container, then not only agent code is restarted, but daemons agents start - keepalived, radvd, dnsmasq - are killed / restarted 14:52:54 <ihrachys> which makes the network blip 14:53:22 <ihrachys> and agent restart is obviously part of upgrades procedures 14:53:35 <ihrachys> when you run on bare metal, daemons stay up and no blip happens 14:53:45 <ihrachys> we are looking into the issue on tripleo side right now 14:54:10 <ihrachys> but has anyone stumbled on the issue in any other context, and if so, any ideas on how to handle it correctly? 14:54:22 <ihrachys> slaweq, I wonder if you have thoughts on that one 14:54:33 <lujinluo> ihrachys: do you happen to know if this happens in other containerization tool? kolla? 14:54:55 <slaweq> ihrachys: I didn't play with containers and neutron agents yet 14:55:34 <ihrachys> lujinluo, it's kolla containers that tripleo uses 14:56:08 <slaweq> in such case all such daemons are running also in containers or outside container? 14:56:16 <ihrachys> there is a sort-of solution where we make netns all shared: https://review.openstack.org/#/c/542858/ 14:56:18 <lujinluo> i see. so maybe we should report it to kolla as well? 14:56:32 <ihrachys> but afaik it still makes the daemons die. it's just that existing wiring is intact. 14:57:50 <ihrachys> lujinluo, yeah I think we are following up with them. but I was wondering if folks looked into it in other environments like openstack-ansible that afaik has its own containers 14:58:22 <ihrachys> if not, that's ok :) 14:58:26 <ihrachys> #topic Open floor 14:58:46 <ihrachys> we have 2 minutes, quick, solve the world hunger! 14:58:55 <ihrachys> anything to raise before we wrap up? 14:59:00 <lujinluo> nothing from me ;) 15:00:07 <ihrachys> ok fine! thanks e1 for joining 15:00:16 <ihrachys> #endmeeting