14:02:11 <ihrachys_> #startmeeting neutron_upgrades 14:02:12 <openstack> Meeting started Thu Jan 25 14:02:11 2018 UTC and is due to finish in 60 minutes. The chair is ihrachys_. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:02:13 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:02:16 <openstack> The meeting name has been set to 'neutron_upgrades' 14:02:21 <ihrachys_> hi! 14:02:23 <lujinluo> o/ 14:02:27 <tuan_vu> Hi everyone 14:03:14 <ihrachys_> before we dive in, some background on the current state of affairs in relation to release 14:03:31 <ihrachys_> as you may probably know, a new major release (queens) is getting closer 14:03:42 <lujinluo> \o/ 14:04:10 <ihrachys_> https://releases.openstack.org/queens/schedule.html 14:04:31 <ihrachys_> this week is feature freeze 14:05:01 <ihrachys_> we'll have a RC1 release (and hence stable/queens) in several weeks from now 14:05:31 <ihrachys_> till that moment, we should be cautious landing patches that may affect stability and don't fix clear bugs 14:05:41 <ihrachys_> which most OVO patches are 14:06:07 <ihrachys_> we can of course continue reshaping patches in review to get them ready for when master is open 14:06:32 <lujinluo> understood. 14:06:52 <ihrachys_> one sad development because of release final coming close is that we were forced to revert port binding patch that took a while to shape 14:07:06 <ihrachys_> because of several issues that merging it revealed 14:07:23 <ihrachys_> one is that it was not counting with mixed old/new engine facade usage 14:07:31 <ihrachys_> another is postgresql installations busted 14:07:57 <lujinluo> yeah, speaking of that. can i have a (stupid) question? 14:08:06 <ihrachys_> we will need to get back to the white board, so to speak, repropose the patch, fix those issues, and land some time in Rocky 14:08:14 <ihrachys_> lujinluo, yes! 14:08:24 <lujinluo> i was trying to reproduce the postgresql issues 14:08:40 <lujinluo> so i built a devstack with postgresql as backend 14:08:56 <lujinluo> then i have no idea how i can reproduce the errors.. 14:09:25 <lujinluo> could you direct me to the conf/settings of that specific tempest tests? 14:09:42 <ihrachys_> the periodic job that fails executes full suite of tempest I believe 14:09:57 <ihrachys_> but let me check 14:10:20 <lujinluo> sure, thanks 14:11:43 <ihrachys_> so this is what failed there: http://logs.openstack.org/periodic/git.openstack.org/openstack/neutron/master/legacy-periodic-tempest-dsvm-neutron-pg-full/71f9bd8/logs/testr_results.html.gz 14:12:21 <lujinluo> got it! will start from there 14:12:29 <ihrachys_> I would imagine running those tests in a loop could get you to failure 14:12:53 <ihrachys_> specifically check those that raise ServerFault 14:12:54 <lujinluo> nice suggestion. will follow it ;) 14:13:00 <ihrachys_> since that's a sign of neutron-server internal error 14:13:09 <ihrachys_> ok 14:13:31 <ihrachys_> btw note that revert patch hasn't landed still: https://review.openstack.org/536913 14:14:05 <lujinluo> ok 14:14:20 <ihrachys_> in other news, gate is quite unstable lately, with multiple issues lingering 14:14:34 <ihrachys_> we are trying to fix them one by one but we are not there yet 14:14:51 <ihrachys_> things like job timeouts is one, functional is also unstable (ovsdb commands timing out) 14:15:10 <ihrachys_> so please be patient if you get those errors over and over, that's sadly expected 14:15:29 <hungpv_> Okay :( 14:16:05 <ihrachys_> now, back to usual business 14:16:06 <slaweq> hello, sorry for late 14:16:21 <ihrachys_> slaweq, hi! 14:16:22 <lujinluo> hey slaweq ! 14:16:24 <ihrachys_> https://review.openstack.org/#/q/status:open+project:openstack/neutron+branch:master+topic:bp/adopt-oslo-versioned-objects-for-db 14:16:41 <ihrachys_> https://review.openstack.org/#/c/521797/ "Use Router OVO in external_net_db" 14:17:17 <ihrachys_> so this patch is fine now but it lacks test coverage for the module / method it touches 14:17:29 <ihrachys_> so we can't really safely land it 14:18:39 <ihrachys_> hungpv_, do you think you could work on adding some unit tests for the method before merging this OVO patch? 14:18:56 <hungpv_> I thought you said it's called by update_rbac_policy and delete_rbac_policy ? 14:19:17 <ihrachys_> I just said that I checked there is no mixed old/new engine facade issue there 14:20:18 <hungpv_> Oh, if it's the case 14:20:29 <ihrachys_> remember in the past, in one of older patchsets, there were some mistakes that absolutely broke the code and gate failed to catch it? that's what I am concerned about, that existing coverage won't catch anything if there are still some bugs there. 14:20:52 <hungpv_> I'll be working more this issue 14:21:02 <ihrachys_> so adding some unit tests would help us to sleep well :) 14:21:09 <ihrachys_> ok thanks hungpv_ ! 14:21:32 <ihrachys_> next "https://review.openstack.org/507772 Use Network OVO in db_base_plugin" 14:21:46 <tuan_vu> I’ve left a comment about current blocking point 14:22:16 <tuan_vu> We’re still working on it 14:22:20 <ihrachys_> the current list of failures is: http://logs.openstack.org/72/507772/29/check/openstack-tox-py27/8b97a84/testr_results.html.gz 14:23:23 <ihrachys_> for foreign key failure, I see it's just in object test class itself 14:24:14 <tuan_vu> Yes, but we haven’t figured out how to fix it 14:24:16 <slaweq> tuan_vu: I didn't have time to check it yet 14:24:25 <slaweq> but I will try to help 14:24:37 <tuan_vu> Hi slaweq, thank you in advance 14:24:43 <ihrachys_> I think it's one of those cases that just need self.update_obj_fields in setUp() for whatever field is violated 14:24:49 <tuan_vu> We really appreciate your help 14:25:43 <ihrachys_> slaweq, tuan_vu, ping me if you need me to look at the foreign key issue 14:25:53 <tuan_vu> Thank you, ihar, we’ll try your suggestion 14:26:03 <ihrachys_> another more important failure there is test_*_queries_constant failures 14:26:41 <tuan_vu> Yes, I’m just wondering if we can change the number of queries? 14:27:04 <tuan_vu> Because in ovo, the way it works is different a little bit 14:27:12 <ihrachys_> which signals that somewhere in the patch, we fetch some models from database in an ineffective way (probably fetching a relationship for networks using a separate query) 14:27:33 <ihrachys_> tuan_vu, we can't, we should find a way to avoid it 14:27:46 <ihrachys_> tuan_vu, do you know which relationship triggers additional queries? 14:28:15 <tuan_vu> We need to reload “shared” attribute 14:28:31 <tuan_vu> When update network 14:29:40 <ihrachys_> why is it? can't we calculate new value from what is already passed by user / fetched from db? 14:30:04 <lujinluo> is it because "shared" is a synthetic field now? 14:30:23 <tuan_vu> Hi Luo, yes 14:31:17 <ihrachys_> besides, test_network_list_queries_constant also fails, which doesn't update network 14:31:28 <lujinluo> i think i have met this issue while working on Port integration. when we have aynthetic fields, the # of queries does not stay constant 14:31:44 <ihrachys_> it lists networks, and probably we trigger a set of queries per network matched 14:32:14 <lujinluo> yes, because get_objects() will try to fetch all their synthetic fields too 14:33:06 <tuan_vu> Yes, thank you Luo, your experience does help a lot 14:33:40 <lujinluo> but i do not recall i found a suitable solution for that :( 14:33:42 <ihrachys_> right. so one of relationships is probably of type that doesn't fetch data when parent network model is fetched. that would be a problem, no? 14:34:01 <ihrachys_> I mean, type of sqlalchemy model attribute for relationship 14:34:49 <ihrachys_> ok, let's try from another direction. how did it work before the patch? 14:35:14 <tuan_vu> Hmm 14:35:42 <tuan_vu> The main problem is “shared” attribute 14:36:09 <tuan_vu> Before the patch, we use rbac for “shared” 14:36:48 <ihrachys_> right. can't we calculate it for OVO the same way? we have .db_obj.rbac_entries already, right? 14:37:21 <ihrachys_> if we then make rbac metaclass to use this info directly instead of refetching it (I guess that's what happens?) then we would avoid the problem? 14:38:38 <tuan_vu> Hmm, let me think more about your suggestion, is that ok? I’ll send you an email if that’s needed 14:39:01 <tuan_vu> Or contact you via IRC 14:39:13 <ihrachys_> yeah sure, we don't need to dig the code right away, but the main point, we can't disable / change this unit test. 14:39:13 <lujinluo> that should work, by changing how the # of queries are calculated. besides, "shared" is not a synthetic field, please check which attribute is the one that causes this issue 14:39:56 <tuan_vu> Thank you, Ihar and Luo 14:39:57 <ihrachys_> moving on 14:40:02 <ihrachys_> https://review.openstack.org/#/c/537320/ "Use Port OVO in neutron/db/external_net_db.py" 14:40:33 <lujinluo> i did recheck about the timeouts.. but i guess zuul hung somewhere...? 14:40:37 <ihrachys_> lujinluo, I was to look at some port patches you have but then we stumbled on port binding revert. I was thinking, are we blocked by that one now? 14:40:59 <ihrachys_> lujinluo, yeah, zuul is sick these days. I rechecked. 14:41:30 <lujinluo> kind of.. it's better to wait for port binding to land first, as those two patches conflicts a lot 14:42:15 <ihrachys_> lujinluo, right. speaking of which... do you go to Dublin? I think mlavalle was planning a session on ovo/enginefacade issue. 14:42:25 <lujinluo> it would be easier to get them in serially, not in parallel 14:42:45 <lujinluo> my employer has not decided yet, but i guess probably.. 14:42:54 <lujinluo> i would go 14:43:22 <ihrachys_> great. btw I don't. but having someone from this group there would help. I may try to join remotely if there will be such an option 14:43:44 <lujinluo> oh, it is sad that you are not going :( 14:44:49 <ihrachys_> yeah. I kinda lagged on trips lately for family reasons. 14:45:06 <lujinluo> well, family always goes first 14:45:20 <ihrachys_> anyway, moving on to the next which is 14:45:25 <ihrachys_> https://review.openstack.org/#/c/537325/ "Use Meter Label OVO in neutron/db/metering/metering_db.py" 14:45:32 <ihrachys_> this looks green :) 14:46:30 <lujinluo> i used label.db_obj (which we should not do normally), but since we cannot use router as a synthetic field, i think this is fine. what do you think? ihrachys_ 14:46:41 <ihrachys_> so there, you pass db models in one case and objects in another case 14:47:11 <lujinluo> hmm right 14:47:38 <lujinluo> i need to avoid that 14:48:04 <ihrachys_> question is, how 14:48:44 <ihrachys_> we could of course add routers field to labels but that's a bit silly from API perspective (label is not a container for routers) 14:49:08 <lujinluo> the first solution came to my mind was _load_object(), but we should avoid that too 14:49:11 <lujinluo> yeah 14:49:58 <ihrachys_> at least for get_sync_data_for_rule, you could avoid fetching labels at all and instead have a method that returns the needed routers. 14:50:01 <lujinluo> how about using get_objects() 14:50:08 <ihrachys_> since you don't use anything but id from the label, and it's already known 14:50:40 <ihrachys_> we could have this method embedded into Router.get_objects, so that if metering_label_id filter is passed, it does the right tihng 14:51:56 <lujinluo> that sounds good. will choose that approach 14:52:11 <ihrachys_> ah wait not really. you rely on shared there 14:53:01 <lujinluo> yes, but instead i can pass the router_ids i fetch from label.db_obj.routers, then use get_objects() 14:54:05 <ihrachys_> true but wouldn't it trigger another fetch? 14:54:26 <ihrachys_> but wait, we already trigger it for shared case... 14:54:35 <lujinluo> yes 14:54:59 <ihrachys_> maybe that's a mistake actually that slipped in 14:55:46 <ihrachys_> it was doing get_collection_query before OVO though 14:55:53 <lujinluo> it got in here https://review.openstack.org/#/c/529551/5/neutron/db/metering/metering_db.py 14:55:54 <ihrachys_> so I guess no one bothered about it being optimial 14:57:01 <lujinluo> anyway, let's dig more in gerrit cause we only have 5 min left now! 14:57:29 <ihrachys_> but actually, for that other case with all routers fetched, it's different 14:57:40 <ihrachys_> because then label.db_obj.routers won't give same result 14:57:47 <ihrachys_> ok let's follow up there 14:58:18 <ihrachys_> slaweq, I am sorry; you joined today. did you have smth? 14:58:47 <slaweq> ihrachys_: no 14:58:54 <slaweq> I just wanted to join :) 14:58:59 <ihrachys_> ok, just checking :) thanks for joining :) 14:59:20 <lujinluo> slaweq: we are happy that you are here! 14:59:29 <slaweq> I will try to be here every week 14:59:34 <slaweq> if I will not forget :) 14:59:42 <ihrachys_> we don't really have much time left. if you have questions about some patches that were not discussed, please ping me in irc or email after this meeting, I will try to help. 14:59:57 <tuan_vu> Awesome, Slawek 15:00:01 <ihrachys_> slaweq, great. your help would be of great value. 15:00:10 <ihrachys_> ok we are officially out of time 15:00:16 <ihrachys_> ciao 15:00:18 <ihrachys_> #endmeeting