21:00:00 <dansmith> #startmeeting nova_cells 21:00:01 <openstack> Meeting started Wed May 17 21:00:00 2017 UTC and is due to finish in 60 minutes. The chair is dansmith. Information about MeetBot at http://wiki.debian.org/MeetBot. 21:00:02 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 21:00:04 <openstack> The meeting name has been set to 'nova_cells' 21:00:13 <melwitt> o/ 21:00:36 <dansmith> man, crickets 21:00:43 <mriedem> o/ 21:00:46 <mriedem> dtp is on vacation 21:01:12 <dansmith> aight 21:01:19 <dansmith> #topic cells testing / bugs 21:01:30 <dansmith> so, we have a bug, which I think goes back to newton, 21:01:38 <dansmith> although more severe the closer you get to ocata 21:01:43 <dansmith> which is a sort of db connection leak 21:01:49 <dansmith> which melwitt has a patch up for 21:01:52 <melwitt> yes, I opened dis https://bugs.launchpad.net/nova/+bug/1691545 21:01:54 <openstack> Launchpad bug 1691545 in OpenStack Compute (nova) "Significant increase in DB connections with cells" [High,In progress] - Assigned to melanie witt (melwitt) 21:02:09 <dansmith> since we don't need to worry about purging the list in older releases, I'd vote for making the change as tiny as possible 21:02:21 <dansmith> agreed and/or did you already change it? 21:02:27 <melwitt> already changed yes 21:02:40 <dansmith> cool, I have outed myself as "has not looked yet" 21:02:43 <mriedem> what do you mean by purging the list? 21:03:01 <dansmith> mriedem: the cache invalidation todo 21:03:03 <melwitt> like clearing the cache upon SIGHUP, is what I had before 21:03:16 <melwitt> and yeah, I did summarize in the TODO 21:03:22 <mriedem> why does that not matter in older releases? 21:03:31 <dansmith> because we never dump any of the other lists of cells we have now 21:03:34 <dansmith> like in compute/api 21:03:43 <dansmith> making this one dump will just be confusing and is more to backport 21:04:09 <mriedem> ok 21:04:27 <mriedem> so like with nova.compute.api.CELLS, 21:04:29 <dansmith> so, I'll go look at that when we're done here and hopefully we can get that landed soonly 21:04:34 <mriedem> if you need to refresh it, you have to restart the services 21:04:38 <dansmith> yeah, for now 21:04:51 <mriedem> ok, release note with that? 21:05:01 <dansmith> release note for no change? 21:05:07 <dansmith> or for the bug fix? 21:05:26 <mriedem> release note for the fix plus the fact that if you need to refresh the cache, you have to restart the nova-api service(s) 21:05:39 <mriedem> also, this is probably something to send to the operators list as a heads up, 21:05:43 <mriedem> since they are rolling to newton, 21:05:56 <dansmith> the restart thing is no different from the rest of the code, so calling it out seems confusing to me 21:06:02 <dansmith> but a note for the fix makes sense 21:06:03 <mriedem> and we've had at least two people say they are having perf issues in newton, and zzzeek made it sound like this was pretty bad 21:06:31 <melwitt> I guess we could surface info about the hidden caches in this rel note 21:06:35 <dansmith> I don't think he understood what we were talking about when he said that 21:06:37 <mriedem> we could put the restart thing in a cells FAQs in the devref 21:06:43 <mriedem> which i'd like to start anyway, but havne't yet 21:07:14 <dansmith> sure, documenting the general "new cells get picked up by a restart" in devref makes sense 21:08:07 <mriedem> ok i've got another bug if you're done with this one 21:08:07 <dansmith> anything else on this? 21:08:11 <dansmith> sure 21:08:13 <mriedem> https://review.openstack.org/#/c/464088/ 21:08:29 <mriedem> is the fix for the upgrade thing with special characeters in the db connection url 21:08:38 <mriedem> and simple_cell_setup effing those up for cell0 21:09:06 <dansmith> okay 21:09:55 <dansmith> anything specific on that or just highlighting? 21:10:05 <mriedem> just pointing it out, it's going to need to go back to newton also 21:10:10 <dansmith> okay 21:11:01 <dansmith> on the testing front, I've gotten my devstack multicell patch past the cellsv1 job now and tomorrow will start on the migration job 21:11:12 <dansmith> aside from the migration job and a libvirt crash, I've got a good run on it against the nova tree 21:11:26 <melwitt> noice 21:11:29 <dansmith> so, soonish on that I think 21:11:34 <dansmith> anything else on bugs/testing? 21:12:14 <dansmith> #topic open reviews 21:12:26 <dansmith> melwitt: last I looked at your set, it had a bunch of jenkins -1s on ti 21:12:27 <dansmith> *it 21:12:33 <dansmith> your quotas set that is 21:13:06 <melwitt> that's all spurious gate failures. I do need to incorporate the feedback from the short summit discussion, to change all counts to dicts anyway though 21:13:18 <dansmith> okay 21:13:28 <melwitt> been busy with this db connection thing but next up is refreshing that set 21:13:34 <dansmith> okay 21:13:48 <dansmith> other than that, we landed the discover hosts fasterer thing due to help from mriedem 21:13:50 <dansmith> so that's cool 21:14:02 <melwitt> cool 21:14:14 <dansmith> also my set for cellsv2 target fixes is in need of a booty call 21:14:24 <dansmith> https://review.openstack.org/#/q/status:open+project:openstack/nova+branch:master+topic:multi-cell-testing 21:14:40 <mriedem> https://review.openstack.org/#/c/458634/ scared me 21:15:01 <dansmith> why? 21:15:10 <mriedem> the turducken of context managers in https://review.openstack.org/#/c/458634/10/nova/compute/api.py 21:15:18 <dansmith> it's just a bunch of calling convention changes so the next patch can make it work 21:15:37 <melwitt> all of that gets simpler after the quotas series also 21:15:43 <mriedem> i haven't gone through it in detail yet 21:15:50 <dansmith> right, 21:15:57 <dansmith> it gets easier after quotas, 21:16:01 <dansmith> and after the instance delete cleanup 21:16:06 <dansmith> but it fixes real issues 21:16:11 <dansmith> once you get to the context.py change 21:16:21 <melwitt> k 21:16:35 <dansmith> anyway 21:16:42 <dansmith> anything else in open reviews? 21:16:51 <mriedem> https://review.openstack.org/#/c/461519/ and the one after it have a +2 21:16:56 <mriedem> alex was on them, but he's smashing bugs this week 21:17:12 <dansmith> okay I'll look at that too, sheesh 21:17:17 <mriedem> not a huge rush on those 21:17:24 <mriedem> they are mostly plumbing 21:17:33 <dansmith> what else? 21:17:54 <mriedem> that's all i know about 21:18:02 <dansmith> cool 21:18:14 <dansmith> #topic open discussion 21:18:20 <dansmith> anything here? 21:18:24 <mriedem> yes! 21:18:28 <dansmith> dammit. 21:18:52 <melwitt> lol 21:18:53 <mriedem> at some point here i have to do recaps of the summit sessions i ran, were you going to do that for the cells v2 session? if not, i can 21:19:12 <dansmith> I wasn't, but I can if you want 21:19:22 <mriedem> up to you, i don't think there was a ton to recap, 21:19:29 <mriedem> compromise on passing hosts to cells for retries 21:19:31 <dansmith> only two topics really 21:19:32 <dansmith> yeah 21:19:38 <mriedem> and you add that auto-disable thing 21:20:24 <dansmith> anyway, I'll plan to do that on friday or something 21:20:26 <melwitt> sanity check, the retry behavior is no different than in cells v1 right? retries stay in a cell I thought 21:20:37 <dansmith> melwitt: definitely 21:20:47 <melwitt> k. just making sure since I told someone that today 21:21:01 <melwitt> cool 21:21:02 <dansmith> heh 21:21:04 <dansmith> and if you don't have separated conductors then you can reschedule across cells in v2 21:21:13 <dansmith> we just need to work when we don't 21:21:30 <dansmith> okay, anything else? 21:21:49 <mriedem> nope 21:21:54 <dansmith> we're about seven minutes over time, so if there's nothing else... 21:22:24 <dansmith> #endmeeting with extreme prejudice