17:00:05 #startmeeting nova_cells 17:00:10 Meeting started Wed Mar 2 17:00:05 2016 UTC and is due to finish in 60 minutes. The chair is alaski. Information about MeetBot at http://wiki.debian.org/MeetBot. 17:00:12 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 17:00:14 The meeting name has been set to 'nova_cells' 17:00:16 o/ 17:00:18 \o/ 17:00:25 bauzas: ping 17:00:33 pong 17:00:39 alaski: thanks 17:00:41 \o/ 17:00:47 np 17:00:54 #topic Cells testing 17:00:57 o/ 17:01:09 * johnthetubaguy kinda lurks 17:01:15 I haven't heard about anything in this area 17:01:24 if nobody is aware of anything we'll move on 17:01:37 haven't heard of anything 17:01:41 #topic Open Reviews 17:01:43 cool 17:01:48 https://etherpad.openstack.org/p/mitaka-nova-priorities-tracking as always 17:02:01 I think today is the last day for M 17:02:08 so what's critical for Mitaka ? 17:02:13 so if things aren't ready to go they'll be deferred 17:02:15 so I am wondering what we should should try get into mitaka... 17:02:16 I nearly loosed focus 17:02:22 lost* 17:02:24 (oh man) 17:02:30 I am thinking it would be good to include this one: https://review.openstack.org/#/c/263925 17:02:35 although its not critical 17:02:50 I would love to get to that point 17:02:52 johnthetubaguy: that's far in the queue 17:03:01 but nothing is really critical from an upgrade PoV 17:03:03 up to this one ? 17:03:19 I think that gets us to the point that the two new objects are always in place, which makes it simpler next cycle 17:03:25 although yeah, its not required 17:03:34 agreed 17:03:57 yeah, the more we get in the better spot we'll be in 17:04:04 I mean always having build requests and instance mappings would be nicer for next cycle, but yeah, thats a stretch 17:04:10 so there is the cell mappings stuff 17:04:11 I can try to sprint on those by tonight 17:04:19 I mean, switch 17:04:42 oh, so thats +Wed 17:04:53 I think everything else has a -2 on it till newton at this point 17:05:04 today is wednesday right? 17:05:10 i thought we were freezing tomorrow 17:05:40 mriedem: that's the case, not all of the patches are -2d now 17:05:53 yes, we freeze tomorrow 17:05:56 johnthetubaguy: There is this: https://review.openstack.org/#/c/270565/ 17:05:56 so we're trying to identify which ones we can ship today 17:05:58 but the idea is to focus the review effort now 17:05:59 ok, yeah, i feel we're close on the build request object one at the bottom 17:06:29 so, agreeing on focusing on https://review.openstack.org/#/c/278124/ and above ideally? 17:06:38 what about the cell0 things ? 17:06:50 They are -2W already. 17:07:09 doffm: yeah, I would really like to see that one go in https://review.openstack.org/#/c/270565/. But I know melwitt has been busy recently 17:07:20 doffm: oh right 17:07:44 so we need this one: https://review.openstack.org/#/c/270565 for the cell mapping stuff to work? 17:07:52 I guess thats not really useful though, yet? 17:08:07 johnthetubaguy: instance mappings are only created if that command has been run 17:08:08 I can handle a new PS if melwitt is busy tonight 17:08:26 if that's just a matter of filing a relnonte 17:08:28 relnote 17:08:30 that said 17:08:32 ah, right, the instance mapping just raises 17:08:45 I feel that if that's the only blocker, we can +W and put the reno file in a later change 17:08:54 we have a reno note in that patch right now, right? 17:09:04 bauzas: it's mostly switching feature to upgrade 17:09:21 alaski: okay, that's something I can definitely handle :) 17:09:41 and relnotes can be amended later on 17:09:49 lemme do that now 17:10:03 so we could ship that one 17:10:09 great 17:10:09 so do we want to give our deployers that extra step so soon? 17:10:17 just checking here 17:10:36 that's the instance mapping table 17:10:46 it's not required at all yet 17:10:47 sorry the host mapping table 17:11:15 but it allows us to start testing, and getting it in this cycle allows for grenade testing 17:11:23 I was thinking of grenade actually 17:11:28 not of all users :) 17:11:39 but we can put some EXPERIMENTAL thing around the command 17:11:50 to make it clear it doesn't really mean something atm 17:12:11 and remove that EXPERIMENTAL log in Newton 17:12:29 so I guess we need it for this line to make sese: 17:12:29 https://review.openstack.org/#/c/263925/18/nova/conductor/manager.py@426 17:12:56 johnthetubaguy: right 17:13:07 OK, in which case, I am sold, lets add it 17:13:10 in the hope we get there 17:13:31 That was easy. :) 17:13:40 you fine with me adding some notice that it's experimental ? 17:13:54 so, for the avoidance of doubt, I am looking for a +W when we cut the branch, to have met FF, but a merge would be way better 17:14:10 johnthetubaguy: cool, good to know 17:14:27 ack 17:14:31 bauzas: making it clear its experimental, or not required, makes sense I think 17:14:37 well, we work 24/7, right ? :) 17:14:57 tomorrow morning UK time is waaaay too far for me 17:15:01 please don't break everyone, we have a few weeks of bug fixing to finish off yet 17:15:25 heh 17:15:36 so far we've been very careful to keep all of this code from interfering with anything. it's all gated behind different checks 17:15:37 so, I think that clears up the ordering of things for mitaka, from where I stand, so thank you for indulging me on that 17:15:52 alaski: +1 top work with all this, its looking good 17:16:06 thanks 17:16:16 and bauzas doffm melwitt and others 17:16:32 I'm a ghost 17:16:43 I got some complaints at the ops summit about there being a new API database thats now being used, but I told them its all good news, and they seemed OK with that 17:17:10 Just wait till they find out there is a cell0 database... and then a scheduler database. (One day) 17:17:18 cool. and it's not even directly cells related yet :) it's really so they can schedule their live-migrations the same way as boot 17:17:23 alaski: good point, good work from a good group of folks, lets keep this moving into newton 17:17:35 alaski: heh, true 17:17:58 johnthetubaguy: I guess we probably need to make clear to ops that we now have a better communication process with reno files 17:18:22 bauzas: that came up, not totally sure they got the idea yet 17:18:36 doffm: yeah :) but then hopefully they realize they can optimize for different usage patterns 17:18:37 https://review.openstack.org/#/c/277543/3 is about which DB tables go where 17:18:44 its docs, so its not FF blocked 17:18:46 fwiw, the api db thing did get into the mitaka install guide 17:18:48 BUT 17:18:59 mriedem: right, that would require the docs to be read 17:19:00 TBH, that matches the audience discussion from dhellmann, I'm mostly focused on getting those for operators that don't touch a bit of code 17:19:00 if the order starts to matter for the cell0 population, then the install guide will need to be updated 17:19:44 mriedem: There will be an ordering... but thats a problem for Newton now. 17:19:51 yeah 17:19:54 ok 17:20:00 last i heard it was still in the air 17:20:03 right now we're just getting pieces in place, then we define an ordering 17:20:04 but that was last week 17:20:16 the database split, I had questions around things like aggregate_host_mappings and servergroup_instance mappings, but not sure where that fits in the agenda 17:20:30 we can move on to the next topic 17:20:34 #topic Open Discussion 17:20:53 johnthetubaguy: the floor is yours 17:21:15 cool, so I am wondering if aggregate_host_mappings and servergroup_instance stuff should go in the cell DB, not API db 17:21:23 I have been working on the spec, and code for the aggregate move. 17:21:27 referencing this change: https://review.openstack.org/#/c/277543/3 17:21:36 johnthetubaguy: I agree the aggregate host mappings could go in the cell db. 17:22:00 I was thinking, if they grow at the same rate as instances and hosts, they should love in the cell db 17:22:02 I'm going to look at how they will be used in the scheduler (post resource provider) 17:22:17 And see what makes sense, but if you think celldb - thats good to know. 17:22:23 johnthetubaguy: doffm: that's my main concern 17:22:36 now that means the API will have to aggregate all those, but I think thats OK 17:22:48 if we say it's a relationship table for aggregates, used by the scheduler, it's somehow a global object 17:22:57 so aggregate_hosts I could see in the cell 17:23:13 yeah, the other bits of aggregate are in the API db 17:23:22 Yep, thats waht I have been thinking. 17:23:36 its just like flavor and instance, well, sort of 17:23:49 well, flavor is an api db thing 17:24:15 actually thats a horrific example 17:24:40 so I think scheduler land there is no change 17:24:44 the host reports its aggregate 17:24:54 the API reports changes to aggregate metadata 17:24:59 I think... 17:25:21 johnthetubaguy: the scheduler doesn't really care about the DB model 17:25:40 johnthetubaguy: it just uses the Aggregates object facade to get that info in memory at startup 17:25:46 doffm made the good point that it will be easier to determine after the resource provider work shakes out 17:25:48 plus some RPC hooks when it changes 17:26:18 johnthetubaguy: what just concerns me is the move operation, but since it's an API thing, that could be doable I guess 17:26:21 so I think its likely that some call will need to touch all cell DBs to get the list of hosts, but thats the same as API list 17:26:31 bauzas: move operation? 17:26:36 johnthetubaguy: agreed, it's a top-down request 17:26:59 johnthetubaguy: ie. moving one host from one agg to another by example 17:27:02 or adding a new host 17:27:09 anyways, it feels like this will end up being a case by case look at each table 17:27:15 yeah probably 17:27:19 so you're right 17:27:22 I suspect there are a few simple groupings and patterns we follow right 17:27:31 yeah, agreed 17:27:37 yeah 17:27:43 references instance, or references host, then it stays in cell DB, or something like that 17:27:58 fair point 17:28:08 but yeah, the fixes, each one needs is kinda complicated 17:28:08 johnthetubaguy: I tried to follow those patterns to make up the list. 17:28:14 for the most part. but we do have instance and host mappings that can be referenced in the api db 17:28:22 But we will need to look closer at each one as we make specs for them. 17:28:37 so as long as it just needs an instance or host uuid it can live in the api db. assuming it makes sense for other reasons 17:28:38 alaski: thats true, I just don't like the DB growth 17:28:56 alaski: so I feel the host mapping is fine, it just tells you which cell your host is in 17:29:03 johnthetubaguy: totally agree. we just need to balance it against duplication 17:29:29 alaski: if we have aggregate uuids, we can still update the aggregate_hosts table by referencing something in another DB 17:29:49 alaski: that just requires to denormalize by removing the FK 17:30:09 which I guess doffm was following to keep both tables in the same DB, right? 17:30:23 bauzas: Correct, i was following FK relationships. 17:30:31 I think we are trying to say adding child cells is largely independent of the api cells DB size, roughly, its more related to API load, I just like trying to keep that as much as possible 17:30:39 But they can be broken for many reasons, including performance. 17:30:49 for the scale out your deployment by adding cells, use case 17:31:00 yeah, I think this is basically sharding 17:31:04 johnthetubaguy: that's a reasonable concern 17:31:26 thats my main motivation for saying we keep the mappings in the child db 17:31:36 the metadata just lives in the API db 17:31:40 johnthetubaguy: Thats a very good point, and makes host_aggregate mapping seem more cells than api. 17:31:42 and the uuid for the aggregate 17:31:43 agreed, we just need to make sure our consistency is not impacted by the denormalization and putting things in different places 17:31:51 doffm: yeah, same thinking 17:32:13 bauzas: right. that's where finding a balance is important 17:32:15 now the API can write into the child cell DB, thats just fine, and the host mapping helps us find it, so the system seems to work OK 17:32:24 I'll look again at that list with johnthetubaguy's scaling point in mind. 17:33:07 alaski: johnthetubaguy: if we agree to keep the top-down pattern as the rule, and bottom-up calls (from cell to API) as minimal as possible, that does sound reasonable, since updating those objects is mostly API-driven 17:33:15 But I was already thinking about host_aggregate mapping as possibly cells, just because it has 'HOST' in it and we have discussed keeping some of the other 'mapping' tables in the cell. 17:33:34 I'm just concerned by any nova-manage command that would be playing with the DB directly 17:34:04 that's where we need to make sure we propagate correctly our changes 17:34:37 bauzas: not sure I understand your concern here 17:34:43 I think it's going to be important to understand the access pattern of the tables, and that will help determine where it should live 17:34:54 what alaski said 17:35:34 if some concepts like aggregates are managed thru the API (by creating those or updating those), that's easy because that's a top-down call 17:36:27 if we allow to modify our objects using nova-manage, we somehow need to make sure that we update both the API DB and the cell DB, and that could be a bottom-up call (ie. from one cell) which worries me a bit 17:36:53 but I'm maybe diverting 17:37:07 possibly 17:37:32 bauzas: it's a good point. I think it should be captured somewhere outside of the meeting, like on the review we're discussion 17:37:37 *discussing 17:37:56 bauzas: yeah, I guess its part of the access pattern? one we are likely to miss actually 17:38:08 I'll try and finish the Aggregate API move spec. We can discuss there in a few weeks time. 17:38:10 johnthetubaguy: yeah that's an access pattern 17:38:10 access pattern is important, I am just worried we loose the sharding we need 17:38:38 but let's put that out of concern for the moment 17:38:59 for me, I see the stuff that needs to move into the API DB, as we don't want it duplicated, like aggregate and aggregate_metadata, the other bits should stay, as they are local to the cell, from a sharding point of view 17:39:09 that, I agree 17:39:16 Agree. 17:39:20 but agreed we should look at relaxing that, if the performance or locking turns out to suck 17:39:47 I agree for the most part, what gets fuzzy for me is scheduling related bits that we might pull out into a global db later anyways 17:40:03 so I just keep them with the host for now, the host related bits 17:40:11 the scheduler can aggregate stuff 17:40:20 alaski: well, that's not yet the point, the scheduler is mostly using memory for that 17:40:41 alaski: I agree, that includes aggregate, but its mostly the Host state and resource provider stuff. 17:40:42 it can, but it leads to duplication if we need to define the same thing in multiple cell dbs 17:40:47 and we still have an open question on how the scheduler should work distributely 17:40:51 erm 17:40:54 doffm: yeah 17:40:58 in a distributed way, rather 17:41:25 bauzas: lets not go there on FF day. :) 17:41:26 alaski: which duplication are you thinking? 17:41:31 the aggregate 17:41:46 I think that lives at the api level to remove the duplication 17:42:00 oh, wait, resource pools 17:42:00 for resource providers we decided to put them into the cells, which means multiple cells might have the same one defined 17:42:13 yeah 17:42:25 yeah, so I think we have both there, which might be confusing 17:42:45 longer term, you are right, that all moves to a scheduler DB 17:42:52 that is sharded separately 17:42:53 The resource pool trick to map them to cells using aggregates is....tricky, but will probably work. 17:43:20 well aggregate being the in api cell makes that all work, I think 17:43:40 anyways, that feels like we got a bit deep for right this second 17:43:45 johnthetubaguy: right. and the sharding in a scheduler db may not match the cell sharding being done. 17:43:50 its write down, and make sure we all agreed kinda thing 17:44:01 agreed 17:44:06 alaski: yeah, I think it has to be independent, in the end 17:44:07 agreed 17:44:25 anyways, it was the scale concern that worried me 17:44:50 it was good to get that out there, not that is a novel thought, it just hit me again yesterday 17:44:55 what I'm taking from this is we could use more words written down explaining the priorities/concerns and motivations for things. rules of thumb we're using and so on 17:45:06 alaski: yeah, I think so 17:45:19 that way we capture the scaling concern and keep it in mind 17:45:22 alaski: a motivations / concerns thing 17:45:36 yeah, like the API review guidelines, but not in there 17:46:08 cool 17:46:12 any more topics today? 17:46:23 I guess the aim is to bring the plan to the summit, and discuss the tricky corners of the DB moves? 17:46:36 I guess that your spec doffm? 17:46:59 I mean if we merge it before the summit, then awesome 17:47:28 that's even just a devref 17:47:32 not a spec 17:47:49 bauzas: I'm working now on an aggregate table move spec. 17:47:55 oh ack 17:49:03 johnthetubaguy: yes, it would be good to discuss db migrations as much as possible. I'm not sure if we'll have time to deep dive on all of them though 17:49:35 There are probably a good few that don't need a deep dive, We can lump those together in one 'move the easy ones' spec. 17:49:49 Maybe. 17:50:10 Then spend lots of deep time on the others. 17:50:17 hopefully. but Nova is full of hidden pitfalls :) 17:50:30 Ha! True. :) 17:50:54 heh 17:51:02 anyways, thanks all, nothing more from me 17:51:05 johnthetubaguy: when are summit session proposals open? 17:51:27 alaski: good question, when ever we like I guess... I should create the etherpad 17:51:38 probably next week sometime, in that case 17:51:50 okay 17:52:08 FWIW https://review.openstack.org/270565 is updated 17:52:09 do people want to come together next week to discussion session topics, or in two weeks? 17:52:41 in other words, do people want to skip next weeks meeting? 17:52:49 why not 17:52:52 That would be OK with me. 17:53:09 * alaski slams a gavel 17:53:11 so noted 17:53:13 that said, I'd love to see if we could maybe try to find another timeslot for our weekly :D 17:53:26 that can wait for Newton-ish 17:53:34 but that one is terrible for me :p 17:53:54 bauzas: okay. we should definitely look at the timeslots again 17:54:04 that's not super urgent 17:54:07 I can handle 17:54:24 okay. I'll dig up the ical thing to see what's open these days 17:54:59 fair, thanks 17:55:06 I'll send a not to the ML about skipping next week 17:55:12 see you all in two weeks 17:55:15 thanks 17:55:23 thanks all 17:55:26 #endmeeting