17:13:15 #startmeeting nova_cells 17:13:17 Meeting started Wed Mar 30 17:13:15 2016 UTC and is due to finish in 60 minutes. The chair is alaski. Information about MeetBot at http://wiki.debian.org/MeetBot. 17:13:18 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 17:13:20 The meeting name has been set to 'nova_cells' 17:13:24 welcome 17:13:37 * bauzas finally waves 17:14:04 alaski: so in the non-cells case, we don't need to worry about cell assignment, because it's clear which one to assign right? 17:14:10 dansmith: yes, looks like it https://review.openstack.org/#/c/270565/12/nova/cmd/manage.py 17:14:12 * bauzas reads the scrollback 17:14:15 the instance mapping one is https://github.com/openstack/nova/blob/master/nova/cmd/manage.py#L1319 IIUC 17:14:28 I guess in the cellsv2 migration case I'd expect we run that in each existing cell which would make it clear which one instances should be assigned to 17:14:37 dansmith: right for non-cells 17:15:00 that instance one takes a marker, but it should just take a limit and operate on any non-mapped instances, right? 17:15:08 taking a marker is just terrible UX I think 17:16:49 that map_instances command is pretty old 17:16:50 you'd have to compare two dbs to find the unmapped instances 17:17:05 which is fine, just a bit more work 17:17:22 it just seems terrible to me to make them retain the marker.. much harder to script 17:17:23 and 17:17:37 I would think it'd be susceptible to drift as the process could take a while 17:17:50 so anyway, that aside, 17:17:56 the sort order and key was picked to avoid that 17:18:21 I guess it seems to me like creating the cell records and probably the host mapping can/should happen syncrhonously and separately, 17:18:22 but we could definitely make that better 17:18:37 for testing, we don't need to set the marker, so I guess it's okay 17:18:42 but the instance mapping thing seems like it can very well be an online migration that does small batches at a time, along with everything else 17:19:03 Online migration would need a 'default' cell to be operating on. 17:19:12 Not sure how it would work for the cellsv1 upgrade case. 17:19:17 alaski: the marker works for a proper sort order, but assuming we have soft-deleted instances 17:19:26 which we do, so I guess we're okay, it just seems odd 17:19:31 the ansible/puppet people will hate that 17:19:52 that's fair 17:20:18 so I think for the instancemapping migration there's some info needed in each case 17:20:31 you need to know the api db, and which cell you're in 17:20:33 and yes, do doffm's point, we'd need some way to know what cell we need to put things in 17:20:36 yeah 17:20:48 if the hosts are already mapped, we could figure it out right? 17:20:56 pick our first host, figure out what cell it claims to be in in the api db, 17:20:59 and then roll with that? 17:21:12 Ummm yeah. I think that works. 17:21:14 and/or make sure all your hosts are in the same cell and if so, use that cell, else error 17:21:19 yeah, that should work 17:21:20 Migration might be a pain. 17:22:35 it's worth a shot. I want to think on it a bit more, and it would help to see it 17:22:49 for the non-cells case, 17:23:01 I think it's important that we build as much into the generic command as possible 17:23:10 because it's what people are going to be (or are) encoding into scripts and such 17:23:20 and it's in devstack now, which means we get it for free 17:24:20 it does impose an ordering requirement 17:24:42 meaning you have to have created the cell records first, yes 17:24:44 you have to create a cell before doing online migrations 17:24:46 yeah 17:24:55 but you have to unless you do it all in one command anyway 17:24:57 and if you don't you need to rerun the data migrations 17:25:55 okay, seems like a good approach 17:26:04 given all the commands, I guess writing a scenario for the non-cells case could be the best opt ? 17:26:23 or one single command to rule'em all, either 17:26:37 Writing some scenarios with example commands would be a good idea. 17:26:56 So we can agree that the command sequence is acceptable. 17:27:04 I'm a bit concerned by the learning curve, given the yells I already had for the api db 17:27:37 the cell0 thing could be also something scaring people ;p 17:27:39 well, this is not going to be something you don't have to worry about, no matter what we do 17:27:40 but.. 17:27:46 we should make it as integrated as possible I think 17:27:57 so, my take is, either we do that silently and directly, or we comment that 17:28:01 and it helps to avoid people that don't care about cells thinking they're doing a bunch of cells steps they shouldn't have to do 17:28:10 if it's integrated with the other processes, it looks like just nova stuff 17:28:23 yeah that's my approach saying "if you don't care with cells, okay, just issue this magical command" 17:28:49 Cell0 should be a non worrying implementation detail if we get it right. 17:29:38 bauzas: well, we still have to have them create the initial cell record stuff somehow 17:29:42 does someone want to write this all up somewhere? 17:29:52 some example scenarios and commands 17:29:56 I could 17:30:00 I can write up a couple of options. 17:30:03 Or bauzas :) 17:30:10 given I'm the one who asks for :p 17:30:37 #action bauzas write up some cell migration scenarios and commands used 17:30:59 that said, I can just upload a placeholder doc saying "WRITE YOUR SCENARIO THERE ===>" and leave others using the nice Gerrit UI that I like (and that I'm conscious being in minority :p ) 17:31:30 bauzas: I think throwing up a review would be great 17:31:39 I like the gerrit UI as well, now that it doesn't jump on me 17:32:02 we already have one? https://review.openstack.org/#/c/267153/ 17:32:19 once we have an agreed upon set of commands we can work to get devstack/grenade using and testing that 17:32:55 when do you online migrate the instance? when we first pull it from the db? 17:32:59 mriedem: sure, we can throw it in there if that's what works 17:33:26 assuming the instance's host is already mapped to a cell 17:33:31 so far we've just talked about doing it with the nova-manage run-online-migrations thing 17:34:20 we can look at doing it on instance access as well 17:34:29 mriedem: that's something I can rebase 17:34:40 (talking of https://review.openstack.org/#/c/267153/1/doc/source/cells.rst) 17:35:05 are cell mappings created manually then? 17:35:13 mriedem: yes 17:35:24 which is the map_cell_and_hosts cli right? 17:35:29 yeah 17:35:29 right 17:35:42 ok 17:35:57 doing the migrations is one thing, mapping the hosts is another thing 17:36:51 was just thinking if we could do something on n-cpu startup, but we don't have the transport_url 17:37:14 anyway, i will shut up now 17:37:52 we'd have to put cell info somewhere for that to work 17:38:11 which cell it should be in I mean 17:38:32 yeah 17:38:47 was just thinking of the single cell case 17:38:53 or migrating from no cells 17:39:54 I guess we could get fancy, but at some point something has to make an entry saying there's now a cell with name 'foo' and db x and mq y 17:39:56 we could do crazy pants and use other things for describing how to shard your cloud, like aggregates 17:40:03 but that's a foolish idea 17:40:20 regions! 17:40:29 * alaski facepalms 17:40:43 moving on, we have a next step here 17:40:44 * bauzas fullbodypalms 17:40:50 #topic open reviews 17:40:55 new link https://etherpad.openstack.org/p/newton-nova-priorities-tracking 17:41:05 I added some patches there 17:41:11 please add your patches as they go up 17:41:16 and please review patches in there 17:41:19 I added the mq switching spec 17:41:24 awesome 17:41:28 I guess since it was cleared I didn't put the cell0 patches up. Will now. 17:41:30 flavors....where are we on the create one? 17:41:31 alaski: ohay, awesome 17:41:44 yeah, there is a -2 that confused me 17:41:57 Are we reviewing past the -2? for flavors. 17:41:58 I thought we were having this because postgre ? 17:42:02 bauzas: the -2 is because we want to merge the series as one step 17:42:10 so we approve everything behind it, then that one 17:42:18 Ahh. Ok. 17:42:32 bauzas: the postgres issue seems to be fixed 17:42:32 alaski: okay, because we fear to miss some changes left in the road ? 17:42:40 alaski: you're -WIP on https://review.openstack.org/#/c/298455/ also 17:42:52 i'm avoiding WIP things from the etherpad really 17:42:58 alaski: that's a point I asked dansmith in the review, I feel he found the pony 17:43:12 mriedem: yeah, I'm afraid I might need to add more attributes for scheduling. I'm working ahead to determine that 17:43:32 it's reviewable, but I may want to add more later 17:44:41 alaski: that's next in my pipe, could I still review https://review.openstack.org/#/c/298455/ ? 17:44:46 bauzas: there's a block on flavor.create in that series until some work from a later patch happens. so we don't want to merge the block without a way to unblock 17:45:16 alaski: the t.n.m thing ? 17:45:22 doffm: yeah, please do review 17:45:35 doffm: just holding so we have the full stack +Ad before we let it go 17:45:36 so we should really review https://review.openstack.org/#/c/295310/ for the flavors series then right? 17:45:39 oops, I meant the n.t.unit.db thing 17:45:51 mriedem: yes, everything in that set 17:46:19 dansmith: https://review.openstack.org/#/c/296106/10 could be left off the series 17:46:28 bauzas: basically we want https://review.openstack.org/#/c/295310/ at the same time as the bottom one in that series 17:46:35 dansmith: it's not requiring something in the series right? 17:46:43 bauzas: it's trivial cleanup, seems fine to leave it in 17:46:46 and it's already approved 17:46:46 dansmith: gotcha 17:46:54 bauzas: I know, I saw your comment, I just didn't want to rearrange things and risk breakage 17:47:00 I think it's fine, but it shouldn't matter 17:47:17 dansmith: mriedem: okay, I'm fine with that, given the -2 that would unblock all the things 17:47:19 if I have to respin again i can check, but... 17:47:37 we just need to make sure the gate is in a nice shape :) 17:47:43 before unblocking the series :p 17:47:53 even if it's not, the risk window is very small 17:47:57 I think it's fine 17:48:30 yeah i don't think anyone is going to CD nova in between when the bottom and top are merged 17:48:39 +1 17:48:41 and if they do, kudos to them 17:48:50 hehe 17:48:57 even if they do, it's not the end of the world 17:49:15 they'd basically have to be CDing a completely stock build with all the default flavors 17:49:18 probably not very likely :) 17:49:18 but but but my telco customer really needs this fancy new flavor today 17:49:27 and even then, there is cleanup they could do easily 17:50:04 alright, i'll focus on that one this afternoon then 17:50:10 I'd like to get an opinion on https://review.openstack.org/#/c/298455 17:50:35 I WIPed because though that satisfies the API reqs of Instance, it may not satisfy the scheduler/compute reqas 17:50:39 *reqs 17:50:47 so further updates may be needed 17:51:01 do people care about getting everything at once, or just update again if necessary? 17:51:31 or just go whole hog and copy much of instance into buildreq and be done with it? 17:51:47 is it just multiple updates or might something break? 17:51:52 I thought it would be fine to update again if needed. but that's just me 17:51:55 multiple updates 17:52:37 I'll look after 17:52:45 alaski: well, it sounds boiling the ocean, right? 17:52:56 okay. I'll un WIP and just proceed 17:52:58 sucks that we're duplicating, 17:53:13 would be nice if we could just copy the instances schema and whitelist any columns that need special care 17:53:25 but it's kind of too late for that now it seems 17:53:35 so 17:53:42 it seems weird that we're adding those things as columns 17:53:45 mriedem: the BuildRequest object is aimed to be transitory 17:53:48 we don't need to query based on them right? 17:54:02 could we take an instance, cut out the crap we don't care about and store it serialized? 17:54:08 dansmith: it's only ever queried by uuid, or project_id 17:54:33 alaski: yeah, so feels strange to need these to be all columns, and then more columns when you find more stuff you want 17:54:35 I dunno 17:54:45 Do we create an instance object before the the build request to serialize it? 17:54:46 i like the serialized instance idea 17:54:59 doffm: yup, IIRC 17:55:01 doffm: we don't, but we could 17:55:08 doffm: you can just create an instance, put shit in it, and then serialize, you don't need to create in the db 17:55:14 OK. 17:55:21 like, you don't need to call Instance.create() on it first 17:55:38 alaski: aren't we ? I thought we were creating the BuildReq object at the same time than the ReqSpec obj 17:55:45 yeah... https://review.openstack.org/#/c/298455/2/nova/compute/api.py@1021 17:55:51 bauzas: req_spec->build_req->instance 17:55:57 but all in the same place right now 17:56:04 hmm 17:56:13 oh yeah I see 17:56:22 all three with overlapping data right/ 17:56:23 L1016 we have the instance 17:56:30 that's really just an implem detail for me 17:56:41 dansmith: build_req and req_spec don't overlap 17:56:44 I mean, it seems to be fine to modify this if needed 17:56:47 build_req contains a req_spec 17:56:58 alaski: okay 17:57:13 build_req extends req_spec 17:57:18 I must be missing why we need to do this 17:57:29 surely seems like we could just create the instance object as we go, with things we know, 17:57:42 serialize it in the request, and then later call .create() on it when we know where it goes 17:58:30 that will work for now 17:58:50 but not later? or ...? 17:59:06 I ultimately wanted to move more towards breaking out better units of things, and not rely on the current instance model 17:59:16 okay 17:59:20 like properly model things the scheduler needs, things compute needs, etc... 17:59:33 okay, well, 17:59:36 but, that doesn't need to get tangled up here 17:59:52 we have 1 minute... 17:59:57 I guess I'm just worried we'll end up with basically the same thing as the instances table here if we store these per column and have to add something everytime we get a new thing 18:00:18 dansmith: yeah 18:00:24 let's hop back to -nova 18:00:26 thanks all! 18:00:30 #endmeeting