17:13:15 <alaski> #startmeeting nova_cells
17:13:17 <openstack> Meeting started Wed Mar 30 17:13:15 2016 UTC and is due to finish in 60 minutes.  The chair is alaski. Information about MeetBot at http://wiki.debian.org/MeetBot.
17:13:18 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
17:13:20 <openstack> The meeting name has been set to 'nova_cells'
17:13:24 <anteaya> welcome
17:13:37 * bauzas finally waves
17:14:04 <dansmith> alaski: so in the non-cells case, we don't need to worry about cell assignment, because it's clear which one to assign right?
17:14:10 <alaski> dansmith: yes, looks like it https://review.openstack.org/#/c/270565/12/nova/cmd/manage.py
17:14:12 * bauzas reads the scrollback
17:14:15 <melwitt> the instance mapping one is https://github.com/openstack/nova/blob/master/nova/cmd/manage.py#L1319 IIUC
17:14:28 <dansmith> I guess in the cellsv2 migration case I'd expect we run that in each existing cell which would make it clear which one instances should be assigned to
17:14:37 <alaski> dansmith: right for non-cells
17:15:00 <dansmith> that instance one takes a marker, but it should just take a limit and operate on any non-mapped instances, right?
17:15:08 <dansmith> taking a marker is just terrible UX I think
17:16:49 <mriedem> that map_instances command is pretty old
17:16:50 <alaski> you'd have to compare two dbs to find the unmapped instances
17:17:05 <alaski> which is fine, just a bit more work
17:17:22 <dansmith> it just seems terrible to me to make them retain the marker.. much harder to script
17:17:23 <dansmith> and
17:17:37 <dansmith> I would think it'd be susceptible to drift as the process could take a while
17:17:50 <dansmith> so anyway, that aside,
17:17:56 <alaski> the sort order and key was picked to avoid that
17:18:21 <dansmith> I guess it seems to me like creating the cell records and probably the host mapping can/should happen syncrhonously and separately,
17:18:22 <alaski> but we could definitely make that better
17:18:37 <bauzas> for testing, we don't need to set the marker, so I guess it's okay
17:18:42 <dansmith> but the instance mapping thing seems like it can very well be an online migration that does small batches at a time, along with everything else
17:19:03 <doffm> Online migration would need a 'default' cell to be operating on.
17:19:12 <doffm> Not sure how it would work for the cellsv1 upgrade case.
17:19:17 <dansmith> alaski: the marker works for a proper sort order, but assuming we have soft-deleted instances
17:19:26 <dansmith> which we do, so I guess we're okay, it just seems odd
17:19:31 <dansmith> the ansible/puppet people will hate that
17:19:52 <alaski> that's fair
17:20:18 <alaski> so I think for the instancemapping migration there's some info needed in each case
17:20:31 <alaski> you need to know the api db, and which cell you're in
17:20:33 <dansmith> and yes, do doffm's point, we'd need some way to know what cell we need to put things in
17:20:36 <dansmith> yeah
17:20:48 <dansmith> if the hosts are already mapped, we could figure it out right?
17:20:56 <dansmith> pick our first host, figure out what cell it claims to be in in the api db,
17:20:59 <dansmith> and then roll with that?
17:21:12 <doffm> Ummm yeah. I think that works.
17:21:14 <dansmith> and/or make sure all your hosts are in the same cell and if so, use that cell, else error
17:21:19 <alaski> yeah, that should work
17:21:20 <doffm> Migration might be a pain.
17:22:35 <alaski> it's worth a shot. I want to think on it a bit more, and it would help to see it
17:22:49 <dansmith> for the non-cells case,
17:23:01 <dansmith> I think it's important that we build as much into the generic command as possible
17:23:10 <dansmith> because it's what people are going to be (or are) encoding into scripts and such
17:23:20 <dansmith> and it's in devstack now, which means we get it for free
17:24:20 <alaski> it does impose an ordering requirement
17:24:42 <dansmith> meaning you have to have created the cell records first, yes
17:24:44 <alaski> you have to create a cell before doing online migrations
17:24:46 <alaski> yeah
17:24:55 <dansmith> but you have to unless you do it all in one command anyway
17:24:57 <alaski> and if you don't you need to rerun the data migrations
17:25:55 <alaski> okay, seems like a good approach
17:26:04 <bauzas> given all the commands, I guess writing a scenario for the non-cells case could be the best opt ?
17:26:23 <bauzas> or one single command to rule'em all, either
17:26:37 <doffm> Writing some scenarios with example commands would be a good idea.
17:26:56 <doffm> So we can agree that the command sequence is acceptable.
17:27:04 <bauzas> I'm a bit concerned by the learning curve, given the yells I already had for the api db
17:27:37 <bauzas> the cell0 thing could be also something scaring people ;p
17:27:39 <dansmith> well, this is not going to be something you don't have to worry about, no matter what we do
17:27:40 <dansmith> but..
17:27:46 <dansmith> we should make it as integrated as possible I think
17:27:57 <bauzas> so, my take is, either we do that silently and directly, or we comment that
17:28:01 <dansmith> and it helps to avoid people that don't care about cells thinking they're doing a bunch of cells steps they shouldn't have to do
17:28:10 <dansmith> if it's integrated with the other processes, it looks like just nova stuff
17:28:23 <bauzas> yeah that's my approach saying "if you don't care with cells, okay, just issue this magical command"
17:28:49 <doffm> Cell0 should be a non worrying implementation detail if we get it right.
17:29:38 <dansmith> bauzas: well, we still have to have them create the initial cell record stuff somehow
17:29:42 <alaski> does someone want to write this all up somewhere?
17:29:52 <alaski> some example scenarios and commands
17:29:56 <bauzas> I could
17:30:00 <doffm> I can write up a couple of options.
17:30:03 <doffm> Or bauzas :)
17:30:10 <bauzas> given I'm the one who asks for :p
17:30:37 <alaski> #action bauzas write up some cell migration scenarios and commands used
17:30:59 <bauzas> that said, I can just upload a placeholder doc saying "WRITE YOUR SCENARIO THERE ===>" and leave others using the nice Gerrit UI that I like (and that I'm conscious being in minority :p )
17:31:30 <alaski> bauzas: I think throwing up a review would be great
17:31:39 <alaski> I like the gerrit UI as well, now that it doesn't jump on me
17:32:02 <mriedem> we already have one? https://review.openstack.org/#/c/267153/
17:32:19 <alaski> once we have an agreed upon set of commands we can work to get devstack/grenade using and testing that
17:32:55 <mriedem> when do you online migrate the instance? when we first pull it from the db?
17:32:59 <alaski> mriedem: sure, we can throw it in there if that's what works
17:33:26 <mriedem> assuming the instance's host is already mapped to a cell
17:33:31 <alaski> so far we've just talked about doing it with the nova-manage run-online-migrations thing
17:34:20 <alaski> we can look at doing it on instance access as well
17:34:29 <bauzas> mriedem: that's something I can rebase
17:34:40 <bauzas> (talking of https://review.openstack.org/#/c/267153/1/doc/source/cells.rst)
17:35:05 <mriedem> are cell mappings created manually then?
17:35:13 <alaski> mriedem: yes
17:35:24 <mriedem> which is the map_cell_and_hosts cli right?
17:35:29 <bauzas> yeah
17:35:29 <alaski> right
17:35:42 <mriedem> ok
17:35:57 <bauzas> doing the migrations is one thing, mapping the hosts is another thing
17:36:51 <mriedem> was just thinking if we could do something on n-cpu startup, but we don't have the transport_url
17:37:14 <mriedem> anyway, i will shut up now
17:37:52 <alaski> we'd have to put cell info somewhere for that to work
17:38:11 <alaski> which cell it should be in I mean
17:38:32 <mriedem> yeah
17:38:47 <mriedem> was just thinking of the single cell case
17:38:53 <mriedem> or migrating from no cells
17:39:54 <alaski> I guess we could get fancy, but at some point something has to make an entry saying there's now a cell with name 'foo' and db x and mq y
17:39:56 <bauzas> we could do crazy pants and use other things for describing how to shard your cloud, like aggregates
17:40:03 <bauzas> but that's a foolish idea
17:40:20 <mriedem> regions!
17:40:29 * alaski facepalms
17:40:43 <alaski> moving on, we have a next step here
17:40:44 * bauzas fullbodypalms
17:40:50 <alaski> #topic open reviews
17:40:55 <alaski> new link https://etherpad.openstack.org/p/newton-nova-priorities-tracking
17:41:05 <alaski> I added some patches there
17:41:11 <alaski> please add your patches as they go up
17:41:16 <alaski> and please review patches in there
17:41:19 <melwitt> I added the mq switching spec
17:41:24 <alaski> awesome
17:41:28 <doffm> I guess since it was cleared I didn't put the cell0 patches up. Will now.
17:41:30 <mriedem> flavors....where are we on the create one?
17:41:31 <bauzas> alaski: ohay, awesome
17:41:44 <bauzas> yeah, there is a -2 that confused me
17:41:57 <doffm> Are we reviewing past the -2? for flavors.
17:41:58 <bauzas> I thought we were having this because postgre ?
17:42:02 <alaski> bauzas: the -2 is because we want to merge the series as one step
17:42:10 <alaski> so we approve everything behind it, then that one
17:42:18 <doffm> Ahh. Ok.
17:42:32 <alaski> bauzas: the postgres issue seems to be fixed
17:42:32 <bauzas> alaski: okay, because we fear to miss some changes left in the road ?
17:42:40 <mriedem> alaski: you're -WIP on https://review.openstack.org/#/c/298455/ also
17:42:52 <mriedem> i'm avoiding WIP things from the etherpad really
17:42:58 <bauzas> alaski: that's a point I asked dansmith in the review, I feel he found the pony
17:43:12 <alaski> mriedem: yeah, I'm afraid I might need to add more attributes for scheduling. I'm working ahead to determine that
17:43:32 <alaski> it's reviewable, but I may want to add more later
17:44:41 <bauzas> alaski: that's next in my pipe, could I still review https://review.openstack.org/#/c/298455/ ?
17:44:46 <alaski> bauzas: there's a block on flavor.create in that series until some work from a later patch happens. so we don't want to merge the block without a way to unblock
17:45:16 <bauzas> alaski: the t.n.m thing ?
17:45:22 <dansmith> doffm: yeah, please do review
17:45:35 <dansmith> doffm: just holding so we have the full stack +Ad before we let it go
17:45:36 <mriedem> so we should really review https://review.openstack.org/#/c/295310/ for the flavors series then right?
17:45:39 <bauzas> oops, I meant the n.t.unit.db thing
17:45:51 <dansmith> mriedem: yes, everything in that set
17:46:19 <bauzas> dansmith: https://review.openstack.org/#/c/296106/10 could be left off the series
17:46:28 <alaski> bauzas: basically we want https://review.openstack.org/#/c/295310/ at the same time as the bottom one in that series
17:46:35 <bauzas> dansmith: it's not requiring something in the series right?
17:46:43 <mriedem> bauzas: it's trivial cleanup, seems fine to leave it in
17:46:46 <mriedem> and it's already approved
17:46:46 <bauzas> dansmith: gotcha
17:46:54 <dansmith> bauzas: I know, I saw your comment, I just didn't want to rearrange things and risk breakage
17:47:00 <dansmith> I think it's fine, but it shouldn't matter
17:47:17 <bauzas> dansmith: mriedem: okay, I'm fine with that, given the -2 that would unblock all the things
17:47:19 <dansmith> if I have to respin again i can check, but...
17:47:37 <bauzas> we just need to make sure the gate is in a nice shape :)
17:47:43 <bauzas> before unblocking the series :p
17:47:53 <dansmith> even if it's not, the risk window is very small
17:47:57 <dansmith> I think it's fine
17:48:30 <mriedem> yeah i don't think anyone is going to CD nova in between when the bottom and top are merged
17:48:39 <alaski> +1
17:48:41 <mriedem> and if they do, kudos to them
17:48:50 <melwitt> hehe
17:48:57 <dansmith> even if they do, it's not the end of the world
17:49:15 <dansmith> they'd basically have to be CDing a completely stock build with all the default flavors
17:49:18 <dansmith> probably not very likely :)
17:49:18 <mriedem> but but but my telco customer really needs this fancy new flavor today
17:49:27 <dansmith> and even then, there is cleanup they could do easily
17:50:04 <mriedem> alright, i'll focus on that one this afternoon then
17:50:10 <alaski> I'd like to get an opinion on https://review.openstack.org/#/c/298455
17:50:35 <alaski> I WIPed because though that satisfies the API reqs of Instance, it may not satisfy the scheduler/compute reqas
17:50:39 <alaski> *reqs
17:50:47 <alaski> so further updates may be needed
17:51:01 <alaski> do people care about getting everything at once, or just update again if necessary?
17:51:31 <alaski> or just go whole hog and copy much of instance into buildreq and be done with it?
17:51:47 <dansmith> is it just multiple updates or might something break?
17:51:52 <melwitt> I thought it would be fine to update again if needed. but that's just me
17:51:55 <alaski> multiple updates
17:52:37 <dansmith> I'll look after
17:52:45 <bauzas> alaski: well, it sounds boiling the ocean, right?
17:52:56 <alaski> okay. I'll un WIP and just proceed
17:52:58 <mriedem> sucks that we're duplicating,
17:53:13 <mriedem> would be nice if we could just copy the instances schema and whitelist any columns that need special care
17:53:25 <mriedem> but it's kind of too late for that now it seems
17:53:35 <dansmith> so
17:53:42 <dansmith> it seems weird that we're adding those things as columns
17:53:45 <bauzas> mriedem: the BuildRequest object is aimed to be transitory
17:53:48 <dansmith> we don't need to query based on them right?
17:54:02 <dansmith> could we take an instance, cut out the crap we don't care about and store it serialized?
17:54:08 <alaski> dansmith: it's only ever queried by uuid, or project_id
17:54:33 <dansmith> alaski: yeah, so feels strange to need these to be all columns, and then more columns when you find more stuff you want
17:54:35 <dansmith> I dunno
17:54:45 <doffm> Do we create an instance object before the the build request to serialize it?
17:54:46 <mriedem> i like the serialized instance idea
17:54:59 <bauzas> doffm: yup, IIRC
17:55:01 <alaski> doffm: we don't, but we could
17:55:08 <dansmith> doffm: you can just create an instance, put shit in it, and then serialize, you don't need to create in the db
17:55:14 <doffm> OK.
17:55:21 <dansmith> like, you don't need to call Instance.create() on it first
17:55:38 <bauzas> alaski: aren't we ? I thought we were creating the BuildReq object at the same time than the ReqSpec obj
17:55:45 <mriedem> yeah... https://review.openstack.org/#/c/298455/2/nova/compute/api.py@1021
17:55:51 <alaski> bauzas: req_spec->build_req->instance
17:55:57 <alaski> but all in the same place right now
17:56:04 <dansmith> hmm
17:56:13 <bauzas> oh yeah I see
17:56:22 <dansmith> all three with overlapping data right/
17:56:23 <mriedem> L1016 we have the instance
17:56:30 <bauzas> that's really just an implem detail for me
17:56:41 <alaski> dansmith: build_req and req_spec don't overlap
17:56:44 <bauzas> I mean, it seems to be fine to modify this if needed
17:56:47 <alaski> build_req contains a req_spec
17:56:58 <dansmith> alaski: okay
17:57:13 <bauzas> build_req extends req_spec
17:57:18 <dansmith> I must be missing why we need to do this
17:57:29 <dansmith> surely seems like we could just create the instance object as we go, with things we know,
17:57:42 <dansmith> serialize it in the request, and then later call .create() on it when we know where it goes
17:58:30 <alaski> that will work for now
17:58:50 <dansmith> but not later? or ...?
17:59:06 <alaski> I ultimately wanted to move more towards breaking out better units of things, and not rely on the current instance model
17:59:16 <dansmith> okay
17:59:20 <alaski> like properly model things the scheduler needs, things compute needs, etc...
17:59:33 <dansmith> okay, well,
17:59:36 <alaski> but, that doesn't need to get tangled up here
17:59:52 <mriedem> we have 1 minute...
17:59:57 <dansmith> I guess I'm just worried we'll end up with basically the same thing as the instances table here if we store these per column and have to add something everytime we get a new thing
18:00:18 <alaski> dansmith: yeah
18:00:24 <alaski> let's hop back to -nova
18:00:26 <alaski> thanks all!
18:00:30 <alaski> #endmeeting