17:00:26 <alaski> #startmeeting nova_cells
17:00:27 <openstack> Meeting started Wed Dec 17 17:00:26 2014 UTC and is due to finish in 60 minutes.  The chair is alaski. Information about MeetBot at http://wiki.debian.org/MeetBot.
17:00:28 <bauzas> my bell ringed :)
17:00:28 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
17:00:30 <openstack> The meeting name has been set to 'nova_cells'
17:00:34 <alaski> bauzas: :)
17:00:36 <alaski> Hi
17:00:36 <gilliard> Hello :)
17:00:37 <mriedem> o/
17:00:42 <bauzas> \o
17:00:57 <melwitt> o/
17:00:57 <edleafe> o/
17:01:07 <alaski> #topic Cells manifesto
17:01:12 <alaski> https://review.openstack.org/#/c/139191/
17:01:28 <alaski> I just updated that based on some good feedback
17:01:33 <bauzas> alaski: eh :)
17:01:36 <dansmith> o/
17:01:58 <alaski> more feedback is appreciated as always, but I think it's looking pretty good
17:02:21 <dansmith> I'll try to hit that here in a bit
17:02:33 <alaski> cool
17:02:37 <johnthetubaguy> me too, interested in reviewing that
17:02:49 <bauzas> alaski: the change looks pretty small
17:03:09 <bauzas> alaski: so if you agree to add more mentions to the networking stuff later, I'm +1 with it
17:03:34 <alaski> bauzas: great
17:03:37 <bauzas> alaski: because I was thinking it was necessary to also discuss what would be the standard way for networks in cells
17:04:03 <alaski> I agree.  But I don't have anything together for that right now
17:04:11 <bauzas> right
17:04:18 <alaski> but it's open for proposals
17:04:25 <bauzas> so, let's move on, and discuss on it later on
17:04:34 <alaski> +1
17:04:40 <alaski> #topic Testing
17:04:41 <johnthetubaguy> alaski: I think we should go all in on neutron, assuming nova-network is dead by then, but lets move on
17:04:55 <alaski> same stuff at https://etherpad.openstack.org/p/nova-cells-testing
17:05:10 <alaski> newest failures at the bottom
17:05:29 <alaski> I tackled the test_host_negative tests and have a review up, which needs some unit tests
17:05:43 <alaski> but there are plenty more to look at
17:06:05 <dansmith> alaski: should we queue up a patch to remove the flavor query from the libvirt driver on top of my flavor patch to see if cells is happy with it?
17:06:15 <bauzas> alaski: could you please provide the devstack logs also ?
17:06:34 <alaski> dansmith: yeah, that would be a great test
17:06:38 <bauzas> alaski: because if not, it needs to run a cells devstack
17:07:01 <alaski> bauzas: I can include a link to the job I pulled the failures from
17:07:22 <bauzas> nevermind, I'm bad at reading
17:07:23 <alaski> and a 'check experimental' can run the tests for logs as well
17:07:39 <bauzas> agreed
17:07:58 <melwitt> dansmith: I thought garyk had a patch up to do something like that -- pass the flavor to driver instead of driver lookup?
17:08:00 <bauzas> https://review.openstack.org/#/c/141905/ seems to be right change to look at
17:08:12 <mriedem> melwitt: i think those are short-term
17:08:22 <melwitt> Okay
17:08:23 <dansmith> melwitt: different anyway I think
17:08:26 <alaski> bauzas: yes, that's a good one
17:08:35 <dansmith> melwitt: but I'll look
17:08:37 <alaski> melwitt: that change merged
17:08:49 <melwitt> oh, sorry
17:08:50 <alaski> that's what brought the test failures as low as they are
17:08:56 <dansmith> alaski: link?
17:09:18 <alaski> dansmith: https://review.openstack.org/#/c/135285/
17:09:31 <dansmith> I see
17:09:41 <alaski> dansmith: so that's actually what should be removed
17:09:55 <dansmith> so,
17:10:01 <dansmith> doesn't this undo things?
17:10:12 <dansmith> meaning, I'd think it causes a bunch of the driver to look at potentially different flavor bits
17:10:23 <dansmith> instead of just the extra_specs bit
17:11:09 <alaski> I'd have to look at the proceeding patch again, but I think the passed in flavor should match what would have been queried in the driver
17:11:26 <dansmith> okay
17:11:35 <dansmith> maybe the driver was already being too aggressive there actually
17:11:55 <dansmith> anyway, that's fine, sorry for the distraction
17:12:08 <alaski> it probably was
17:12:11 <alaski> no worries
17:12:20 <alaski> anything else on testing?
17:12:44 <alaski> #topic Table analysis
17:12:52 <alaski> https://etherpad.openstack.org/p/nova-cells-table-analysis
17:13:15 <alaski> Unless there are objections I think we should move the uncontroversial tables into the devref for now
17:13:36 <alaski> as a follow on to the cells page started with the manifesto
17:14:15 <dansmith> sounds like a good incremental step
17:14:29 <alaski> cool
17:14:43 <johnthetubaguy> yeah +1 on that
17:14:48 <bauzas> alaski: I vote for a conservative way of explaining this using lots of conditionals : "it should", "we may" etc. :D
17:14:50 <alaski> that will give some concrete examples to reference while discussing things or writing specs
17:15:34 <alaski> bauzas: sure.  There can be a disclaimer that this is the current breakdown, but nothing is final until it's coded, and not even then
17:15:44 <bauzas> alaski: I like this :)
17:16:23 <alaski> anyone want to make make the devref review?
17:17:11 <alaski> okay, I'll add it as a followup to the manifesto review
17:17:36 <alaski> #topic Cells scheduling
17:17:45 <alaski> The big topic for today
17:17:49 <alaski> https://review.openstack.org/#/c/141486/
17:18:21 <alaski> there's a lot of stuff in there
17:18:52 <dansmith> I haven't looked at this yet, but I will after the meeting
17:18:56 * dansmith sucks
17:19:07 <bauzas> \o/
17:19:11 <alaski> I made a proposal to start the discussion, but am not necessarily advocating what's currently up
17:19:25 <bauzas> alaski: yeah, I think the discussion is very large
17:19:49 <dansmith> the other thing we can do,
17:19:52 <bauzas> alaski: see all our current discussions in the table analysis that are also related to the scheduler placement decision
17:20:06 <dansmith> is expect that the current scheduling approach will apply to cells as if it was a single deployment,
17:20:17 <bauzas> dansmith: +1000
17:20:25 <dansmith> with the note that we can't do anything other than prove that the rest of the stuff works until we address the scale problem
17:20:51 <bauzas> dansmith: I'm seeing the nova-scheduler code as something unique, but which can be deployed in many ways
17:20:56 <dansmith> so, we can't dump cellsv1 until the scheduler scales, but at least we're able to move forward on the rest of the bits which are more structural
17:21:09 * dansmith wonders what johnthetubaguy thinks of that
17:21:18 <bauzas> dansmith: ie. if it's a scalability problem, then we can consider having a nova-scheduler for only cells picking and one nova-scheduler process per cell
17:21:27 <johnthetubaguy> dansmith: I think thats what I was thinking too, if I understand you correctly
17:21:54 <johnthetubaguy> basically, we need *something* to give us (cell, host, node), lets make that a single RPC call to the "scheduler"
17:22:06 <bauzas> at the moment, the scheduler is unable to pick a cell, so I would propose to add a new RPC method for this
17:22:07 <johnthetubaguy> what happens next, needs fixing before we drop v1
17:22:10 <dansmith> right, and then we can fix what is behind it
17:22:14 <dansmith> johnthetubaguy: yeah, agreed
17:22:20 <dansmith> alaski: thoughts?
17:23:21 <alaski> I'm okay with that approach
17:23:47 <bauzas> do we all agree to extend the current nova-scheduler ?
17:23:51 <johnthetubaguy> alaski: I wonder if tasks could help your worries in the spec?
17:24:02 <dansmith> bauzas: I don't agree to that
17:24:14 <johnthetubaguy> alaski: the global cell records the user request as a task
17:24:14 <dansmith> bauzas: I agree to not hinge cellsv2 on the non-scaling scheduler problem :P
17:24:27 <johnthetubaguy> alaski: once the scheduler is done, we create the instance record?
17:24:29 <alaski> what I like is the interface between the API and cells being flexible enough to handle a few scheduler deployment scenarios
17:24:45 <alaski> and look at a single scheduler right now
17:25:15 <alaski> johnthetubaguy: tasks would help, yes
17:25:20 <johnthetubaguy> alaski: right, the above interface we describe could actually be a services that first quieres a cell, then uses that to pick the next scheduler to ask, then returns, etc
17:25:32 <alaski> johnthetubaguy: I proposed something that starts to look like a task in the spec
17:25:43 <bauzas> johnthetubaguy: do we consider to elect a cell first and then a host ?
17:25:46 <johnthetubaguy> alaski: I should read further down, sorry!
17:26:21 <johnthetubaguy> dansmith: thats a good point though, the question is do we want a DB record by the time the API returns, or does the API return wait for the scheduler to finish
17:26:46 <johnthetubaguy> dansmith: I skipped over that before, and having a "task" recorded in the global DB fixes that a little
17:26:53 <dansmith> johnthetubaguy: gotta have something before the API returns I'd think, but we can just shove it into the mapping at that point, right?
17:27:19 <alaski> We need to store a little more than the mapping can hold
17:27:38 <dansmith> like what?
17:27:41 <dansmith> more than just a uuid?
17:27:54 <bauzas> alaski: you mean the spec ?
17:27:59 <dansmith> I guess we have to look like we've recorded all the information they gave us...
17:28:02 <alaski> bauzas: not the spec
17:28:05 <alaski> dansmith: yes
17:28:18 <alaski> we need to fulfill the current api contract if they show the instance right away
17:28:26 <dansmith> yeah
17:28:50 <bauzas> alaski: gilliard and I left a comment on why you're considering the whole story as async
17:28:50 <dansmith> alaski: we could go ahead and add an instance json cache table :)
17:29:04 <bauzas> alaski: that's still unclear for me
17:29:06 * johnthetubaguy shudders
17:29:23 <alaski> dansmith: heh, that would probably work
17:29:30 <dansmith> johnthetubaguy: are you shuddering at me?
17:29:32 <alaski> I was thinking of storing it almost as a task
17:29:46 <johnthetubaguy> dansmith: yeah, its feels nasty, but its annoying because it works...
17:30:01 <dansmith> johnthetubaguy: I thought there was a goal of doing that anyway?
17:30:07 <alaski> bauzas: because the API should be snappy, and waiting for the scheduler won't allow that
17:30:10 <dansmith> johnthetubaguy: even for running instances
17:30:31 <johnthetubaguy> dansmith: yeah, we do need a cache eventually, thats true, I mean we might need to index on other things, but yeah
17:31:02 <bauzas> alaski: but how are you sure that the contract you give back to the user is valid ?
17:31:34 <bauzas> sorry guys, I know about the tasks proposal but I just want to make sure we all agree with the idea to return an uuid with a building state ?
17:31:48 <bauzas> and then just provide to the user a fail status if not ?
17:32:05 <johnthetubaguy> bauzas: think about listing all your instances, etc, its more that bit I worry about
17:32:17 <bauzas> johnthetubaguy: hell yeah
17:32:32 <alaski> dansmith: johnthetubaguy we need somewhere to store some info.  task or instance cache should work now, we should probably get that on the review and debate/think on it there
17:32:55 <dansmith> yeah, I have a hard time thinking these things through without actually doing them,
17:32:59 <johnthetubaguy> alaski: yeah, sounds like its hit all the major issues, I should just review the written detail
17:33:02 <dansmith> 'cause I'm weak-minded
17:33:22 <alaski> dansmith: hah, I find that hard to believe
17:33:31 <johnthetubaguy> likewise.
17:33:52 <alaski> bauzas: I don't think returning before scheduling changes the contract we have now, because we dont currently wait on scheduling
17:34:09 <bauzas> alaski: right
17:34:47 <alaski> so I'm not sure why we would start to wait on scheduling
17:34:54 <bauzas> alaski: gotcha
17:35:49 <alaski> anything else on scheduling for now?
17:35:59 <dansmith> please no.
17:36:00 <dansmith> :)
17:36:03 <alaski> :)
17:36:04 <johnthetubaguy> +1
17:36:18 <alaski> alright, please add thoughts to the review
17:36:27 <alaski> #topic Open Discussion
17:36:47 <alaski> General announcement, the next meeting will be Jan 7th
17:37:16 <alaski> anything else people would like to discuss?
17:37:53 <alaski> going once...
17:38:10 <dansmith> sold!
17:38:18 <alaski> alright, early marks all around!
17:38:25 <johnthetubaguy> :)
17:38:26 <bauzas> awesome
17:38:26 <dansmith> heh
17:38:31 <gilliard> :)
17:38:31 <alaski> #endmeeting