17:00:26 <alaski> #startmeeting nova_cells 17:00:27 <openstack> Meeting started Wed Dec 17 17:00:26 2014 UTC and is due to finish in 60 minutes. The chair is alaski. Information about MeetBot at http://wiki.debian.org/MeetBot. 17:00:28 <bauzas> my bell ringed :) 17:00:28 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 17:00:30 <openstack> The meeting name has been set to 'nova_cells' 17:00:34 <alaski> bauzas: :) 17:00:36 <alaski> Hi 17:00:36 <gilliard> Hello :) 17:00:37 <mriedem> o/ 17:00:42 <bauzas> \o 17:00:57 <melwitt> o/ 17:00:57 <edleafe> o/ 17:01:07 <alaski> #topic Cells manifesto 17:01:12 <alaski> https://review.openstack.org/#/c/139191/ 17:01:28 <alaski> I just updated that based on some good feedback 17:01:33 <bauzas> alaski: eh :) 17:01:36 <dansmith> o/ 17:01:58 <alaski> more feedback is appreciated as always, but I think it's looking pretty good 17:02:21 <dansmith> I'll try to hit that here in a bit 17:02:33 <alaski> cool 17:02:37 <johnthetubaguy> me too, interested in reviewing that 17:02:49 <bauzas> alaski: the change looks pretty small 17:03:09 <bauzas> alaski: so if you agree to add more mentions to the networking stuff later, I'm +1 with it 17:03:34 <alaski> bauzas: great 17:03:37 <bauzas> alaski: because I was thinking it was necessary to also discuss what would be the standard way for networks in cells 17:04:03 <alaski> I agree. But I don't have anything together for that right now 17:04:11 <bauzas> right 17:04:18 <alaski> but it's open for proposals 17:04:25 <bauzas> so, let's move on, and discuss on it later on 17:04:34 <alaski> +1 17:04:40 <alaski> #topic Testing 17:04:41 <johnthetubaguy> alaski: I think we should go all in on neutron, assuming nova-network is dead by then, but lets move on 17:04:55 <alaski> same stuff at https://etherpad.openstack.org/p/nova-cells-testing 17:05:10 <alaski> newest failures at the bottom 17:05:29 <alaski> I tackled the test_host_negative tests and have a review up, which needs some unit tests 17:05:43 <alaski> but there are plenty more to look at 17:06:05 <dansmith> alaski: should we queue up a patch to remove the flavor query from the libvirt driver on top of my flavor patch to see if cells is happy with it? 17:06:15 <bauzas> alaski: could you please provide the devstack logs also ? 17:06:34 <alaski> dansmith: yeah, that would be a great test 17:06:38 <bauzas> alaski: because if not, it needs to run a cells devstack 17:07:01 <alaski> bauzas: I can include a link to the job I pulled the failures from 17:07:22 <bauzas> nevermind, I'm bad at reading 17:07:23 <alaski> and a 'check experimental' can run the tests for logs as well 17:07:39 <bauzas> agreed 17:07:58 <melwitt> dansmith: I thought garyk had a patch up to do something like that -- pass the flavor to driver instead of driver lookup? 17:08:00 <bauzas> https://review.openstack.org/#/c/141905/ seems to be right change to look at 17:08:12 <mriedem> melwitt: i think those are short-term 17:08:22 <melwitt> Okay 17:08:23 <dansmith> melwitt: different anyway I think 17:08:26 <alaski> bauzas: yes, that's a good one 17:08:35 <dansmith> melwitt: but I'll look 17:08:37 <alaski> melwitt: that change merged 17:08:49 <melwitt> oh, sorry 17:08:50 <alaski> that's what brought the test failures as low as they are 17:08:56 <dansmith> alaski: link? 17:09:18 <alaski> dansmith: https://review.openstack.org/#/c/135285/ 17:09:31 <dansmith> I see 17:09:41 <alaski> dansmith: so that's actually what should be removed 17:09:55 <dansmith> so, 17:10:01 <dansmith> doesn't this undo things? 17:10:12 <dansmith> meaning, I'd think it causes a bunch of the driver to look at potentially different flavor bits 17:10:23 <dansmith> instead of just the extra_specs bit 17:11:09 <alaski> I'd have to look at the proceeding patch again, but I think the passed in flavor should match what would have been queried in the driver 17:11:26 <dansmith> okay 17:11:35 <dansmith> maybe the driver was already being too aggressive there actually 17:11:55 <dansmith> anyway, that's fine, sorry for the distraction 17:12:08 <alaski> it probably was 17:12:11 <alaski> no worries 17:12:20 <alaski> anything else on testing? 17:12:44 <alaski> #topic Table analysis 17:12:52 <alaski> https://etherpad.openstack.org/p/nova-cells-table-analysis 17:13:15 <alaski> Unless there are objections I think we should move the uncontroversial tables into the devref for now 17:13:36 <alaski> as a follow on to the cells page started with the manifesto 17:14:15 <dansmith> sounds like a good incremental step 17:14:29 <alaski> cool 17:14:43 <johnthetubaguy> yeah +1 on that 17:14:48 <bauzas> alaski: I vote for a conservative way of explaining this using lots of conditionals : "it should", "we may" etc. :D 17:14:50 <alaski> that will give some concrete examples to reference while discussing things or writing specs 17:15:34 <alaski> bauzas: sure. There can be a disclaimer that this is the current breakdown, but nothing is final until it's coded, and not even then 17:15:44 <bauzas> alaski: I like this :) 17:16:23 <alaski> anyone want to make make the devref review? 17:17:11 <alaski> okay, I'll add it as a followup to the manifesto review 17:17:36 <alaski> #topic Cells scheduling 17:17:45 <alaski> The big topic for today 17:17:49 <alaski> https://review.openstack.org/#/c/141486/ 17:18:21 <alaski> there's a lot of stuff in there 17:18:52 <dansmith> I haven't looked at this yet, but I will after the meeting 17:18:56 * dansmith sucks 17:19:07 <bauzas> \o/ 17:19:11 <alaski> I made a proposal to start the discussion, but am not necessarily advocating what's currently up 17:19:25 <bauzas> alaski: yeah, I think the discussion is very large 17:19:49 <dansmith> the other thing we can do, 17:19:52 <bauzas> alaski: see all our current discussions in the table analysis that are also related to the scheduler placement decision 17:20:06 <dansmith> is expect that the current scheduling approach will apply to cells as if it was a single deployment, 17:20:17 <bauzas> dansmith: +1000 17:20:25 <dansmith> with the note that we can't do anything other than prove that the rest of the stuff works until we address the scale problem 17:20:51 <bauzas> dansmith: I'm seeing the nova-scheduler code as something unique, but which can be deployed in many ways 17:20:56 <dansmith> so, we can't dump cellsv1 until the scheduler scales, but at least we're able to move forward on the rest of the bits which are more structural 17:21:09 * dansmith wonders what johnthetubaguy thinks of that 17:21:18 <bauzas> dansmith: ie. if it's a scalability problem, then we can consider having a nova-scheduler for only cells picking and one nova-scheduler process per cell 17:21:27 <johnthetubaguy> dansmith: I think thats what I was thinking too, if I understand you correctly 17:21:54 <johnthetubaguy> basically, we need *something* to give us (cell, host, node), lets make that a single RPC call to the "scheduler" 17:22:06 <bauzas> at the moment, the scheduler is unable to pick a cell, so I would propose to add a new RPC method for this 17:22:07 <johnthetubaguy> what happens next, needs fixing before we drop v1 17:22:10 <dansmith> right, and then we can fix what is behind it 17:22:14 <dansmith> johnthetubaguy: yeah, agreed 17:22:20 <dansmith> alaski: thoughts? 17:23:21 <alaski> I'm okay with that approach 17:23:47 <bauzas> do we all agree to extend the current nova-scheduler ? 17:23:51 <johnthetubaguy> alaski: I wonder if tasks could help your worries in the spec? 17:24:02 <dansmith> bauzas: I don't agree to that 17:24:14 <johnthetubaguy> alaski: the global cell records the user request as a task 17:24:14 <dansmith> bauzas: I agree to not hinge cellsv2 on the non-scaling scheduler problem :P 17:24:27 <johnthetubaguy> alaski: once the scheduler is done, we create the instance record? 17:24:29 <alaski> what I like is the interface between the API and cells being flexible enough to handle a few scheduler deployment scenarios 17:24:45 <alaski> and look at a single scheduler right now 17:25:15 <alaski> johnthetubaguy: tasks would help, yes 17:25:20 <johnthetubaguy> alaski: right, the above interface we describe could actually be a services that first quieres a cell, then uses that to pick the next scheduler to ask, then returns, etc 17:25:32 <alaski> johnthetubaguy: I proposed something that starts to look like a task in the spec 17:25:43 <bauzas> johnthetubaguy: do we consider to elect a cell first and then a host ? 17:25:46 <johnthetubaguy> alaski: I should read further down, sorry! 17:26:21 <johnthetubaguy> dansmith: thats a good point though, the question is do we want a DB record by the time the API returns, or does the API return wait for the scheduler to finish 17:26:46 <johnthetubaguy> dansmith: I skipped over that before, and having a "task" recorded in the global DB fixes that a little 17:26:53 <dansmith> johnthetubaguy: gotta have something before the API returns I'd think, but we can just shove it into the mapping at that point, right? 17:27:19 <alaski> We need to store a little more than the mapping can hold 17:27:38 <dansmith> like what? 17:27:41 <dansmith> more than just a uuid? 17:27:54 <bauzas> alaski: you mean the spec ? 17:27:59 <dansmith> I guess we have to look like we've recorded all the information they gave us... 17:28:02 <alaski> bauzas: not the spec 17:28:05 <alaski> dansmith: yes 17:28:18 <alaski> we need to fulfill the current api contract if they show the instance right away 17:28:26 <dansmith> yeah 17:28:50 <bauzas> alaski: gilliard and I left a comment on why you're considering the whole story as async 17:28:50 <dansmith> alaski: we could go ahead and add an instance json cache table :) 17:29:04 <bauzas> alaski: that's still unclear for me 17:29:06 * johnthetubaguy shudders 17:29:23 <alaski> dansmith: heh, that would probably work 17:29:30 <dansmith> johnthetubaguy: are you shuddering at me? 17:29:32 <alaski> I was thinking of storing it almost as a task 17:29:46 <johnthetubaguy> dansmith: yeah, its feels nasty, but its annoying because it works... 17:30:01 <dansmith> johnthetubaguy: I thought there was a goal of doing that anyway? 17:30:07 <alaski> bauzas: because the API should be snappy, and waiting for the scheduler won't allow that 17:30:10 <dansmith> johnthetubaguy: even for running instances 17:30:31 <johnthetubaguy> dansmith: yeah, we do need a cache eventually, thats true, I mean we might need to index on other things, but yeah 17:31:02 <bauzas> alaski: but how are you sure that the contract you give back to the user is valid ? 17:31:34 <bauzas> sorry guys, I know about the tasks proposal but I just want to make sure we all agree with the idea to return an uuid with a building state ? 17:31:48 <bauzas> and then just provide to the user a fail status if not ? 17:32:05 <johnthetubaguy> bauzas: think about listing all your instances, etc, its more that bit I worry about 17:32:17 <bauzas> johnthetubaguy: hell yeah 17:32:32 <alaski> dansmith: johnthetubaguy we need somewhere to store some info. task or instance cache should work now, we should probably get that on the review and debate/think on it there 17:32:55 <dansmith> yeah, I have a hard time thinking these things through without actually doing them, 17:32:59 <johnthetubaguy> alaski: yeah, sounds like its hit all the major issues, I should just review the written detail 17:33:02 <dansmith> 'cause I'm weak-minded 17:33:22 <alaski> dansmith: hah, I find that hard to believe 17:33:31 <johnthetubaguy> likewise. 17:33:52 <alaski> bauzas: I don't think returning before scheduling changes the contract we have now, because we dont currently wait on scheduling 17:34:09 <bauzas> alaski: right 17:34:47 <alaski> so I'm not sure why we would start to wait on scheduling 17:34:54 <bauzas> alaski: gotcha 17:35:49 <alaski> anything else on scheduling for now? 17:35:59 <dansmith> please no. 17:36:00 <dansmith> :) 17:36:03 <alaski> :) 17:36:04 <johnthetubaguy> +1 17:36:18 <alaski> alright, please add thoughts to the review 17:36:27 <alaski> #topic Open Discussion 17:36:47 <alaski> General announcement, the next meeting will be Jan 7th 17:37:16 <alaski> anything else people would like to discuss? 17:37:53 <alaski> going once... 17:38:10 <dansmith> sold! 17:38:18 <alaski> alright, early marks all around! 17:38:25 <johnthetubaguy> :) 17:38:26 <bauzas> awesome 17:38:26 <dansmith> heh 17:38:31 <gilliard> :) 17:38:31 <alaski> #endmeeting