17:00:32 <alaski> #startmeeting nova_cells 17:00:33 <openstack> Meeting started Wed Apr 8 17:00:32 2015 UTC and is due to finish in 60 minutes. The chair is alaski. Information about MeetBot at http://wiki.debian.org/MeetBot. 17:00:34 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 17:00:36 <openstack> The meeting name has been set to 'nova_cells' 17:00:51 <alaski> anyone here for the cells meeting? 17:00:54 <vineetmenon> o/ 17:00:58 <melwitt> o/ 17:01:03 <dheeraj-gupta-4> o/ 17:01:22 <alaski> bauzas: ? 17:01:28 <bauzas> oh 17:01:29 <bauzas> \o 17:01:36 <alaski> #topic Cellsv1 Tempest Testing 17:01:37 <bauzas> alaski: thanks for pinging me 17:01:57 <bauzas> soooo 17:01:59 <alaski> so cells is green right now 17:02:01 <dansmith> o/ 17:02:05 <alaski> but.. https://review.openstack.org/#/c/171414/ 17:02:47 <alaski> once that gets in we'll be failing hypervisor tests again 17:02:49 <bauzas> who can we hassle for +W'ing it ? 17:02:55 <bauzas> :) 17:02:59 <alaski> and then there's https://review.openstack.org/#/c/171306 and https://review.openstack.org/#/c/160506/ for those 17:03:39 <alaski> bauzas: I'm not sure, could probably ask in -infra about it 17:03:56 <bauzas> dheeraj-gupta-4: you asked me a question this morning about why cells job is just having a few tests, see https://review.openstack.org/#/c/171414/ for hte explanation 17:04:12 <bauzas> alaski: yeah, that's good that we have at least one +2 17:04:12 <dheeraj-gupta-4> bauzas: Yes got it. 17:04:25 <alaski> I'd like to see that change in before the service objects changes are merged 17:04:32 <alaski> so we can see a good test run on them 17:04:37 <bauzas> alaski: agreed 17:04:51 <bauzas> alaski: but that would require a recheck for both 17:05:14 <alaski> bauzas: yep 17:05:28 <alaski> fortunately jenkins seems in good shape atm 17:05:29 <bauzas> anyway, let's see what we can do soon 17:05:45 <bauzas> RC1 is tomorrow :/ 17:06:35 <alaski> yep. we're close though 17:06:43 <alaski> anything more on testing? 17:06:54 <melwitt> I'll ask in infra and see if we can get +W today 17:07:04 <alaski> melwitt: thanks 17:07:04 <bauzas> melwitt: cool 17:07:13 <alaski> #topic Specs 17:07:27 <alaski> this is mainly a reminder that there are some specs now 17:07:49 <alaski> and scheduling in particular is going to get a lot of feedback I think 17:07:58 <alaski> so the earlier that happens the better 17:08:23 <alaski> https://review.openstack.org/#/c/141486/ https://review.openstack.org/#/c/136490/ https://review.openstack.org/#/c/169901/ 17:08:25 <bauzas> alaski: nit: you should amend https://wiki.openstack.org/wiki/Nova-Cells-v2 by modifying the Gerrit search URL 17:08:54 <alaski> bauzas: ok 17:09:15 <alaski> #action alaski update https://wiki.openstack.org/wiki/Nova-Cells-v2 with new specs/reviews/info 17:10:00 <alaski> anything to discuss on specs today? 17:10:08 <vineetmenon> alaski: since we are in scheduling.. 17:10:13 <bauzas> alaski: I saw jaypipes's point and I will put some notes 17:10:30 <alaski> bauzas: thanks 17:10:38 <vineetmenon> is it finalized that top level will schedule a new instance upto node level? 17:10:49 <bauzas> alaski: because I think what cells v2 is trying to do with scheduling can directly benefit to scheduler 17:11:05 <bauzas> vineetmenon: that's what's discussed in https://review.openstack.org/#/c/141486/ 17:11:06 <vineetmenon> i mean the scheduling decision (cell, host) will be entirely taken by top level 17:11:07 <alaski> vineetmenon: nothing is finalized at this point 17:11:25 <alaski> vineetmenon: but the idea is that asking to be scheduled will return that 17:11:46 <alaski> vineetmenon: and the scheduler might be broken into two levels behind the scenes. that's up in the air now though 17:11:47 <bauzas> alaski: I was thinking of an approach like Google Omega, ie. multiple optimistic schedulers sharing a global view 17:12:05 <vineetmenon> alaski: kk 17:12:21 <bauzas> alaski: but that's a scheduler approach - the biggest problem is what jaypipes said, ie. the current scheduler doesn't scale well 17:12:26 <bauzas> s/well/at all even 17:13:05 <alaski> bauzas: I saw you post http://eurosys2013.tudos.org/wp-content/uploads/2013/paper/Schwarzkopf.pdf a while ago and have that in a tab to read 17:13:22 <bauzas> alaski: that's a long long journey 17:13:32 <bauzas> alaski: but I'm trying to move towards that direction 17:13:33 <alaski> bauzas: but if you could distill that into comments on the spec that would be very helpful 17:13:49 <bauzas> alaski: honestly, it's all about deliverables 17:14:12 <vineetmenon> :) 17:14:17 <bauzas> alaski: we can't barely assume that the nova scheduler will be able to cope by magic with thousands of nodes 17:14:24 <bauzas> at least for L 17:14:30 <alaski> agreed 17:14:46 <bauzas> alaski: so I think we should keep a divide and conquer approach as a short term solution 17:15:05 <alaski> what I'm looking at is can we get cells into the scheduler and have it cope with the load of a current non cells deployment 17:15:19 <alaski> I don't expect it to handle a rackspace or CERN load yet 17:15:20 <vineetmenon> bauzas: I was thinking the same +! 17:15:21 <bauzas> with the magic Scheduler (with a big S) as a mid-term goal 17:15:46 <bauzas> alaski: well, I think that the idea behind cells is duplication 17:16:47 <bauzas> ie. call it mitosis 17:17:05 <bauzas> ie. doesn't scale ? duplicate it 17:17:22 <alaski> bauzas: I'm on board with keeping a two level approach, assuming that's what you mean by divide and conquer, if we abstract that properly 17:17:43 <alaski> i.e. hide it behind a client api 17:17:56 <vineetmenon> +1 with two level is that it can scale up to n levels.. if properly thought of.. 17:18:01 <bauzas> alaski: shard it, if you prefer 17:18:23 <alaski> bauzas: gotcha. I'm fine with that too, but that's a big change 17:18:42 <alaski> and has to answer many of the same questions as cells, like how do we handle affinity/anti-affinity 17:18:49 <bauzas> alaski: I think that's a tinier change than just a scalable scheduler 17:19:11 <bauzas> alaski: IIUC, RAX is using one sched per cell ? 17:19:19 <alaski> bauzas: yes 17:19:29 <bauzas> okay, plus the cells scheduler for parenting this 17:19:41 <bauzas> plus the CacheScheduler, got it 17:20:27 <bauzas> okay, sounds like a big big thing anyway - either we scale up or we scale out 17:20:47 <alaski> I think we need to have an understanding of what requirements need to be met for scheduling, and then we can look at how to do it 17:21:23 <bauzas> alaski: well, the scheduler itself is fairly easy to understand* 17:21:34 <bauzas> alaski: the main problem is that it's racy 17:21:54 <alaski> because while I agree that sharding is easier than scaling the scheduler as is, it needs to handle affinity which is then trickier 17:22:01 <bauzas> alaski: so the bigger your deployment is, the higher you get retries 17:22:11 <bauzas> alaski: agreed 17:22:54 <bauzas> alaski: in particular if we consider that aggregates and servergroups are colocated with the cloud itself, not a child cell 17:23:28 <bauzas> but that question hasn't been answered AFAIK 17:23:56 <bauzas> alaski: do you plan inter-cell migrations with cells ? 17:23:56 <alaski> it hasn't, but I agree 17:24:37 <alaski> bauzas: that's tricky too. I think Nova should support it, but you may want to turn it off. 17:24:48 <bauzas> agreed 17:25:12 <vineetmenon> bauzas: what does that mean, 'inter-cell migrations'? 17:25:59 <bauzas> vineetmenon: do you want to live/cold migrate or evacuate VMs from cell1@hostA to cell2@hostB N 17:26:01 <bauzas> ? 17:26:34 <alaski> this actually gets into the next topic so let me switch real quick 17:26:41 <alaski> #topic Cells Scheduling 17:26:43 <alaski> :) 17:26:51 <bauzas> alaski: that reminds me that I probably missed the biggest question : do we hide to the users the complexity of deployments and are we just showing instances - not cells 17:27:07 <bauzas> alaski: or is it a segragation thing ? 17:27:18 <bauzas> *segregation 17:27:33 <alaski> I don't want to hide it, not fully 17:27:44 <bauzas> alaski: speaking of end-users, not admins of course 17:27:51 <alaski> it doesn't have to be exposed as cells though 17:28:02 <bauzas> alaski: that's my thought 17:28:24 <bauzas> alaski: but if it's hidden, then that's just like aggregates, an op thing 17:29:06 <alaski> I think it's useful to know if two instances are in the same or different cells 17:29:11 <melwitt> if I think about the use case of scheduling to a specific cell using a scheduler hint, then I think it would be nice to expose something about the cells 17:30:09 <alaski> melwitt: agreed 17:30:27 <bauzas> melwitt: that's the difference between AZs and aggregates 17:31:10 <bauzas> anyway, I just want to ask those questions not to point out all the difficulties, just say that we'll need to make decisions and approximations 17:31:16 <alaski> this is a good question to get some operator feedback on 17:31:23 <bauzas> +1 17:32:16 <alaski> back to scheduling 17:32:47 <alaski> I think we should discuss some specifics to better shape what will be needed in the scheduler 17:33:28 <alaski> I started getting a list of scheduler filters, and alphabetically affinity is first 17:34:32 <alaski> I have two questions on it: how do we revise the semantics for cells, and how do we implement it 17:34:54 <alaski> for the semantics I'm thinking about anti-affinity and cells 17:35:20 <alaski> should it mean anything as far as cells is concerned, like anti-affinity in same cell vs different cell? 17:35:54 <bauzas> mmm, affinity is a big word 17:36:05 <bauzas> we have affinity for VMs, server groups and aggregates 17:36:45 <bauzas> I'm fine with having a new level as cells, but that ties to the question I mentioned about cells scope 17:38:08 <melwitt> I can think of it meaning anti-affinity within a cell. if a cell represents a group of instances belonging to a certain security zone for example. this kind of brings up the two level scheduling possibility I think, allowing in-cell schedulers to filter if they want to. at the top anti-affinity would mean different cells, in-cell it would be within the same cell 17:38:18 <alaski> most of the cells filters I deal with are primarily about the capabilities of a cell, this type of hardware for this type of flavor type thing 17:38:42 <alaski> so my initial thinking is that cells scope only deals with large things like that, everything else is global host level stuff 17:39:12 <bauzas> melwitt: that's just two filters, if we consider a top-level scheduler 17:39:34 <melwitt> bauzas: okay 17:40:08 <bauzas> alaski: are you grouping per capabilities ? 17:40:42 <alaski> bauzas: not sure what you mena 17:40:44 <alaski> mean 17:41:11 <bauzas> alaski: sorry, I was referring to what you mention as "capabilities of a cell" 17:41:28 <bauzas> alaski: that implies homegeneous hardware 17:41:39 <alaski> bauzas: ahh, gotcha 17:42:00 <alaski> bauzas: doesn't have to be fully homogeneous, but close enough 17:42:12 <alaski> like all SSDs, but perhaps different cpus 17:42:22 <bauzas> alaski: because I'm considering cells as just something for scaling Nova, not necessarly necessary for grouping 17:42:34 <bauzas> alaski: because aggregates are used for that purpose 17:43:05 <alaski> sure, but it is a grouping as well and I think it will be used that way 17:43:39 <bauzas> alaski: sorry but why not creating a big aggregate per cell, and leave as it is now ? 17:43:40 <alaski> but it's a good point that it doesn't need to be 17:44:12 <bauzas> alaski: I'm asking that, because that sends a bad signal that cells are just aggregates 17:44:46 <bauzas> alaski: but I agree with you on the colocation aspect 17:45:03 <bauzas> alaski: I guess you tend to group cells by hardware proximity 17:45:25 <alaski> yeah 17:45:50 <alaski> I'm currently rethinking how much of a concept cells need to be in the scheduler 17:46:45 <alaski> we do need to figure out how to migrate the concepts we currently have into something the scheduler can handle 17:46:51 <bauzas> alaski: well, I'm considering cells as just a new colocation object 17:47:04 <alaski> but if can move cell capabilities into aggregates that could be a path 17:47:30 <alaski> bauzas: yeah, but should that be different than an implicit aggregate? 17:47:51 <bauzas> alaski: because of the affinity stuff you mentioned 17:48:08 <bauzas> alaski: like saying that you have a DC with 3 floors 17:48:37 <bauzas> alaski: you could use aggregates for grouping your SSD hosts which are not at the same floor 17:49:16 <bauzas> alaski: but you could need to do a boot request saying 'I want to boot on a host at the 3rd floor, because I know that my power consumption is lower there" 17:49:29 <bauzas> alaski: of course, it can be implicit 17:50:06 <bauzas> so the affinity can be done orthogonallyt 17:50:17 <alaski> right 17:51:00 <bauzas> alaski: and that would solve one big confusion about Nova AZs != AWS AZs 17:51:24 <bauzas> because we could say that Nova cells are somehow related to AWS AZs 17:52:13 <alaski> possibly 17:52:20 <alaski> that's still up to the deployer I think 17:52:25 <bauzas> agreed 17:52:42 <bauzas> that's why I'm saying that's just a new segregation object for the scheduler 17:52:54 <bauzas> so new filters 17:53:10 <bauzas> - leaving off the scaling problem, of course - 17:53:36 <alaski> the new filter would be at the same place other scheduler filters are at 17:53:50 <bauzas> alaski: if the scheduler is a global scheduler, yes 17:53:55 <alaski> going this route I'm not sure we're introducing a real scalability difference 17:54:23 <bauzas> that's just an abstraction that leaves the scalability problem unresolved 17:54:35 <alaski> bauzas: right, I'm assuming global scheduler for now since that's what we have 17:54:52 <alaski> bauzas: unresolved, but not any different than today right? 17:55:01 <bauzas> alaski: exactly, and that's necessary to keep that abstraction if we want "orthogonal" filtering 17:55:18 <bauzas> alaski: right, I'm just trying to say that's distinct efforts 17:55:34 <bauzas> #1 scalability problem of a global scheduler 17:55:42 <bauzas> #2 cells-related filters 17:56:05 <alaski> okay. agreed, though for this meeting I would reverse the ordering :) 17:56:13 <bauzas> lol 17:56:36 <bauzas> just thinking about hash rings for scheduler, lol 17:56:48 <alaski> well that felt productive to me, because I'm totally rethinking scheduling now 17:57:24 <alaski> bauzas: heh, if we could do that it would be pretty cool 17:57:42 <bauzas> yeah.... but claims are compute-based :/ 17:57:43 <alaski> so my takeaway is that I need to rewrite my scheduling spec 17:58:11 <bauzas> eh 17:58:37 <bauzas> you have to jump in on scheduler, as I have to do with cells :) 17:58:51 <alaski> I'm seeing that 17:59:05 <alaski> I'll look up the meeting time and start showing up 17:59:18 <bauzas> cool 17:59:25 <alaski> any last minute thoughts from anyone, since we didn't get an open discussion? 17:59:41 <melwitt> regex fix merged, I put rechecks on the two patches 17:59:48 <bauzas> melwitt: \o/ 17:59:48 <alaski> melwitt: awesome! 17:59:57 <bauzas> mens are chattering... 18:00:10 <bauzas> congrats ! 18:00:13 <melwitt> heh 18:00:14 <bauzas> doing rechecks now 18:00:29 <alaski> she already did them 18:00:34 <alaski> thanks everyone! 18:00:34 <bauzas> oh cool 18:00:38 <bauzas> thanks 18:00:40 <alaski> #endmeeting