17:00:32 #startmeeting nova_cells 17:00:33 Meeting started Wed Apr 8 17:00:32 2015 UTC and is due to finish in 60 minutes. The chair is alaski. Information about MeetBot at http://wiki.debian.org/MeetBot. 17:00:34 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 17:00:36 The meeting name has been set to 'nova_cells' 17:00:51 anyone here for the cells meeting? 17:00:54 o/ 17:00:58 o/ 17:01:03 o/ 17:01:22 bauzas: ? 17:01:28 oh 17:01:29 \o 17:01:36 #topic Cellsv1 Tempest Testing 17:01:37 alaski: thanks for pinging me 17:01:57 soooo 17:01:59 so cells is green right now 17:02:01 o/ 17:02:05 but.. https://review.openstack.org/#/c/171414/ 17:02:47 once that gets in we'll be failing hypervisor tests again 17:02:49 who can we hassle for +W'ing it ? 17:02:55 :) 17:02:59 and then there's https://review.openstack.org/#/c/171306 and https://review.openstack.org/#/c/160506/ for those 17:03:39 bauzas: I'm not sure, could probably ask in -infra about it 17:03:56 dheeraj-gupta-4: you asked me a question this morning about why cells job is just having a few tests, see https://review.openstack.org/#/c/171414/ for hte explanation 17:04:12 alaski: yeah, that's good that we have at least one +2 17:04:12 bauzas: Yes got it. 17:04:25 I'd like to see that change in before the service objects changes are merged 17:04:32 so we can see a good test run on them 17:04:37 alaski: agreed 17:04:51 alaski: but that would require a recheck for both 17:05:14 bauzas: yep 17:05:28 fortunately jenkins seems in good shape atm 17:05:29 anyway, let's see what we can do soon 17:05:45 RC1 is tomorrow :/ 17:06:35 yep. we're close though 17:06:43 anything more on testing? 17:06:54 I'll ask in infra and see if we can get +W today 17:07:04 melwitt: thanks 17:07:04 melwitt: cool 17:07:13 #topic Specs 17:07:27 this is mainly a reminder that there are some specs now 17:07:49 and scheduling in particular is going to get a lot of feedback I think 17:07:58 so the earlier that happens the better 17:08:23 https://review.openstack.org/#/c/141486/ https://review.openstack.org/#/c/136490/ https://review.openstack.org/#/c/169901/ 17:08:25 alaski: nit: you should amend https://wiki.openstack.org/wiki/Nova-Cells-v2 by modifying the Gerrit search URL 17:08:54 bauzas: ok 17:09:15 #action alaski update https://wiki.openstack.org/wiki/Nova-Cells-v2 with new specs/reviews/info 17:10:00 anything to discuss on specs today? 17:10:08 alaski: since we are in scheduling.. 17:10:13 alaski: I saw jaypipes's point and I will put some notes 17:10:30 bauzas: thanks 17:10:38 is it finalized that top level will schedule a new instance upto node level? 17:10:49 alaski: because I think what cells v2 is trying to do with scheduling can directly benefit to scheduler 17:11:05 vineetmenon: that's what's discussed in https://review.openstack.org/#/c/141486/ 17:11:06 i mean the scheduling decision (cell, host) will be entirely taken by top level 17:11:07 vineetmenon: nothing is finalized at this point 17:11:25 vineetmenon: but the idea is that asking to be scheduled will return that 17:11:46 vineetmenon: and the scheduler might be broken into two levels behind the scenes. that's up in the air now though 17:11:47 alaski: I was thinking of an approach like Google Omega, ie. multiple optimistic schedulers sharing a global view 17:12:05 alaski: kk 17:12:21 alaski: but that's a scheduler approach - the biggest problem is what jaypipes said, ie. the current scheduler doesn't scale well 17:12:26 s/well/at all even 17:13:05 bauzas: I saw you post http://eurosys2013.tudos.org/wp-content/uploads/2013/paper/Schwarzkopf.pdf a while ago and have that in a tab to read 17:13:22 alaski: that's a long long journey 17:13:32 alaski: but I'm trying to move towards that direction 17:13:33 bauzas: but if you could distill that into comments on the spec that would be very helpful 17:13:49 alaski: honestly, it's all about deliverables 17:14:12 :) 17:14:17 alaski: we can't barely assume that the nova scheduler will be able to cope by magic with thousands of nodes 17:14:24 at least for L 17:14:30 agreed 17:14:46 alaski: so I think we should keep a divide and conquer approach as a short term solution 17:15:05 what I'm looking at is can we get cells into the scheduler and have it cope with the load of a current non cells deployment 17:15:19 I don't expect it to handle a rackspace or CERN load yet 17:15:20 bauzas: I was thinking the same +! 17:15:21 with the magic Scheduler (with a big S) as a mid-term goal 17:15:46 alaski: well, I think that the idea behind cells is duplication 17:16:47 ie. call it mitosis 17:17:05 ie. doesn't scale ? duplicate it 17:17:22 bauzas: I'm on board with keeping a two level approach, assuming that's what you mean by divide and conquer, if we abstract that properly 17:17:43 i.e. hide it behind a client api 17:17:56 +1 with two level is that it can scale up to n levels.. if properly thought of.. 17:18:01 alaski: shard it, if you prefer 17:18:23 bauzas: gotcha. I'm fine with that too, but that's a big change 17:18:42 and has to answer many of the same questions as cells, like how do we handle affinity/anti-affinity 17:18:49 alaski: I think that's a tinier change than just a scalable scheduler 17:19:11 alaski: IIUC, RAX is using one sched per cell ? 17:19:19 bauzas: yes 17:19:29 okay, plus the cells scheduler for parenting this 17:19:41 plus the CacheScheduler, got it 17:20:27 okay, sounds like a big big thing anyway - either we scale up or we scale out 17:20:47 I think we need to have an understanding of what requirements need to be met for scheduling, and then we can look at how to do it 17:21:23 alaski: well, the scheduler itself is fairly easy to understand* 17:21:34 alaski: the main problem is that it's racy 17:21:54 because while I agree that sharding is easier than scaling the scheduler as is, it needs to handle affinity which is then trickier 17:22:01 alaski: so the bigger your deployment is, the higher you get retries 17:22:11 alaski: agreed 17:22:54 alaski: in particular if we consider that aggregates and servergroups are colocated with the cloud itself, not a child cell 17:23:28 but that question hasn't been answered AFAIK 17:23:56 alaski: do you plan inter-cell migrations with cells ? 17:23:56 it hasn't, but I agree 17:24:37 bauzas: that's tricky too. I think Nova should support it, but you may want to turn it off. 17:24:48 agreed 17:25:12 bauzas: what does that mean, 'inter-cell migrations'? 17:25:59 vineetmenon: do you want to live/cold migrate or evacuate VMs from cell1@hostA to cell2@hostB N 17:26:01 ? 17:26:34 this actually gets into the next topic so let me switch real quick 17:26:41 #topic Cells Scheduling 17:26:43 :) 17:26:51 alaski: that reminds me that I probably missed the biggest question : do we hide to the users the complexity of deployments and are we just showing instances - not cells 17:27:07 alaski: or is it a segragation thing ? 17:27:18 *segregation 17:27:33 I don't want to hide it, not fully 17:27:44 alaski: speaking of end-users, not admins of course 17:27:51 it doesn't have to be exposed as cells though 17:28:02 alaski: that's my thought 17:28:24 alaski: but if it's hidden, then that's just like aggregates, an op thing 17:29:06 I think it's useful to know if two instances are in the same or different cells 17:29:11 if I think about the use case of scheduling to a specific cell using a scheduler hint, then I think it would be nice to expose something about the cells 17:30:09 melwitt: agreed 17:30:27 melwitt: that's the difference between AZs and aggregates 17:31:10 anyway, I just want to ask those questions not to point out all the difficulties, just say that we'll need to make decisions and approximations 17:31:16 this is a good question to get some operator feedback on 17:31:23 +1 17:32:16 back to scheduling 17:32:47 I think we should discuss some specifics to better shape what will be needed in the scheduler 17:33:28 I started getting a list of scheduler filters, and alphabetically affinity is first 17:34:32 I have two questions on it: how do we revise the semantics for cells, and how do we implement it 17:34:54 for the semantics I'm thinking about anti-affinity and cells 17:35:20 should it mean anything as far as cells is concerned, like anti-affinity in same cell vs different cell? 17:35:54 mmm, affinity is a big word 17:36:05 we have affinity for VMs, server groups and aggregates 17:36:45 I'm fine with having a new level as cells, but that ties to the question I mentioned about cells scope 17:38:08 I can think of it meaning anti-affinity within a cell. if a cell represents a group of instances belonging to a certain security zone for example. this kind of brings up the two level scheduling possibility I think, allowing in-cell schedulers to filter if they want to. at the top anti-affinity would mean different cells, in-cell it would be within the same cell 17:38:18 most of the cells filters I deal with are primarily about the capabilities of a cell, this type of hardware for this type of flavor type thing 17:38:42 so my initial thinking is that cells scope only deals with large things like that, everything else is global host level stuff 17:39:12 melwitt: that's just two filters, if we consider a top-level scheduler 17:39:34 bauzas: okay 17:40:08 alaski: are you grouping per capabilities ? 17:40:42 bauzas: not sure what you mena 17:40:44 mean 17:41:11 alaski: sorry, I was referring to what you mention as "capabilities of a cell" 17:41:28 alaski: that implies homegeneous hardware 17:41:39 bauzas: ahh, gotcha 17:42:00 bauzas: doesn't have to be fully homogeneous, but close enough 17:42:12 like all SSDs, but perhaps different cpus 17:42:22 alaski: because I'm considering cells as just something for scaling Nova, not necessarly necessary for grouping 17:42:34 alaski: because aggregates are used for that purpose 17:43:05 sure, but it is a grouping as well and I think it will be used that way 17:43:39 alaski: sorry but why not creating a big aggregate per cell, and leave as it is now ? 17:43:40 but it's a good point that it doesn't need to be 17:44:12 alaski: I'm asking that, because that sends a bad signal that cells are just aggregates 17:44:46 alaski: but I agree with you on the colocation aspect 17:45:03 alaski: I guess you tend to group cells by hardware proximity 17:45:25 yeah 17:45:50 I'm currently rethinking how much of a concept cells need to be in the scheduler 17:46:45 we do need to figure out how to migrate the concepts we currently have into something the scheduler can handle 17:46:51 alaski: well, I'm considering cells as just a new colocation object 17:47:04 but if can move cell capabilities into aggregates that could be a path 17:47:30 bauzas: yeah, but should that be different than an implicit aggregate? 17:47:51 alaski: because of the affinity stuff you mentioned 17:48:08 alaski: like saying that you have a DC with 3 floors 17:48:37 alaski: you could use aggregates for grouping your SSD hosts which are not at the same floor 17:49:16 alaski: but you could need to do a boot request saying 'I want to boot on a host at the 3rd floor, because I know that my power consumption is lower there" 17:49:29 alaski: of course, it can be implicit 17:50:06 so the affinity can be done orthogonallyt 17:50:17 right 17:51:00 alaski: and that would solve one big confusion about Nova AZs != AWS AZs 17:51:24 because we could say that Nova cells are somehow related to AWS AZs 17:52:13 possibly 17:52:20 that's still up to the deployer I think 17:52:25 agreed 17:52:42 that's why I'm saying that's just a new segregation object for the scheduler 17:52:54 so new filters 17:53:10 - leaving off the scaling problem, of course - 17:53:36 the new filter would be at the same place other scheduler filters are at 17:53:50 alaski: if the scheduler is a global scheduler, yes 17:53:55 going this route I'm not sure we're introducing a real scalability difference 17:54:23 that's just an abstraction that leaves the scalability problem unresolved 17:54:35 bauzas: right, I'm assuming global scheduler for now since that's what we have 17:54:52 bauzas: unresolved, but not any different than today right? 17:55:01 alaski: exactly, and that's necessary to keep that abstraction if we want "orthogonal" filtering 17:55:18 alaski: right, I'm just trying to say that's distinct efforts 17:55:34 #1 scalability problem of a global scheduler 17:55:42 #2 cells-related filters 17:56:05 okay. agreed, though for this meeting I would reverse the ordering :) 17:56:13 lol 17:56:36 just thinking about hash rings for scheduler, lol 17:56:48 well that felt productive to me, because I'm totally rethinking scheduling now 17:57:24 bauzas: heh, if we could do that it would be pretty cool 17:57:42 yeah.... but claims are compute-based :/ 17:57:43 so my takeaway is that I need to rewrite my scheduling spec 17:58:11 eh 17:58:37 you have to jump in on scheduler, as I have to do with cells :) 17:58:51 I'm seeing that 17:59:05 I'll look up the meeting time and start showing up 17:59:18 cool 17:59:25 any last minute thoughts from anyone, since we didn't get an open discussion? 17:59:41 regex fix merged, I put rechecks on the two patches 17:59:48 melwitt: \o/ 17:59:48 melwitt: awesome! 17:59:57 mens are chattering... 18:00:10 congrats ! 18:00:13 heh 18:00:14 doing rechecks now 18:00:29 she already did them 18:00:34 thanks everyone! 18:00:34 oh cool 18:00:38 thanks 18:00:40 #endmeeting