21:00:25 #startmeeting nova_cells 21:00:26 Meeting started Wed Feb 21 21:00:25 2018 UTC and is due to finish in 60 minutes. The chair is dansmith. Information about MeetBot at http://wiki.debian.org/MeetBot. 21:00:28 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 21:00:30 The meeting name has been set to 'nova_cells' 21:00:33 o/ 21:00:36 I got distracted talking to belmoreira 21:00:41 o/ 21:00:43 hence my 26 second tardiness 21:01:32 #topic bugs 21:01:53 we've got a pretty good set on the agenda, cultivated by tssurya: https://wiki.openstack.org/wiki/Meetings/NovaCellsv2 21:02:16 tssurya: I'm not sure how I feel about continuing to work on that first one, for cellsv1, to be honest 21:02:33 o/ 21:02:46 dansmith : well we have that patch in production now 21:03:01 and we would be moving away from cellsv1 soon :) 21:03:08 tssurya: yeah, it just doesn't work for our test environment, hence my concern 21:03:35 dansmith : so maybe we just keep it as WIP ? 21:03:38 ack, so I think I'll just leave it up in case people need it, but not really push on it 21:03:38 yeah 21:03:42 I'll make a note on it 21:04:16 the rest of the bugs up there look straightforward and almost all have reviews, which melwitt just added to the priorities list, so ... review those 21:04:23 tssurya: any of those you want to highlight? 21:04:26 I would appreciate some pointers on trying to write a test case for this : https://review.openstack.org/#/c/546660/ with 21:04:36 respect to deleting RPs 21:04:54 tssurya: okay cool 21:05:37 dansmith : thanks, 21:05:46 will wait for your comments in the review then 21:06:00 tssurya: sure, or mriedem.. he's good with that stuff 21:06:00 I don't have anything else to highlight 21:06:03 okay 21:06:15 dansmith : okay 21:06:18 #topic open reviews 21:06:23 I have this set up: https://review.openstack.org/#/q/status:open+project:openstack/nova+branch:master+topic:bp/placement-req-filter 21:06:35 which is about a pre-filtering mechanism for the scheduler, which isn't cells-specific, 21:06:52 but came up because of the concerns tssurya and belmoreira had about the scheduler choking on the full result set from placement 21:07:04 this will let us fine-tune what we ask of placement for lots of cases 21:07:18 this being one solution for the cells case: https://review.openstack.org/#/c/545002/ 21:07:26 dansmith : thanks again for doing this 21:07:29 specifically over tenant cell assignment 21:07:46 * melwitt adds to priorities etherpad 21:07:50 here's the start of another one that isn't cells-specific: https://review.openstack.org/546282 21:08:01 which would let us do AZs without a post-scheduler filter like we do today, 21:08:09 which will be way more efficient when users ask for a specific AZ 21:08:35 there is some placement API work that has to be done first in order for both of these to work, but it's just a parity thing and not too major 21:08:50 dansmith : so the placement aggregates would be modelled to accommodate the avz ? 21:09:05 tssurya: for the AZ thing yeah 21:09:15 cool 21:09:26 jay is working on a spec to allow mirroring of aggregate operations up to placement, 21:09:36 so when you add an aggregate and add hosts to it, nova will tell placement about those things 21:09:40 so you don't have to do everything twice 21:09:51 dansmith by not cell specific is because it uses aggregates? 21:10:00 however, until that happens, you'd just have to make sure placement knows about the links 21:10:05 I need to read up on that. placement will do some aggregate stuff but not all, like metadata I assume? 21:10:15 belmoreira: not cells-specific because people that just use AZs today would still want this 21:10:39 melwitt: right, placement already has aggregates for things like knowing which computes are connected to which networks, shared storage, etc 21:10:48 but it's not as heavy as nova's implementation 21:10:55 k, cool 21:11:09 not going to have all of the key=value stuff in it 21:11:13 correct 21:11:21 got it 21:11:51 dansmith do you think that then we can also have the cell abstraction? 21:12:15 belmoreira: what do you mean? 21:12:23 model cells in placement i assume 21:12:33 like ed's idea about nested providers 21:12:37 for large sites aggregates are fine grained. We organize thinhs with cells 21:12:38 even though cells don't provide inventory 21:13:13 yeah, cells don't provide inventory, which is why I think it's a bad idea to model cells as parent providers 21:13:21 not to mention it makes the entire deployment in one tree 21:13:24 meaning that we will need to duplicate the host-cell mapping that we already have per cell for the aggregates 21:14:03 belmoreira: placement is definitely not going to get a cell notion 21:14:19 belmoreira: the closest would be nova maintaining an aggregate per cell when hosts are mapped or something 21:14:42 which I guess we could do, but it doesn't excite me :) 21:14:56 dansmith ack :) 21:15:08 nested aggregates anybody? 21:15:46 melwitt: that's what I'll tell people when they ask why I'm applying for my next job, yeah 21:15:46 or wait, we can already do that 21:15:59 but if not done by nova ,operators will need to keep them in sync (aggregate/cell). Not easy... 21:16:00 no, we don't have nested aggregates, we have overlapping aggregates 21:16:05 but maybe AZs messes that up. anyway 21:16:21 overlapping is what I was thinking of 21:16:23 okay 21:16:25 belmoreira: well, operators that need cell based scheduling 21:17:01 belmoreira: so far you're the only one I know of like that, and the other large operators I've talked to want *more* management-via-aggregate, like the allocation ratios thing 21:17:31 fwiw I predict other large operators wanting it 21:17:49 like, if they had cells, they'd want to manage ratios per cell if they could 21:17:52 so, I get that your case would require some manual syncing of those concepts, and I get why that sucks, I just need to kinda get my head around what we ca do about it 21:18:08 melwitt: they can, by defining aggregates 21:18:27 but what if you have multi aggregates in one cell? then can't, right? 21:18:37 sure 21:18:54 that's why I think forcing people into one per cell is wrong anyway 21:19:06 because some people may have smallish cells and deal with things only on the cell level, 21:19:19 others may have giant cells, for which no one rule applies to all things in that cell 21:19:35 that's why aggregates can overlap and why they have metadata and not fixed attributes 21:20:05 and you define aggregates around the things with similar characteristics, and assign meaning to those as appropriate 21:20:12 dansmth true 21:20:34 but also because we lack metadata in cells 21:21:27 that's intentional though, so we don't have to apply all the things we can do with aggregates to cells in a different way 21:22:06 yeah ... that makes sense 21:22:19 there are a ton of things you can do with aggregates, and replicating that onto cells is just a terribly complex undertaking 21:22:27 isn't a lot of this trying to shoe-horn the old multi-level cells scheduler stuff into the new flat world rather than just doing things the way we can with what we have in flat scheduling? 21:22:40 mriedem: yes 21:22:41 it might, yeah 21:22:41 like, in cells v1 we have 2 level scheduling and can optimize the cell that's picked, 21:22:51 ok 21:23:03 it's, IMHO, more about "tenants are fixed into these silos", which is valid 21:23:21 those silos used to be naturally cellsv1 cells, but I don't want to tie more meaning into a cell than we have to, 21:23:29 because i don't think we want to add a bunch of new complexity to maintain how things were done the cells v1 way 21:23:34 which is why I'm resistant to giving them more meaning than just a group of computes that share a db/mq 21:23:39 right 21:23:41 ack 21:23:52 * melwitt nods 21:23:53 i realize that makes the transition harder 21:24:26 mirroring cells into aggregates (i.e. when we discover a new cell mapping, we add an aggregate, and when we map a new host, we add it to the aggregate) is an option, I just don't want to make that too easy :) 21:24:56 mriedem the transition yes, but I'm must worried about the operations 21:25:51 I need to setup something to keep the aggregates in sync and we will have few of them 21:25:55 this talk is making me think it might be useful to brainstorm a few reference deployment layouts to include in our docs 21:26:07 for example: aggregate-cell; aggregate-avz 21:26:29 complete with how you could draw your aggregates and cells 21:26:57 "if you currently do this with cells v1, this is how you would do that in cells v2" 21:27:29 going from the multi-level stuff to the flat. anyway, just an idea 21:28:01 melwitt: as long as we're not making a direct mapping, but describing how you achieve things in the new system 21:28:18 belmoreira: so, since this is going on and to wrap up a bit: 21:28:33 * melwitt nods 21:29:03 belmoreira: are you willing to do the mapping with aggregates and this pre-filter thing for a first-go, and report back with how heavy it is in reality for maintenance? 21:29:39 presumably the worst case here is "yes, this is very hard, we have aggregates out of sync sometimes, because $reasons, etc" 21:30:03 dansmith sure. let's give it a go 21:30:36 belmoreira: okay cool.. this aggregate idea is the result of me transitioning from "hell no" to this, which gets us close, 21:31:12 so I think we'll end up with something workable given more soak time and learning more about the pain points, as we have already 21:31:42 and we have next week to smash our brains together on ideas for refining things 21:31:57 okay, so.. any other open reviews to highlight? :) 21:32:01 other than tssurya's bug reviews 21:32:32 not yet, consoles stuff is still up but spec not re-approved yet. fyi 21:32:36 dansmith having some filtering in the placement is already something great. thanks for that. don't let me wrong :) 21:32:43 belmoreira: okay :) 21:32:59 melwitt: ack 21:33:12 #topic open discussion 21:33:20 we've already done a lot of discussing openly 21:33:25 no meeting next week because obvious. 21:33:32 anything else to bring up? 21:33:38 nope 21:33:51 * dansmith 's fingers are already tired 21:34:00 nay 21:34:06 tssurya: looking forward to meeting you next week! 21:34:16 dansmith : same here! 21:34:17 ++ 21:34:28 aight, cells team out 21:34:30 #endmeeting