14:00:19 <edleafe> #startmeeting nova_scheduler 14:00:20 <openstack> Meeting started Mon Oct 3 14:00:19 2016 UTC and is due to finish in 60 minutes. The chair is edleafe. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:00:21 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:00:23 <openstack> The meeting name has been set to 'nova_scheduler' 14:00:26 <edleafe> Who's around? 14:00:28 <takashin> o/ 14:00:29 <_gryf> o/ 14:00:37 <rpodolyaka> o/ 14:00:37 <Yingxin> o/ 14:00:42 <alex_xu> o/ 14:00:43 <cdent> o/ 14:01:03 * johnthetubaguy lurks 14:01:42 <jaypipes> o/ 14:01:43 <edleafe> Let's wait another minute 14:02:31 <edleafe> #topic Specs and reviews 14:02:37 <edleafe> Nothing on the agenda 14:02:54 <edleafe> Anyone have something that needs special attention? 14:03:02 <bauzas> \o 14:03:19 <alex_xu> the traits API's spec and poc is ready 14:03:27 <cdent> just a remind about the existence of the aggregates reviews: https://review.openstack.org/#/c/362863/ 14:03:42 <Yingxin> yup, traits 14:03:48 <edleafe> #link https://review.openstack.org/#/c/362863/ 14:04:03 <edleafe> alex_xu: care to add a link to the start of those series? 14:04:11 <Yingxin> start from https://review.openstack.org/#/c/376198 14:04:18 <edleafe> thanks Yingxin 14:04:48 <edleafe> Anything else? 14:04:52 <Yingxin> and the spec https://review.openstack.org/#/c/345138 14:05:07 <edleafe> #link https://review.openstack.org/#/c/376198 14:05:07 <alex_xu> Yingxin: thanks 14:05:15 <edleafe> #link https://review.openstack.org/#/c/345138 14:05:18 <alex_xu> a little trouble with my browser.. 14:05:21 <bauzas> I need to write the (hopefully) specless blueprint about the left usage of RequestSpec 14:05:45 <edleafe> Yingxin: you can add them as #link entries so that people can find them in the meeting minuntes 14:06:08 <edleafe> bauzas: ah, good - that leads us to... 14:06:12 <edleafe> #topic Opens 14:06:21 <Yingxin> edleafe: ok sorry 14:06:24 <edleafe> bauzas: care to summarize what's left? 14:06:28 <edleafe> Yingxin: no worries 14:06:45 <bauzas> edleafe: well, are you guys knowing how the RequestSpec object is used ? 14:07:01 <edleafe> bauzas: probably not as well as you 14:07:08 <edleafe> :) 14:07:10 <bauzas> I first wrote the implementation for creating a RequestSpec object in the conductor and passing it to the scheduler 14:07:20 <bauzas> that was the first item 14:07:44 <bauzas> then, alaski helped me to persist that object in DB so that we could use it for cells v2 needs 14:08:06 * alaski peeks in 14:08:32 <bauzas> then, I wrote a 3rd item where all the API move operations check whether the RequestSpec already exists, and if so, pass it to the conductor 14:09:03 <bauzas> plus, I wrote a special DB migration for making sure that all the instances are now having a related RequestSpec object 14:09:32 <bauzas> that means that now, either you boot and then you go to the conductor that writes a RequestSpec object and passes it to the scheduler 14:10:07 <bauzas> or, you do another API operation (resize, migrate, live-mig, etc.) and then you're getting it straight from the API 14:10:09 <bauzas> buuuut 14:10:15 <bauzas> there are a lot of things left to do 14:10:44 <jaypipes> bauzas: short list of those remaining items would be useful. 14:10:58 <bauzas> jaypipes: yeah, hence me saying I need to write a bp 14:11:06 <bauzas> but edleafe wanted to know about :) 14:11:32 <edleafe> So, here's my concern: bauzas is pretty much the only one working on this in enough detail to understand what needs to be changed, and what the effects will be 14:11:51 <edleafe> What I'd like to see is some knowledge distribution 14:12:21 <edleafe> Maybe have bauzas act as oversight, and have others do the work 14:12:33 <edleafe> The RequestSpec czar, if you will 14:12:46 <bauzas> so, what's missing is basically to make sure our computes are getting our spec object instead of getting the legacy dicts, so that they could reschedule by passing back that object 14:12:58 <edleafe> This way he's not the only one who migth feel confident enough to make updates to it in the future 14:13:28 <bauzas> and then we could cleanup our RPC interfaces by removing all ugly request_spec/filter_props dictionaries and all our code conditionals where we test whether it's a dict or a nova object 14:13:29 <edleafe> The same concern goes for cdent and the placement API/enginie 14:13:34 <cdent> +1 14:13:38 <bauzas> that's it for me 14:13:55 <bauzas> edleafe: the only problem I see with that is that I like coding too :p 14:14:06 <edleafe> bauzas: fair enough :) 14:14:14 <edleafe> bauzas: you can do _some_ of it 14:14:20 <bauzas> but I could help with the placement series, like I promised :) 14:14:35 <edleafe> but I'd feel better if more people understood it in depth 14:14:45 <bauzas> either way, I think it's more a problem about how we can, as a team, share our work 14:14:53 <edleafe> bauzas: 'zactly 14:15:11 <bauzas> edleafe: but like I pointed to you, we're having very different backgrounds 14:15:20 <bauzas> so, that's not trivial 14:15:47 <edleafe> bauzas: That's true, but it seems like the technical debt problem 14:16:00 <edleafe> Not enough time to address current debt, so it keeps increasing 14:16:02 <dave-johnston> o/ 14:16:18 <edleafe> Maybe since Ocata is a short cycle, this might be the best time to address it 14:17:05 * bauzas shrugs 14:17:16 <bauzas> I mean, sure that would be cool 14:17:29 <edleafe> Anyone else have opinions on this one way or the other? 14:18:20 <cdent> Other than "pay more attention to tech debt and share workload more" are you proposing something more specific edleafe ? 14:18:42 <cdent> because if it is mostly what I've quoted, then yeah, of course we should do that 14:19:00 <edleafe> cdent: yes, something like http://blog.leafe.com/pair-development/ 14:19:24 <cdent> yeah, as I've said before I'm pretty keen on that 14:19:25 <_gryf> I agree, working together on certain topic is a better way for understanding code underneath than doing reviews only 14:19:37 <edleafe> But of course there are other options that we should consider 14:20:29 <Yingxin> we can also work on a shift :) 14:20:50 <edleafe> So maybe have bauzas write the BP, and then we can consider how to divide that up so as to spread the knowledge better? 14:21:01 <bauzas> oh man, work for me :p 14:21:03 <edleafe> Yingxin: yes, good point 14:21:15 <bauzas> (and not, work*s* for me :p ) 14:21:50 <edleafe> Yingxin: but it does help to have enough timezone overlap for those times that a hangout or similar might be needed 14:22:07 <edleafe> Boss Bauzas :) 14:22:21 <Yingxin> edleafe: yes, indeed 14:23:05 <edleafe> This is certainly something we can discuss at the summit, too. 14:23:17 <edleafe> OK, let's move on 14:23:26 <edleafe> Placement DB Spec 14:23:55 <edleafe> cdent rpodolyaka and I are trying to hammer this one out 14:23:57 <cdent> gist is that Matt has asked that there be a spec for the optional placement db to avoid confusion 14:24:24 <cdent> there's been some noodling at 14:24:26 <cdent> #link https://etherpad.openstack.org/p/placement-optional-db-spec 14:24:36 <cdent> some of which includes "should we even bother" 14:24:42 <rpodolyaka> ++ 14:25:02 <edleafe> cdent: "Should we even bother? I mean, we're all going to die anyway" 14:25:05 <edleafe> :) 14:25:08 <rpodolyaka> I just went over the changes that actually made it to Newton and looks like "self healing" should migrate most of the things 14:25:17 <rpodolyaka> *except for aggregates 14:25:36 <edleafe> rpodolyaka: and the initial db/table creation 14:25:40 <cdent> rpodolyaka: it should, but it might stall adding new consumers for too long 14:25:49 <rpodolyaka> edleafe: sure 14:25:52 <cdent> if I remember the concerns correctly 14:26:00 <cdent> we probably need to inquire with dansmith 14:26:24 <rpodolyaka> my understanding was that we report resources state on start in nova-compute and in a periodic task that is executed like every minute 14:26:46 <edleafe> rpodolyaka: but does that account for allocations, too? 14:26:49 <cdent> but in case people were wondering: we're trying to use this as an example of "working on stuff together" 14:26:52 <bauzas> not sure I understand what the problem is 14:27:14 <edleafe> cdent: yes, exactly 14:27:27 <rpodolyaka> edleafe: it turns out we also create allocations based on instances usages. this is done in update_resources_...() in RT, so this should be covered as well 14:27:50 <edleafe> rpodolyaka: ok, thanks. I need to dig into that code path deeper 14:28:01 <rpodolyaka> yeah, I want to give it a try on devstack 14:28:10 <johnthetubaguy> so just looking at the etherpad, is this about the DB no longer being optional? or did I get that back to front? 14:28:24 <edleafe> johnthetubaguy: you got it right 14:28:37 <bauzas> wait, what ? 14:28:44 <cdent> this is still about it being optional 14:29:02 <edleafe> optional in Ocata?? 14:29:02 <johnthetubaguy> but about the need for some stuff to be migrated if you want it to be separate? 14:29:02 <cdent> but that in ocata the cost of that optional-ness is different than it would have been if we had managed to get it out in newton 14:29:15 <bauzas> cdent: not sure I understand the same point 14:29:25 <bauzas> cdent: I still think the placement DB should be optional 14:29:45 <cdent> bauzas: yes, that's the plan 14:29:57 <bauzas> just that we need to consider what was written in the API DB could possibly be now written in a separate DB 14:30:05 <johnthetubaguy> so I think what I am saying, is I don't understand the problem that etherpad is trying to solve right now, I am missing context somewhere 14:30:14 <cdent> but if people choose to use it, then there will need to be some kind of migration and what we're discussing (on the etherpad) is how much migration will be required and how to do it 14:30:15 <edleafe> Hmmm, I thought that the optional in Newton was to smooth the transition to non-optional in Ocata 14:30:26 <bauzas> edleafe: not at all 14:30:36 <edleafe> that's disappointing 14:30:49 <bauzas> edleafe: the point about that being optional is that we share the same schema 14:30:52 <edleafe> So is there a plan for when it will be non-optional? 14:30:57 <johnthetubaguy> I thought optional was so you could avoid a nasty migrate, if you were willing to plan ahead, but yeah 14:31:17 <cdent> johnthetubaguy: that's right, both edleafe and bauzus are a bit out of sync 14:31:28 <bauzas> edleafe: so operators wanting to use it heavily could have the benefit of a separate DB without getting the PITA for us to maintain a 3rd schema and all the tooling associated 14:32:22 <johnthetubaguy> so I guess some of the good news here, is we have hit the confusion already, so thats good news 14:32:24 <bauzas> cdent: specs are good for clarifying the intent, I would say :) 14:32:28 <edleafe> I get the 'same schema' concept 14:33:01 <cdent> johnthetubaguy: the reason I've suggested that perhaps we shouldn't even bother with the optional database is now that a migration of some kind is going to be required anyway, may as well just wait and do a big one when placement is extracted, if it ever is 14:33:09 <edleafe> But I'm still wondering if the plan is to always have an option to keep it in the API DB 14:33:37 <cdent> edleafe: that's kind of the flip side of what I just said 14:34:04 <edleafe> cdent: the "if it ever is" part? 14:34:31 <bauzas> edleafe: cdent: the migration we're talking about are about inventories, right ? 14:34:50 <edleafe> inventories, allocations, and aggregates 14:34:54 <bauzas> because allocations are written using the periodic update or within claims, so that's not really an issue 14:35:01 <bauzas> aggregates aren't a thing yet 14:35:11 <cdent> edleafe: as in: what's the actual goal here, where will the boundary lie, how permeable will it be? will it ever migration? will there always be an option to stay in api db, etc. 14:35:13 <bauzas> so that leaves inventories and allocations 14:35:31 <bauzas> plus the fact that nobody is actually calculating the resource usage 14:35:42 <bauzas> (yet) 14:36:05 <cdent> I'd like to think that we could hash this out on the etherpad and in email in a more...considered... fashion because we're pretty much just throwing words at one another right now and not really making any sense 14:36:05 <bauzas> we rushed in Newton because we wanted our Newton computes to be able to send allocations and inventories 14:36:22 <edleafe> cdent: My understanding was that the separation will happen, with the question being when 14:36:26 <cdent> In large part because we don't actually have consensus on the goals. Making a spec without a big picture goal is _useless_ 14:36:49 <cdent> edleafe: me too, but bauzas apparently disagrees 14:37:16 <edleafe> The rush to get stuff in newton was specifically so that the change in Ocata would be possible 14:37:22 * jaypipes reading back, sorry had emergency call 14:37:37 <edleafe> Otherwise, we'd have to wait until Pike to do it 14:38:41 <edleafe> Let's timebox this until 14:45 because we have other items on the agenda we need to get to 14:38:50 <cdent> yes please 14:38:51 <edleafe> We can always continue on -nova 14:39:10 * edleafe is waiting for jaypipes to catch up 14:40:36 <edleafe> Well, let's cover the other stuff. We can circle back to jaypipes after 14:40:46 <edleafe> Placement leftovers: 14:40:50 <edleafe> #link http://lists.openstack.org/pipermail/openstack-dev/2016-October/104900.html 14:41:09 <edleafe> cdent? 14:41:29 <cdent> thanks, I simply wanted to draw people's attention to that email 14:41:44 <cdent> that's a list of things I thought of that are lose ends on the placement work already done 14:41:55 <cdent> stuff that we should let slip away lest we create more cruft and debt 14:41:58 <cdent> shouldn't! 14:41:59 <cdent> :) 14:42:08 <edleafe> Freudian slip there 14:42:30 <jaypipes> bauzas: what are you talking about with "nobody is calculating the resourc eusage yet"? 14:42:39 <cdent> As I say in the message I'd like feedback on what matters, and people who want to work on it with me and others. 14:42:49 <edleafe> #action Everyone read cdent's email at http://lists.openstack.org/pipermail/openstack-dev/2016-October/104900.html and provide feedback 14:42:51 <bauzas> jaypipes: I mean that we're not using the placement API yet for doing our filtering 14:43:00 <jaypipes> bauzas: ah. 14:43:03 <jaypipes> bauzas: yes 14:43:15 <bauzas> ie. we don't have a client actually consuming those placement information 14:43:26 <bauzas> we have clients that provide resource information 14:43:43 <jaypipes> bauzas: well, we do. the scheduler reporting client is consuming the allocation and inventory information. 14:43:57 <bauzas> that's correct 14:44:40 <bauzas> jaypipes: okay, I think it's gonna be tough discussing it there, I would suggest to draw something out later either way 14:45:02 <jaypipes> bauzas: k 14:45:02 <bauzas> I'm just not super convinced that we should make it non-optional if we share the same models 14:45:39 <bauzas> and providing a separate model for placement seems an hard call for Ocata given the shorter cycle 14:45:54 <jaypipes> bauzas: the optional-ness of the placement DB was more about the tooling around db sync and migrations, no? It wasn't about whether we wanted a separate placement DB... 14:46:04 <jaypipes> dansmith: is ^ your recollection? 14:46:09 <bauzas> jaypipes: I agree 14:46:17 <bauzas> hence me wanting to stick with the plan 14:46:18 <edleafe> jaypipes: that was my understanding, too 14:46:53 <cdent> bauzas: can you restate what you agree with please? 14:46:57 <jaypipes> bauzas: I guess I need a refresher on the specifics of the current plan, then. Ill go read that etherpad. 14:47:09 <dansmith> bauzas: yeah, I don't want them to be forced to have another db until it's actually split, but we should provide all the machinery to make it possible 14:47:12 <dansmith> er, jaypipes ^ 14:47:49 <jaypipes> dansmith: yes, totes. but the defintion of "all the machinery" needs to be specified. 14:47:50 <cdent> I wonder if confusion is from the etherpad point out that placement _api_ is non optional, which is not the same as placement _db_ being optional or not 14:47:59 <dansmith> jaypipes: agreed 14:48:11 <cdent> there's code here: https://review.openstack.org/#/c/362766/ it already works 14:48:24 <cdent> (in local devstacks) 14:48:40 <cdent> we stalled that code for newton because we didn't want to add to the confusion 14:48:46 <jaypipes> dansmith: for instance, do we want to provide machinery to do a migrate/sync on a separate placement DB connection, or do we continue the strategy from Newton of "if you're a big provider and want to proactively do X, then run these commands..." 14:49:31 <edleafe> cdent: my confusion came from the changes between Austin and the midcycle, where the DB strategy seemed to change. I thought the resolution was to make it temporary short-term, and mandatory after that 14:49:40 <dansmith> jaypipes: I thought you were going to just write a script for people that wanted to? 14:49:58 <edleafe> cdent: I don't recall always being optional as, well, an option 14:50:12 * bauzas needs to drop-off for doing some homework to my elder 14:50:23 <edleafe> ok, thanks bauzas 14:50:31 <jaypipes> dansmith: that would certainly be fine. just need to write down that that is indeed our strategy/plan for Ocata. 14:50:48 <dansmith> cool 14:51:04 <cdent> Can jaypipes, dansmith and bauzas hash out what the end game is, on that etherpad so we don't lose the details? 14:51:06 <edleafe> Yeah, I get the feeling that decisions are being made and not spread publicly very well 14:51:14 <jaypipes> cdent: yep 14:51:19 <cdent> awesome thanks jaypipes 14:51:32 <edleafe> yes, let's move on now 14:51:36 <edleafe> Next up are 2 cold migration specs from takashin: 14:51:36 <edleafe> #link https://review.openstack.org/#/c/334286/ 14:51:36 <edleafe> #link https://review.openstack.org/#/c/334725/ 14:51:48 <takashin> Could you review them? 14:51:50 <edleafe> takashin: do you have any comments about them? 14:52:15 <takashin> They are about cold migration. 14:53:01 <edleafe> takashin: ok, thanks. I just wanted to make sure that you didn't have any extra concerns about them 14:53:18 <takashin> they are my specs. 14:54:09 <edleafe> ok, thanks 14:54:14 <edleafe> Anything else for opens? 14:54:32 <alex_xu> jaypipes: edleafe so let us hangout for traits API after the meeting? 14:55:08 <edleafe> I'm free 14:55:15 <edleafe> jaypipes? 14:55:53 <alex_xu> edleafe: cool, thanks 14:56:48 <edleafe> alex_xu: let's wait until jaypipes is available. I know it's pretty late for you 14:56:57 <alex_xu> edleafe: yeah 14:57:02 <edleafe> alex_xu: we'll continue in -nova 14:57:05 <cdent> alex_xu, edleafe If you guys can keep me informed on when that is, I'll try to make it 14:57:12 <alex_xu> edleafe: yea, cool 14:57:15 <alex_xu> cdent: ok, got it 14:57:17 <edleafe> cdent: it was supposed to be now 14:57:32 <edleafe> but with no jaypipes it might have to wait 14:57:41 <dave-johnston> ] 14:57:46 <edleafe> Anyway, let's continue in -nova 14:57:50 * cdent nods 14:57:53 <edleafe> #endmeeting