14:04:00 <cdent> #startmeeting nova scheduler 14:04:01 <openstack> Meeting started Mon Jun 4 14:04:00 2018 UTC and is due to finish in 60 minutes. The chair is cdent. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:04:02 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:04:06 <bauzas> noo 14:04:06 <openstack> The meeting name has been set to 'nova_scheduler' 14:04:16 <bauzas> howdy, it works \o/ 14:04:19 <takashin> o/ 14:04:26 <cdent> #chair efried bauzas edleafe jaypipes 14:04:26 <bauzas> s/<space>/_/ 14:04:29 <openstack> Current chairs: bauzas cdent edleafe efried jaypipes 14:04:41 <efried> ō/ again 14:04:46 <cdent> it fixes it automagically as I recall 14:05:05 <bauzas> I wasn't expecting to have "nova scheduler" be transformed into "nova_scheduler", I was rather thinking it was creating a "nova" one 14:05:11 <bauzas> anyway, gtk 14:05:15 <cdent> #link agenda https://wiki.openstack.org/wiki/Meetings/NovaScheduler#Agenda_for_next_meeting 14:05:25 <bauzas> I'm there for 20 mins 14:05:45 <bauzas> please thank the daylight summer time 14:06:00 <cdent> As there hasn't been a recent placement update email (I'll try to do one this coming Friday) I'm not sure of the state of specs and reviews but: 14:06:04 <cdent> #topic specs and reviews 14:06:13 <cdent> any that people would like to bring up? 14:06:20 <efried> nrp-in-alloc-cands series needs second +2s 14:06:29 <efried> current bottom: https://review.openstack.org/#/c/567113/ 14:06:46 <cdent> will be exciting to see that live 14:06:48 <efried> melissaml is +1, so it's clearly ready. 14:07:03 <bauzas> I can honestly approve the spec, not the series 14:07:08 <bauzas> because SQL, dudes 14:07:23 <bauzas> I looked at it, I understand it 14:07:40 <bauzas> but how can I know whether we have problems with that except just trusting ? 14:07:46 <efried> The second +2 is really on jaypipes 14:07:51 <bauzas> maybe jaypipes is the best person to +W 14:07:53 <cdent> very close reading of the tests? 14:07:53 <bauzas> yeah that 14:07:54 <efried> yes 14:07:58 <alex_xu_> o/ 14:08:15 <cdent> any other reviews in a similar state? 14:08:23 <tetsuro> im ready to response questions on that series and hope that jay have a second look on that 14:08:33 <efried> I'd like to know the state of consumer gen series. 14:08:41 <efried> Whether that's ready for a final look yet. 14:08:51 <bauzas> my own series is blocked by https://review.openstack.org/#/c/557065/ 14:08:55 <bauzas> I beg here for reviews 14:08:58 <efried> current bottom of consumer gen: https://review.openstack.org/#/c/567678/9 14:09:34 <efried> bauzas: Have we given up on using yaml right out of the gate on that one? 14:09:37 <bauzas> I have a merge conflict on my implementation but since I need to update my series because of a spec's revision, please at least review the spec first 14:09:43 <bauzas> efried: indeed 14:09:44 <jaypipes> tetsuro, efried, bauzas: I don't believe we should proceed with anything in n-r-p until we settle on a design for the in-place upgrade of the compute nodes. 14:10:08 <bauzas> jaypipes: that's a reasonable assumption, at least for new resource allocations 14:10:34 <bauzas> jaypipes: I think the problem is not existing with resource classes that don't need to be migrated 14:10:36 <efried> jaypipes: That's not a necessary blocker, esp. for drivers that haven't yet modeled resources that will need to move. 14:10:43 <efried> exactly 14:10:43 <bauzas> what efried said 14:10:56 <bauzas> also 14:11:01 <bauzas> one thing we need to keep in mind 14:11:04 <efried> Concrete example: I want to start modeling GPUs in powervm. 14:11:07 <bauzas> we support rolling upgrade 14:11:22 <bauzas> so we still need to support N-1 computes 14:11:34 <jaypipes> bauzas, efried: guys, if the drivers begin modeling resources with trees, that's when stuff will blow up, right? so we need to put all those series on hold until we figure out a path forward for the "healing" process stuff. 14:12:05 <efried> jaypipes: No, only in cases where *existing* resources are being *moved* from the compute node RP to subtrees 14:12:15 <gibi> jaypipes: does it include the case when Neutron will start report nested RPs? 14:12:21 <jaypipes> efried: right. so, NUMA, VGPU, etc. 14:12:35 <efried> jaypipes: PowerVM doesn't have GPUs yet - been deliberately waiting for this reason. 14:12:56 <bauzas> again, one thing to keep in mind is that even if we merge compute patches for NRP in Rocky, scheduler and placement still have to handle Queens RPs 14:13:04 <efried> So as soon as we have nrp-in-alloc-cands, we can start - we don't have to wait for the migration stuff. Because we won't have migration yet. 14:13:44 <jaypipes> efried: here comes the bluntness everyone hates me for apparently... but I really don't prioritize PowerVM's driver stuff over getting the fundamentals correct for the whole rolling upgrade problem that bauzas brought up on ML. 14:13:47 <bauzas> efried: tbc, nrp-in-alloc-cands work with rolling upgrades because we check microversions 14:14:22 <efried> I'm giving PowerVM as an example of why nrp shouldn't be blocked on the upgrade issue. 14:14:24 <bauzas> ok, let's not freak out 14:14:30 <bauzas> (like the GitHub thing :p ) 14:14:40 <bauzas> we're still end of Roxky-2 14:15:04 <efried> Not asking anyone to prioritize PowerVM. Just asking to prioritize nrp like we've been asking to prioritize it since the beginning of the cycle. 14:15:08 <bauzas> so we can settle down a solution for rolling upgrades *and* merge bits on nrp in time for Rocky hopefully 14:15:42 <bauzas> I was just trying to identify the upgrade impact 14:15:55 <bauzas> hence me saying that some things shouldn't be blocked 14:15:59 <jaypipes> efried: I'm just saying if we're going to focus on n-r-p things, it should be anything that will enable the fixes for rolling upgrade. 14:16:00 <bauzas> but looks like we're diverting 14:16:11 <bauzas> jaypipes: that sounds reasonable 14:16:23 <bauzas> in terms of upstream effort on priotization 14:16:35 <bauzas> gosh, I have gloves 14:16:46 <jaypipes> efried: if that's tetsuro's n-r-p with allocs series, fine, but I don't *think* that series will have anything to do with rolling upgrades will it? I mean, we've been discussing a completely new HTTP endpoint for handling these mass migrations. 14:17:06 <bauzas> jaypipes: tetsuro is proposing a new microversion for that 14:17:15 <jaypipes> bauzas: for what? 14:17:22 <bauzas> jaypipes: for returning child resources 14:17:23 <efried> sorry, I didn't follow that last question, jaypipes 14:17:37 <bauzas> so I guess we can pretend in Rocky to not know about nested RPs 14:17:46 <efried> agh, no. 14:18:00 <bauzas> inventories will report the new model, but then scheduler will only speak old greek 14:18:06 <jaypipes> bauzas: right, but that (while a very useful addition for the scheduler to make use of n-r-p) isn't relevant to the mass migration API. 14:18:12 <bauzas> that's one way to tackle this 14:18:34 <bauzas> jaypipes: agreed, it's orthogonal, good point 14:18:44 <bauzas> jaypipes: but I guess we somehow needs to address that too 14:18:54 <bauzas> ie. we have two upgrade concerns 14:18:59 <jaypipes> OK, what patches *currently up* would enable any of the mass migration stuffs? anything? 14:19:12 <bauzas> 1/ is the allocations and inventories of resource classes that are now attached to a child RP 14:19:29 <efried> no, we haven't started that yet jaypipes, as far as I'm aware. 14:19:35 <bauzas> 2/ is the fact that scheduler will have to handle the fact we have both Queens and Rocky RPs 14:19:53 <jaypipes> AFAICT, we should -W the XenAPI VGPU series as well as the multi-gpu-type series until the mass migration stuff is resolved. Would bauzas, efried, cdent you agree with that? 14:19:54 <cdent> last I recall was last friday's (or maybe thursday's) discussion of the etherpad 14:19:58 <bauzas> (hope people follow me) 14:20:14 <bauzas> jaypipes: can't disagree 14:20:24 <efried> jaypipes: Agree we should hold up series impacted by upgrade issue, yes. 14:20:29 <bauzas> it's sad but it's necessary 14:20:43 <jaypipes> bauzas, efried: specifically those two series, temporarily, yes? 14:20:54 <cdent> we should _not_, however, block tetsuro's stuff, right? 14:20:59 <jaypipes> bauzas, efried: are there any *other* patch series currently up that should be -W'd for the same reason? 14:21:06 <efried> I haven't thought through thoroughly, but that's a qualified agreement that it's those two series that need to be held. 14:21:20 <jaypipes> cdent: block, no. de-prioritize (slightly) to deal with the upgrade-y stuff, maybe. 14:21:21 <efried> cdent: Correct, we should *not* block nrp-in-alloc-cands - we should get that reviewed and merged asap. 14:21:43 <jaypipes> efried: ack. I can review it today. 14:21:48 <bauzas> jaypipes: my own series wasn't yet providing nested inventories yet 14:21:50 <cdent> I'd like to see it merged asap as well, just so it is out of the way, but still usable 14:22:00 <jaypipes> efried: just want to point out the upgrade stuff should take precedence/priority as much as possible. 14:22:11 <efried> jaypipes: Cool; if it helps, I can propose spec and/or code for the upgrade stuff so you're clear to review tetsuro's series. 14:22:41 <efried> I'm +2 on the bottom few patches, and mostly up to speed on the rest 14:22:46 <jaypipes> efried: well, if you can whip up a spec that would be cool, yes. we still need to settle differences on the various proposals in that etherpad, though. 14:23:02 <efried> yes. Let's talk about that some more in a bit. 14:23:05 <jaypipes> efried: you're +2 on tetsuro's series' bottom patches? 14:23:07 <bauzas> can we put the etherpad link here ? 14:23:17 <efried> jaypipes: yes 14:23:20 <bauzas> and snaaaap, I need to go babysitting 14:23:24 <jaypipes> https://etherpad.openstack.org/p/placement-migrate-operations 14:23:26 <jaypipes> #link https://etherpad.openstack.org/p/placement-migrate-operations 14:23:49 <jaypipes> I'd actually appreciate a hangout session on the ^^ for higher-bandwidth communication. 14:24:26 <efried> Cool, are folks free after this? cdent jaypipes, anyone else? 14:24:29 <jaypipes> would anyone have time to discuss the above on a hangout? 14:24:46 <jaypipes> efried: I'd like at least cdent and edleafe on the hangout if possible. 14:24:53 <cdent> I'm free 2.5 hours from now 14:24:57 <jaypipes> our API experts-in-resident 14:25:30 <jaypipes> cdent: heh, unfortunately I have a call from 2.5 hours from now to 3.5 hours from now. 14:25:32 <bauzas> jaypipes: I can do the hangout right around 3pm UTC 14:25:45 <bauzas> uhu 14:25:51 <bauzas> anyway, I need to leave now 14:25:51 <jaypipes> perhaps we can schedule a hangout for tomorrow morning (EST) and do some brainstorming today on the ehterpad? 14:25:57 <cdent> I could maybe do 1 hour from now if it was only .5 hour long? 14:26:01 <bauzas> find a slot and I'll see how I can sneak into 14:26:13 <cdent> or tomorrow morning works too 14:26:24 <cdent> (I'm still pst, but operating early) 14:26:35 <jaypipes> reminder to folks on the etherpad... please set your name in the participants color box thing. 14:26:46 <efried> currently on the table: 1530 UTC for half an hour, or 1700 UTC 14:26:57 <jaypipes> cdent: oh, crap, forgot about you being in PST... 14:27:07 <efried> edleafe: either of those better for you? 14:27:18 <cdent> jaypipes: no worries, I'm waking up early 14:27:19 <jaypipes> efried: I think edleafe may be commuting to the office? 14:27:24 <jaypipes> cdent: ack 14:27:41 * jaypipes needs to head to his daily standup meeting in 2 minutes. 14:27:57 * alex_xu only wants to wait the conclusion 14:27:58 <cdent> let's work out a when in #openstack-placement 14:28:08 <jaypipes> k 14:28:13 <cdent> for having one tomorrow morning 14:28:18 <cdent> with 14:28:25 <cdent> #action review https://etherpad.openstack.org/p/placement-migrate-operations today 14:28:33 <jaypipes> ++ 14:28:38 <efried> wfm 14:28:59 <jaypipes> ok, unfortunately I need to run to another meeting :( 14:29:09 <tetsuro> I got that we're focusing on upgrade issue. After this I should go, but will have a look on that etherpad tomorrow 14:29:17 <jaypipes> #action review tetsuro's n-r-p alloc cands patch series as soon as possible 14:29:22 <cdent> okay, moving on, any other reviews to discuss? 14:29:39 * jaypipes goes ethereal 14:29:45 <cdent> #topic bugs 14:29:54 <cdent> #link bugs https://bugs.launchpad.net/nova/+bugs?field.tag=placement&orderby=-id 14:30:26 <cdent> 32 of them, nothing super new 14:30:37 <cdent> anybody want to talk about a bug? 14:31:06 <cdent> #topic opens 14:31:14 <cdent> anyone on anything? 14:31:49 <cdent> If not, then our primary next steps are to look at the migration etherpad and to get tetsuro's nrp in allocation-candidates reviewed and merged 14:31:56 <cdent> cool? 14:32:24 <gibi> cool 14:32:30 <efried> cool 14:32:36 <melwitt> ++ 14:32:38 <cdent> thanks gibi, for not leaving me hanging 14:32:54 <cdent> thanks for coming everyone 14:32:57 <cdent> #endmeeting