13:59:59 <efried> #startmeeting nova_scheduler
14:00:00 <openstack> Meeting started Mon Jul 16 13:59:59 2018 UTC and is due to finish in 60 minutes.  The chair is efried. Information about MeetBot at http://wiki.debian.org/MeetBot.
14:00:01 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
14:00:03 <openstack> The meeting name has been set to 'nova_scheduler'
14:00:14 <cdent> o_
14:00:17 <takashin> o/
14:00:23 <gibi> o/
14:00:24 <efried> Get up cdent!
14:00:24 <alex_xu> o/
14:00:32 * cdent is tired
14:00:34 <edleafe> \o
14:01:26 <efried> #topic last meeting
14:01:38 <efried> #link last minutes: http://eavesdrop.openstack.org/meetings/nova_scheduler/2018/nova_scheduler.2018-07-09-14.01.html
14:01:52 <efried> Any old business to bring up?
14:02:51 <jaypipes> o/
14:02:54 <tssurya> o/
14:02:56 <efried> #topic specs and review
14:03:03 <efried> #link latest pupdate: http://lists.openstack.org/pipermail/openstack-dev/2018-July/132252.html (Thanks for covering this, Jay)
14:03:27 <efried> Highest priority at the moment is the
14:03:27 <efried> #link reshaper series: https://review.openstack.org/#/q/topic:bp/reshape-provider-tree+status:open
14:04:08 <efried> cdent: jaypipes and I are working through stuff this morning.
14:04:15 <efried> s/:/,/
14:04:47 <cdent> I _almost_ managed to reshape something. but not quite
14:05:04 <jaypipes> cdent: what did you run into?
14:05:20 <jaypipes> cdent: you mean reshaping with gabbits? or something else?
14:05:23 <cdent> resource provider generation conflict in set_allocations
14:05:35 <cdent> yes, reshaping over http
14:05:47 <cdent> I noted it in my latest push. it's the last test in reshaper.yaml
14:05:52 <jaypipes> ack
14:05:56 <jaypipes> ok, will look shortly.
14:06:06 <efried> Last week we stuffed all of the patches into a single series; we'll see if multiple-authors-on-a-series is more or less painful than maintaining dependencies in other ways.
14:07:11 <efried> Note that there's a
14:07:11 <efried> #link reshaper spec update: https://review.openstack.org/582350
14:07:11 <efried> open to collect design tweaks as needed for implementation.
14:07:27 * efried re -Ws that...
14:08:19 <efried> Going to hold that open until we're pretty much done. Folks should feel free to submit patch sets there.
14:08:46 <efried> Any other specs or reviews people would like to highlight at this time?
14:08:57 <efried> or discussion needed on the above?
14:09:02 <jaypipes> not from me, no
14:09:10 <cdent> carry on
14:09:25 <efried> #topic bugs
14:09:25 <efried> #link placement bugs: https://bugs.launchpad.net/nova/+bugs?field.tag=placement&orderby=-id
14:10:34 <efried> We've closed the more urgent consumer gen related bugs, which was probably the most important/urgent thing.
14:10:56 <jaypipes> https://bugs.launchpad.net/nova/+bug/1781439 should be an easy one for people looking for low-hanging-fruit
14:10:57 <openstack> Launchpad bug 1781439 in OpenStack Compute (nova) "Test & document 1.28 (consumer gen) changes for /resource_providers/{u}/allocations" [Undecided,New]
14:11:00 * jaypipes adds tag in LP
14:11:34 <jaypipes> I'll shoot a note to the ML
14:11:51 <efried> I'd like to point out that bug #1781430 should still be fixed, but is no longer in the critical path since latest tweaks to the reshaper db patch.
14:11:51 <openstack> bug 1781430 in OpenStack Compute (nova) "AllocationList.delete_all() incorrectly assumes a single consumer" [High,In progress] https://launchpad.net/bugs/1781430 - Assigned to Jay Pipes (jaypipes)
14:12:46 <efried> Any other bugs anyone would like to highlight or discuss at this time?
14:13:33 <jaypipes> efried: err, isn't that bug fixed in the bottom patch of the reshaper series?
14:13:46 <jaypipes> i.e. https://review.openstack.org/#/c/582382/
14:13:59 <efried> jaypipes: Yes it is, but not yet merged, and no longer essential for the series.
14:13:59 <jaypipes> efried: or were you thinking of a different bug?
14:14:13 <efried> jaypipes: I don't anticipate it coming to this, but if necessary, it could be yanked out of the series.
14:14:23 <gibi> efried: that fix is on the gate now :)
14:14:29 <jaypipes> k
14:14:32 <efried> gibi: Thanks. That's the easiest thing :)
14:15:08 <efried> moving on...
14:15:15 <efried> #topic opens
14:15:15 <efried> Planning/Doing support in nova/report client for: consumer generation handling
14:15:56 <cdent> I added that because I figured we needed the reminder, but I don't have much to say beyond that
14:16:30 <efried> This entails scouring the rt/reportclient for stuff related to allocations, bumping the calls to microversions that handle consumer generations, and possibly handling retries/races accordingly.
14:17:12 <efried> I'm guessing cdent jaypipes efried will probably not have the bandwidth to look at this until reshaper winds down. If anyone else wishes to jump in here, that help would be welcomed.
14:17:50 <gibi> efried, cdent: I will try to keep that on my radar
14:17:59 <efried> Thanks gibi
14:18:14 <efried> next up:
14:18:14 <efried> Planning/Doing support in nova/report client for: nested and shared providers when modifying migration (and other?) allocations
14:18:48 <efried> Similar in spirit to the above, though probably considerably more complicated.
14:19:31 <gibi> efried: also I think a bit depends on the former as the nested ac is a higher microversion than the consumer gen
14:20:14 <cdent> yeah, that too was "we needed the reminder"
14:20:44 <efried> gibi: Well, the nested microversion (1.29) only affects GET /a_c; whereas the consumer gen (1.28) affects the [resource_providers/{u}]/allocations[/{u}] paths.
14:20:49 <efried> So they can still be done mutually exclusively.
14:20:53 <efried> However,
14:21:27 <efried> I think we should broaden this topic (or add a precursor) for making sure our initial allocations work with nested + shared in the first place.
14:21:49 <efried> We have one example where we've proven shared works on initial allocations - that's libvirt with shared DISK_GB.
14:22:28 <efried> I believe gibi is working on a patch for some func tests along these lines with nrp.  Finding...
14:22:49 <gibi> efried: here is the patch https://review.openstack.org/#/c/527728/ it shows that we need 1.29 for nested a_C
14:23:02 <efried> beaut.
14:24:34 <efried> next up:
14:24:34 <efried> Planning/Doing support in nova/report client for: whatever else is not being remembered right now
14:24:34 <efried> Can anyone think of more things we should do on the rt/reportclient side to exploit work we've done in placement lately?
14:25:07 <cdent> I think we need to be clear to separate what we must do from what we'd like to do. I'm not clear where that boundary currently is
14:25:22 <efried> Agree with that.
14:25:59 <efried> We know we need reshaper before we can support nrp for vgpu or numa. That's an easy one.
14:26:23 * cdent nods
14:27:26 <gibi> also we need 1.28 and 1.29 support in the report client to support nrp at all outside of reshaping situations
14:27:40 * mriedem forgot the meeting started
14:27:43 <efried> Do we need 1.28 though?
14:28:02 <efried> Do we actually need consumer gens for anything from nova right now, considering the big lock in the rt?
14:28:28 <gibi> efried: I feel dangerous getting back allocation candidates from 1.29 and passing it back those to /allocations < 1.28
14:29:08 <efried> That's fair.
14:30:35 <efried> So that reminds me: generation support for aggregate operations.
14:31:10 <efried> #link Check provider generation and retry on conflict: https://review.openstack.org/#/c/556669/
14:31:44 <efried> This *is* crucial because we *do* have a race on aggregates, since we're mirroring host aggs in the api service.
14:32:10 <efried> This patch has been through the wringer a fair bit, but I think it's ready to be reviewed now.
14:32:29 <efried> cdent, jaypipes, mriedem: you have had eyes on this previously; would you mind having another look please?
14:32:38 <cdent> yessir
14:33:11 <mriedem> currently working on fixing https://bugs.launchpad.net/nova/+bug/1781710
14:33:11 <openstack> Launchpad bug 1781710 in OpenStack Compute (nova) "ServersOnMultiNodesTest.test_create_server_with_scheduler_hint_group_anti_affinity failing with "Servers are on the same host"" [High,Triaged] - Assigned to Matt Riedemann (mriedem)
14:33:15 <mriedem> but yeah for later
14:34:24 <efried> So I think making sure both initial and migrating allocs work for shared+nested <= this is in the critical path for nrp support, possibly even more important than reshaper because it affects even initial nrp impls (those not requiring reshaper to get started).
14:35:17 <efried> cdent: Sounds like the answer is: It's all "must do" o_O
14:35:34 <cdent> yay?
14:35:50 <gibi> business as usual
14:36:47 <efried> If nobody else has volunteered by the time reshaper winds down, I'll probably hit that last one, because I'm going to be implementing such a driver (initial nrp not requiring reshaper).
14:37:40 <cdent> i'll have a more clear picture in a few days
14:38:41 <efried> Okay; any other open discussion topics?
14:39:59 <efried> Here's one then: anyone feel like (co-)proposing a placement (or other scheduler) topic for Berlin?
14:40:59 <cdent> I'm trying really hard to avoid presenting at summit.
14:41:32 <cdent> jaypipes: you continuing your plan of not going?
14:42:08 <gibi> efried: I and others proposed one for bandwidth
14:44:39 <efried> Okay. I'm about 20% motivated to propose another placement update like the last one. It went well, but not sure the value:effort ratio is high enough.
14:45:07 <cdent> efried: check with me about joining for one wherever the post berlin one is. I should be off the tc by then
14:45:18 <efried> roger wilco.
14:45:28 <efried> Okay, any other topics before we close?
14:45:48 <cdent> let's call it
14:46:15 <efried> Thanks all.
14:46:15 <efried> #endmeeting