21:00:44 <efried> #startmeeting nova 21:00:45 <openstack> Meeting started Thu Jul 18 21:00:44 2019 UTC and is due to finish in 60 minutes. The chair is efried. Information about MeetBot at http://wiki.debian.org/MeetBot. 21:00:46 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 21:00:48 <openstack> The meeting name has been set to 'nova' 21:01:10 <takashin> o/ 21:01:19 <dustinc_OSCON> o/ 21:01:24 <efried> https://wiki.openstack.org/wiki/Meetings/Nova#Agenda_for_next_meeting 21:02:07 <dustinc_OSCON> Only half here, on my phone at a conference. 21:02:48 <artom> Here, but will cowardly flee halfway through to pick up kids 21:03:04 <efried> foreshadowing of PTG 21:03:31 <efried> okay, might as well rip through this 21:03:32 <efried> #topic Last meeting 21:03:32 <efried> #link Minutes from last meeting: http://eavesdrop.openstack.org/meetings/nova/2019/nova.2019-07-11-14.01.html 21:03:36 <efried> any old bidniss? 21:04:21 <efried> #topic Release News 21:04:21 <efried> Spec freeze in one week (July 25th) 21:04:21 <efried> #link spec review etherpad (I'm still tracking this) https://etherpad.openstack.org/p/nova-spec-review-day 21:04:21 <efried> Look for things that aren't Action: author -- these are our responsibility until spec freeze 21:04:45 <efried> #topic Bugs (stuck/critical) 21:04:45 <efried> No Critical bugs 21:04:45 <efried> #link 69 new untriaged bugs (+2 since the last meeting): https://bugs.launchpad.net/nova/+bugs?search=Search&field.status=New 21:04:45 <efried> #link 4 untagged untriaged bugs (-1 since the last meeting): https://bugs.launchpad.net/nova/+bugs?field.tag=-*&field.status%3Alist=NEW 21:04:59 <efried> #topic Gate status 21:04:59 <efried> #link check queue gate status http://status.openstack.org/elastic-recheck/index.html 21:04:59 <efried> #link 3rd party CI status TBD 21:05:28 <efried> You may see infra node counts crest 700 now (+100 ish) thanks to donnyd and fortnebula 21:05:30 <artom> Gate status reminds me, need to hunt for a second +2 for https://review.opendev.org/#/c/670848/ 21:05:58 <donnyd> efried: still working on getting to 100 21:06:02 <donnyd> at 60 ATM 21:06:11 <efried> artom: can we get that passing zuul? 21:06:44 <artom> efried, yeah, doh :( But I'm at the mercy of rechecks 21:07:31 <efried> btw, donnyd I've noticed a fair number of timeouts on jobs being run on fortnebula. That ^ is one of them. Here's another: https://review.opendev.org/#/c/671341/ 21:07:43 <efried> no idea if that's coincidence or anything you can tweak or what. 21:08:00 <efried> but thought you would want to know 21:08:10 <efried> #topic Reminders 21:08:10 <efried> any? 21:08:30 <efried> #topic Stable branch status 21:08:30 <efried> #link stable/stein: https://review.openstack.org/#/q/status:open+(project:openstack/os-vif+OR+project:openstack/python-novaclient+OR+project:openstack/nova)+branch:stable/stein 21:08:30 <efried> #link stable/rocky: https://review.openstack.org/#/q/status:open+(project:openstack/os-vif+OR+project:openstack/python-novaclient+OR+project:openstack/nova)+branch:stable/rocky 21:08:30 <efried> #link stable/queens: https://review.openstack.org/#/q/status:open+(project:openstack/os-vif+OR+project:openstack/python-novaclient+OR+project:openstack/nova)+branch:stable/queens 21:08:41 * melwitt irc client weirdly finally showed the messages in this channel 21:08:55 <efried> o/ melwitt, anything to add on topics so far? 21:09:05 <melwitt> nay 21:09:11 <efried> #topic Sub/related team Highlights 21:09:11 <efried> Placement (cdent) 21:09:11 <efried> #link latest pupdate http://lists.openstack.org/pipermail/openstack-discuss/2019-July/007716.html 21:10:02 <efried> not a lot of change since last week. Looking for update to the 21:10:02 <efried> #link NUMA topology in placement spec https://review.opendev.org/#/c/552924/ 21:10:19 <efried> ...to make use of recent placement features. 21:10:36 <efried> API (gmann) 21:10:36 <efried> Updates on ML- #link http://lists.openstack.org/pipermail/openstack-discuss/2019-July/007877.html 21:10:43 <melwitt> so that spec will be the one for consuming placement things in order to implement vGPU with affinity? 21:11:11 <melwitt> ok, I think so 21:11:41 <efried> melwitt: more broadly, nova modeling more things, including *CPU and MEMORY_MB, in placement, so we can do affinity at GET /a_c time rather than late in the NUMATopologyFilter. 21:11:56 <efried> s/more things/more NUMA things/ 21:12:22 <donnyd> efried: yea its from the storage being to slow, still waiting on gear to fix it 21:12:28 <efried> ack, thanks donnyd 21:12:33 <melwitt> efried: ok, so is a new/different spec expected for the vGPU stuff specifically? 21:13:01 <efried> melwitt: you refer to 21:13:01 <efried> #link NUMA affinity with VGPU https://review.opendev.org/#/c/650963/ 21:13:05 <melwitt> yes 21:13:17 <efried> I've been recalcitrant on that one because I don't see it moving us forward. 21:13:38 <efried> bauzas has convinced me that there are pieces of it that would make sense to do anyway 21:13:48 <efried> so if he wanted to pare it down to just those pieces, I would get out of the way. 21:14:11 <melwitt> I understand. I'm trying to determine what is the way forward with consuming the placement stuff for the vGPU affinity 21:14:41 <melwitt> like do we draw up a spec for that? or is the spec you mentioned earlier what would cover it? or other? 21:15:42 <efried> I could see the first spec covering VCPU, PCPU, MEMORY_MB, and VGPU 21:15:48 <efried> we have all the placement tools to make that work. 21:15:58 <artom> Maybe one thing at a time? 21:16:08 <efried> that --^ is "one thing" IMO 21:16:29 <efried> we can't really reshape proc/mem resources into child providers without also dealing with the existing PGPU child provider 21:16:35 <artom> I get that they're all related, and that you can't do affinity until you have them all 21:17:11 <efried> well, it makes no sense to do VCPU without MEMORY_MB or vice versa, and the incremental cost of both vs either is negligible really. 21:17:23 <artom> Yeah, so cpus and mem together 21:17:33 <efried> incremental cost of putting VGPU in the mix is also less than if we tried to do it separately IMO. 21:17:34 <artom> But GPUs can stay on the compute node, no? 21:17:42 <efried> VGPUs are not on the compute RP. 21:17:49 <efried> they're already in a child PGPU provider. 21:18:07 <artom> Sorry, right, but that child is a child of the compute node, not a NUMA node 21:18:16 <efried> correct, because we don't have numa nodes yet 21:18:44 <artom> So that can stay as is until we move VCPUs and MEM? 21:18:46 <efried> so we could leave it alone, and it would be sitting there as a peer of numa nodes, and we couldn't support affinity with it, and we would have to do another reshape later to put it under the right NUMA node. 21:18:59 <artom> We don't support affinity for it now... 21:19:21 <artom> Are reshapes expensive? 21:19:23 <efried> but IMO that's more painful than just moving the PGPU provider at the same time as we create & reshape to the NUMA providers. 21:19:46 <efried> in the sense that you can only do them on release boundaries (as the current rules are written) 21:19:49 <artom> My gut is that many less complex reshapes are better than 1 big one 21:21:13 <efried> the two efforts are going to be intertwined no matter what. Whichever effort you implement second is going to have to accommodate changes made by the first. 21:22:38 <efried> If the PGPU provider sits parallel with the NUMA providers, then the PGPU affinity spec still needs to deal with the fact that the procs/mem it's looking at are no longer on the root RP. 21:22:38 <efried> And if you implement the VGPU affinity thing first, you'll have to change that code likewise ^ when you do the reshape & affinity work for the proc/mem splitup 21:22:48 <efried> but sure, we could do that. 21:22:53 <efried> to what advantage? 21:23:15 <efried> anyway, this is probably moot until we have actual proposals to look at. 21:23:16 <artom> Ah, you're assuming that GPUs would get reshapred first 21:23:26 <efried> no, I cited either direction above. 21:23:31 <artom> (which makes sense, given that melwitt is pushing for it) 21:24:02 <artom> Yeah, it's moot. I'll only be happy to be proven wrong 21:24:23 <artom> I'm still scarred by that GPU reshape I reviewed a while ago 21:24:52 <efried> Realistically, considering this would not be a trivial spec, and considering it effectively hasn't been started yet, I don't see proc/mem split landing in Train. 21:25:15 <melwitt> I don't think I'm pushing for anything in particular, just that we (RH) would like to deliver NUMA affinity with vGPU and I wanted to find out what are the next steps as far as which spec we need 21:26:28 <efried> At this point I don't even remember which bit of the vgpu affinity spec bauzas convinced me was still useful. 21:27:41 <melwitt> yeah, I don't know about that. but I was wondering should we be updating that spec to be about consuming placement features for vGPU affinity or is there an already existing spec we need to update or do we need a new spec 21:29:07 <efried> Okay, I will try to answer this specific question. 21:29:07 <efried> Without splitting proc/mem into numa RPs, there's nothing that can be done placement-wise for vgpu affinity, so https://review.opendev.org/#/c/650963/ is the only way vgpu affinity is happening. 21:29:43 <melwitt> it sounds like it would be either the "all the things" spec or a new spec that is only about the vGPU part, with the latter being of questionable easier-ness 21:29:51 <melwitt> I see, ok 21:30:09 <melwitt> thanks 21:31:52 <efried> The general numa-modeling-in-placement spec (https://review.opendev.org/#/c/552924/) could be taken in two different directions: a) proc/mem; or b) proc/mem/vgpu. If a), then we would eventually need a third spec to describe VGPU affinity via placement-isms 21:32:46 <efried> but if b) then we could do away with https://review.opendev.org/#/c/650963/ entirely 21:32:46 <efried> or 21:32:46 <efried> implement https://review.opendev.org/#/c/650963/ and then have to retrofit it when we do b) 21:33:01 <efried> dead horse at this point? 21:33:17 <efried> #topic Stuck Reviews 21:33:17 <efried> any? 21:33:27 <efried> (other than perhaps the vgpu affinity one :P ) 21:33:57 <efried> #topic Review status page 21:33:57 <efried> #link http://status.openstack.org/reviews/#nova 21:33:57 <efried> Count: 461 (-0); Top score: 1433 (+21) 21:33:57 <efried> #help Pick a patch near the top, shepherd it to closure 21:34:07 <efried> #topic Open discussion 21:34:34 <efried> any? 21:34:40 <efried> Anything else to discuss? 21:35:24 <efried> Thanks all 21:35:24 <efried> o/ 21:35:24 <efried> #endmeeting