14:03:20 <gibi> #startmeeting nova 14:03:20 <openstack> Meeting started Thu Oct 4 14:03:20 2018 UTC and is due to finish in 60 minutes. The chair is gibi. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:03:21 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:03:23 <openstack> The meeting name has been set to 'nova' 14:03:34 <takashin> o/ 14:03:36 <tssurya> oops o/ again 14:03:38 <Luzi> o/ 14:03:39 <gmann> o/ 14:03:44 <dansmith> o/ 14:03:44 <efried> ō/ 14:03:47 <mhen> o/ 14:04:06 <edleafe> \o 14:04:08 <gibi> hi, I think I will be your guide today 14:04:13 <bauzas> \o 14:04:31 <gibi> let's get started 14:04:36 <gibi> #topic Release News 14:04:42 <gibi> #link Stein release schedule: https://wiki.openstack.org/wiki/Nova/Stein_Release_Schedule 14:04:50 * cdent wanders in late. time is hard. 14:05:08 <gibi> #link Stein runway etherpad: https://etherpad.openstack.org/p/nova-runways-stein 14:05:15 <gibi> #link runway #1: https://blueprints.launchpad.net/nova/+spec/use-nested-allocation-candidates (gibi) [END: 2018-10-04] next patch is https://review.openstack.org/583667 14:05:31 <gibi> #link runway #2: https://blueprints.launchpad.net/nova/+spec/vmware-live-migration (rgerganov) [END: 2018-10-12] one patch https://review.openstack.org/270116 14:05:55 <gibi> #link runway #3: https://blueprints.launchpad.net/nova/+spec/boot-instance-specific-storage-backend (brinzhang)[END:2018-10-14] 14:06:09 <gibi> the runway queue is empty 14:06:35 <gibi> anything about release schedule or runways to discuss? 14:07:23 <gibi> then moving on 14:07:28 <gibi> #topic Bugs 14:07:36 <gibi> no critical bugs 14:08:08 <gibi> #link 60 new untriaged bugs (small decrease since last meeting): https://bugs.launchpad.net/nova/+bugs?search=Search&field.status=New 14:08:36 <gibi> #link 5 untagged untriaged bugs (down 18 since the last meeting): https://bugs.launchpad.net/nova/+bugs?field.tag=-*&field.status%3Alist=NEW 14:08:47 <gibi> that is actually pretty good ^^ 14:08:53 <gibi> thanks for whoever did the triage 14:09:09 <gibi> #link bug triage how-to: https://wiki.openstack.org/wiki/Nova/BugTriage#Tags 14:09:22 <gibi> anything about bugs to discuss? 14:09:41 <sean-k-mooney> https://bugs.launchpad.net/nova/+bug/1795920 14:09:41 <openstack> Launchpad bug 1795920 in OpenStack Compute (nova) "SR-IOV shared PCI numa not working " [Undecided,Confirmed] 14:10:02 <sean-k-mooney> we did not fully implement the spec resulting in this bug 14:10:20 <sean-k-mooney> i am going to reporpose teh spec for completion if there are no objections 14:11:05 <gibi> sean-k-mooney: make sense to me. Thanks for taking care of it 14:11:50 <gibi> I think the silence means no objection :) 14:11:58 <gibi> anything else? 14:12:02 <gibi> on bugs 14:12:15 <gibi> then 14:12:17 <gibi> Gate status 14:12:20 <gibi> #link check queue gate status http://status.openstack.org/elastic-recheck/index.html 14:12:20 <bauzas> gibi: you're correct on the assumption :) 14:12:27 <gibi> 3rd party CI 14:12:31 <gibi> #link 3rd party CI status http://ci-watch.tintri.com/project?project=nova&time=7+days 14:12:52 <gibi> the gate feels faster for me but the parallel evacuation bug hit us hard 14:12:57 <bauzas> right 14:13:01 <gibi> http://status.openstack.org/elastic-recheck/index.html#1763181 14:13:25 <gibi> mriedem proposed a patch to turn the test off but mdbooth has a fix up as well 14:13:34 * gibi grabbing link 14:13:51 <gibi> #link https://review.openstack.org/#/c/607620 14:14:01 <gibi> #link https://review.openstack.org/#/c/605436 14:14:12 <bauzas> we can skip for now 14:14:15 <bauzas> I'll +W the change 14:14:25 <gibi> bauzas: works for me 14:14:26 <bauzas> and then we can test the change separately 14:14:56 <efried> FWIW, the fix fixes the test case 14:15:16 <efried> But dansmith is concerned (rightly so) about the scope/impact/side effects of the fix 14:15:49 <dansmith> yup 14:15:57 <gibi> I saw that mdbooth answered dansmith in the review so I think that discussion can move forward 14:15:57 <bauzas> efried: sure, but we can ask to remove the skipTest in the change 14:16:06 <efried> yes 14:16:09 <dansmith> I see I have things to read this morning 14:16:10 <bauzas> efried: so while other changes aren't impacted, we can check it works 14:16:21 <bauzas> nothing really blocking the fix 14:16:34 <bauzas> except maybe to have it rebased on top of https://review.openstack.org/#/c/607620 14:16:43 <bauzas> so that it can remove the skipTest 14:17:15 <bauzas> mdbooth : ^ 14:17:22 <efried> I just didn't want to have us forget about the fix because it's hard and the failure is no longer occurring because we simply skipped it. 14:17:33 <bauzas> efried: I promise I won't :) 14:17:37 <efried> :) 14:17:41 <bauzas> and I guess matt won't too 14:17:45 <gibi> OK I think we have a clear way forward. moving on :) 14:18:04 <gibi> anything else about gate? 14:18:37 <gibi> #topic Reminders 14:18:46 <gibi> #link high level nova PTG summary: http://lists.openstack.org/pipermail/openstack-dev/2018-September/135122.html 14:18:55 <gibi> #link Stein Subteam Patches n Bugs: https://etherpad.openstack.org/p/stein-nova-subteam-tracking 14:19:23 <gibi> any other reminder to note? 14:19:49 <gibi> #topic Stable branch status 14:19:55 <gibi> #link stable/rocky: https://review.openstack.org/#/q/status:open+project:openstack/nova+branch:stable/rocky,n,z 14:19:59 <gibi> #link stable/queens: https://review.openstack.org/#/q/status:open+project:openstack/nova+branch:stable/queens,n,z 14:20:02 <gibi> #link stable/pike: https://review.openstack.org/#/q/status:open+project:openstack/nova+branch:stable/pike,n,z 14:20:06 <gibi> #link stable/ocata: https://review.openstack.org/#/q/status:open+project:openstack/nova+branch:stable/ocata,n,z 14:20:37 <gibi> mriedem is not here do somebody has a short summary about the stable status? 14:20:57 <gibi> Are we still working on flushing out the last ocata release before EM? 14:22:04 <gibi> we have couple of patches in the ocata queue so I guees those are needed to be released before EM 14:22:35 <gibi> anything else on stable? 14:23:01 <gibi> #topic Subteam Highlights 14:23:04 <mdbooth_> o/ 14:23:06 <gibi> Cells v2 (dansmith) 14:23:17 <jding1_> o/ 14:23:18 <dansmith> no meeting this week, 14:23:27 <dansmith> but down cell is proceeding, 14:23:44 <dansmith> and matt has a PoC up for the cross-cell migration stuff 14:23:46 <dansmith> which is fairly amazing 14:24:05 <dansmith> oh, and there's a bug with the console stuff we deprecated last cycle 14:24:20 <dansmith> which melwitt has some patches up for.. I haven't fully grokked that situation but I think it's progressing as well 14:24:23 <dansmith> that's about it I think,. 14:24:29 <gibi> thanks 14:24:34 <gibi> Scheduler (efried) 14:24:51 <efried> #link Last NovaScheduler meeting minutes http://eavesdrop.openstack.org/meetings/nova_scheduler/2018/nova_scheduler.2018-10-01-14.00.html 14:25:11 <bauzas> heh 14:25:14 <bauzas> good luck 14:25:26 <efried> We talked about the consumer gen series. I'll get through as much of that as I can today. 14:25:33 <efried> Extraction is still progressing. 14:25:39 <efried> #link ML thread on "intended purpose" of traits http://lists.openstack.org/pipermail/openstack-dev/2018-September/135209.html 14:25:40 <efried> was mentioned and briefly discussed 14:25:47 <efried> Talked a bit further about how to handle configuring min_unit (and others) in a forward-looking (i.e. generic NRP) way. jaypipes agreed to look at the specs that might move us in that direction: 14:25:55 <efried> #link kosamara's spec for modeling passthrough https://review.openstack.org/#/c/591037/ 14:25:55 <efried> #link sean-k-mooney's wip spec for generic device discovery/modeling https://review.openstack.org/#/c/603805/ 14:26:01 <efried> Idea being to pick one of those so we can narrow our focus and move the ball forward. 14:26:05 <efried> END 14:26:16 <gibi> efried: thanks 14:26:35 <gibi> Notification (gibi) 14:26:49 <gibi> I've cancelled the weekly meeting indefinitely due to low interest: https://review.openstack.org/#/c/607314/ 14:27:15 <gibi> so I guess this is the last report 14:27:27 <gibi> but I still be around for notification work (too) 14:27:29 <gibi> END 14:27:34 <gibi> API (gmann) 14:27:48 <gibi> gmann left note on the wiki: no office hour this week. 14:27:49 <gmann> no API office hour this week. i have added API related stein items in subteam etherpad last week #link https://etherpad.openstack.org/p/stein-nova-subteam-tracking 14:28:13 <gmann> not other update to share. will start reviewing those items 14:28:20 <gibi> gmann: thanks 14:28:29 <gibi> anything else subteam related? 14:28:57 <gibi> #topci Stuck Reviews 14:29:04 <gibi> nothing on the agenda 14:29:15 <gibi> does anybody want to mention something here? 14:29:28 <gibi> #topic Stuck Reviews 14:29:46 <bauzas> gibi: #undo is cool for such things ;) 14:30:01 <gibi> bauzas: the keyword was wrong, so nothing to undo 14:30:20 <gibi> #topic Open discussion 14:30:32 <gibi> we have one item on the agenda 14:30:33 <gibi> HPET support on x86 guests: https://blueprints.launchpad.net/nova/+spec/support-hpet-on-guest 14:30:40 <efried> I took a preliminary look at the code: 14:30:40 <efried> #link HPET patch https://review.openstack.org/#/c/605902/ 14:30:46 * bauzas won't make the joke again 14:30:56 <efried> The thing I thought we might want to discuss is a) how strict we should be in terms of making sure we get on a HPET-capable host if that was requested in the flavor 14:31:07 <efried> and b) whether we should be using traits in here 14:31:19 <efried> The case for b) becomes stronger if a) is a yes. 14:31:33 <efried> rather, if a) is "we should be strict" 14:31:52 <sean-k-mooney> efried: if we request a hpet in the guest i think we shoudl refuse to spawn if the host cant supprot it 14:32:22 <gibi> efried: would libvirt fail to spawn if the host does not support hpet ? 14:32:22 <efried> Okay. Not really having a handle on how users perceive this thing, that was my reaction too. 14:32:39 <efried> No, right now the code is set up to be really forgiving. If you include the extra spec, but you spell it wrong, or the host isn't capable, it just spawns anyway with HPET off. 14:32:56 <gibi> efried: I see 14:33:17 <efried> If we make it stricter just in the driver, then the failure would be late, which isn't ideal, and is precisely what traits are meant for. 14:33:21 <sean-k-mooney> well spelling mistaks can be ignored but the later is a bug in my view 14:33:46 <efried> we can't do anything about spelling the key wrong. But if you misspell "treu" that should be an early failure. 14:33:50 <sean-k-mooney> efried: we could report hpet suport as a cpu/compute node trait 14:33:57 <efried> yes, sean-k-mooney, exactly. 14:34:02 <gibi> efried: I totally agree that if we want to fail we should fail in placement GET a_c 14:34:30 <efried> and either parlay hw:hpet=True into a required=<trait_name> or require the flavor to include the latter explicitly. 14:35:06 <efried> This goes back to the discussion about having to say a thing twice, once to schedule and once to turn it on. 14:35:23 <efried> which is not an ideal ux 14:35:55 <sean-k-mooney> efried: well we do not want traits to enable things correct so if we removed one of the two we would keep the extra spec and have nova generate the trait 14:35:56 <efried> if we make a call for this case, it sets a precedent for the whole ironic deploy template design too... 14:36:03 <gibi> I like the idea of saying once "I want this to be turned on" and as a result the instance get scheduled accordingly and the feature is turned on in the backend 14:36:15 <efried> sean-k-mooney: I think that's probably the decision least likely to result in all-out war. 14:36:53 <efried> The problem of course being that it's kind of arcane knowledge that some extra specs get parlayed into required traits. Doc doc doc. 14:36:53 * sean-k-mooney wait im not onthe all-out war side this feels weird 14:37:40 <bauzas> well 14:37:59 <sean-k-mooney> efried: that is true but i think that is preferable to documentin you must ad extra spec x and trait y 14:38:02 <bauzas> I don't have a particular opinion on that, except that fat-fingering shouldn't be a nova issue 14:38:39 <bauzas> whether it should be an direct extraspec trait or something we would transform is not really important for me 14:38:47 <sean-k-mooney> typos could be addresed seperatly with the extra spec validation proposal so i think thats a seperate issue 14:38:52 <efried> Using strict=True when interpreting the bool value 14:39:09 <bauzas> if you make typos in your flavor, it's your fault, right? 14:39:09 <efried> seems like a pretty low-risk, low-cost validation strategy 14:39:35 <efried> to some extent, yes. Like I say, there's no way we can validate the key, because there's not a prescriptive set of keys - you can put anything you want in there. 14:39:52 <jding1_> I second that document both but only expose hw:hpet to extra spec and let now generate traits 14:39:53 <efried> But we can at least validate that the value - of a known key - matches the "data type" we expect. 14:40:06 <jding1_> let nova genarate traits 14:40:12 <efried> I would like to hear jaypipes and/or dansmith weigh in on this 14:40:43 <dansmith> sorry, something unfortunate came up that is distracting me from paying attention here 14:40:48 <sean-k-mooney> efried: there is a scema for validate via the glace metdadef api already 14:40:55 <sean-k-mooney> for extra spec validation that is 14:41:12 <gibi> efried: then I guess we need to wait for them to express their oppinion on the review 14:41:34 <dansmith> is this spec discussion about hpet and how to request it from placement? 14:41:53 <gibi> dansmith: I think it is a specless bp right now 14:42:04 <gibi> dansmith: and an implementation patch 14:42:14 <dansmith> okay 14:42:22 <dansmith> I've only skimmed, 14:42:31 <efried> To summarize: 14:42:31 <efried> We add a trait, say HPET_CAPABLE, which x86 libvirt adds to the host RP next to other "capabilities". 14:42:31 <efried> Operator puts extra spec hw:hpet=True in their flavor. 14:42:31 <efried> Nova looks at hw:hpet=True and adds required=HPET_CAPABLE to the GET /a_c request 14:42:31 <efried> libvirt sees hw:hpet=True and switches HPET on in the guest. 14:42:54 <dansmith> efried: yeah, was just typing that out.. what's wrong with that approach? 14:43:15 <efried> dansmith: Nothing, I'm good with it. Wanted to make sure you were. 14:43:26 * cdent is confused 14:43:27 <dansmith> am I missing the thing I should hate there? 14:43:38 <cdent> why is this differerent from requiring a trait? 14:43:42 <efried> Well, I find it slightly arcane that we're special-casing an extra spec to push a required trait t othe placement request 14:43:47 <cdent> jinx 14:43:52 <dansmith> we're not? 14:43:52 <bauzas> efried: just one point 14:44:02 <efried> cdent: Oh, because we're using it to switch on a thing in the guest, as well as making it part of the scheduling decision. 14:44:02 <dansmith> you can already do required=$trait in a flavor yeah? 14:44:10 <dansmith> oh, I see, but.. 14:44:13 <bauzas> efried: we're talking a lot of transforming extra specs into traits and/or request groups 14:44:16 <dansmith> isn't that kinda the approach we have with GPUs already? 14:44:30 <bauzas> efried: I think someone should take his guts and write something clean for that 14:44:30 <efried> dansmith: This is the whole crux of the ironic deploy template discussion right now. 14:44:41 <bauzas> efried: it could be me because $NUMA, or it could be you 14:44:45 <cdent> the use of traits in the GET /a_c part seems solid and normal 14:44:47 <dansmith> efried: it is, except that the deploy template is more than just one thing 14:45:01 <bauzas> dansmith: we directly ask a resource class in the flavor 14:45:02 <cdent> it's the issue of having to transform from hw:hpet=True that is odd 14:45:13 <efried> dansmith: take the UEFI example - that's just one thing, right? 14:45:13 <dansmith> cdent: we don't 14:45:15 <bauzas> dansmith: here, I think efried is asking to hide this into an extra spec 14:45:22 <cdent> dansmith: that's what efried described abvoe 14:45:27 <bauzas> I'm not opposed to, I just want to make the mapping easy 14:45:39 <dansmith> cdent: oh sorry, I see, I missed those two steps 14:45:47 <dansmith> why wouldn't we just put the required trait in the flavor? 14:45:53 <cdent> bingo 14:46:04 <dansmith> that said, the whole point of the request filter stuff is to translate novaisms into placementisms to some degree 14:46:07 <efried> dansmith: Because then the operator has to remember to say both things, which kind of sucks. 14:46:40 <dansmith> efried: oh, because libvirt doesn't see the request we made to placement, thus doesn't see that we asked for it, 14:46:44 <dansmith> but it can see the flavor 14:46:50 <bauzas> efried: like I said, my NUMA spec proposes some transformation-ism from an extra spec to a list of numbered request groups 14:46:51 <dansmith> and that the flavor asked for that trait 14:46:59 * dansmith is catching up 14:47:15 <bauzas> wait 14:47:20 <bauzas> we have allocations, right? 14:47:29 <sean-k-mooney> dansmith: even if it could see the trait we do not enable capablite based on traits 14:47:38 <bauzas> for VGPUs, we exactly know what the user asked 14:47:40 <dansmith> sean-k-mooney: currently. 14:47:55 <dansmith> this seems like the trait version of what we do for GPU to me 14:47:55 <efried> So here's the spectrum: 14:47:56 <efried> - Operator says hw:hpet=True and we magically add required=HPET_CAPABLE to the placement call. 14:47:56 <efried> - Operator says required=HPET_CAPABLE and libvirt uses that as its prompt to set it on the guest <== this is what jaypipes hates 14:47:56 <efried> - Operator says both hw:hpet=True,required=HPET_CAPABLE, which is poor ux because if he forgets one or the other, he doesn't get what he wants. 14:48:05 <sean-k-mooney> dansmith: or ever if i understand jaypipes view on that 14:48:10 <dansmith> efried: jaypipes said he hates it? 14:48:21 <efried> Well, he hates the principle 14:48:22 <dansmith> I guess that's on the review? 14:48:32 <efried> he hasn't weighed in on HPET specifically 14:48:46 <efried> but using traits to effect guest config he has come down hard on. 14:48:51 <dansmith> I feel like there might be some confusion about how this does or does not overlap with the ironic case 14:49:10 <bauzas> oh wait 14:49:13 <dansmith> I think that is because of the traits being complex in the ironic case, and needing to be basically parsed to extract the key=value aspect 14:49:14 <sean-k-mooney> dansmith: jaypipes has said previouly that we should not use reired=X to enable X in the context of secure boot and other things 14:49:16 <dansmith> but I could be wrong 14:49:16 <bauzas> for VGPUs, we only get allocations 14:49:17 <efried> For the ironic case, if we just stick to the UEFI case, which is 1:1, it's a good parallel. 14:49:25 <bauzas> so we don't really know the traits 14:49:41 <efried> but yeah, the multifaceted deploy templates gets off into deeper weeds. 14:49:42 <bauzas> we only see whether we have a specific resource class 14:49:45 <dansmith> can we maybe just schedule a hangout with the man himself instead of everyone parroting what they think he intends? 14:49:56 <bauzas> for HPET, we would need to pass the request down the virt driver 14:50:10 <dansmith> I think we could burn through this quicker that way anyway 14:50:15 <dansmith> instead of everyone watching this 14:50:33 <efried> wfm. jding1_ would you be available for a google hangout? 14:50:35 <bauzas> + 14:50:40 <bauzas> +1 even 14:50:50 <bauzas> but I have to run in a call right after this meeting 14:51:04 <jding1_> will see what time 14:51:09 <efried> jding1_: and cfriesen too 14:51:27 <efried> so 14:51:30 <sean-k-mooney> sure that siad i also do not like using traits to enable features e.g. retuired=hpet_capable give you a hpet unless i can do that with everything like hugepages 14:51:49 <efried> It's probably worth writing up these three alternatives at least into the blueprint template, if not into a short spec. 14:51:59 <gibi> efried: I agree 14:52:37 <efried> cfriesen: You may want to read the scrollback. Or I can catch you up in -nova after the meeting. 14:53:03 * cdent ducks out early 14:53:24 <gibi> dansmith: will you organize the hangouts session for this? 14:53:28 <cfriesen> efried: checking scrollback now 14:53:49 <dansmith> gibi: that won't go well, given my week 14:54:03 <sean-k-mooney> cfriesen: did you also have a topic related to vtpm that you wanted to get eyes on? 14:54:05 <dansmith> I'm sure efried wants that job 14:54:08 <efried> I can write words, but not sure where I should put them. 14:54:26 <gibi> efried: put it in the blueprint as a start 14:54:29 <cfriesen> sean-k-mooney: yeah, I wasn't sure if I was going to make it so I didn't put it on the agenda 14:54:37 <efried> I could work with cfriesen/jding1_ on a short spec and we could discuss there, if that'll ease schedules. 14:54:46 <gibi> efried: thank you 14:55:03 <efried> I don't think I have authority to edit the bp, and not sure the whiteboard is a good spot for it. 14:55:44 <cfriesen> If people have a few minutes...there's a sort of similar question around how flexible to make the virtual TPM stuff. (https://review.openstack.org/#/c/571111) 14:55:47 <sean-k-mooney> cfriesen: i think we are just done with the hpet topic if you want to use the last 5 mins 14:55:58 <sean-k-mooney> :) 14:56:06 <efried> cfriesen, jding1_: I'll write up the essentials and propose a spec review we can discuss on, and then give it over to y'all to fill in the details and make it buildable? 14:56:15 <jaypipes> sorry folks, reading back... 14:56:42 <gibi> cfriesen: go ahead you have 3 minutes :) 14:56:54 <cfriesen> efried: sounds good. Sean had suggested specifying the tpm type and version number, (with version defaulting to something if not specified). then nova would translate that to a resource/trait request. 14:57:12 <efried> so same design pattern 14:57:12 <cfriesen> alternately we could have the flavor specify it directly in placement terminology 14:57:17 <jding1_> efried: sounds good 14:58:04 <cfriesen> so for tpm we have an actual spec 14:58:24 <cfriesen> maybe we want to put the alternatives in there and use that as proxy for the hpet discussion? 14:59:05 <sean-k-mooney> efried: yes same pattern. if we model hugepages in placement in the futre i would also like to translate hw:mem_page_size into traits too 14:59:14 <gibi> cfriesen: hpet is a bit special as it only uses traits not resource classes 14:59:27 <cfriesen> gibi: ah, good point 14:59:35 <gibi> we are running out of time 14:59:47 <gibi> continue this on #openstack-nova 14:59:51 <cfriesen> yep 14:59:58 <gibi> thank you all 15:00:02 <gibi> #endmeeting