21:00:07 <melwitt> #startmeeting nova 21:00:08 <openstack> Meeting started Thu Feb 28 21:00:07 2019 UTC and is due to finish in 60 minutes. The chair is melwitt. Information about MeetBot at http://wiki.debian.org/MeetBot. 21:00:09 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 21:00:12 <openstack> The meeting name has been set to 'nova' 21:00:14 <mriedem> o/ 21:00:15 <melwitt> hi everyone, welcome to the nova meeting 21:00:18 <dansmith> o/ 21:00:19 <takashin> o/ 21:00:21 <melwitt> agenda https://wiki.openstack.org/wiki/Meetings/Nova 21:00:21 <artom> ~o~ 21:00:33 <mriedem> that's my move 21:00:46 <melwitt> let's make a start 21:00:47 <edleafe> \o 21:00:49 <melwitt> #topic Release News 21:00:56 <melwitt> #link Stein release schedule: https://wiki.openstack.org/wiki/Nova/Stein_Release_Schedule 21:01:03 <melwitt> #info non-client library freeze is today Feb 28, os-vif 1.15.1 was released, os-resource-classes 0.3.0 was released. os-traits did not have anything new to release since last version. 21:01:19 <melwitt> so all of our non-client library releases are done 21:01:26 <melwitt> #info s-3 feature freeze is March 7 21:01:29 <melwitt> one week away 21:01:37 <melwitt> #link Stein blueprint status tracking: https://etherpad.openstack.org/p/nova-stein-blueprint-status 21:01:53 <melwitt> we're tracking progress here ^ 21:02:00 <efried> o/ 21:02:01 <melwitt> #link Stein RC potential changes tracking: https://etherpad.openstack.org/p/nova-stein-rc-potential 21:02:25 <melwitt> RC potential blocker bugs and other related RC stuff goes here ^ 21:02:39 <melwitt> #link Stein runway etherpad: https://etherpad.openstack.org/p/nova-runways-stein 21:02:47 <melwitt> #link runway #1: https://blueprints.launchpad.net/nova/+spec/flavor-extra-spec-image-property-validation (jackding) [END 2019-03-06] https://review.openstack.org/#/c/620706/ Flavor extra spec and image properties validation 21:02:53 <melwitt> #link runway #2: https://blueprints.launchpad.net/nova/+spec/ironic-conductor-groups (jroll) [END 2019-03-06] https://review.openstack.org/#/c/635006/ ironic: partition compute services by conductor group 21:03:00 <melwitt> this is merged and bp marked as complete today ^ 21:03:08 <melwitt> #link runway #3: https://blueprints.launchpad.net/nova/+spec/enable-rebuild-for-instances-in-cell0 (ttsiouts) [END 2019-03-07 - feature freeze] https://review.openstack.org/570201 21:03:34 <melwitt> does anyone have anything else to mention for release news? or questions? 21:03:55 <melwitt> ok, moving on 21:03:57 <melwitt> #topic Bugs (stuck/critical) 21:04:02 <melwitt> no critical bugs 21:04:09 <melwitt> #link 69 new untriaged bugs (up 5 since the last meeting): https://bugs.launchpad.net/nova/+bugs?search=Search&field.status=New 21:04:16 <melwitt> #link 9 untagged untriaged bugs (up 3 since the last meeting): https://bugs.launchpad.net/nova/+bugs?field.tag=-*&field.status%3Alist=NEW 21:04:25 <melwitt> #link bug triage how-to: https://wiki.openstack.org/wiki/Nova/BugTriage#Tags 21:04:31 <melwitt> #help need help with bug triage 21:05:08 <melwitt> when doing bug triage, use the nova-stein-rc-potential bug tag for potential RC blockers 21:05:47 <melwitt> #link ML post http://lists.openstack.org/pipermail/openstack-discuss/2019-February/003343.html 21:05:59 <melwitt> Gate status 21:06:04 <melwitt> #link check queue gate status http://status.openstack.org/elastic-recheck/index.html 21:06:10 <melwitt> 3rd party CI 21:06:15 <melwitt> #link 3rd party CI status http://ciwatch.mmedvede.net/project?project=nova&time=7+days 21:06:37 <melwitt> anything else to mention for bugs or gate/CI? 21:06:56 <melwitt> ok, continuing 21:07:05 <melwitt> #topic Stable branch status 21:07:13 <melwitt> #link stable/rocky: https://review.openstack.org/#/q/status:open+project:openstack/nova+branch:stable/rocky,n,z 21:07:31 <melwitt> very few rocky backports proposed 21:07:38 <melwitt> #link stable/queens: https://review.openstack.org/#/q/status:open+project:openstack/nova+branch:stable/queens,n,z 21:07:42 <efried> That must mean rocky was perfect 21:07:56 <melwitt> yeah that's what I assume 21:07:56 * artom prefers Rocky II personally 21:08:08 <melwitt> queens backports could use some review help 21:08:14 <melwitt> lots o backports 21:08:20 <melwitt> #link stable/pike: https://review.openstack.org/#/q/status:open+project:openstack/nova+branch:stable/pike,n,z 21:08:31 <efried> something something Stallone can beat up Freddy Mercury 21:08:32 <melwitt> bunch for pike too. help wanted 21:08:58 <melwitt> I'll propose stable releases next week on s-3 day 21:09:24 <melwitt> since we usually aim for doing stable release at milestone 21:09:40 <melwitt> maybe we should do that a week later bc FF actually 21:10:19 <melwitt> so maybe the week after FF aim to flush stable reviews and release 21:10:35 <melwitt> anything else for stable branches before we move on? 21:10:46 <melwitt> ok 21:10:50 <melwitt> #topic Subteam Highlights 21:11:01 <melwitt> efried: any updates for scheduler? 21:11:07 <efried> you know it 21:11:14 <efried> #link n-sch minutes http://eavesdrop.openstack.org/meetings/nova_scheduler/2019/nova_scheduler.2019-02-25-14.00.html 21:11:22 <efried> We discussed 21:11:22 <efried> #link alloc cands in_tree series starting at https://review.openstack.org/#/c/638929/ 21:11:22 <efried> ...which has since merged \o/ (microversion 1.31) 21:11:38 <efried> We discussed 21:11:38 <efried> #link the OVO-ectomy https://review.openstack.org/#/q/topic:cd/less-ovo+(status:open+OR+status:merged) 21:11:38 <efried> ...all of which has since merged. There is a continuation of 21:11:38 <efried> #link refactors and cleanup currently starting at https://review.openstack.org/#/c/637325/ 21:11:51 <efried> We discussed 21:11:51 <efried> #link libvirt reshaper (new bottom of series) https://review.openstack.org/#/c/636591/ 21:11:51 <efried> That bottom patch has merged, and the rest of the series is mostly green except for one issue noted at 21:11:52 <efried> #link what happens to mdevs on reboot? https://review.openstack.org/#/c/636591/5/nova/virt/libvirt/driver.py@586 21:12:02 <efried> We discussed 21:12:02 <efried> #link ML thread about placement & related bug/bp tracking #link http://lists.openstack.org/pipermail/openstack-discuss/2019-February/003102.html 21:12:02 <efried> As well another couple of operational things that should be hashed out on the ML, possibly initiated there by the PTL (old or new): 21:12:02 <efried> - Format/fate of the n-sch meeting 21:12:02 <efried> - Placement team logistics at the PTG 21:12:15 <efried> END 21:12:34 <melwitt> cool, so based on that I will mark the in_tree bp as complete 21:12:42 <efried> there is a spec update pending 21:13:07 <melwitt> I think that's ok. I'll bug people to review the spec update 21:13:14 <efried> #link in-tree alloc candidates spec update https://review.openstack.org/#/c/639033/ 21:13:24 <efried> yeah, self.review_that_sucker() 21:13:39 <melwitt> yeah, me too 21:13:41 <melwitt> ok 21:13:48 <melwitt> no updates for API from gmann on the agenda 21:13:59 <melwitt> so we'll move on to... 21:14:05 <melwitt> #topic Stuck Reviews 21:14:25 <melwitt> (mriedem): Decide what to do about attaching volumes with tags to shelved offloaded servers for https://review.openstack.org/#/c/623981 21:14:35 <melwitt> #link ML thread with options: http://lists.openstack.org/pipermail/openstack-discuss/2019-February/003356.html 21:14:50 <mriedem> you want me to just go? 21:14:57 <melwitt> yeah, sure 21:14:57 <mriedem> the options are in the email 21:15:06 <mriedem> the way the root detach/attach code is written today, 21:15:14 <mriedem> when detaching a root volume, the tag is reset to None, 21:15:25 <mriedem> with the idea that when you attach a new root volume, you could specify a new tag, 21:15:33 <mriedem> the problem is, root detach/attach is only allowed on shelved offloaded instances, 21:15:47 <mriedem> but the api does not allow you to attach a volume with a tag to a shelved offloaded instance 21:15:50 <mriedem> the tag part specifically 21:16:03 <mriedem> the original thinking was because when we unshelve, we don't know if the compute will support tags 21:16:05 <mriedem> and honor them 21:16:08 <mriedem> however, 21:16:27 <mriedem> that's already a latent bug because i can createa server with device tags, shelve it and then unshelve it and if i land on a host that does not support device tags, it passes but my tags aren't exposed to the guest 21:16:46 <mriedem> that recorded with bug 1817927 21:16:47 <openstack> bug 1817927 in OpenStack Compute (nova) "device tagging support is not checked during move operations" [Undecided,New] https://launchpad.net/bugs/1817927 21:16:51 <mriedem> same is true for any move operation actually, 21:16:59 <mriedem> because we don't consider the user-requested device tags during scheduling at all, not even create 21:17:05 <mriedem> so, 21:17:23 <mriedem> i think we're restricting attaching volumes with tags to shelve offloaded servers for really no good reason 21:17:46 <artom> I guess realistically, how many people are running heterogeneous clouds with the potential to hit bug 1817927? It was reported by mriedem, not and end user/operator... 21:17:46 <openstack> bug 1817927 in OpenStack Compute (nova) "device tagging support is not checked during move operations" [Undecided,New] https://launchpad.net/bugs/1817927 21:17:46 <mriedem> the question is what to do about it in the context of the root volume detach/attach series 21:17:57 <mriedem> artom: i would say slim 21:18:05 <artom> mriedem, yeah, so I'm partial to 1, with you 21:18:13 <mriedem> also, looking back, 21:18:28 <mriedem> we probably should have put a policy rule in the api for device tags if your deployment doesn't support them 21:18:29 <melwitt> IIUC, when you create an instance, tags are not guaranteed to be supported by the compute host the server lands on? 21:18:32 <mriedem> like we have for trusted certs 21:18:47 <mriedem> melwitt: correct, and if they land on a compute that doesn't support them during create, it aborts 21:18:54 <mriedem> no reschedule, nothing - you're dead 21:19:00 <artom> melwitt, this was more true in the past with the possibility of older versions, but now it's just about running a supported hypervisor 21:19:09 <melwitt> ok. then I guess I don't see why to restrict it for shelve/unshelve 21:19:10 <mriedem> again correct 21:19:13 <artom> hyperv and xen and libvirt have them (for boot time) 21:19:30 <mriedem> yeah so if you're running VIO and your users try to specify tags during server create, kaboom 21:19:46 <mriedem> we could policy rule that out of existence if we wanted, but it hasn't come up 21:19:58 <melwitt> yeah. it seems like the existing restriction doesn't make sense given that there's not even a restriction for create 21:20:05 <mriedem> also, with the compute-driven capabilities traits stuff that aspiers is working on, 21:20:27 <mriedem> we can modify the request spec in train to say, "the user wants tags, so make sure you give them a compute which supports tags" 21:21:01 <melwitt> yeah, that would be nice 21:21:12 <mriedem> so if we're leaning to option 1, we would lift that restriction in the same microversion Kevin_Zheng is adding for the root attach/detach support 21:21:17 <mriedem> i assume anyway 21:21:36 <mriedem> we can't really just remove the restriction and say 'oops' for interop reasons 21:21:56 <melwitt> yeah. I can't immediately think of how a separate microversion would help 21:22:08 <mriedem> this does make his change more complicated 21:22:14 <melwitt> yeah :( 21:22:21 <mriedem> but i think it needs to happen this way, i don't want to half ass around with multiple microversions for this 21:22:39 <artom> mriedem, actually, hold up 21:23:04 <artom> IIRC one of the reasons we outright refused tagged attach to shelved is because we had to communicate with the compute manager 21:23:20 <artom> Which we didn't know at the time of attach 21:23:30 <artom> Has this been "solved" by Kevin's work? 21:23:48 <mriedem> when attaching a volume to a not-shelved server, we call down to compute to reserve a device name 21:23:58 <mriedem> when attaching a volume to a shelved offloaded server, we just create the bdm in the api 21:24:21 <mriedem> in the case of your tagged attach code, it will also check the compute capabilities to see if it supports tagged attach and blow up if not 21:24:32 <mriedem> so we wouldn't have ^ in the case of shelved offloaded attach 21:24:43 <mriedem> however, as noted, we're already not honoring device tags on unshelve anyway 21:24:50 <mriedem> so....who cares? 21:25:16 <mriedem> the long-term fix for doing that properly is the scheduling based on required traits stuff 21:25:31 <mriedem> i don't think we can just start exploding unshelve because servers have tags with them now 21:25:46 <mriedem> until the scheduler piece is worked in 21:26:23 <mriedem> artom: you're thinking of this https://github.com/openstack/nova/blob/master/nova/compute/manager.py#L5480 21:26:31 <mriedem> ^ happens during an attach to a non-shelved server 21:27:07 <mriedem> multiattach volumes are kind of broken in the same way wrt unshelve 21:27:22 <mriedem> you can boot from multiattach volume, shelve and then unshelve elsewhere on something that doesn't support it 21:27:59 <mriedem> the api kicks out trying to attach multiattach volumes to shelved servers as well 21:28:26 <mriedem> https://github.com/openstack/nova/blob/master/nova/compute/api.py#L4199 21:28:48 <melwitt> I assume volume attach is the only time you can add device tags to something 21:29:01 <mriedem> create and attach 21:29:15 <melwitt> otherwise a workaround would be to set them after attaching sans tags 21:29:17 <melwitt> got it 21:29:33 <mriedem> we don't have that today 21:29:38 <artom> mriedem, hah, found it https://review.openstack.org/#/c/391941/50/nova/compute/api.py 21:29:39 <melwitt> right 21:29:47 <artom> And yeah, it was only checking for compute host support 21:30:37 <melwitt> so that means a server create could reschedule to land on a host with support? 21:31:00 <mriedem> server create aborts if it lands on a host that doesn't support tags 21:31:04 <mriedem> it does not reschedule 21:31:18 <melwitt> oh that's in manager 21:31:19 <melwitt> I see. ok 21:31:26 <mriedem> https://github.com/openstack/nova/blob/master/nova/compute/manager.py#L1898 21:31:48 <mriedem> so another option is, 21:31:59 <mriedem> land kevin's code as-is (option 2 in my email), 21:32:28 <mriedem> and when we have smarter scheduling to take device tags and multiattach volumes into account, we could add a microversion to top the tag/multiattach restriction on attaching volumes to shelved offloaded instances 21:32:52 <mriedem> which is these 2 checks https://github.com/openstack/nova/blob/master/nova/compute/api.py#L4191 21:33:10 <mriedem> s/top/drop/ 21:33:11 <artom> It'd be kinda weird to just disappear a tag without warnin 21:33:13 <artom> *warning 21:33:36 <melwitt> that also sounds reasonable, and would make it easier on Kevin for landing this series. the only potential pitfall is that ^ you could lose your tags and be unable to re-add them 21:33:41 <mriedem> well, we'd probably put something in the api reference saying 'at this microversion you can detach a root volume but note that the tag will be gone with it and you cannot provide a new tag when attaching a new root volume' 21:34:13 <artom> We'll need a big warning regardless 21:34:16 <melwitt> ++ 21:34:22 <artom> Hrmm, so actually, the API is user-facing, right? 21:34:36 <mriedem> our api is meant to be used by users yes... 21:34:37 <artom> So if we're going to warn about stuff in the API, it should be about what users can change/control 21:34:48 <artom> Ie, telling them their tag will disappear is fair game 21:34:51 <melwitt> I think ideal is option 1, the restriction seems artificial based on what's been explained. but can we even get option 1 done within a week 21:35:08 <artom> Telling them their unshelve might blow up because behind the scenes the operator is running different HVs isn't fair 21:35:12 <artom> Because they can't do anything about that 21:35:38 <artom> So with that reasoning I'm leaning more 2 now 21:36:05 <mriedem> artom: as in do what we have now, and in the future when we don't suck at scheduling, lift the api restriction 21:36:16 <artom> mriedem, yeah 21:36:19 <artom> (Heh, "when") 21:36:21 <melwitt> I guess I could see how unshelve is worse than rejecting a create in a mixed HV's env because you haven't invested much into your server yet 21:36:42 <mriedem> well, unshelve just fails, 21:36:46 <mriedem> we don't delete your snapshot 21:37:08 <mriedem> actually unshelve doesn't even fail if you have device tags 21:37:12 <mriedem> that's that bug from earlier 21:37:17 <mriedem> bug 1817927 21:37:18 <openstack> bug 1817927 in OpenStack Compute (nova) "device tagging support is not checked during move operations" [Undecided,New] https://launchpad.net/bugs/1817927 21:37:20 <melwitt> oh yeah, right. 21:37:26 <artom> (That'd be hillarious if the FaultWrapper just deleted a random instance) 21:37:27 <mriedem> but nor does evacuate, resize or live migrate 21:38:00 <artom> So what does actually happen? The tag is just ignored? 21:38:19 <mriedem> yes 21:38:25 <artom> That's harmless 21:38:36 <mriedem> we don't honor the user request 21:38:42 <artom> So really 1 and 2 are the same in that sense 21:38:47 <artom> You end up with a tagless server 21:39:06 <melwitt> ok, this is pretty complicated to reason about but I think the problem has been adequately explained. so we could continue discussing in #openstack-nova and/or the ML 21:39:06 <artom> In 1 it's ignored by the unshelve 21:39:12 <artom> In 2 it's removed by the detach 21:39:41 <mriedem> ok we can move on, people can dump opinions in the ML 21:39:46 <artom> I have to bounce to pick up kids anyways 21:39:48 <artom> o/ 21:39:52 <melwitt> ok, cool 21:39:58 <melwitt> last thing, open discussion 21:40:15 <melwitt> #topic Open discussion 21:40:26 <melwitt> anyone have anything for open discussion before we wrap up? 21:40:52 <melwitt> going 21:41:00 <melwitt> going 21:41:18 <melwitt> ok, guess that's it, thank you all 21:41:19 <melwitt> #endmeeting