13:00:02 <alex_xu> #startmeeting nova api 13:00:02 <openstack> Meeting started Wed Jun 21 13:00:02 2017 UTC and is due to finish in 60 minutes. The chair is alex_xu. Information about MeetBot at http://wiki.debian.org/MeetBot. 13:00:03 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 13:00:06 <openstack> The meeting name has been set to 'nova_api' 13:00:09 <alex_xu> who is here today? 13:00:18 <gmann_> o/ 13:00:25 <mriedem> o/ for about 15 minutes 13:00:52 <alex_xu> sdague: are you around for api meeting? 13:01:33 <alex_xu> ok, looks like just three of us 13:01:47 <alex_xu> mriedem: we can talk your item directly 13:02:03 <mriedem> ok i need to find the link 13:02:13 <alex_xu> #link https://review.openstack.org/#/c/391941/ 13:02:17 <alex_xu> mriedem: ^ here is 13:02:20 <mriedem> yeah, 13:02:36 <mriedem> so that's part of the series for allowing tags when attaching volumes and interfaces 13:03:04 <mriedem> that patch is currently checking to see if the user is trying to attach a volume to a shelved offloaded instance with a tag and if so, it fails 13:03:36 <mriedem> if the instance is not shelved offloaded, we make an rpc call to the compute to reserve the block device name, and as part of that check on the compute it's checking a capabilities flag for the virt driver to see if it supports tagged attach 13:03:56 <mriedem> since in the case of shelved offloaded instances there isn't a compute host, we can't do that rpc call so artom decided to not support it 13:04:08 <Kevin_Zheng> o/ 13:04:42 <mriedem> now, i think we could support attaching a volume with a tag to a shelved offloaded instances, the tag wouldn't be applied until we prep block devices during unshelve - which is similar to how we handle tagged bdms during normal server create 13:05:13 <mriedem> the main difference with unshelve and normal server create, if the driver doesn't support tagged devices, is that with server create, the instance goes to ERROR state and the build is aborted, 13:05:25 <mriedem> with unshelve, we don't put the instance into error state. we'll record a fault but that's it. 13:05:36 <mriedem> so it's less obvious that something went wrong when you tried unshelving 13:05:45 <alex_xu> mriedem: yea, probably a record in instance-action 13:06:11 <mriedem> i really don't like shelve as an operation 13:06:13 <artom> Wasn't alex_xu saying that we rollback the instance to offloaded? 13:06:17 <mriedem> it's incomplete 13:06:23 <artom> That way it's obvious something went wrong 13:06:29 <artom> Since your VM isn't active 13:06:42 <artom> But you'd still have to go look in instance_actions 13:06:44 <mriedem> artom: no, the vm state doesn't change until unshelve is successful 13:06:52 <mriedem> the task state changes, but it's reverted to None on failure 13:07:35 <mriedem> https://github.com/openstack/nova/blob/master/nova/compute/manager.py#L4548 13:07:37 * artom tries to think of precedent 13:07:47 <alex_xu> mriedem: what do mean about doesn't like shevle as an operation? 13:07:52 <artom> What happens if we attach a volume to an offloaded instance, then delete the volume and attempt unshelving? 13:08:32 <artom> Does it just silently unshelve without the volume? 13:08:35 <mriedem> artom: you can detach the volume and the bdm is deleted 13:08:36 <mriedem> in the api 13:08:43 <gmann_> mriedem: alex_xu artom if people tagged device on shelve offload it means they know it is supported or not so unshelve should be either error(if no tagging supported) or successful (if it is) 13:08:50 <mriedem> alex_xu: shelve is a weird api with bugs 13:09:04 <mriedem> alex_xu: like if unshelve fails, we don't cleanup volume connections on the host 13:09:09 <mriedem> probably similar for vifs 13:09:32 <mriedem> gmann_: the problem with putting the instance in error state is the user can't attempt to unshelve it again, 13:09:36 <mriedem> without the admin resetting the state 13:09:38 <alex_xu> mriedem: will we fix the shelve API in the future? 13:09:40 <artom> gmann_, assuming a hybrid cloud with some virt drivers that support tagged attach and some that don't, we don't know whether the particular host on which the unshelved instance will end up will support tagged attach 13:09:45 <alex_xu> or expect to deprecate the API 13:09:54 <artom> gmann_, so that error can only happen at unshelve-time, not at attach-time 13:10:10 <gmann_> humm 13:10:34 <mriedem> it might be possible to put the instance on a host via some aggregate metadata for the shelved image, but i'm not sure - it's a mess 13:10:42 <artom> gmann_, or more implementation-wise, it's a cast to the compute manager, not a call, so we have to way to get an error back to the user 13:10:47 <mriedem> alex_xu: i don't know if we'll deprecate it, that's kind of the issue i'm struggling with, 13:10:57 <artom> *no way 13:11:02 <mriedem> do we pile more feature complexity into an already semi broken API 13:11:18 <artom> Do we have number on how often people use it? 13:11:22 <mriedem> no idea 13:11:30 <mriedem> the other side of this is, 13:11:38 <artom> I guess if bugs are reported in it frequently it's used *somewhere* 13:12:13 <mriedem> is if i'm an api user and can attach a volume to a shelved offloaded instance, and can attach a volume with a tag to a normal running instance, is it reasonable for me to expect that i can attach a volume with a tag to a shelved offloaded instance? 13:12:25 <mriedem> from an end user point of view, it seems like an odd restriction 13:12:56 <alex_xu> yea 13:13:14 <artom> This could technically all be fixed if the virt driver becomes a resource provider with traits 13:13:23 <alex_xu> if we stop to enable new features on the unshelve API, that sounds like we give up the API :) 13:13:28 <artom> Instead of capabilities 13:13:45 <artom> So the scheduler would know not to put instance with certain features on hosts that don't support them 13:13:52 <artom> But that's *waay* off in the future 13:14:02 <gmann_> mriedem: you mean cannot attach volume with tag on shelve offload right? 13:14:17 <mriedem> gmann_: that's what is currently written in artom's patch 13:14:43 <gmann_> yea 13:14:55 <sfinucan> o/ 13:15:06 <mriedem> so i don't have a good answer, or a strong opinion either way. i feel like i'll be unhappy about either direction. :) 13:15:10 * sfinucan would love to kill shelve - it's caused him much pain in the past also 13:15:19 <mriedem> yes i also really dislike shelve 13:15:34 <sfinucan> (and it seems to be only RAX that care(d?), fwict) 13:15:44 <sfinucan> but I digress 13:15:48 <artom> I understand the breaking expectations aspect... 13:15:56 <alex_xu> I only clearly to know I'm on the side not set the instance to error state if we plan to support tag on a shelve offload instance 13:16:03 <artom> But wouldn't a workaround be to unshelve first, *then* do the tagged attach? 13:16:17 <gmann_> for me, behaving like same as boot instance can be consistent by considering that user did operation waht can cause xyz behaviour 13:16:35 <gmann_> and we document that risk of tagging on shelve offload instance clearly 13:16:43 <mriedem> artom: you could, but i assume people wanted to be able to attach a volume to a shelved offloaded instance for a reason 13:16:52 <mriedem> artom: also, if you did that, the tag wouldn't get in the config drive 13:17:09 <artom> mriedem, tagged attach already discounted the config drive 13:17:14 <artom> So it's not a huge stretch 13:17:22 <mriedem> artom: for unshelve it's a new spawn, 13:17:25 <mriedem> new guest, 13:17:28 <mriedem> so you'd have a new config drive 13:17:41 <artom> I get that 13:17:51 <artom> But from an API pov, it's the tagged attachment 13:18:05 <mriedem> i'm not following you 13:18:08 <artom> And we're already saying tagged attach doesn't update the config drive 13:18:35 <mriedem> except it would when unshelving, if we allowed that 13:18:37 <artom> So tagged attach to a shelved instance can also not update the config drive, and we're staying consistent in our message 13:18:45 <mriedem> sure it's not an 'update', 13:18:49 <mriedem> we'd be 'creating' config drive 13:19:02 <mriedem> but that's a pretty thin play on words 13:19:20 <mriedem> you must have been a lawyer in a past life :) 13:19:53 <artom> That reminds me of a thing my mom says, but that's neither here nor there 13:20:22 <mriedem> the other thing is we could land this now and add support for tagged attached to a shelved offloaded instance later if someone wanted that 13:20:32 <mriedem> it pushes the decision to the future 13:20:48 <mriedem> at the expense of weird conditions on the API behavior today 13:21:15 <sfinucan> personally, I'd opt for that, so long as it's well documented 13:21:19 <sfinucan> YAGNI, and all that 13:21:19 <artom> See my comment about virt drivers become RPs with traits in the future :) 13:21:44 <artom> Side-step that problem entirely by only scheduling instances on hosts that can support all their features 13:22:16 <mriedem> i don't know if we plan on putting traits on compute node resource providers with their driver capabilties 13:22:21 <mriedem> would be a question for jaypipes 13:22:30 <mriedem> but yeah i see it as an option, 13:22:37 <mriedem> the request spec for unshelve could have a required trait 13:22:39 <alex_xu> I remember that is part of capabilities API, not part of traits 13:22:59 <alex_xu> of course, no progress on capabilities API... 13:23:00 <mriedem> alex_xu: right, although the capabilities API is probably even further in the future since it does'nt have an owner or a plan 13:23:06 <mriedem> jinx :) 13:23:09 <artom> I don't know much about RPs/traits/caps api - it just sounded like a cool solution to our problem 13:24:36 <artom> It would also apply device tagging in general - ie, avoid the BuildAbort situation we currently have 13:25:03 <artom> And probably a whole bunch of other things 13:25:04 <mriedem> yes, i'll ask jay about it today 13:25:40 <mriedem> so as much as i don't really like it, i'm ok with deferring support for tagged attach to shelved offloaded instances 13:25:47 <mriedem> like i said, i don't like doing either 13:26:17 <mriedem> alternatively, if all the virt drivers supported tagged devices we wouldn't have this issue as much 13:26:20 * artom finds he doesn't really prefer one option over the other, though he secretly hopes the consensus would be to keep whatever's currently written in place to save work ;) 13:26:39 <artom> mriedem, I don't see that happening 13:26:49 <artom> Claudiu added tagging to hyperv 13:26:58 <artom> xenapi has been in progress since Newton 13:27:04 <artom> But nothing else 13:27:12 <mriedem> the powervm guys are pretty on top of things, 13:27:22 <mriedem> but i don't know if they can technically do it with their hypervisor, would have to ask 13:27:32 <mriedem> as for vmware, i guess we'll ask cdent when he starts there in 2 weeks :) 13:27:43 <alex_xu> wow, news 13:28:00 <gmann_> :) 13:28:05 <alex_xu> :) 13:28:15 <mriedem> alex_xu: gmann_: are you also ok with just restricting tagged attachments to non-shelved offloaded instances for now? 13:28:51 <alex_xu> mriedem: i'm ok 13:29:05 <gmann_> mriedem: for now ok. 13:29:11 <mriedem> ok let's go with that then 13:29:16 <mriedem> thanks everyone, i've got to drop off now 13:29:23 <sfinucan> o/ 13:29:29 <gmann_> should we discuss that in spec with revised version? 13:29:30 <artom> Thanks mriedem! 13:29:32 <sfinucan> (that was goodbye wave) 13:30:13 <artom> gmann_, I think so - I'll push a new revision addressing mriedem's feedback based on what we've decided here, and you guys can review 13:30:33 <alex_xu> artom: cool 13:30:52 <gmann_> artom: nice, thanks. 13:31:01 <alex_xu> ok, let us move on 13:31:05 <alex_xu> #link https://review.openstack.org/#/q/status:open+project:openstack/nova+branch:master+topic:bug/1693168 13:31:16 <alex_xu> thanks gmann_ for working on the fix 13:31:28 <gmann_> alex_xu: np! 13:31:38 <alex_xu> I already +2, so sfinucan mriedem sdague ^ it is your turn :) 13:31:38 <gmann_> mriedem: sfinucan if you can check those 13:32:10 <sfinucan> (y) 13:32:23 <alex_xu> #link https://review.openstack.org/#/q/status:open+project:openstack/nova+branch:master+topic:bp/api-no-more-extensions-pike 13:32:27 <gmann_> alex_xu: i will push the follow up on quota keys refactoring, that was nice catch 13:32:36 <alex_xu> gmann_: thanks 13:33:00 <gmann_> alex_xu: i remember to do that for schema also not to import anything from object etc but need to check whether i fixed those or not 13:33:30 <alex_xu> gmann_: sorry, I didn't get you 13:35:05 <gmann_> alex_xu: like this #link https://github.com/openstack/nova/blob/master/nova/api/validation/parameter_types.py#L424 13:35:30 <gmann_> alex_xu: if things changed on object.* it change the schema and very dangerous 13:35:52 <gmann_> alex_xu: we should have hard coded values on schema side 13:36:09 <alex_xu> gmann_: yea 13:36:27 <alex_xu> the API shoudn't bind with object 13:36:51 <gmann_> ll fix those in this week, planned long time ago but always forget 13:36:59 <alex_xu> gmann_: cool 13:37:31 <alex_xu> for remove stevedore, when I work on extension_info and version API, I saw circular import.... 13:37:46 <alex_xu> I hate that, but I didn't get a chance to find a good way to resolve that 13:38:01 <gmann_> alex_xu: oh, you have patch up? 13:38:15 <alex_xu> gmann_: no, I gave up those change 13:38:48 <alex_xu> gmann_: think of maybe leave those two in the last, and try to remove the loaded_extension_info together, maybe get rid of those problem 13:39:10 <gmann_> alex_xu: ohk 13:40:54 <alex_xu> althought not sure that works, because I only can see a very strange error message, but let us see 13:41:23 <alex_xu> that is all I have 13:41:32 <alex_xu> anymore more people want to bring up? 13:41:46 <sfinucan> Nothing from me 13:41:58 <gmann_> nothing for me too 13:42:12 <alex_xu> ok, thanks all! 13:42:20 <alex_xu> #endmeeting