16:00:30 <ildikov> #startmeeting cinder-nova-api-changes 16:00:31 <openstack> Meeting started Thu May 25 16:00:30 2017 UTC and is due to finish in 60 minutes. The chair is ildikov. Information about MeetBot at http://wiki.debian.org/MeetBot. 16:00:32 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 16:00:34 <openstack> The meeting name has been set to 'cinder_nova_api_changes' 16:00:40 <jungleboyj> o/ 16:00:40 <ildikov> DuncanT ameade cFouts johnthetubaguy jaypipes takashin alaski e0ne jgriffith tbarron andrearosa hemna erlon mriedem gouthamr ebalduf patrickeast smcginnis diablo_rojo gsilvis xyang1 raj_singh lyarwood breitz jungleboyj 16:00:51 <mriedem> o/ 16:00:52 <hemna> @! 16:00:52 <pewp> hemna (╯°□°)╯︵ ┻━┻ 16:01:06 <jungleboyj> @!h 16:01:06 <pewp> jungleboyj (/ .□.) ︵╰(゜Д゜)╯︵ /(.□. ) 16:01:26 <ildikov> hemna: already? :) 16:01:40 <hemna> it's the new hi. 16:01:52 <ildikov> hemna: oh, got it :) 16:02:06 <jungleboyj> hemna: Is just angry all the time. 16:02:11 <ildikov> I guess jungleboyj said smth like 'Hi All' then :) 16:02:24 <smcginnis> hah 16:02:27 <stvnoyes> o? 16:02:35 <jungleboyj> ildikov: :-) 16:02:42 <stvnoyes> i guess i am here 16:03:18 <smcginnis> stvnoyes: Not sure about that? :) 16:03:18 <ildikov> ok, so for today we have live migrate, touch on attach and swap a bit and then we might look into some other things stvnoyes found recently in the new Cinder attach/detach API 16:03:48 <stvnoyes> :-) 16:03:54 <ildikov> stvnoyes: I saw you put live_migrate into WIP 16:04:08 <mriedem> do we have any nova changes which aren't WIP that need review? 16:04:08 <ildikov> stvnoyes: is that because of the comments for rollback? 16:04:12 <mriedem> and can actually move forward? 16:04:34 <ildikov> mriedem: your two changes I think still 16:04:55 <mriedem> ildikov: ok i thought so 16:05:20 <ildikov> mriedem: and this: https://review.openstack.org/#/c/456896/ 16:05:51 <mriedem> which i co-authored 16:06:01 <mriedem> i'll need to just find some helpless nova core to pick on here 16:06:06 <stvnoyes> is there a proposed fix yet for this cinder bug? - https://bugs.launchpad.net/cinder/+bug/1692153 - it will be important to get that fixed 16:06:07 <openstack> Launchpad bug 1692153 in Cinder "v3 attachment connection_info formatting is different than v2" [High,New] 16:06:24 <ildikov> mriedem: and we could decide on this one too: https://review.openstack.org/#/c/446671/ 16:06:34 <mriedem> stvnoyes: is that on 3.0? 16:06:36 <mriedem> or 3.27? 16:06:46 <ildikov> mriedem: 3.27 16:06:57 <stvnoyes> only with the new v3 cinder attach code 16:07:24 <hemna> stvnoyes, so, I think I can put in a fix in os-brick to cast those values to ints 16:07:33 <hemna> but I still think it's a bug in the output IMHO 16:07:43 <mriedem> bug in the output of the api? 16:07:47 <hemna> yes 16:07:51 <mriedem> at this point it's in the response that way 16:07:54 <mriedem> you can't change it 16:07:58 <mriedem> w/o a new microversion 16:07:58 <hemna> the OVO stuff completely reformats it 16:08:12 <hemna> even if it's a bug ? 16:08:16 <mriedem> correct 16:08:21 <stvnoyes> i think there were boolean issues too... 16:08:25 <hemna> yes 16:08:31 <hemna> "False" vs. False 16:08:33 <hemna> values. 16:08:47 <stvnoyes> ok 16:08:57 <hemna> the oslo versioned object stuff forces it to strings :( 16:09:04 <ildikov> :/ 16:09:09 <mriedem> need response schema validation in tempest 16:09:12 <mriedem> however, 16:09:25 <mriedem> given (1) you don't really want to test initialize_connection in tempest and (2) it's a freeform dict, 16:09:29 <mriedem> that's probably not trivial 16:09:29 <hemna> attaches won't work w/o a fix somewhere 16:09:35 <mriedem> you could do it with your in-tree tempest plugin 16:09:43 <mriedem> the fixes will have to be client-side for now 16:09:43 <ildikov> so we use OVO and then convert stuff back before sending out the response? 16:09:46 <hemna> I discovered it while trying to update the brick cinderclient extension to do bare metal attaches 16:10:19 <stvnoyes> i ran into it doing a v3 attach using the POC bits 16:10:22 <ildikov> hemna: yeah, I remember now you were discussing it on the Cinder channel not long ago 16:10:47 <hemna> nobody is using this new API yet..... 16:11:19 <mriedem> heh 16:11:21 <mriedem> again, 16:11:25 <mriedem> that's not the answer 16:11:44 <mriedem> once you guys are getting used outside openstack, like you want to be, you can't make those assumptions 16:11:46 <hemna> :P 16:11:47 <mriedem> plus, 16:11:56 <mriedem> changing this on pike means it's still busted on ocata 16:11:57 <hemna> come on Dad! 16:12:04 <mriedem> and client side doesn't know b/c you didnt bump the microversion 16:12:07 <smcginnis> Until we backport the fix. 16:12:11 <mriedem> wrong answer 16:12:25 <smcginnis> Which we can't do if we bump the mv just for a bug fix. 16:12:32 <hemna> I'm not sure how to fix it on the cinder side at the moment 16:12:49 <hemna> who are the ovo "guys" in oslo ? 16:12:57 <mriedem> dansmith and jaypipes 16:13:04 <mriedem> i don't see how this is a problem for ovo though 16:13:11 <hemna> ok, I'll raise this with them and see what their thoughts are on a fix 16:13:13 <mriedem> smcginnis: correct, you can't backport it 16:13:15 <mriedem> it's a new microversion 16:13:21 <hemna> in the mean time, I can patch brick to do a fix 16:13:25 <mriedem> even if you changed ovo, the response would change 16:13:48 <hemna> then we can microversion the fix on the cinder side once we figure out the right way. 16:14:32 <mriedem> yup 16:14:48 <mriedem> see 2.42 in nova if you want to see how we've also had to deal with this kind of pain 16:14:58 <hemna> are we ok with me patching brick to fix at least the LUN int ? 16:15:07 <mriedem> brick is client side, 16:15:11 <mriedem> and can straddle releases 16:15:28 <mriedem> and nova is also client side, 16:15:33 <mriedem> and can require a minimum version of brick 16:15:34 <mriedem> with the fix 16:15:47 <hemna> ok I'll try and get that up today 16:16:01 <ildikov> cool, so brick it is 16:16:41 <ildikov> and also check what we do with ovo and this type of issues long term 16:16:48 <hemna> ildikov, +1 16:17:22 <hemna> mriedem, if you can find me a nova review that I can see how you guys handled something similar that'd help 16:18:32 <ildikov> is there a note/warning anywhere for ovo that it does things you would not necessarily expect? 16:19:09 <hemna> ildikov, well, I think it's doing exactly as you'd expect 16:19:10 <hemna> https://github.com/openstack/cinder/blob/master/cinder/objects/volume_attachment.py#L53 16:19:22 <hemna> the connection_info was declared as a dict of strings. 16:19:27 <hemna> hence the conversion 16:19:32 <mriedem> hemna: https://specs.openstack.org/openstack/nova-specs/specs/ocata/implemented/fix-tag-attribute-disappearing.html is a story about how we f'ed up 16:19:59 <hemna> I think you have to declare those vars as something, and as far as I can tell, there is no DictOfLeaveMeAlone 16:20:01 <ildikov> hemna: hmm, I was just wondering why we only realized it now... :) 16:20:16 <hemna> ildikov, because nobody was using the result coming back from the API yet 16:20:22 <hemna> to try and do a real attach 16:20:28 <mriedem> well, 16:20:41 <mriedem> the long-term awesome thing to do is model ConnectionInfo as an object itself, 16:20:42 <hemna> brick puked, because "1" != 1 16:20:46 <mriedem> with it's own fields that are various types 16:20:53 <mriedem> but, 16:20:56 <hemna> ouch 16:20:56 <ildikov> hemna: I know, but still 16:20:59 <hemna> yah 16:20:59 <mriedem> that means actually codifying that thing 16:21:15 <mriedem> actually knowing wtf goes into connection_info would be awesome 16:21:21 <stvnoyes> +1 16:21:23 <mriedem> even if you needed a vendor_extras field which is a dict of strings 16:21:29 <mriedem> and unvalidated 16:21:38 <mriedem> anyway, that's way down the road and will probably never happen 16:21:47 <hemna> brick 2.0 16:21:52 <ildikov> uh, that looks a tough goal to aim for, but I definitely agree it would be nice to have some sort of a structure for that thing 16:21:55 <mriedem> hemna: it's server side 16:21:57 <mriedem> not brick 16:22:04 <mriedem> think nova.objects.ImageMetaProps in nova 16:22:09 <hemna> well, there is the connector dict too 16:22:11 <mriedem> codifies image metadata properties from glance 16:22:14 <hemna> I guess I was thinking of that too 16:22:16 <mriedem> yes connector should also be a versioned object 16:22:22 <mriedem> they are both a wild west of terrible 16:22:36 <mriedem> anyway, wayyyyyy down the rabbit hole here 16:22:38 <mriedem> can we move on? 16:22:41 <hemna> yup 16:22:45 <ildikov> yes 16:23:34 <ildikov> so we have a recent WIP, which is live migrate 16:23:49 <ildikov> stvnoyes: do you want to discuss that here? 16:24:17 <stvnoyes> hold off on rv'ing that for now. i am in the middle of functionally testing it. i just got v3 attach working yesterday. 16:24:36 <ildikov> I saw your latest comments about rollback and was wondering whether that's something to raise here? 16:24:47 <stvnoyes> i've been updating the POC and my rv with what I'm finding 16:25:22 <stvnoyes> well there is a odd bug in nova in rollback.... 16:25:29 <stvnoyes> new or old code 16:26:37 <stvnoyes> https://review.openstack.org/#/c/463987/3/nova/compute/manager.py - self.compute_rpcapi.remove_volume_connection has the wrong args 16:26:50 <ildikov> mriedem: have you seen that ^? 16:26:56 <stvnoyes> ~line 5758 16:27:25 <stvnoyes> also, i'm not sure fixing the args is the real answer, i don't know why we'd do this on a source node. 16:28:41 <mriedem> why is the signature wrong? 16:29:25 <mriedem> this is the source host telling the dest host to rollback volume connections from the dest host right? 16:29:30 <mriedem> b/c we're rolling back 16:30:13 <stvnoyes> hmmm, re-looking at this, it seems ok. not sure what i was thinking. 16:30:25 <stvnoyes> ignore this for now 16:30:37 <mriedem> so, 16:30:41 <mriedem> live migration is a turd 16:30:43 <mriedem> just fyi 16:30:43 <mriedem> :) 16:31:02 <stvnoyes> yeah, we had similar issues in the virtual iron product 16:31:09 <mriedem> it's like 20 compute manager methods doing rpc calls back and forth and you rarely know which one is on the source or target host 16:31:20 <mriedem> none of it is documented, 16:31:21 <mriedem> or mapped out 16:31:28 <mriedem> you're welcome 16:31:34 <stvnoyes> :-) 16:31:43 <ildikov> nice :) 16:32:11 <mriedem> you're lucky the docstring even says 16:32:11 <mriedem> :param dest: 16:32:12 <mriedem> This method is called from live migration src host. 16:32:12 <mriedem> This param specifies destination host. 16:32:25 <mriedem> that's a rare bit of critical information in these methods 16:33:16 <ildikov> ok, then I guess whomever can make attach work with the new flow should test the live migrate patch for now to see how it works out 16:33:18 <stvnoyes> the odd part is that i was getting wrong number of args exception in here. i need to go back and re-look at where that was coming from. 16:33:26 <mriedem> $5 to a new contributor to document the flow :) 16:33:54 <ildikov> lol :) 16:34:06 <stvnoyes> i need to check the other patches that i applied... 16:35:23 <mriedem> moving on? 16:35:36 <mriedem> stvnoyes and i reviewed jgriffith's attach WIP https://review.openstack.org/#/c/330285/ 16:35:37 <ildikov> the attach PoC throws warious things I would guess, the other bits should be fine 16:36:05 <ildikov> mriedem: stvnoyes: we will skip the microversion discovery for now 16:36:25 <ildikov> so we can have attach itself working 16:36:29 <ildikov> first 16:36:45 <mriedem> i'm fine with that for now 16:36:54 <mriedem> we 16:36:56 <mriedem> we'll want t 16:36:58 <mriedem> damn 16:37:00 <ildikov> shelve should be moved out of the patch too, but as it contained more things originally than just attach I didn't bother as of yet... 16:37:05 <mriedem> we'll want to split that out into a separate patch first anyway i think 16:37:19 <mriedem> well, 16:37:28 <mriedem> this is attaching a volume to a shelved offloaded server 16:37:31 <mriedem> so it's still new style attach 16:37:41 <mriedem> just a wrinkle 16:37:46 <mriedem> i don't think you can split that out really 16:38:31 <ildikov> I hoped we can 16:39:20 <ildikov> I talked to John this morning, he's cleaning up things in the patch like using the attachment_id we have in the bdm now, etc. 16:39:41 <ildikov> I also finally have a DevStack up and running so I will chime in testing and cleaning up this stuff 16:40:21 <mriedem> i think there are some obvious things to fix in there, yes, which is a good start 16:40:28 <mriedem> which should also decomplicate it 16:40:45 <ildikov> mriedem: stvnoyes: also tnx for the thorough review :) 16:41:15 <stvnoyes> np, i am anxious to see migrate work 16:41:17 <ildikov> mriedem: yeah, I started to fix up check and reserve for instance, but couldn't get there to clean up the rest and things like these 16:41:25 <stvnoyes> hopefully shortly 16:41:49 <ildikov> the connection_info format changes in the driver are also old ones, we had a stage when the format didn't match 16:42:41 <ildikov> that needs to be checked again too 16:44:07 <ildikov> and when we have attach then we will start write a new version of swap 16:44:17 <ildikov> that thing still confuses me... 16:45:33 <ildikov> if someone could once describe me what migrate_volume_completion is supposed to do I would be pretty pleased :) 16:45:49 <mriedem> it's nova calling back to cinder saying i'm done 16:45:59 <mriedem> which was either successful or an error 16:46:09 <mriedem> if cinder initiated the swap, i thought telling it we failed initiated a rollback 16:46:12 <mriedem> on the cinder side 16:46:27 <mriedem> if cinder didn't initiate the swap, i assume cinder is like, 'ok great but i don't know what you're talking about' 16:47:30 <ildikov> and what does it do when Nova initiated a swap and it was successful? 16:47:35 <smcginnis> mriedem: I believe so: https://github.com/openstack/cinder/blob/114e3bd7223ae957bd9bb71e3fb7e114af5c9f72/cinder/volume/manager.py#L2018 16:48:40 <mriedem> ildikov: i don't know 16:48:44 <mriedem> that's a question for cinder 16:48:53 <ildikov> mriedem: ok :) 16:49:10 <mriedem> i would think if cinder wasn't tracking a migration for the volume it would ignore it, 16:49:11 <mriedem> but idk 16:49:41 <mriedem> https://github.com/openstack/cinder/blob/114e3bd7223ae957bd9bb71e3fb7e114af5c9f72/cinder/volume/api.py#L1449 16:49:59 <mriedem> we don't cast to the volume manager then 16:50:33 <mriedem> so i think that's just updating the db on the cinder side 16:50:38 <mriedem> in the case that it didn't fail 16:50:47 <mriedem> which if nova already did that i think it's just a noop 16:50:53 <mriedem> but again i'm not sure 16:50:55 <ildikov> you mean with all that attach/detach calls in the non-error state? 16:50:58 <mriedem> yes 16:51:26 <ildikov> ok, cool 16:51:56 <mriedem> http://logs.openstack.org/95/467995/3/check/gate-grenade-dsvm-neutron-multinode-live-migration-nv/46deb21/logs/new/tempest_conf.txt.gz 16:51:59 <mriedem> oops wrong window 16:52:20 <ildikov> then we will look into how to reproduce those many noop's with the new flow then... 16:52:52 <ildikov> I think mainly that's it for today 16:52:57 <mriedem> cool, 16:53:08 <mriedem> i've also put out the nova core call to 2 specific cores to review the bottom changes which have the +2 16:53:10 <ildikov> mriedem: who to ping with those few patches to merge? 16:53:10 <mriedem> alex and kenichi 16:53:22 <mriedem> alex is in china so he's probably done for the day already 16:53:23 <ildikov> mriedem: awesome, tnx 16:53:31 <mriedem> but kenichi is in california so he can't hide 16:53:40 <ildikov> ok :) 16:54:32 <ildikov> I pinged other two guys, hopefully we will have at least one looking at those finally 16:55:07 <ildikov> cool 16:55:32 <ildikov> if nothing else then thank y'all for the today's meeting 16:55:32 <mriedem> which other two guys? 16:55:55 <ildikov> lyarwood and sfinucan IIRC 16:55:59 <mriedem> oh, not cores 16:56:01 <mriedem> but ok 16:56:05 <mriedem> anyway yeah let's call it a meeting 16:56:13 <mriedem> oh sfinucan is 16:56:15 <mriedem> duh 16:56:32 <ildikov> :) 16:57:15 <ildikov> talk to you next week the latest! :) 16:57:20 <ildikov> have a good day 16:57:26 <ildikov> #endmeeting