16:00:30 <ildikov> #startmeeting cinder-nova-api-changes
16:00:31 <openstack> Meeting started Thu May 25 16:00:30 2017 UTC and is due to finish in 60 minutes.  The chair is ildikov. Information about MeetBot at http://wiki.debian.org/MeetBot.
16:00:32 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
16:00:34 <openstack> The meeting name has been set to 'cinder_nova_api_changes'
16:00:40 <jungleboyj> o/
16:00:40 <ildikov> DuncanT ameade cFouts johnthetubaguy jaypipes takashin alaski e0ne jgriffith tbarron andrearosa hemna erlon mriedem gouthamr ebalduf patrickeast smcginnis diablo_rojo gsilvis  xyang1 raj_singh lyarwood breitz jungleboyj
16:00:51 <mriedem> o/
16:00:52 <hemna> @!
16:00:52 <pewp> hemna (╯°□°)╯︵ ┻━┻
16:01:06 <jungleboyj> @!h
16:01:06 <pewp> jungleboyj (/ .□.) ︵╰(゜Д゜)╯︵ /(.□. )
16:01:26 <ildikov> hemna: already? :)
16:01:40 <hemna> it's the new hi.
16:01:52 <ildikov> hemna: oh, got it :)
16:02:06 <jungleboyj> hemna:  Is just angry all the time.
16:02:11 <ildikov> I guess jungleboyj said smth like 'Hi All' then :)
16:02:24 <smcginnis> hah
16:02:27 <stvnoyes> o?
16:02:35 <jungleboyj> ildikov:  :-)
16:02:42 <stvnoyes> i guess i am here
16:03:18 <smcginnis> stvnoyes: Not sure about that? :)
16:03:18 <ildikov> ok, so for today we have live migrate, touch on attach and swap a bit and then we might look into some other things stvnoyes found recently in the new Cinder attach/detach API
16:03:48 <stvnoyes> :-)
16:03:54 <ildikov> stvnoyes: I saw you put live_migrate into WIP
16:04:08 <mriedem> do we have any nova changes which aren't WIP that need review?
16:04:08 <ildikov> stvnoyes: is that because of the comments for rollback?
16:04:12 <mriedem> and can actually move forward?
16:04:34 <ildikov> mriedem: your two changes I think still
16:04:55 <mriedem> ildikov: ok i thought so
16:05:20 <ildikov> mriedem: and this: https://review.openstack.org/#/c/456896/
16:05:51 <mriedem> which i co-authored
16:06:01 <mriedem> i'll need to just find some helpless nova core to pick on here
16:06:06 <stvnoyes> is there a proposed fix yet for this cinder bug? - https://bugs.launchpad.net/cinder/+bug/1692153 - it will be important to get that fixed
16:06:07 <openstack> Launchpad bug 1692153 in Cinder "v3 attachment connection_info formatting is different than v2" [High,New]
16:06:24 <ildikov> mriedem: and we could decide on this one too: https://review.openstack.org/#/c/446671/
16:06:34 <mriedem> stvnoyes: is that on 3.0?
16:06:36 <mriedem> or 3.27?
16:06:46 <ildikov> mriedem: 3.27
16:06:57 <stvnoyes> only with the new v3 cinder attach code
16:07:24 <hemna> stvnoyes, so, I think I can put in a fix in os-brick to cast those values to ints
16:07:33 <hemna> but I still think it's a bug in the output IMHO
16:07:43 <mriedem> bug in the output of the api?
16:07:47 <hemna> yes
16:07:51 <mriedem> at this point it's in the response that way
16:07:54 <mriedem> you can't change it
16:07:58 <mriedem> w/o a new microversion
16:07:58 <hemna> the OVO stuff completely reformats it
16:08:12 <hemna> even if it's a bug ?
16:08:16 <mriedem> correct
16:08:21 <stvnoyes> i think there were boolean issues too...
16:08:25 <hemna> yes
16:08:31 <hemna> "False" vs. False
16:08:33 <hemna> values.
16:08:47 <stvnoyes> ok
16:08:57 <hemna> the oslo versioned object stuff forces it to strings :(
16:09:04 <ildikov> :/
16:09:09 <mriedem> need response schema validation in tempest
16:09:12 <mriedem> however,
16:09:25 <mriedem> given (1) you don't really want to test initialize_connection in tempest and (2) it's a freeform dict,
16:09:29 <mriedem> that's probably not trivial
16:09:29 <hemna> attaches won't work w/o a fix somewhere
16:09:35 <mriedem> you could do it with your in-tree tempest plugin
16:09:43 <mriedem> the fixes will have to be client-side for now
16:09:43 <ildikov> so we use OVO and then convert stuff back before sending out the response?
16:09:46 <hemna> I discovered it while trying to update the brick cinderclient extension to do bare metal attaches
16:10:19 <stvnoyes> i ran into it doing a v3 attach using the POC bits
16:10:22 <ildikov> hemna: yeah, I remember now you were discussing it on the Cinder channel not long ago
16:10:47 <hemna> nobody is using this new API yet.....
16:11:19 <mriedem> heh
16:11:21 <mriedem> again,
16:11:25 <mriedem> that's not the answer
16:11:44 <mriedem> once you guys are getting used outside openstack, like you want to be, you can't make those assumptions
16:11:46 <hemna> :P
16:11:47 <mriedem> plus,
16:11:56 <mriedem> changing this on pike means it's still busted on ocata
16:11:57 <hemna> come on Dad!
16:12:04 <mriedem> and client side doesn't know b/c you didnt bump the microversion
16:12:07 <smcginnis> Until we backport the fix.
16:12:11 <mriedem> wrong answer
16:12:25 <smcginnis> Which we can't do if we bump the mv just for a bug fix.
16:12:32 <hemna> I'm not sure how to fix it on the cinder side at the moment
16:12:49 <hemna> who are the ovo "guys" in oslo ?
16:12:57 <mriedem> dansmith and jaypipes
16:13:04 <mriedem> i don't see how this is a problem for ovo though
16:13:11 <hemna> ok, I'll raise this with them and see what their thoughts are on a fix
16:13:13 <mriedem> smcginnis: correct, you can't backport it
16:13:15 <mriedem> it's a new microversion
16:13:21 <hemna> in the mean time, I can patch brick to do a fix
16:13:25 <mriedem> even if you changed ovo, the response would change
16:13:48 <hemna> then we can microversion the fix on the cinder side once we figure out the right way.
16:14:32 <mriedem> yup
16:14:48 <mriedem> see 2.42 in nova if you want to see how we've also had to deal with this kind of pain
16:14:58 <hemna> are we ok with me patching brick to fix at least the LUN int ?
16:15:07 <mriedem> brick is client side,
16:15:11 <mriedem> and can straddle releases
16:15:28 <mriedem> and nova is also client side,
16:15:33 <mriedem> and can require a minimum version of brick
16:15:34 <mriedem> with the fix
16:15:47 <hemna> ok I'll try and get that up today
16:16:01 <ildikov> cool, so brick it is
16:16:41 <ildikov> and also check what we do with ovo and this type of issues long term
16:16:48 <hemna> ildikov, +1
16:17:22 <hemna> mriedem, if you can find me a nova review that I can see how you guys handled something similar that'd help
16:18:32 <ildikov> is there a note/warning anywhere for ovo that it does things you would not necessarily expect?
16:19:09 <hemna> ildikov, well, I think it's doing exactly as you'd expect
16:19:10 <hemna> https://github.com/openstack/cinder/blob/master/cinder/objects/volume_attachment.py#L53
16:19:22 <hemna> the connection_info was declared as a dict of strings.
16:19:27 <hemna> hence the conversion
16:19:32 <mriedem> hemna: https://specs.openstack.org/openstack/nova-specs/specs/ocata/implemented/fix-tag-attribute-disappearing.html is a story about how we f'ed up
16:19:59 <hemna> I think you have to declare those vars as something, and as far as I can tell, there is no DictOfLeaveMeAlone
16:20:01 <ildikov> hemna: hmm, I was just wondering why we only realized it now... :)
16:20:16 <hemna> ildikov, because nobody was using the result coming back from the API yet
16:20:22 <hemna> to try and do a real attach
16:20:28 <mriedem> well,
16:20:41 <mriedem> the long-term awesome thing to do is model ConnectionInfo as an object itself,
16:20:42 <hemna> brick puked, because "1" != 1
16:20:46 <mriedem> with it's own fields that are various types
16:20:53 <mriedem> but,
16:20:56 <hemna> ouch
16:20:56 <ildikov> hemna: I know, but still
16:20:59 <hemna> yah
16:20:59 <mriedem> that means actually codifying that thing
16:21:15 <mriedem> actually knowing wtf goes into connection_info would be awesome
16:21:21 <stvnoyes> +1
16:21:23 <mriedem> even if you needed a vendor_extras field which is a dict of strings
16:21:29 <mriedem> and unvalidated
16:21:38 <mriedem> anyway, that's way down the road and will probably never happen
16:21:47 <hemna> brick 2.0
16:21:52 <ildikov> uh, that looks a tough goal to aim for, but I definitely agree it would be nice to have some sort of a structure for that thing
16:21:55 <mriedem> hemna: it's server side
16:21:57 <mriedem> not brick
16:22:04 <mriedem> think nova.objects.ImageMetaProps in nova
16:22:09 <hemna> well, there is the connector dict too
16:22:11 <mriedem> codifies image metadata properties from glance
16:22:14 <hemna> I guess I was thinking of that too
16:22:16 <mriedem> yes connector should also be a versioned object
16:22:22 <mriedem> they are both a wild west of terrible
16:22:36 <mriedem> anyway, wayyyyyy down the rabbit hole here
16:22:38 <mriedem> can we move on?
16:22:41 <hemna> yup
16:22:45 <ildikov> yes
16:23:34 <ildikov> so we have a recent WIP, which is live migrate
16:23:49 <ildikov> stvnoyes: do you want to discuss that here?
16:24:17 <stvnoyes> hold off on rv'ing that for now. i am in the middle of functionally testing it. i just got v3 attach working yesterday.
16:24:36 <ildikov> I saw your latest comments about rollback and was wondering whether that's something to raise here?
16:24:47 <stvnoyes> i've been updating the POC and my rv with what I'm finding
16:25:22 <stvnoyes> well there is a odd bug in nova in rollback....
16:25:29 <stvnoyes> new or old code
16:26:37 <stvnoyes> https://review.openstack.org/#/c/463987/3/nova/compute/manager.py - self.compute_rpcapi.remove_volume_connection has the wrong args
16:26:50 <ildikov> mriedem: have you seen that ^?
16:26:56 <stvnoyes> ~line 5758
16:27:25 <stvnoyes> also, i'm not sure fixing the args is the real answer, i don't know why we'd do this on a source node.
16:28:41 <mriedem> why is the signature wrong?
16:29:25 <mriedem> this is the source host telling the dest host to rollback volume connections from the dest host right?
16:29:30 <mriedem> b/c we're rolling back
16:30:13 <stvnoyes> hmmm, re-looking at this, it seems ok. not sure what i was thinking.
16:30:25 <stvnoyes> ignore this for now
16:30:37 <mriedem> so,
16:30:41 <mriedem> live migration is a turd
16:30:43 <mriedem> just fyi
16:30:43 <mriedem> :)
16:31:02 <stvnoyes> yeah, we had similar issues in the virtual iron product
16:31:09 <mriedem> it's like 20 compute manager methods doing rpc calls back and forth and you rarely know which one is on the source or target host
16:31:20 <mriedem> none of it is documented,
16:31:21 <mriedem> or mapped out
16:31:28 <mriedem> you're welcome
16:31:34 <stvnoyes> :-)
16:31:43 <ildikov> nice :)
16:32:11 <mriedem> you're lucky the docstring even says
16:32:11 <mriedem> :param dest:
16:32:12 <mriedem> This method is called from live migration src host.
16:32:12 <mriedem> This param specifies destination host.
16:32:25 <mriedem> that's a rare bit of critical information in these methods
16:33:16 <ildikov> ok, then I guess whomever can make attach work with the new flow should test the live migrate patch for now to see how it works out
16:33:18 <stvnoyes> the odd part is that i was getting wrong number of args exception in here. i need to go back and re-look at where that was coming from.
16:33:26 <mriedem> $5 to a new contributor to document the flow :)
16:33:54 <ildikov> lol :)
16:34:06 <stvnoyes> i need to check the other patches that i applied...
16:35:23 <mriedem> moving on?
16:35:36 <mriedem> stvnoyes and i reviewed jgriffith's attach WIP https://review.openstack.org/#/c/330285/
16:35:37 <ildikov> the attach PoC throws warious things I would guess, the other bits should be fine
16:36:05 <ildikov> mriedem: stvnoyes: we will skip the microversion discovery for now
16:36:25 <ildikov> so we can have attach itself working
16:36:29 <ildikov> first
16:36:45 <mriedem> i'm fine with that for now
16:36:54 <mriedem> we
16:36:56 <mriedem> we'll want t
16:36:58 <mriedem> damn
16:37:00 <ildikov> shelve should be moved out of the patch too, but as it contained more things originally than just attach I didn't bother as of yet...
16:37:05 <mriedem> we'll want to split that out into a separate patch first anyway i think
16:37:19 <mriedem> well,
16:37:28 <mriedem> this is attaching a volume to a shelved offloaded server
16:37:31 <mriedem> so it's still new style attach
16:37:41 <mriedem> just a wrinkle
16:37:46 <mriedem> i don't think you can split that out really
16:38:31 <ildikov> I hoped we can
16:39:20 <ildikov> I talked to John this morning, he's cleaning up things in the patch like using the attachment_id we have in the bdm now, etc.
16:39:41 <ildikov> I also finally have a DevStack up and running so I will chime in testing and cleaning up this stuff
16:40:21 <mriedem> i think there are some obvious things to fix in there, yes, which is a good start
16:40:28 <mriedem> which should also decomplicate it
16:40:45 <ildikov> mriedem: stvnoyes: also tnx for the thorough review :)
16:41:15 <stvnoyes> np, i am anxious to see migrate work
16:41:17 <ildikov> mriedem: yeah, I started to fix up check and reserve for instance, but couldn't get there to clean up the rest and things like these
16:41:25 <stvnoyes> hopefully shortly
16:41:49 <ildikov> the connection_info format changes in the driver are also old ones, we had a stage when the format didn't match
16:42:41 <ildikov> that needs to be checked again too
16:44:07 <ildikov> and when we have attach then we will start write a new version of swap
16:44:17 <ildikov> that thing still confuses me...
16:45:33 <ildikov> if someone could once describe me what migrate_volume_completion is supposed to do I would be pretty pleased :)
16:45:49 <mriedem> it's nova calling back to cinder saying i'm done
16:45:59 <mriedem> which was either successful or an error
16:46:09 <mriedem> if cinder initiated the swap, i thought telling it we failed initiated a rollback
16:46:12 <mriedem> on the cinder side
16:46:27 <mriedem> if cinder didn't initiate the swap, i assume cinder is like, 'ok great but i don't know what you're talking about'
16:47:30 <ildikov> and what does it do when Nova initiated a swap and it was successful?
16:47:35 <smcginnis> mriedem: I believe so: https://github.com/openstack/cinder/blob/114e3bd7223ae957bd9bb71e3fb7e114af5c9f72/cinder/volume/manager.py#L2018
16:48:40 <mriedem> ildikov: i don't know
16:48:44 <mriedem> that's a question for cinder
16:48:53 <ildikov> mriedem: ok :)
16:49:10 <mriedem> i would think if cinder wasn't tracking a migration for the volume it would ignore it,
16:49:11 <mriedem> but idk
16:49:41 <mriedem> https://github.com/openstack/cinder/blob/114e3bd7223ae957bd9bb71e3fb7e114af5c9f72/cinder/volume/api.py#L1449
16:49:59 <mriedem> we don't cast to the volume manager then
16:50:33 <mriedem> so i think that's just updating the db on the cinder side
16:50:38 <mriedem> in the case that it didn't fail
16:50:47 <mriedem> which if nova already did that i think it's just a noop
16:50:53 <mriedem> but again i'm not sure
16:50:55 <ildikov> you mean with all that attach/detach calls in the non-error state?
16:50:58 <mriedem> yes
16:51:26 <ildikov> ok, cool
16:51:56 <mriedem> http://logs.openstack.org/95/467995/3/check/gate-grenade-dsvm-neutron-multinode-live-migration-nv/46deb21/logs/new/tempest_conf.txt.gz
16:51:59 <mriedem> oops wrong window
16:52:20 <ildikov> then we will look into how to reproduce those many noop's with the new flow then...
16:52:52 <ildikov> I think mainly that's it for today
16:52:57 <mriedem> cool,
16:53:08 <mriedem> i've also put out the nova core call to 2 specific cores to review the bottom changes which have the +2
16:53:10 <ildikov> mriedem: who to ping with those few patches to merge?
16:53:10 <mriedem> alex and kenichi
16:53:22 <mriedem> alex is in china so he's probably done for the day already
16:53:23 <ildikov> mriedem: awesome, tnx
16:53:31 <mriedem> but kenichi is in california so he can't hide
16:53:40 <ildikov> ok :)
16:54:32 <ildikov> I pinged other two guys, hopefully we will have at least one looking at those finally
16:55:07 <ildikov> cool
16:55:32 <ildikov> if nothing else then thank y'all for the today's meeting
16:55:32 <mriedem> which other two guys?
16:55:55 <ildikov> lyarwood and sfinucan IIRC
16:55:59 <mriedem> oh, not cores
16:56:01 <mriedem> but ok
16:56:05 <mriedem> anyway yeah let's call it a meeting
16:56:13 <mriedem> oh sfinucan is
16:56:15 <mriedem> duh
16:56:32 <ildikov> :)
16:57:15 <ildikov> talk to you next week the latest! :)
16:57:20 <ildikov> have a good day
16:57:26 <ildikov> #endmeeting