16:00:05 <ildikov> #startmeeting cinder-nova-api-changes
16:00:06 <openstack> Meeting started Thu Apr 20 16:00:05 2017 UTC and is due to finish in 60 minutes.  The chair is ildikov. Information about MeetBot at http://wiki.debian.org/MeetBot.
16:00:07 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
16:00:09 <mriedem> o/
16:00:10 <openstack> The meeting name has been set to 'cinder_nova_api_changes'
16:00:14 <mriedem> o/ o/ o/
16:00:16 <johnthetubaguy> o/
16:00:20 <ildikov> DuncanT ameade cFouts johnthetubaguy jaypipes takashin alaski e0ne jgriffith tbarron andrearosa hemna erlon mriedem gouthamr ebalduf patrickeast smcginnis diablo_rojo gsilvis  xyang1 raj_singh lyarwood breitz jungleboyj
16:00:22 <lyarwood> o/
16:00:31 <hemna> \o
16:00:34 <ildikov> mriedem: hi :)
16:00:42 <smcginnis> ./
16:01:05 <patrickeast> o/
16:01:14 <jungleboyj> o/
16:01:55 <ildikov> I think we can start :)
16:02:36 <stvnoyes> o/
16:02:49 <ildikov> so ongoing things are updating the detach flow in Nova
16:02:57 <ildikov> and remove check_detach at this point
16:03:02 <mriedem> https://review.openstack.org/#/c/438750/ has a +2
16:03:41 <lyarwood> a quick nit on that, do we also need to delete the attachment_id from the bdm when we are removing the attachment?
16:03:41 <ildikov> mriedem: awesome, thank you :)
16:04:06 <mriedem> oh hmm
16:04:18 <ildikov> lyarwood: actually we should :S
16:04:22 <johnthetubaguy> does the BDM get deleted at that point?
16:04:30 <johnthetubaguy> I guess not always
16:05:32 <lyarwood> yeah sorry trying and failing to think of a use case where it would remain, ignore me, I'll follow up in the review if I find one
16:05:46 <mriedem> well,
16:05:52 <mriedem> it's a good question,
16:05:56 <johnthetubaguy> terminate connection ones
16:06:07 <mriedem> but things that are keying off attachment_id would see it's not set and then do any old path stuff
16:06:11 <johnthetubaguy> but wait, no attachment delete there
16:06:23 <johnthetubaguy> mriedem: good point
16:06:54 <johnthetubaguy> although after you detach a bdm, not sure what you would do with it, unless its already attached somewhere else or about to be attached somewhere else
16:06:55 <lyarwood> but that's only on the detach flow right? attach isn't going to use attachment_id, it's just always going to use the new flow.
16:07:08 <johnthetubaguy> lyarwood: I think that might be the key bit
16:07:17 <mriedem> detach_volume in the compute manager deletes the bdm
16:07:21 <mriedem> after the detach
16:07:27 <ildikov> lyarwood: attach will set it in the BDM
16:07:39 <mriedem> unless it's doing a rebuild
16:07:45 <ildikov> lyarwood: attach will use microversions to decide what flow to use
16:08:06 <mriedem> hmm, this is going to be weird with rebuild/evacuate
16:08:07 <mdbooth> o/
16:08:29 <mriedem> i believe rebuild models things with a single bdm
16:08:30 <johnthetubaguy> well, on the new host its an attach flow
16:09:55 <johnthetubaguy> ... I wonder if we really detach the attachment, rather than the BDM, oh my, my head is mashed :(
16:10:03 <ildikov> mriedem: you mean updating or not updating the attachment_id?
16:11:17 <ildikov> mriedem: johnthetubaguy: do you have kind of a state machine for the BDM?
16:11:34 <ildikov> just to try to figure out what happens with it in each use case
16:12:06 <mriedem> hells no
16:12:11 <johnthetubaguy> I think we wrote it out in the spec
16:12:19 <johnthetubaguy> I wouldn't want to go down that path here
16:12:29 <mriedem> johnthetubaguy: so rebuild is either going to delete the old bdm and create a new bdm, or update the single bdm with the new attachment_id
16:12:31 <mriedem> i think
16:12:40 <johnthetubaguy> yeah, I think it does the update
16:12:51 <johnthetubaguy> you know connection_info
16:12:52 <mriedem> because it's the same volume
16:12:58 <johnthetubaguy> we have to store that
16:12:58 <mriedem> same volume, different attachment
16:13:01 <johnthetubaguy> so we don't loose it
16:13:07 <mriedem> yeah so i'm still +2
16:13:10 <johnthetubaguy> I think attachment_id will have to be done in the same way
16:13:13 <johnthetubaguy> yeah, I think me too
16:13:36 <johnthetubaguy> either way, I think its fine with what uses that code right now
16:13:47 <mriedem> yeah that too,
16:13:51 <mriedem> we can flush out other demons later
16:13:55 <johnthetubaguy> +1
16:14:00 <johnthetubaguy> its a problem for live-migrate
16:14:02 <johnthetubaguy> etc
16:14:12 <ildikov> yeap, I'm trying to organize the chain of patches right now
16:14:28 <ildikov> I had to accept I cannot really avoid doing that... :/
16:14:39 <mriedem> yeah i was trying to figure out what to review next
16:14:42 <mriedem> since it's not a series that's hard
16:15:08 <mriedem> i guess https://review.openstack.org/#/c/456896/ is next?
16:15:33 <mriedem> since some things build on top of that
16:16:21 <johnthetubaguy> hmm... is there not a way to do an explicit flow
16:16:29 <johnthetubaguy> like API operation X
16:16:42 <mriedem> i think that's what ildikov tried to do
16:16:49 <mriedem> the common things are at the bottom,
16:16:53 <ildikov> mriedem: that can be
16:16:54 <mriedem> but then each api is handled separately
16:16:56 <mriedem> not in a chain
16:16:57 <johnthetubaguy> ah, my bad, I see
16:17:07 <mriedem> i just don't personally want to handle live migration right now...
16:17:17 <johnthetubaguy> yeah
16:17:25 <ildikov> mriedem: I tried to re-use terminate_connection, re your comment on try not to do too much copy-paste there
16:17:26 <johnthetubaguy> maybe migrate would be easier, or rebuild?
16:17:46 <johnthetubaguy> honestly, I would add a new method, and consolidate if it makes sense
16:18:04 <johnthetubaguy> to stop accidentally using the new stuff, I was thinking
16:18:13 <ildikov> mriedem: but it's not crystal clean yet whether that was a good idea or not
16:18:23 <ildikov> mriedem: I meant _terminate_volume_connections to be clear
16:18:25 <mriedem> ildikov: in https://review.openstack.org/#/c/456896/3/nova/compute/manager.py ?
16:18:41 <mriedem> ok i see
16:19:16 <johnthetubaguy> you know what, we need create before delete
16:19:55 <ildikov> mriedem: yes
16:20:13 <ildikov> mriedem: attachment_create is there by mistake in that patch
16:20:47 <ildikov> johnthetubaguy: accidentally use new stuff?
16:21:19 <mriedem> ildikov: i agree with lyarwood in that change, comments inline
16:21:27 <mriedem> i don't see anything calling _terminate_volume_connections that already has a connector in scope
16:21:30 <mriedem> unless those come later in the series
16:21:37 <johnthetubaguy> ildikov: I will describe more in the patch
16:21:43 <ildikov> johnthetubaguy: ok
16:21:53 <johnthetubaguy> ah, so I think I mean what mriedem just said
16:22:07 <johnthetubaguy> oops, maybe not
16:22:38 <mriedem> i guess swap_volume is the one that already has the connector
16:22:47 <mriedem> so https://review.openstack.org/#/c/456896/3/nova/compute/manager.py is prematurely optimizing
16:23:41 <ildikov> mriedem: we have it here for instance: https://review.openstack.org/#/c/456988/1/nova/compute/manager.py
16:24:22 <johnthetubaguy> doesn't attachment_create need the connector here?
16:24:31 <johnthetubaguy> my bad, its the wrong one
16:24:43 <mriedem> ildikov: yeah we don't need that yet in https://review.openstack.org/#/c/456896/3/nova/compute/manager.py
16:24:46 <mriedem> comments inline
16:24:52 <mriedem> anyway, we could review by committee later :)
16:24:55 <mriedem> should we move on?
16:25:03 <ildikov> johnthetubaguy: I thought it just reserves at this point, but we might actually want to connect as well
16:25:22 <johnthetubaguy> I don't think you do the connect in there right now
16:25:42 <mriedem> creating an attachment in the _terminate_volume_connections method is wrong
16:25:44 <johnthetubaguy> I think thats correct, because we are on the wrong host to get the destination connector
16:25:56 <johnthetubaguy> mriedem: I think its only needed for the old API case
16:25:57 <ildikov> I only split the code into multiple patches and that was it
16:26:01 <mriedem> so whatever patches do that, like swap_volume, will need something different
16:26:11 <ildikov> my bad, but I didn't have more bandwidth for it this week :(
16:26:39 <johnthetubaguy> mriedem: ++
16:26:45 <ildikov> mriedem: when you create an attachment without connector that only keeps the volume reserved
16:27:06 <johnthetubaguy> mriedem: I think thats what I was trying to say earlier with my create a new method for this, it feels like we have many uses, not all need to do the attachment create
16:27:11 <ildikov> mriedem: we can do that later, but then the volume will be out there available
16:27:16 <johnthetubaguy> oh, its worse than that isn't it....
16:27:58 <johnthetubaguy> the attachment delete is needed when we both detatch and terminate connection, but when we just terminate connection, we probably want to create then delete
16:28:12 <ildikov> I added a keep_reserved boolean to that method later on and when that's True that's when I wanted to do the attachment_create to reserve
16:28:19 <johnthetubaguy> I am not sure I understood what I just said there
16:28:44 <johnthetubaguy> honestly, I would do swap with all brand new code, then we look for shared code later?
16:28:52 <ildikov> the first patch there shouldn't have attachment_create regardless
16:28:54 <johnthetubaguy> I know thats going to mean cut and paste in the mean time
16:29:09 <smcginnis> johnthetubaguy: Seems a reasonable approach though.
16:29:11 <johnthetubaguy> mriedem: do you think that would work ^
16:29:18 <smcginnis> As long as the clean up work gets identified and done.
16:29:24 <johnthetubaguy> I just think we are going to confuse ourselves with premature optimizations here
16:29:27 <ildikov> johnthetubaguy: works for me, I only tried this per request from mriedem, I can switch back to new code
16:29:50 <johnthetubaguy> its a good idea to try, but its bending my mind reviewing this stuff
16:30:01 <johnthetubaguy> agreed we need no cut and paste in the end, like smcginnis said
16:30:26 <smcginnis> Let's get it working first, then get it cleaned up.
16:30:44 <johnthetubaguy> so... I am thinking about to nova-network and neutron integration
16:30:45 <mriedem> i think re-using code is fine, if it's done properly,
16:30:45 <jungleboyj> smcginnis:  +2
16:30:47 <ildikov> fair enough, I will reorganize this when I do that chaining up
16:30:51 <mriedem> there are just simple mistakes in these patches
16:30:55 <mriedem> that once removed are not a problem
16:31:07 <johnthetubaguy> mriedem: maybe
16:31:12 <mriedem> i've commented in two of them
16:31:36 <lyarwood> ildikov: is bandwidth going to be an issue over the next week? Happy to take on swap_volume for example if it is.
16:31:54 <mriedem> so what's the issue with the keeping the volume reserved think?
16:31:57 <mriedem> that's for swap volume?
16:32:14 <mriedem> because as it stands, the code is deleting the attachment and then creating a new attachment w/o a connector to fake out that the volume is still reserved
16:32:17 <mriedem> that's lossy
16:32:26 <mriedem> we have to have a better way to make that atomic
16:32:32 <mriedem> can't we update the attachment to drop the connector?
16:32:36 <johnthetubaguy> you don't need the connector
16:32:39 <ildikov> lyarwood: let's sync up after the meeting as if I put all these into one chain then it's tough to share the work on it
16:32:55 <johnthetubaguy> oh, sorry, thats not your question
16:33:09 <johnthetubaguy> mriedem: the API we have is create a new one, then delete it, to make it atomic
16:33:10 <lyarwood> mriedem: I suggested that during the spec review but I don't think the api can do that at the moment
16:33:16 <ildikov> mriedem: the Cinder code does not support that at the moment
16:33:30 <johnthetubaguy> ildikov: does it support the reverse ordering?
16:33:36 <johnthetubaguy> create then delete
16:33:50 <ildikov> mriedem: and when I last talked to jgriffith we kind of agreed it would be better not to mess up update with that case
16:33:55 <mriedem> ok, create new attachment (w/o connector), delete old attachment (that had connector), then update new attachment with new connector?
16:34:08 <johnthetubaguy> mriedem: yeah, that I thought what the plan
16:34:17 <mriedem> it'd be nice if attachment update could just be atomic
16:34:20 <johnthetubaguy> mriedem: agreed its messy
16:34:26 <mriedem> since the client has to jump through hoops now
16:34:37 <ildikov> johnthetubaguy: we might be able to do that, I don't think anything prevent us from reversing that order
16:34:37 <mriedem> and it means the client has to know the internal state machine of cinder for how attachments work
16:34:39 <johnthetubaguy> so live-migrate has to do that flow anyway, as both are connected for some time
16:34:58 <johnthetubaguy> ildikov: we need the reverse order for live-migrate, so I hope it works
16:35:28 <mriedem> yeah create new attachment on destination host, once live migration is done, delete old attachment from source host
16:35:32 <mriedem> we do the same with ports
16:36:40 <lyarwood> johnthetubaguy: iirc you called that flow out in the spec so it is allowed
16:36:41 <ildikov> johnthetubaguy: I think it should, I don't recall anything that would fail with that case right now
16:37:11 <johnthetubaguy> lyarwood: yeah, I remember jgriffith gave me a funny look about that, so I am curious if it made it into the code
16:37:47 <johnthetubaguy> mriedem: FWIW, its causing big problems with ports
16:38:05 <ildikov> johnthetubaguy: what kind of problems?
16:38:34 <johnthetubaguy> so its probably a total distraction here, its specific to how neutron does a partial attach
16:38:43 <johnthetubaguy> and live-migrate, naturally
16:38:57 <ildikov> I just asked as if we have anything similar to that here then we could try to do it better for volumes
16:39:05 <mriedem> johnthetubaguy: are you referring to all of the stuff we talked about with neutron in BCN that we never implemented?
16:39:11 <johnthetubaguy> mriedem: yeah
16:39:31 <johnthetubaguy> mriedem: they wouldn't implement it, because we wouldn't approve our side of the work, or something like that
16:39:40 <johnthetubaguy> and it wasn't high priority
16:39:44 <mriedem> oh mututally assured destruction, i see
16:39:45 <johnthetubaguy> but anyways, thats a total rat hole
16:39:54 <mriedem> *mutually
16:39:58 * johnthetubaguy nods
16:40:13 <mriedem> well it was ocata,
16:40:17 <mriedem> we had 3 people and 3 months to do anything
16:40:18 <mriedem> so...
16:40:35 <mriedem> anyway,
16:40:38 <mriedem> comments are in the patches
16:40:41 <mriedem> what else?
16:41:10 <johnthetubaguy> seems like we have the stuff for next week
16:41:26 <johnthetubaguy> who was working on those, I forget lyarwood or ildikov or both?
16:41:30 <ildikov> we could decide what to do with remove check_detach: https://review.openstack.org/#/c/446671/
16:41:55 <ildikov> johnthetubaguy: me for sure, I will sync up with lyarwood and stvnoyes to see how we can share work items
16:42:04 <johnthetubaguy> sweet
16:42:24 <johnthetubaguy> #action ildikov will sync up with lyarwood and stvnoyes to see how we can share work items
16:42:31 <johnthetubaguy> check_detach
16:42:34 <ildikov> johnthetubaguy: and I will hunt down jgriffith as we're in the same town today and figure out what we touched on today to see what to do with these edge cases
16:42:42 <johnthetubaguy> so I totally missread the code and got all worried about breaking the API
16:43:10 <johnthetubaguy> it seems the extra checks are very much "belt and braces" stuff, by which I mean overkill
16:43:37 <johnthetubaguy> ildikov: is that right, its just redundant checks we are removing here really
16:44:35 <ildikov> johnthetubaguy: check_attach and check_detach were/are checks remained here from the ancient times when Cinder became a separate entity in my understanding and from what I saw in the code so far
16:45:02 <ildikov> johnthetubaguy: and I personally believe in code clean up and deleting code time to time and this seems like a good time for it
16:45:33 <mriedem> check_attach had implications for api behavior,
16:45:34 <ildikov> johnthetubaguy: and finally in my view Nova should not track the volume state in these cases as it's Cinder's responsibility really
16:45:38 <mriedem> so it required service version checks and the like
16:45:45 <hemna> ildikov, +1
16:45:52 <mriedem> i also love deleting code
16:45:54 <ildikov> mriedem: that was because of a missed reserve_volume call
16:46:07 <mriedem> but i don't assuming every service is always upgraded to the latest level at the same time
16:46:07 <johnthetubaguy> ildikov: I support all your aims, I am just trying to work out if the change is safe
16:46:28 <ildikov> so I think we made the Nova side code better overall with that :)
16:46:31 <mriedem> having said all that, i haven't gone through the check_detach patch, it might be less messy
16:46:35 <johnthetubaguy> I mean if it were me merge code deletes should deliver you a cookie
16:46:48 <johnthetubaguy> mriedem: I think it is
16:46:50 <mriedem> i didn't get a cookie for removing os-pci
16:47:10 <mriedem> anyway...moving on?
16:47:14 <ildikov> mriedem: we can sort this out in Boston :)
16:47:24 <johnthetubaguy> so... I have a quick question
16:47:28 <ildikov> mriedem: plz check the patch, it's pretty small, should be a piece of cake
16:47:29 <johnthetubaguy> begin_detaching...
16:47:36 <ildikov> mriedem: sorry, cookie :)
16:47:44 <johnthetubaguy> what does that get replaced by in the new world?
16:48:15 <johnthetubaguy> i.e. one we use the new attachment API
16:48:31 <ildikov> johnthetubaguy: you or someone from Cinder should remind me what that does exactly before I try to answer that question
16:48:51 <johnthetubaguy> its the API that does the state check, to see if we can detach
16:48:54 <johnthetubaguy> cinder API
16:49:11 <johnthetubaguy> I don't think it exists in the new API, so Nova would have to manual check the state
16:49:38 <johnthetubaguy> I think it does attached -> detaching state change on the volume in the old world
16:50:06 <ildikov> johnthetubaguy: we shouldn't solve anything on the Nova side that belongs to Cinder
16:50:21 <johnthetubaguy> I agree, but the API doesn't let us do that
16:50:59 <ildikov> we can also re-evaluate the states that we're maintaining in Cinder, if all of them are needed then we can check what's missing and fix it
16:51:15 <johnthetubaguy> so... actually, about that patch we just approved
16:51:25 <ildikov> I mean fix where the fix should happen
16:51:29 <johnthetubaguy> I guess we actually still call begin_detaching, which I don't think we are allowed to?
16:51:55 <johnthetubaguy> so maybe begin_detaching is in the new API too, I wasn't ever 100% clear on that
16:52:29 <mriedem> begin_detaching changes the volume status from in-use to detaching,
16:52:35 <johnthetubaguy> right
16:52:38 <johnthetubaguy> sorry
16:52:39 <ildikov> I don't think it is
16:52:40 <mriedem> if we fail to detach, we call roll_detaching to put it back into in-use state
16:53:23 <johnthetubaguy> maybe thats why we dropped it in the new API, my memory is sketchy on that now
16:53:30 <mriedem> if begin_detaching isn't available in the new api, shouldn't it fail if you request it at microversion >= 2.37?
16:53:32 <mriedem> *3.27
16:53:47 <mriedem> or is it just a noop?
16:53:57 <ildikov> mriedem: I meant that the new flow doesn't use it, the code is available still
16:53:59 <mriedem> i guess this is homework for cinder
16:54:07 <johnthetubaguy> I think so
16:54:18 <johnthetubaguy> so to be clear, I think this is the new flow
16:54:36 <johnthetubaguy> check BDM state, continue (maybe check cinder is in sync-> attachment still exists)
16:54:44 <johnthetubaguy> give early error to uses it was a bad API call
16:54:45 <johnthetubaguy> etc
16:54:48 <johnthetubaguy> then we do stuff
16:54:52 <johnthetubaguy> then delete attachment
16:55:01 <mriedem> why does begin_detaching even exist?
16:55:03 <johnthetubaguy> so its like one cinder API call only in the new flow, I think
16:55:04 <smcginnis> I thought there was a reason it was no longer needed, but we can verify that.
16:55:16 <mriedem> who cares if the volume is being detached?
16:55:21 <johnthetubaguy> so the code we are deleting seems to be needed in the new flow
16:55:22 <mriedem> it's either in-use or it's not
16:55:26 <johnthetubaguy> mriedem: correct
16:55:35 <smcginnis> mriedem: +1
16:55:43 <mriedem> begin_detaching was probably some internal nova-volume thing, externalized in the cinder api
16:56:12 <johnthetubaguy> its just so you get a detaching... state in the API
16:56:18 <johnthetubaguy> actually, thats a good point
16:56:18 <mriedem> same with putting the volume in reserved state, but that's a bit trickier because we need to reserve the volume in nova-api and then put it in-use in the compute
16:56:24 <johnthetubaguy> the new flow changes the API state changes
16:56:35 <hemna> well it was also used to try and prevent race conditions
16:56:43 <johnthetubaguy> reserve just == "in use" I think
16:56:43 <mriedem> hemna: which one?
16:56:52 <ildikov> mriedem: if we would ever get there to multi-attach I guess we might not want to attach and detach at the same time?
16:56:57 <johnthetubaguy> two people calling detatch... damm it
16:56:59 <hemna> multiple detaches happening at the same time
16:57:14 <hemna> re: volume state mgmt
16:57:18 <johnthetubaguy> grr, we missed that one
16:57:31 <johnthetubaguy> right now it will fail on the compute due to the locking going on there
16:57:39 <mriedem> so request 1 puts the volume into detaching state, request 2 fails because the volume is already detaching
16:57:44 <johnthetubaguy> wich is a poor API expereience
16:57:45 <hemna> yah
16:57:59 <mriedem> and we don't have this with the new flows?
16:58:08 <hemna> if it's already detaching, then isn't the 2nd api request a success ?
16:58:10 <mriedem> i.e. we can't use begin_detaching with the >=3.27?
16:58:11 <hemna> it's detaching....
16:58:19 <hemna> except multi-attach
16:58:29 <mriedem> we have 2 minutes left
16:58:32 <ildikov> mriedem: we can and I think we still do actually
16:58:36 <mriedem> but this is why we can't just rip code out willy nilly
16:58:48 <mriedem> i said it, willy nilly
16:58:57 <ildikov> lol :)
16:59:12 <mriedem> anyway we're out of time
16:59:13 <ildikov> mriedem: it's check_detach I would love to rip out
16:59:35 <ildikov> we can continue on the Cinder channel and code reviews
16:59:50 * willynilly wants to remove check_detach
16:59:51 * johnthetubaguy needs to go cook his dinner
16:59:57 <ildikov> or Nova channel, whichever we want to
17:00:00 <johnthetubaguy> heh
17:00:09 <ildikov> thanks everyone for today
17:00:29 <ildikov> #endmeeting