16:00:06 <ildikov> #startmeeting cinder-nova-api-changes
16:00:07 <openstack> Meeting started Thu Jun  8 16:00:06 2017 UTC and is due to finish in 60 minutes.  The chair is ildikov. Information about MeetBot at http://wiki.debian.org/MeetBot.
16:00:09 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
16:00:11 <openstack> The meeting name has been set to 'cinder_nova_api_changes'
16:00:13 <mriedem> o/
16:00:21 <ildikov> DuncanT ameade cFouts johnthetubaguy jaypipes takashin alaski e0ne jgriffith tbarron andrearosa hemna erlon mriedem gouthamr ebalduf patrickeast smcginnis diablo_rojo gsilvis  xyang1 raj_singh lyarwood breitz jungleboyj
16:00:25 <stvnoyes> o/
16:00:41 <pewp> hemna (=゚ω゚)ノ
16:00:47 <johnthetubaguy> o/
16:01:18 <ildikov> hi :)
16:01:24 <jgriffith> o/
16:01:56 <ildikov> ok, let's start
16:02:02 <mriedem> you don't have stvnoyes in your list
16:02:17 <mriedem> and should probably remove alaski :)
16:02:28 <ildikov> mriedem: will update, tnx :)
16:03:08 <ildikov> so the open reviews are looking good: https://review.openstack.org/#/q/topic:bp/cinder-new-attach-apis
16:03:44 <ildikov> we have two patches on the gate to handle the microversion discovery, tnx to mriedem and all the reviewers
16:04:01 <ildikov> the attach PoC is under continuous updates
16:04:21 <mriedem> stvnoyes: i think we can abandon https://review.openstack.org/#/c/471142/
16:04:27 <ildikov> stvnoyes: were you be able to test live_migrate at all with that one?
16:04:44 <stvnoyes> agreed
16:05:08 <mriedem> changing course on the live migration stuff,
16:05:24 <mriedem> going to try the LiveMigrateData object to stash the old volume attachment ids
16:05:27 <mriedem> during the live migration
16:05:27 <stvnoyes> i am testing migrate now with the new approach of having the old attach info in the migrate_data. that seems to be working ok with one small problem that i'm tracking down now
16:06:22 <stvnoyes> some strange problem with os-brick. i haven't confirmed this, but it doesn't seem to like having the 'serial' value in connection_info
16:06:38 <hemna> have any logs ?
16:06:39 <stvnoyes> during connect
16:07:08 <ildikov> stvnoyes: isn't it another cast issue?
16:07:16 <stvnoyes> well, I am doing connects right now with and with that and when I seem to get a pattern I'll let you know
16:07:30 <hemna> ok, let me know if I can fix something
16:07:34 <stvnoyes> no, no exceptions, the volume is just not seen from within the vm
16:07:55 <stvnoyes> ^ with and withOUT that...
16:07:56 <hemna> maybe pastebin some logs at the time os-brick is called and returns
16:08:01 <stvnoyes> ok
16:08:29 <stvnoyes> just hit this within the last hour
16:08:34 <hemna> with debug logging on, you should be able to see what os-brick returns from connect_volume back to nova
16:09:04 <stvnoyes> ok
16:09:22 <hemna> ping me if you any help.
16:09:30 <johnthetubaguy> in the new flow, I guess we call os-brick more often? like when you create the second attachment?
16:09:57 <johnthetubaguy> ignore more, I should re-read that
16:10:02 <stvnoyes> I think it's the same # of times calling, this is during pre-live_migration where you connect to the vol on the dest
16:10:03 <johnthetubaguy> ignore me that was meant to say
16:10:14 <johnthetubaguy> yeah, I forgot about the stashed connect_info's
16:10:46 <stvnoyes> btw, what is the purpose of 'serial'?
16:11:01 <stvnoyes> it has the volume_id in it
16:11:06 <mriedem> not sure, i've wondered that over time too
16:11:06 <hemna> I'm unfamiliar with that
16:11:06 <jgriffith> stvnoyes that's an *excellent* question :)
16:11:28 <stvnoyes> ;-) I guess I'm not alone
16:11:29 <ildikov> mriedem: haven't we all...
16:12:23 <jgriffith> stvnoyes I've never figured out why that's there as an additional volume_id arg
16:12:53 <mriedem> it's a libvirt thing
16:12:54 <mriedem> set in LibvirtConfigGuestDisk
16:13:12 <hemna> the cryptsetup stuff uses it for some reason
16:13:13 <johnthetubaguy> its the thing we connect the VM to, I guess?
16:13:42 <hemna> https://github.com/openstack/os-brick/blob/master/os_brick/encryptors/cryptsetup.py#L51
16:13:44 <stvnoyes> anyways, I think i'm very close to getting this migrate work up for review. the unit tests are done... Just one last thing... :-)
16:13:45 <mriedem> hemna: probably just b/c it knows it's the volume id
16:13:58 <hemna> only place it's accessed afaik
16:14:10 <mriedem> no it's also used in the libvirt driver
16:14:12 <mriedem> during snapshot
16:14:16 <mriedem> _volume_snapshot_create
16:14:17 <ildikov> stvnoyes: never say 'just one last thing' out loud! :)
16:14:20 <mriedem> for the glusterfs stuff
16:14:26 <hemna> ah ok, I wasn't looking in the nova code
16:14:29 <mriedem> which we've removed now, but...
16:14:38 <stvnoyes> let me work on this more. it's too early to dive too deep on this yet
16:15:09 <stvnoyes> ildikov- that's why the smile is there...
16:15:13 <stvnoyes> :-)
16:15:34 <mriedem> "If present, this specify serial number of virtual hard drive.           For example, it may look           like <serial>WD-WMAP9A966149</serial>.           Not supported for scsi-block devices, that is those using           disk type 'block' using device 'lun'           on bus 'scsi'.           Since 0.7.1"
16:15:42 <ildikov> stvnoyes: last time I hit iscsiadm issues in my test env for instance, so there's always something that's odd at least in my env with attach...
16:16:10 <mriedem> https://libvirt.org/formatdomain.html#elementsDisks
16:16:14 <ildikov> stvnoyes: it's still a pretty thin line :)
16:17:10 <stvnoyes> thanks mriedem - this makes sense - serial - If present, this specify serial number of virtual hard drive.
16:17:28 <mriedem> i don't think nova makes any distinction for "Not supported for scsi-block devices, that is those using           disk type 'block' using device 'lun'           on bus 'scsi'." though
16:17:36 <mriedem> nova just always sets it i think
16:18:23 <jgriffith> mriedem it does, and uses the cinder volume-id in the case of block type
16:18:50 <stvnoyes> with John's new attach patch, it's not set during attach (which works ok).
16:19:05 <stvnoyes> though I am not suing John's latest
16:19:11 <stvnoyes> using
16:19:16 <mriedem> i can ask some libvirt/kvm people about serial
16:19:22 <jgriffith> I mean "it does not distinguish"
16:19:37 <stvnoyes> it's OK, let me verify i's the problem first.
16:19:57 <jgriffith> stvnoyes I will look and see if the latest has any more *missing* translations of that
16:20:26 <jgriffith> stvnoyes during attach I explicitly check if that exists and if it doesn't, set it to volume-id
16:20:43 <johnthetubaguy> I think serial is just a thing the virtual device exposes to the guest, right?
16:22:53 <stvnoyes> let's move on since I think it's still too early in my debugging of this.
16:22:54 <mriedem> anyway
16:23:05 <mriedem> i've asked kashyap who will probably ask qemu devs
16:23:26 <mriedem> has there been any movement on the swap volume questions on the cinder side?
16:23:31 <mriedem> or is that no longer a question/issue?
16:23:38 <ildikov> stvnoyes: do you see this issue only with live_migrate?
16:23:51 <ildikov> mriedem: the idea now is to re-write swap for the new flow
16:23:55 <ildikov> on both sides
16:24:08 <ildikov> we just wanted to make attach work before doing that
16:24:24 <mriedem> ildikov: that's kind of a chicken and egg though,
16:24:25 <ildikov> so we can test what we're doing
16:24:48 <ildikov> mriedem: you mean from merging perspective?
16:24:48 <mriedem> i thought we were going to do swap before the attach change at the very end
16:25:11 <mriedem> swap keys off if the volume that is attached, and being swapped from, was attached with the new api flow
16:25:24 <mriedem> https://review.openstack.org/#/c/456971/6/nova/compute/api.py
16:27:01 <ildikov> that's an old review
16:27:11 <mriedem> well,
16:27:14 <mriedem> it's the only review
16:27:15 <mriedem> for swap
16:27:31 <ildikov> I mean since then we realized that the migration_completion stuff on the Cinder side will not work either so we need to re-think this
16:27:31 <mriedem> so now we're talking about a new cinder api microversion for swap?
16:27:46 <mriedem> right, so i'm asking, has it been re-thought :)
16:27:46 <ildikov> we will look into this with jgriffith soon I think
16:28:04 <ildikov> as attach is operational enough for testing now, except a few cleanups
16:28:05 <mriedem> today is p-2 so if there needs to be a new cinder api, it's getting late for pike
16:28:17 <johnthetubaguy> this is blocking us turning on the new API right?
16:28:21 <mriedem> yes
16:28:28 <jgriffith> I don't think there's a need for a new API to simplify that
16:28:34 <mriedem> because swap won't handle new style attachments othrewise
16:28:59 <johnthetubaguy> right, ideally the new and old volumes each have a separate attachment during the swap, because nova has two attachments
16:29:01 <ildikov> mriedem: +1
16:29:03 * jgriffith will look at it again today
16:29:22 <jgriffith> I think there's some confusion also because there's two types of swap
16:29:38 <jgriffith> Literally swap a-->b using libvirts copy function
16:29:48 <johnthetubaguy> cinder started and Nova API started with a different volume
16:29:51 <jgriffith> and the swap associated with migration
16:29:56 <jgriffith> they're two very different things
16:30:03 <johnthetubaguy> right, we are talking about the first one right?
16:30:10 <johnthetubaguy> the Nova swap_volume API?
16:30:11 <mriedem> they both result in the libvirt driver doing the blockRebase though right?
16:30:13 <jgriffith> so that's easy
16:30:19 <jgriffith> mriedem yes
16:30:36 <mriedem> who initiates them is what impacts the migration_completion callback to cinder
16:30:40 <mriedem> and i believe is the pickle
16:30:56 <johnthetubaguy> ah, so its the same two things I am thinking about, just different names
16:30:59 <mriedem> today if cinder didn't initiate, it just ignores the migration_completion callback i think
16:31:01 <jgriffith> mriedem yes, but in the first case the callback is a noop anyway so can just go away
16:31:17 <jgriffith> the second case is a mess, and I honestly haven't come up with an answer for it as of yet
16:31:18 <johnthetubaguy> I thought the only difference is the result volume-uuid for the thing thats left attached at the end?
16:31:43 <jgriffith> johnthetubaguy yes, the second case does a swap of the volume-uuid to make it an "invisible" thing
16:31:43 <mriedem> jgriffith: sure, it can't go away on the nova side w/o us knowing if we should call it or not though, is our problem
16:31:47 <jgriffith> which is stupid IMO
16:32:14 <jgriffith> mriedem but Nova already has all of the control on that
16:32:21 <jgriffith> (in the first case)
16:32:25 <johnthetubaguy> we don't right?
16:32:29 <jgriffith> I'm agreeing that the second case is different
16:32:37 <mriedem> we don't know who called the swap volume REST API
16:32:39 <johnthetubaguy> we don't know if its a user calling us or cinder after the user called cinder right?
16:32:42 <mriedem> if it was cinder or an admin
16:33:01 <jgriffith> johnthetubaguy yes, it's not written that way due to call-backs etc but the call-backs don't actually *do* anything on the Cinder side
16:33:24 <mriedem> jgriffith: in the case of a non-volume migrate initiated swap, right?
16:33:26 <johnthetubaguy> no, no, we are saying the REST API calls on the Nova side, and Nova state is identical in both cases
16:33:32 <jgriffith> mriedem correct
16:33:33 <mriedem> in the case of volume migration, i thought they completed the migration on the cinder side
16:33:55 <mriedem> i think we all know what we're saying, we're just saying different things :)
16:34:02 <jgriffith> haha.. perhaps
16:34:06 <johnthetubaguy> mriedem: +1
16:34:13 <mriedem> the issue is what migrate_completion does on the cinder side for new-style attachments for a volume-migrate initiated swap
16:34:18 <jgriffith> regardless there are problems there that aren't trivial
16:34:31 <jgriffith> IMHO that code should all be reverted :)
16:35:16 <johnthetubaguy> if there was a new swap API, that took two attachment_ids, and was the one cinder called, and for that one we made a new callback to a new API that took the same two attachment_ids... we would be in a more sane place?
16:35:23 <ildikov> jgriffith: that might not be an option :)
16:35:34 <jgriffith> bummer
16:35:35 <johnthetubaguy> actually... thats missing too much context
16:36:00 <johnthetubaguy> to be clear, this is now blocking multi-attach, and us moving to the new API
16:36:23 <ildikov> it pretty much does
16:37:01 <johnthetubaguy> for me, if we passed attachment_ids around, instead of volume-ids, for cinder and nova APIs, wouldn't we have all the state we needed?
16:37:34 <hemna> especially in the case of multi-attach that makes sense
16:38:14 <jgriffith> johnthetubaguy yes, that was kinda the whole idea of the attachment-object
16:38:17 <johnthetubaguy> that means passing Nova an attachment we would update from os-brick, then attach to the instance, and the other attachment would have to be already attached, etc
16:38:32 <ildikov> johnthetubaguy: how would that help with the we don't know who called the swap-volume API problem?
16:39:23 <johnthetubaguy> ildikov: it doesn't fix that, but we would be telling you what the attchement was, so cinder should know if thats a migration attachment or not
16:39:35 <hemna> attaching the attached attachment.
16:39:40 <johnthetubaguy> it helps a bit in that we care less if the volume-uuid changes, because the attachment-uuid is the same
16:40:00 <ildikov> johnthetubaguy: ah ok, got it
16:40:00 <johnthetubaguy> hemna: whos doing what now?
16:40:08 <hemna> :P
16:40:10 <hemna> ignore me
16:40:21 <johnthetubaguy> heh
16:41:05 <johnthetubaguy> it seems like a new cinder API to be added, and a new Nova API that cinder calls into?
16:41:27 <johnthetubaguy> thats one approach, I guess the question is, is there a simpler option available?
16:41:39 <jgriffith> johnthetubaguy that honestly might be the safest and most expedient answer
16:42:03 <johnthetubaguy> yeah, its the safest explcit one I think
16:42:08 <ildikov> johnthetubaguy: jgriffith: +1
16:42:08 <jgriffith> but that means we need to come up with something and agree on both sides in like 24 hours :)
16:42:41 <ildikov> jgriffith: I don't see the issue :)
16:42:46 <johnthetubaguy> well... ideally three months ago, but either way could work
16:43:00 <jgriffith> ha!
16:43:06 * jgriffith fires up his way-back machine
16:43:36 <mriedem> i'm not totally following here, you're suggesting a new cinder api where nova passes in the newly attached attachment uuid for the attachment we swapped *to*?
16:43:41 <mriedem> to complete the migratoin?
16:43:58 <johnthetubaguy> a new nova API to pass the existing and new attachment id
16:44:02 <mriedem> and if swap was initiated by a user, not cinder, then it's ignored like today?
16:44:10 <jgriffith> mriedem I *think* a new cinder/nova interaction for the volume-migration call from Cinder
16:44:11 <johnthetubaguy> then a new cinder API it calls back to, passing both the attachments
16:44:12 <mriedem> new nova api?
16:44:31 <johnthetubaguy> yeah, I think its a new nova api too really
16:44:39 <jgriffith> johnthetubaguy +1
16:44:43 <mriedem> umm
16:44:45 <johnthetubaguy> so the cinder started one, calls a new API that gets attachments
16:45:00 <johnthetubaguy> we can try and stop regular using using that this time, for real
16:45:05 <mriedem> can we not do more special case service-specific APIs please
16:45:14 <mriedem> you can't stop regular users from using it
16:45:26 <johnthetubaguy> well, I mean good default RBAC really
16:45:34 <mriedem> default rbac on swap is already admin only
16:45:36 <johnthetubaguy> its user safe, if they create attachments
16:45:38 <jgriffith> ok, I'll look again later today and see if there's a way to fix the mess we already have
16:45:52 <johnthetubaguy> yeah, admin only isn't good enough, but thats a different debate
16:46:17 <mriedem> anyway - my fear is, if swap depends on the new attach flow, and now we can't do swap w/o this new api interaction, which is a microversion newer than the new attach api, things start to get super f'ed
16:46:43 <johnthetubaguy> so.. if we ignore multi attach, I am surpized why the cinder side can't just be "fixed up", are we missing something there?
16:47:01 <johnthetubaguy> mriedem: yeah, we would have to raise to this new API when checking if we can use the new flow
16:47:42 <mriedem> maybe we don't design this here,
16:47:48 <mriedem> i don't understand the issues on the cinder side,
16:47:58 <mriedem> but can the problem be stated in the ML?
16:48:05 <johnthetubaguy> thats my previous question really, whats the issue cinder side, if we keep the API the same
16:48:15 <mriedem> and then we, and hopefully others, can sort it out where/when we have time to digest the actual problem?
16:48:18 <johnthetubaguy> yeah, some written down form would be good
16:48:39 <johnthetubaguy> it seems clear we can't move forward here and now without that
16:49:11 <mriedem> i can live with new cinder apis if we have to, because then the deps are still one direction,
16:49:24 <mriedem> but if we're doing new apis on both sides, which is a circular dependency, then we start to get into crazy land
16:49:36 <johnthetubaguy> so proposal 1: re-write the lot to be attachment aware, propsal 2: just fail if any attachment is multi-attach and use the old APIs, question: what does proposal 2 not work for cinder?
16:49:58 <ildikov> mriedem: there are old attach/detach snippets in the call-bak on the Cinder side, which should not hit in the simple case though
16:50:01 <mriedem> i'm not really sure why we're talking about multi-attach at this point
16:50:19 <mriedem> "swapping multi-attached volumes is not supported, fin"
16:50:24 <mriedem> at least in v1
16:50:31 <ildikov> mriedem: +1
16:50:37 <mriedem> we do'nt support multiattach today in nova anyway
16:50:40 <mriedem> and likely won't in pike
16:50:51 <johnthetubaguy> well, we should include swapping a volume that has multiple attachments
16:51:00 <johnthetubaguy> i.e. half way through a live-migration
16:51:13 <mriedem> that's not the same operatoin though
16:51:19 <mriedem> there is no migrate_completion callback during nova live migratoin
16:51:21 <johnthetubaguy> Nova will just reject those API calls anyways, so no change there
16:51:33 <mriedem> the migration_completion cinder api is the issue
16:51:38 <mriedem> as far as i understand
16:51:42 <mriedem> and that's *only* for swap volume
16:51:44 <mriedem> on the nova side
16:51:49 <ildikov> mriedem: correct
16:51:53 <johnthetubaguy> yes, I am talking about when you call swap volume
16:51:59 <johnthetubaguy> what is the state of the system
16:52:17 <johnthetubaguy> if we have a volume attached on two hosts, its "complicated" on the cinder side, as there are two attachments
16:52:29 <johnthetubaguy> but Nova doesn't care about that case
16:52:39 <mriedem> johnthetubaguy: yo'ure talking about swapping a multiattach volume though right?
16:52:43 <johnthetubaguy> no
16:52:51 <johnthetubaguy> multi-attach=false
16:53:02 <johnthetubaguy> two attachments
16:53:10 <johnthetubaguy> source host and destination host
16:53:15 <mriedem> how is the same volume attached to two hosts during swap volume?
16:53:19 <mriedem> if not multiattach?
16:53:25 <mriedem> swap volume is literally 2 separate volumes
16:53:28 <mriedem> 1 attachment each
16:53:31 <johnthetubaguy> on the cinder side you can call migrate at any time
16:53:43 <johnthetubaguy> the VM could be in the live-migrating state
16:53:47 <johnthetubaguy> Nova fails the API call
16:54:00 <johnthetubaguy> but theoretically, you can have two attachments in the Cinder DB
16:54:02 <johnthetubaguy> same volume
16:54:02 <mriedem> can you swap volumes on an instance being migrated?
16:54:05 <johnthetubaguy> same instance
16:54:25 <johnthetubaguy> so if you call that, I think you get a Nova API error
16:54:31 <johnthetubaguy> but its a case on the cinder side
16:54:33 <mriedem> you can't swap volumes on an instance that's undergoing a task state change i don't think
16:54:42 <mriedem> we're either talking past each other,
16:54:42 <johnthetubaguy> I am trying to say cinder can ignore that case
16:54:48 <mriedem> or i'm just totally not understanding what you're saying
16:54:50 <johnthetubaguy> so... yeah, that
16:55:07 <mriedem> with 5 minutes left, and i need lunch, let's table this and move it to the ML,
16:55:11 <johnthetubaguy> probably a hangout thing, after its all written down
16:55:12 <johnthetubaguy> +1
16:55:28 <mriedem> but i'd really just like jgriffith or someone from cinder to describe the problem we have today with non-multiattach volumes and the migration_callback in cinder
16:55:41 <mriedem> with the new flow
16:55:49 <ildikov> mriedem: johnthetubaguy: we will get it written down with jgriffith and then we can figure out whether it's an ML thread a Hangouts call or both or what else
16:56:02 <mriedem> i really need writing to digest it
16:56:10 <johnthetubaguy> jgriffith: do you understand the question we are asking?
16:56:31 <jgriffith> johnthetubaguy partially, but frankly you are all over the board :)
16:56:42 <mriedem> jgriffith: "i'd really just like jgriffith or someone from cinder to describe the problem we have today with non-multiattach volumes and the migration_callback in cinder"
16:56:43 <jgriffith> So I'll put a write up together on the two cases
16:56:46 <mriedem> that's step 1
16:56:49 <jgriffith> and what the challenges are problems etc
16:56:55 <johnthetubaguy> basically, if we keep all the APIs the same, whats broken in the new world?
16:56:57 <johnthetubaguy> perfect
16:57:09 <ildikov> johnthetubaguy: your last scenario is unclear to me but we will get to the base case and let's take it from there
16:57:35 <johnthetubaguy> I think I am making option 2 possible, but yeah, lets get the basics written down
16:58:36 <ildikov> johnthetubaguy: I think we will get a better view with some written material first
16:58:36 <johnthetubaguy> cool, seems like we are done, and just waiting for that write up?
16:58:50 <ildikov> johnthetubaguy: yes
16:59:27 <stvnoyes> jgriffith: can you stick around here after the mtg?  I have a couple of questions for you about the attach POC operation
17:00:02 <mriedem> stvnoyes: could move to nova or cinder channels
17:00:04 <jgriffith> stvnoyes sure
17:00:06 <ildikov> ok, let's end the meeting and move to the Cinder channel with further chats
17:00:15 <jgriffith> stvnoyes mriedem yeah... I'll catch you cinder
17:00:15 <ildikov> or the Nova one :)
17:00:17 <stvnoyes> ok see on the cinder irc
17:00:26 <ildikov> thank you all!
17:00:34 <ildikov> will catch up later on swap
17:00:45 <ildikov> #endmeeting