16:00:06 <ildikov> #startmeeting cinder-nova-api-changes 16:00:07 <openstack> Meeting started Thu Jun 8 16:00:06 2017 UTC and is due to finish in 60 minutes. The chair is ildikov. Information about MeetBot at http://wiki.debian.org/MeetBot. 16:00:09 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 16:00:11 <openstack> The meeting name has been set to 'cinder_nova_api_changes' 16:00:13 <mriedem> o/ 16:00:21 <ildikov> DuncanT ameade cFouts johnthetubaguy jaypipes takashin alaski e0ne jgriffith tbarron andrearosa hemna erlon mriedem gouthamr ebalduf patrickeast smcginnis diablo_rojo gsilvis xyang1 raj_singh lyarwood breitz jungleboyj 16:00:25 <stvnoyes> o/ 16:00:41 <pewp> hemna (=゚ω゚)ノ 16:00:47 <johnthetubaguy> o/ 16:01:18 <ildikov> hi :) 16:01:24 <jgriffith> o/ 16:01:56 <ildikov> ok, let's start 16:02:02 <mriedem> you don't have stvnoyes in your list 16:02:17 <mriedem> and should probably remove alaski :) 16:02:28 <ildikov> mriedem: will update, tnx :) 16:03:08 <ildikov> so the open reviews are looking good: https://review.openstack.org/#/q/topic:bp/cinder-new-attach-apis 16:03:44 <ildikov> we have two patches on the gate to handle the microversion discovery, tnx to mriedem and all the reviewers 16:04:01 <ildikov> the attach PoC is under continuous updates 16:04:21 <mriedem> stvnoyes: i think we can abandon https://review.openstack.org/#/c/471142/ 16:04:27 <ildikov> stvnoyes: were you be able to test live_migrate at all with that one? 16:04:44 <stvnoyes> agreed 16:05:08 <mriedem> changing course on the live migration stuff, 16:05:24 <mriedem> going to try the LiveMigrateData object to stash the old volume attachment ids 16:05:27 <mriedem> during the live migration 16:05:27 <stvnoyes> i am testing migrate now with the new approach of having the old attach info in the migrate_data. that seems to be working ok with one small problem that i'm tracking down now 16:06:22 <stvnoyes> some strange problem with os-brick. i haven't confirmed this, but it doesn't seem to like having the 'serial' value in connection_info 16:06:38 <hemna> have any logs ? 16:06:39 <stvnoyes> during connect 16:07:08 <ildikov> stvnoyes: isn't it another cast issue? 16:07:16 <stvnoyes> well, I am doing connects right now with and with that and when I seem to get a pattern I'll let you know 16:07:30 <hemna> ok, let me know if I can fix something 16:07:34 <stvnoyes> no, no exceptions, the volume is just not seen from within the vm 16:07:55 <stvnoyes> ^ with and withOUT that... 16:07:56 <hemna> maybe pastebin some logs at the time os-brick is called and returns 16:08:01 <stvnoyes> ok 16:08:29 <stvnoyes> just hit this within the last hour 16:08:34 <hemna> with debug logging on, you should be able to see what os-brick returns from connect_volume back to nova 16:09:04 <stvnoyes> ok 16:09:22 <hemna> ping me if you any help. 16:09:30 <johnthetubaguy> in the new flow, I guess we call os-brick more often? like when you create the second attachment? 16:09:57 <johnthetubaguy> ignore more, I should re-read that 16:10:02 <stvnoyes> I think it's the same # of times calling, this is during pre-live_migration where you connect to the vol on the dest 16:10:03 <johnthetubaguy> ignore me that was meant to say 16:10:14 <johnthetubaguy> yeah, I forgot about the stashed connect_info's 16:10:46 <stvnoyes> btw, what is the purpose of 'serial'? 16:11:01 <stvnoyes> it has the volume_id in it 16:11:06 <mriedem> not sure, i've wondered that over time too 16:11:06 <hemna> I'm unfamiliar with that 16:11:06 <jgriffith> stvnoyes that's an *excellent* question :) 16:11:28 <stvnoyes> ;-) I guess I'm not alone 16:11:29 <ildikov> mriedem: haven't we all... 16:12:23 <jgriffith> stvnoyes I've never figured out why that's there as an additional volume_id arg 16:12:53 <mriedem> it's a libvirt thing 16:12:54 <mriedem> set in LibvirtConfigGuestDisk 16:13:12 <hemna> the cryptsetup stuff uses it for some reason 16:13:13 <johnthetubaguy> its the thing we connect the VM to, I guess? 16:13:42 <hemna> https://github.com/openstack/os-brick/blob/master/os_brick/encryptors/cryptsetup.py#L51 16:13:44 <stvnoyes> anyways, I think i'm very close to getting this migrate work up for review. the unit tests are done... Just one last thing... :-) 16:13:45 <mriedem> hemna: probably just b/c it knows it's the volume id 16:13:58 <hemna> only place it's accessed afaik 16:14:10 <mriedem> no it's also used in the libvirt driver 16:14:12 <mriedem> during snapshot 16:14:16 <mriedem> _volume_snapshot_create 16:14:17 <ildikov> stvnoyes: never say 'just one last thing' out loud! :) 16:14:20 <mriedem> for the glusterfs stuff 16:14:26 <hemna> ah ok, I wasn't looking in the nova code 16:14:29 <mriedem> which we've removed now, but... 16:14:38 <stvnoyes> let me work on this more. it's too early to dive too deep on this yet 16:15:09 <stvnoyes> ildikov- that's why the smile is there... 16:15:13 <stvnoyes> :-) 16:15:34 <mriedem> "If present, this specify serial number of virtual hard drive. For example, it may look like <serial>WD-WMAP9A966149</serial>. Not supported for scsi-block devices, that is those using disk type 'block' using device 'lun' on bus 'scsi'. Since 0.7.1" 16:15:42 <ildikov> stvnoyes: last time I hit iscsiadm issues in my test env for instance, so there's always something that's odd at least in my env with attach... 16:16:10 <mriedem> https://libvirt.org/formatdomain.html#elementsDisks 16:16:14 <ildikov> stvnoyes: it's still a pretty thin line :) 16:17:10 <stvnoyes> thanks mriedem - this makes sense - serial - If present, this specify serial number of virtual hard drive. 16:17:28 <mriedem> i don't think nova makes any distinction for "Not supported for scsi-block devices, that is those using disk type 'block' using device 'lun' on bus 'scsi'." though 16:17:36 <mriedem> nova just always sets it i think 16:18:23 <jgriffith> mriedem it does, and uses the cinder volume-id in the case of block type 16:18:50 <stvnoyes> with John's new attach patch, it's not set during attach (which works ok). 16:19:05 <stvnoyes> though I am not suing John's latest 16:19:11 <stvnoyes> using 16:19:16 <mriedem> i can ask some libvirt/kvm people about serial 16:19:22 <jgriffith> I mean "it does not distinguish" 16:19:37 <stvnoyes> it's OK, let me verify i's the problem first. 16:19:57 <jgriffith> stvnoyes I will look and see if the latest has any more *missing* translations of that 16:20:26 <jgriffith> stvnoyes during attach I explicitly check if that exists and if it doesn't, set it to volume-id 16:20:43 <johnthetubaguy> I think serial is just a thing the virtual device exposes to the guest, right? 16:22:53 <stvnoyes> let's move on since I think it's still too early in my debugging of this. 16:22:54 <mriedem> anyway 16:23:05 <mriedem> i've asked kashyap who will probably ask qemu devs 16:23:26 <mriedem> has there been any movement on the swap volume questions on the cinder side? 16:23:31 <mriedem> or is that no longer a question/issue? 16:23:38 <ildikov> stvnoyes: do you see this issue only with live_migrate? 16:23:51 <ildikov> mriedem: the idea now is to re-write swap for the new flow 16:23:55 <ildikov> on both sides 16:24:08 <ildikov> we just wanted to make attach work before doing that 16:24:24 <mriedem> ildikov: that's kind of a chicken and egg though, 16:24:25 <ildikov> so we can test what we're doing 16:24:48 <ildikov> mriedem: you mean from merging perspective? 16:24:48 <mriedem> i thought we were going to do swap before the attach change at the very end 16:25:11 <mriedem> swap keys off if the volume that is attached, and being swapped from, was attached with the new api flow 16:25:24 <mriedem> https://review.openstack.org/#/c/456971/6/nova/compute/api.py 16:27:01 <ildikov> that's an old review 16:27:11 <mriedem> well, 16:27:14 <mriedem> it's the only review 16:27:15 <mriedem> for swap 16:27:31 <ildikov> I mean since then we realized that the migration_completion stuff on the Cinder side will not work either so we need to re-think this 16:27:31 <mriedem> so now we're talking about a new cinder api microversion for swap? 16:27:46 <mriedem> right, so i'm asking, has it been re-thought :) 16:27:46 <ildikov> we will look into this with jgriffith soon I think 16:28:04 <ildikov> as attach is operational enough for testing now, except a few cleanups 16:28:05 <mriedem> today is p-2 so if there needs to be a new cinder api, it's getting late for pike 16:28:17 <johnthetubaguy> this is blocking us turning on the new API right? 16:28:21 <mriedem> yes 16:28:28 <jgriffith> I don't think there's a need for a new API to simplify that 16:28:34 <mriedem> because swap won't handle new style attachments othrewise 16:28:59 <johnthetubaguy> right, ideally the new and old volumes each have a separate attachment during the swap, because nova has two attachments 16:29:01 <ildikov> mriedem: +1 16:29:03 * jgriffith will look at it again today 16:29:22 <jgriffith> I think there's some confusion also because there's two types of swap 16:29:38 <jgriffith> Literally swap a-->b using libvirts copy function 16:29:48 <johnthetubaguy> cinder started and Nova API started with a different volume 16:29:51 <jgriffith> and the swap associated with migration 16:29:56 <jgriffith> they're two very different things 16:30:03 <johnthetubaguy> right, we are talking about the first one right? 16:30:10 <johnthetubaguy> the Nova swap_volume API? 16:30:11 <mriedem> they both result in the libvirt driver doing the blockRebase though right? 16:30:13 <jgriffith> so that's easy 16:30:19 <jgriffith> mriedem yes 16:30:36 <mriedem> who initiates them is what impacts the migration_completion callback to cinder 16:30:40 <mriedem> and i believe is the pickle 16:30:56 <johnthetubaguy> ah, so its the same two things I am thinking about, just different names 16:30:59 <mriedem> today if cinder didn't initiate, it just ignores the migration_completion callback i think 16:31:01 <jgriffith> mriedem yes, but in the first case the callback is a noop anyway so can just go away 16:31:17 <jgriffith> the second case is a mess, and I honestly haven't come up with an answer for it as of yet 16:31:18 <johnthetubaguy> I thought the only difference is the result volume-uuid for the thing thats left attached at the end? 16:31:43 <jgriffith> johnthetubaguy yes, the second case does a swap of the volume-uuid to make it an "invisible" thing 16:31:43 <mriedem> jgriffith: sure, it can't go away on the nova side w/o us knowing if we should call it or not though, is our problem 16:31:47 <jgriffith> which is stupid IMO 16:32:14 <jgriffith> mriedem but Nova already has all of the control on that 16:32:21 <jgriffith> (in the first case) 16:32:25 <johnthetubaguy> we don't right? 16:32:29 <jgriffith> I'm agreeing that the second case is different 16:32:37 <mriedem> we don't know who called the swap volume REST API 16:32:39 <johnthetubaguy> we don't know if its a user calling us or cinder after the user called cinder right? 16:32:42 <mriedem> if it was cinder or an admin 16:33:01 <jgriffith> johnthetubaguy yes, it's not written that way due to call-backs etc but the call-backs don't actually *do* anything on the Cinder side 16:33:24 <mriedem> jgriffith: in the case of a non-volume migrate initiated swap, right? 16:33:26 <johnthetubaguy> no, no, we are saying the REST API calls on the Nova side, and Nova state is identical in both cases 16:33:32 <jgriffith> mriedem correct 16:33:33 <mriedem> in the case of volume migration, i thought they completed the migration on the cinder side 16:33:55 <mriedem> i think we all know what we're saying, we're just saying different things :) 16:34:02 <jgriffith> haha.. perhaps 16:34:06 <johnthetubaguy> mriedem: +1 16:34:13 <mriedem> the issue is what migrate_completion does on the cinder side for new-style attachments for a volume-migrate initiated swap 16:34:18 <jgriffith> regardless there are problems there that aren't trivial 16:34:31 <jgriffith> IMHO that code should all be reverted :) 16:35:16 <johnthetubaguy> if there was a new swap API, that took two attachment_ids, and was the one cinder called, and for that one we made a new callback to a new API that took the same two attachment_ids... we would be in a more sane place? 16:35:23 <ildikov> jgriffith: that might not be an option :) 16:35:34 <jgriffith> bummer 16:35:35 <johnthetubaguy> actually... thats missing too much context 16:36:00 <johnthetubaguy> to be clear, this is now blocking multi-attach, and us moving to the new API 16:36:23 <ildikov> it pretty much does 16:37:01 <johnthetubaguy> for me, if we passed attachment_ids around, instead of volume-ids, for cinder and nova APIs, wouldn't we have all the state we needed? 16:37:34 <hemna> especially in the case of multi-attach that makes sense 16:38:14 <jgriffith> johnthetubaguy yes, that was kinda the whole idea of the attachment-object 16:38:17 <johnthetubaguy> that means passing Nova an attachment we would update from os-brick, then attach to the instance, and the other attachment would have to be already attached, etc 16:38:32 <ildikov> johnthetubaguy: how would that help with the we don't know who called the swap-volume API problem? 16:39:23 <johnthetubaguy> ildikov: it doesn't fix that, but we would be telling you what the attchement was, so cinder should know if thats a migration attachment or not 16:39:35 <hemna> attaching the attached attachment. 16:39:40 <johnthetubaguy> it helps a bit in that we care less if the volume-uuid changes, because the attachment-uuid is the same 16:40:00 <ildikov> johnthetubaguy: ah ok, got it 16:40:00 <johnthetubaguy> hemna: whos doing what now? 16:40:08 <hemna> :P 16:40:10 <hemna> ignore me 16:40:21 <johnthetubaguy> heh 16:41:05 <johnthetubaguy> it seems like a new cinder API to be added, and a new Nova API that cinder calls into? 16:41:27 <johnthetubaguy> thats one approach, I guess the question is, is there a simpler option available? 16:41:39 <jgriffith> johnthetubaguy that honestly might be the safest and most expedient answer 16:42:03 <johnthetubaguy> yeah, its the safest explcit one I think 16:42:08 <ildikov> johnthetubaguy: jgriffith: +1 16:42:08 <jgriffith> but that means we need to come up with something and agree on both sides in like 24 hours :) 16:42:41 <ildikov> jgriffith: I don't see the issue :) 16:42:46 <johnthetubaguy> well... ideally three months ago, but either way could work 16:43:00 <jgriffith> ha! 16:43:06 * jgriffith fires up his way-back machine 16:43:36 <mriedem> i'm not totally following here, you're suggesting a new cinder api where nova passes in the newly attached attachment uuid for the attachment we swapped *to*? 16:43:41 <mriedem> to complete the migratoin? 16:43:58 <johnthetubaguy> a new nova API to pass the existing and new attachment id 16:44:02 <mriedem> and if swap was initiated by a user, not cinder, then it's ignored like today? 16:44:10 <jgriffith> mriedem I *think* a new cinder/nova interaction for the volume-migration call from Cinder 16:44:11 <johnthetubaguy> then a new cinder API it calls back to, passing both the attachments 16:44:12 <mriedem> new nova api? 16:44:31 <johnthetubaguy> yeah, I think its a new nova api too really 16:44:39 <jgriffith> johnthetubaguy +1 16:44:43 <mriedem> umm 16:44:45 <johnthetubaguy> so the cinder started one, calls a new API that gets attachments 16:45:00 <johnthetubaguy> we can try and stop regular using using that this time, for real 16:45:05 <mriedem> can we not do more special case service-specific APIs please 16:45:14 <mriedem> you can't stop regular users from using it 16:45:26 <johnthetubaguy> well, I mean good default RBAC really 16:45:34 <mriedem> default rbac on swap is already admin only 16:45:36 <johnthetubaguy> its user safe, if they create attachments 16:45:38 <jgriffith> ok, I'll look again later today and see if there's a way to fix the mess we already have 16:45:52 <johnthetubaguy> yeah, admin only isn't good enough, but thats a different debate 16:46:17 <mriedem> anyway - my fear is, if swap depends on the new attach flow, and now we can't do swap w/o this new api interaction, which is a microversion newer than the new attach api, things start to get super f'ed 16:46:43 <johnthetubaguy> so.. if we ignore multi attach, I am surpized why the cinder side can't just be "fixed up", are we missing something there? 16:47:01 <johnthetubaguy> mriedem: yeah, we would have to raise to this new API when checking if we can use the new flow 16:47:42 <mriedem> maybe we don't design this here, 16:47:48 <mriedem> i don't understand the issues on the cinder side, 16:47:58 <mriedem> but can the problem be stated in the ML? 16:48:05 <johnthetubaguy> thats my previous question really, whats the issue cinder side, if we keep the API the same 16:48:15 <mriedem> and then we, and hopefully others, can sort it out where/when we have time to digest the actual problem? 16:48:18 <johnthetubaguy> yeah, some written down form would be good 16:48:39 <johnthetubaguy> it seems clear we can't move forward here and now without that 16:49:11 <mriedem> i can live with new cinder apis if we have to, because then the deps are still one direction, 16:49:24 <mriedem> but if we're doing new apis on both sides, which is a circular dependency, then we start to get into crazy land 16:49:36 <johnthetubaguy> so proposal 1: re-write the lot to be attachment aware, propsal 2: just fail if any attachment is multi-attach and use the old APIs, question: what does proposal 2 not work for cinder? 16:49:58 <ildikov> mriedem: there are old attach/detach snippets in the call-bak on the Cinder side, which should not hit in the simple case though 16:50:01 <mriedem> i'm not really sure why we're talking about multi-attach at this point 16:50:19 <mriedem> "swapping multi-attached volumes is not supported, fin" 16:50:24 <mriedem> at least in v1 16:50:31 <ildikov> mriedem: +1 16:50:37 <mriedem> we do'nt support multiattach today in nova anyway 16:50:40 <mriedem> and likely won't in pike 16:50:51 <johnthetubaguy> well, we should include swapping a volume that has multiple attachments 16:51:00 <johnthetubaguy> i.e. half way through a live-migration 16:51:13 <mriedem> that's not the same operatoin though 16:51:19 <mriedem> there is no migrate_completion callback during nova live migratoin 16:51:21 <johnthetubaguy> Nova will just reject those API calls anyways, so no change there 16:51:33 <mriedem> the migration_completion cinder api is the issue 16:51:38 <mriedem> as far as i understand 16:51:42 <mriedem> and that's *only* for swap volume 16:51:44 <mriedem> on the nova side 16:51:49 <ildikov> mriedem: correct 16:51:53 <johnthetubaguy> yes, I am talking about when you call swap volume 16:51:59 <johnthetubaguy> what is the state of the system 16:52:17 <johnthetubaguy> if we have a volume attached on two hosts, its "complicated" on the cinder side, as there are two attachments 16:52:29 <johnthetubaguy> but Nova doesn't care about that case 16:52:39 <mriedem> johnthetubaguy: yo'ure talking about swapping a multiattach volume though right? 16:52:43 <johnthetubaguy> no 16:52:51 <johnthetubaguy> multi-attach=false 16:53:02 <johnthetubaguy> two attachments 16:53:10 <johnthetubaguy> source host and destination host 16:53:15 <mriedem> how is the same volume attached to two hosts during swap volume? 16:53:19 <mriedem> if not multiattach? 16:53:25 <mriedem> swap volume is literally 2 separate volumes 16:53:28 <mriedem> 1 attachment each 16:53:31 <johnthetubaguy> on the cinder side you can call migrate at any time 16:53:43 <johnthetubaguy> the VM could be in the live-migrating state 16:53:47 <johnthetubaguy> Nova fails the API call 16:54:00 <johnthetubaguy> but theoretically, you can have two attachments in the Cinder DB 16:54:02 <johnthetubaguy> same volume 16:54:02 <mriedem> can you swap volumes on an instance being migrated? 16:54:05 <johnthetubaguy> same instance 16:54:25 <johnthetubaguy> so if you call that, I think you get a Nova API error 16:54:31 <johnthetubaguy> but its a case on the cinder side 16:54:33 <mriedem> you can't swap volumes on an instance that's undergoing a task state change i don't think 16:54:42 <mriedem> we're either talking past each other, 16:54:42 <johnthetubaguy> I am trying to say cinder can ignore that case 16:54:48 <mriedem> or i'm just totally not understanding what you're saying 16:54:50 <johnthetubaguy> so... yeah, that 16:55:07 <mriedem> with 5 minutes left, and i need lunch, let's table this and move it to the ML, 16:55:11 <johnthetubaguy> probably a hangout thing, after its all written down 16:55:12 <johnthetubaguy> +1 16:55:28 <mriedem> but i'd really just like jgriffith or someone from cinder to describe the problem we have today with non-multiattach volumes and the migration_callback in cinder 16:55:41 <mriedem> with the new flow 16:55:49 <ildikov> mriedem: johnthetubaguy: we will get it written down with jgriffith and then we can figure out whether it's an ML thread a Hangouts call or both or what else 16:56:02 <mriedem> i really need writing to digest it 16:56:10 <johnthetubaguy> jgriffith: do you understand the question we are asking? 16:56:31 <jgriffith> johnthetubaguy partially, but frankly you are all over the board :) 16:56:42 <mriedem> jgriffith: "i'd really just like jgriffith or someone from cinder to describe the problem we have today with non-multiattach volumes and the migration_callback in cinder" 16:56:43 <jgriffith> So I'll put a write up together on the two cases 16:56:46 <mriedem> that's step 1 16:56:49 <jgriffith> and what the challenges are problems etc 16:56:55 <johnthetubaguy> basically, if we keep all the APIs the same, whats broken in the new world? 16:56:57 <johnthetubaguy> perfect 16:57:09 <ildikov> johnthetubaguy: your last scenario is unclear to me but we will get to the base case and let's take it from there 16:57:35 <johnthetubaguy> I think I am making option 2 possible, but yeah, lets get the basics written down 16:58:36 <ildikov> johnthetubaguy: I think we will get a better view with some written material first 16:58:36 <johnthetubaguy> cool, seems like we are done, and just waiting for that write up? 16:58:50 <ildikov> johnthetubaguy: yes 16:59:27 <stvnoyes> jgriffith: can you stick around here after the mtg? I have a couple of questions for you about the attach POC operation 17:00:02 <mriedem> stvnoyes: could move to nova or cinder channels 17:00:04 <jgriffith> stvnoyes sure 17:00:06 <ildikov> ok, let's end the meeting and move to the Cinder channel with further chats 17:00:15 <jgriffith> stvnoyes mriedem yeah... I'll catch you cinder 17:00:15 <ildikov> or the Nova one :) 17:00:17 <stvnoyes> ok see on the cinder irc 17:00:26 <ildikov> thank you all! 17:00:34 <ildikov> will catch up later on swap 17:00:45 <ildikov> #endmeeting