21:01:21 <ildikov> #startmeeting cinder nova api changes 21:01:21 <openstack> Meeting started Wed Apr 6 21:01:21 2016 UTC and is due to finish in 60 minutes. The chair is ildikov. Information about MeetBot at http://wiki.debian.org/MeetBot. 21:01:22 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 21:01:24 <openstack> The meeting name has been set to 'cinder_nova_api_changes' 21:01:27 <scottda> Good timing! 21:01:30 <scottda> scottda ildikov DuncanT ameade cFouts johnthetubaguy jaypipes takashin alaski e0ne jgriffith tbarron andrearosa hemna erlon mriedem gouthamr ebalduf patrickeast smcginnis diablo_rojo 21:01:39 <jgriffith> o/ 21:01:40 <andrearosa> o/ 21:01:45 <takashin> o/ 21:01:45 <DuncanT> o/ 21:01:48 <ildikov> scottda: thanks :) 21:01:58 <ildikov> #chair scottda 21:01:59 <openstack> Current chairs: ildikov scottda 21:02:12 <mriedem> o/ 21:02:18 <smcginnis> o/ 21:02:27 <diablo_rojo> Hello 21:02:38 <smcginnis> Nice, learned about the #chair command. 21:02:47 <hemna> yough 21:03:10 <ildikov> smcginnis: I learnt new things like this today too :) 21:03:24 <smcginnis> Always more to learn. ;) 21:03:39 <ildikov> etherpad: #link https://etherpad.openstack.org/p/cinder-nova-api-changes 21:03:53 <ildikov> smcginnis: I hear ya ;) 21:04:00 <scottda> I was concerned that this meeting is pretty late for some people. I think DuncanT is UTC + 300 so it's 2400 there. 21:04:09 <ildikov> so the first topic on the etherpad is the meeting time slot 21:04:19 <DuncanT> Yup, midnight here 21:04:21 <scottda> Who is furthest East, i.e. is anyone earlier than UTC -700 PDT TZ? 21:04:33 <scottda> takashin: ? 21:04:34 <ildikov> takashin: what is your time zone? 21:04:46 <takashin> JST 21:04:51 <takashin> UTC+9 21:05:15 <ildikov> so it's 6am in yours now, right? 21:05:21 <takashin> Yes. 21:05:31 <scottda> OK, so this is about as good as it gets then. 21:05:55 <scottda> Just checking. Duncan already said he was going to hate me in the morning :) 21:06:02 <ildikov> scottda: we couldn't get much better :( 21:06:13 <DuncanT> I'll survive and just curse Scott in the morning 21:06:15 <hemna> dang, that's rough 21:06:15 <ildikov> DuncanT: sorry, it's 11pm here too, I almost feel your pain 21:06:23 <scottda> ildikov: Fair enough, I just wanted to check. We can move on.... 21:06:37 <ildikov> alright 21:06:41 <andrearosa> I am on holiday ATM and it is 11pm here so I win! 21:06:55 <ildikov> andrearosa: kudos! 21:07:18 <scottda> Hopefully, we do a couple more of these before the Summit and then get to some consensus at the Summit... 21:07:19 <ildikov> so we have a few alternatives on the etherpad to track more info about attachments 21:07:29 <scottda> So we won't have to keep doing this for too terribly long. 21:07:36 <ildikov> scottda: +1 21:07:51 <ildikov> the idea about the meetings is to ensure progress 21:08:07 <ildikov> hopefully the time zone issue will give us some extra motivation 21:08:25 <scottda> So, this is a bit of work, but hemna started creating diagrams.... 21:08:36 <scottda> https://www.gliffy.com/go/publish/image/10360157/L.png 21:09:08 <scottda> We don't want to just make some pictures for the fun of it, but it seems useful to figure out the flow of things for attach, detach, nova live-migration, and anything else relevant... 21:09:20 <andrearosa> wow very useful 21:09:51 <scottda> Before we jump into details about how to fix all this, it'd be good to decide all the things that need fixing at the high level first. That's my opinion. 21:10:04 <smcginnis> scottda: +1 21:10:37 <hemna> scottda, +1 21:10:39 <hemna> yah 21:10:49 <hemna> I wanted to create another diagram like that for live migration as well 21:10:52 <ildikov> I think we could add these diagrams to developer docs as well 21:10:58 <hemna> since that has a lot of interactions with Cinder as well 21:11:08 <hemna> I guess another with detach would be helpful too 21:11:09 <scottda> Yeah, once you start in on the diagrams, there's uses everywhere. 21:11:11 <jgriffith> scottda: might be worth mentioning there's also an explanation of the cinder side in the official docs here: http://docs.openstack.org/developer/cinder/devref/attach_detach_conventions.html 21:11:35 <jgriffith> scottda: might be good to level set on the design and how things are intended to work 21:11:46 <scottda> jgriffith: Thanks, That's very useful. 21:11:47 <ildikov> jgriffith: looks like a good fit for diagrams like the one above 21:12:22 <cFouts> o/ 21:12:33 <ildikov> hemna: yeah, detach would be needed as well, it's similar regarding interaction with Cinder and os-brick 21:12:42 <hemna> yup 21:13:15 <ildikov> is it only me or that flow really is complicated? 21:13:24 <scottda> ildikov: It's just you. 21:13:28 <scottda> just kidding 21:13:31 <hemna> :) 21:13:40 <ildikov> lol :) 21:13:42 <hemna> the flow is complicated. 21:13:48 <scottda> So, I had thought a few months back about creating a simpler API... 21:13:55 <hemna> and my diagram probably isn't the best way to show it. 21:14:03 <scottda> https://review.openstack.org/#/c/267715/ 21:14:14 <smcginnis> hemna: That does capture a lot. Thanks for pulling that together. 21:14:15 <scottda> But then issues around multi-attach came up, and live migration.... 21:14:47 <ildikov> smcginnis: +1 21:14:50 <hemna> smcginnis, welcome :) 21:15:00 <scottda> But this does bring up a question: Should we be thinking more globally , and see if there's a simpler API that we could create that makes most/all of these problems go away? 21:15:02 <ildikov> scottda: do you mean that those features are contradicting with your spec? 21:15:13 <scottda> ildikov: No, not a contradiction... 21:15:14 <smcginnis> scottda: I'm all for a simpler API. 21:15:33 <jgriffith> smcginnis: scottda can you clarify what you mean by "simpler" API 21:15:34 <scottda> Just that we are like the 5 blind men trying to describe the elephant...... 21:15:40 <jgriffith> smcginnis: scottda the API is pretty simple as it is IMO 21:15:46 <scottda> Eveyone looks at a different part, and sees it differently 21:15:50 <jgriffith> "nova volume-attach xxx yyyy /dev/vdb" 21:15:52 <jgriffith> how much simpler? 21:15:57 <DuncanT> jgriffith: I think we can get it down to three calls 21:16:08 <scottda> jgriffith: I mean the underlying calls, not the initial command 21:16:11 <jgriffith> DuncanT: again, please clarify what layer/api you're referring to 21:16:14 <jgriffith> scottda: thanks 21:16:16 <DuncanT> Not that API I think, the one between cinder and nova 21:16:21 <smcginnis> DuncanT: +1 21:16:22 <jgriffith> DuncanT: it's already only 3 calls 21:16:34 <hemna> jgriffith, didn't we whiteboard the 'simpler' API in one of the cinder midcycle meetups and it kinda turned into exactly what we have already today ? 21:16:40 <scottda> jgriffith You and hemna and I started talking about this a couple mid-cycles ago... 21:16:46 <scottda> snap 21:16:48 <smcginnis> hemna: Hah, really. OK. 21:16:55 <scottda> hemna: Yes, in Ft. Collins last summer 21:17:03 <jgriffith> scottda: hemna correct 21:17:08 <jgriffith> I don't know how/what you want to make simpler 21:17:11 <scottda> See that spec for a basic idea of where we went with that ^^ 21:17:15 <jgriffith> it's pretty basic as is today 21:17:36 <jgriffith> it's just not followed and there's some clunky things we have going on with managing and tracking 21:17:53 <DuncanT> jgriffith: I'd get rid of reserve and its converse... not a huge win for nova but helps reduce failure windows for bare metal 21:18:07 <jgriffith> DuncanT: how come? 21:18:30 <DuncanT> jgriffith: Because things die between calls. With annoying frequency in dev systems 21:18:34 <hemna> I thought nova needed reserve 21:18:36 <jgriffith> DuncanT: I'm not familiar with what reserving the resource in the DB causes? 21:18:53 <DuncanT> jgriffith: Things stuck in 'attaching' 21:18:56 <jgriffith> DuncanT: hmm... yeah, but then you introduce races 21:18:57 <hemna> to ensure the volume was 'locked' for it to attach through the process 21:18:58 <mriedem> nova needs reserve so the volume is 'attaching' from the api before we cast to the compute to do the actual attach 21:19:08 <hemna> mriedem, +1 21:19:32 <mriedem> we don't have a reserve like that for networks, for example, so we can hit races with neutron today too 21:19:36 <mriedem> and quota issues 21:19:44 <jgriffith> DuncanT: I think the idea of something failing between reserve and initialize is another problem that should be solved independently 21:19:46 <DuncanT> jgriffith: Make it valid to call initialise_connection with do_reserve_for_me=True and bare metal is easier I think 21:19:59 <scottda> Yeah, but why not wait until Cinder returns the connector info before Nova even proceeds? Do we need to be async here? 21:20:01 <jgriffith> mriedem: +1 and thus my point about races... that's why that call is there 21:20:19 <jgriffith> scottda: because somebody could come in and delete the volume 21:20:21 <mriedem> does the ironic driver in nova even support volume attach? 21:20:41 <hemna> jgriffith, +1 21:20:45 <scottda> internally, we can still set to 'attaching' and then proceed to initialize connector: 21:20:53 <hemna> or another n-cpu node to put it into attaching as well 21:20:57 <hemna> (multi-attach) 21:21:06 <DuncanT> jgriffith: Either the API call wins the race against the delete, puts the volume into 'attached' and the delete fails, or the API call looses and returns 404 21:21:13 <scottda> https://www.irccloud.com/pastebin/1L3KWe0j/ 21:21:18 <DuncanT> mriedem: It's being worked on 21:21:42 <jgriffith> DuncanT: ok... but personally I'd say if you're resources are disappearing in the 1 second between those two calls we should fix soemthing else up to recover from it 21:21:53 <jgriffith> DuncanT: and I don't know how you do that anyway if you lost something 21:22:04 <mriedem> scottda: that diagram assumes that smart_attach is going to be fast on the cinder side? 21:22:20 <mriedem> scottda: nova api can't be waiting for a long call to return from cinder else we timeout the request 21:22:23 <jgriffith> I don't want to take up a ton of time on this though because I'm not overly familiar or understanding the problem completely 21:22:27 <DuncanT> jgriffith: It is a small optimisation, but as I said, I managed to hit that window a fair bit, and we used to see it in the public cloud too, e.g. rabbit restarts 21:22:27 <scottda> mriedem: It does, but that's not necessarily a good assumption.... 21:22:38 <hemna> mriedem, that smart_attach might take a while 21:22:51 <mriedem> scottda: right, this is why we rejected doing the get-me-a-network allocation with neutron from nova-api 21:22:54 <mriedem> b/c we can't hold up nova-api 21:22:56 <hemna> because it looks like it is doing the work of initialize_connection today, which is entirely driver dependent. 21:23:03 <DuncanT> mriedem: I wouldn't force nova to do a one step rather than two, I'm thinking of bare metal and none-nova consumers 21:23:08 <scottda> Nova still has to wait for the connector info during initialize_connection... 21:23:22 <mriedem> DuncanT: ok, well, i'm not really interested in the non-nova case 21:23:25 <hemna> if smart_attach was async to nova? 21:23:31 <hemna> but man that is a lot of changes 21:23:45 <mriedem> scottda: sure, but we wait on the compute 21:23:51 <mriedem> after we already cast from nova-api 21:23:58 <scottda> And maybe not worth it. It's just an idea that gets proposed at various times. 21:24:03 <jgriffith> mriedem: scottda I'm not sure why this is *good*? 21:24:03 <DuncanT> I guess I'm in a minority though, and I don't think my proposal overlaps with the other discussion much at all, so maybe shelve it for now and let people think about it later? 21:24:16 <jgriffith> mriedem: scottda as opposed to just doing better management of what we have? 21:24:34 <DuncanT> jgriffith: initalise_connection is slow, so kicking it off earlier and picking up the result later can be a win 21:24:36 <jgriffith> mriedem: scottda or... "more efficient" better implementation of what we have 21:24:54 <jgriffith> DuncanT: why? / how? sorry, I'm not following 21:24:56 <scottda> jgriffith: That may be the way to go. I'm just bringing this up because it had been discussed previously. 21:25:13 <jgriffith> DuncanT: no matter what you can't attach a volume w/out an iqn 21:25:21 <scottda> We can move on to discussing just fixing what is missing at the moment, as far as I'm concerned. 21:25:28 <mriedem> i've had thoughts on ways to make boot from volume better in nova, but it doesn't really have anything to do with what we need to accomplish before the summit, which is what this meeting is supposed to be about 21:25:29 <DuncanT> jgriffith: Look at EMC.... they were taking long enough for the RPC to time out when having to do FC zone setup too 21:25:29 <jgriffith> so if the backend is slow for giving that for some reason ??? you can't do anything without it 21:25:33 <mriedem> i have a hard stop in like 30 minutes 21:25:39 <jgriffith> DuncanT: no, that's different 21:25:47 <scottda> Let's move on.... 21:25:52 <DuncanT> Ok, we can come back to this later 21:25:57 <jgriffith> DuncanT: that was an API response issue with multiple clones to their backend API simultaneously 21:26:07 <jgriffith> "their" API 21:26:24 <mriedem> so i see 3 solutions in https://etherpad.openstack.org/p/cinder-nova-api-changes 21:26:27 <mriedem> for multiattach 21:26:29 <jgriffith> DuncanT: they have a limit on # simultaneous API requests on teh backend 21:26:31 <mriedem> who wants to go over those? 21:26:36 <scottda> mriedem: +1 21:26:43 * jgriffith can go over his 21:26:44 <ildikov> mriedem: +1 21:27:08 <ildikov> jgriffith: the floor is yours :) 21:27:16 <jgriffith> k... thanks 21:27:38 <jgriffith> So IMO the bigger challenge with a lot of things we've been circling on is more around managing attach status 21:27:46 <mriedem> which one is this? #4? 21:27:55 <scottda> I think it's #2 21:27:56 <hemna> mriedem, #2 21:28:03 <mriedem> ok 21:28:14 <jgriffith> oh... yeah sorry 21:28:21 <jgriffith> Also to help: #link https://review.openstack.org/#/c/284867/ 21:28:46 <jgriffith> so eharney brought up a point about that being a bit iSCSI centric, we can revisit that 21:29:11 <jgriffith> but the proposal I think solves the multi-attach problem as well as some of this circling about the current API and how things work 21:29:29 <jgriffith> we already pass the connector info in to initialize_connection... but sadly we don't really do much with it 21:29:51 <jgriffith> if we took that info and built out the attachments table properly, we could just give the caller (Nova) back the attachment ID 21:30:00 <jgriffith> Nova wouldn't need to care about multi-attach etc 21:30:05 <jgriffith> Just the attachment ID 21:30:17 <jgriffith> no tracking, no state checking etc 21:30:29 <jgriffith> Same holds true on the cinder side for detach 21:30:46 <jgriffith> we don't care anymore... if we get a detach request for a volume with an attachment-ID we just act on it 21:30:55 <jgriffith> quit adding complexity 21:31:04 <scottda> Who would determine when to remove the export in your case jgriffith ? 21:31:12 <hemna> sonus, I'm confused how initialize_connection works for multi-attach looking at the code. re: attachment = attachments[0] 21:31:14 <scottda> IN the case where some drivers multiplex connections. 21:31:32 <ildikov> jgriffith: we still need to figure out in Nova when to call disconnect_volume 21:31:32 <jgriffith> scottda: that's always up to the consumer 21:31:50 <jgriffith> scottda: that's the point here... if it's multiplexed that's fine, but each "plex" has it's own attachment-id 21:31:52 <scottda> Would Nova just call detach with an attachment_id, and then Cinder (manager?) or driver? figures it out. 21:32:00 <hemna> ildikov, if we stored the host and the instance_uuid, nova can make the correct choice on it's side. 21:32:12 <jgriffith> scottda: yes, that's the only way you can handle the deltas between drivers and their implementations 21:32:29 <jgriffith> hemna: sorry... which part are you confused on? 21:32:31 <ildikov> hemna: we need the back end info as well, regarding the multiplex or not cases 21:32:52 <jgriffith> hemna: ildikov wait.. back up a second 21:32:55 <scottda> jgriffith: OK, I do like that the consumer (Nova ) just calls detach(attachment_id) and let's Cinder figure out when to un-export based on driver 21:32:57 <jgriffith> let me try this 21:33:00 <jgriffith> nova has node-A 21:33:05 <hemna> jgriffith, line 920 in manager.py 21:33:18 <jgriffith> we request an attach of a volume to instance Z on node-A (Z-A) 21:33:20 <hemna> makes an assumption that the correct attachment is always the first found. 21:33:37 <jgriffith> we then request another attach of the same volume also on node-A but now instance X 21:33:44 <jgriffith> (X-A) 21:33:53 <jgriffith> we create an attachment ID for each one 21:34:03 <hemna> as attachments has a list of all attachments for that volume, which can be on any host or any instance. 21:34:15 <jgriffith> we don't care if your particular backend shares a target, creates a new one or boils a pigeon 21:34:25 <jgriffith> hemna: please let me finish 21:34:41 <hemna> ok, sorry, you asked why I was confused. 21:34:43 <jgriffith> at this point if nova is done with the volume on Instance Z 21:34:58 <jgriffith> it issues the detach using the attachment-id associated with Z-A 21:35:39 <mriedem> i don't think that's the hard part, the os-detach call to cinder that is, the hard part is figuring out if/when we call virt.driver.disconnect_volume, which comes before calling cinder os-detach 21:35:49 <jgriffith> that goes into the detach flow on Cinder.. and IF and ONLY if a driver needs to do some special magic they can do something different 21:36:03 <jgriffith> like "hey... how many attachments to this host, volume etc etc" 21:36:10 <ildikov> mriedem: yeah, my point exactly 21:36:36 <jgriffith> mriedem: I'm not sure which part you're referring to? 21:36:46 <hemna> mriedem, if cinder has the host and the instance_uuid for every entry in the volume_attachments table, nova can loop through those and decide if the volume has anything left on that n-cpu node. 21:36:48 <jgriffith> mriedem: I mean... 21:36:55 <hemna> if it doesn't, then it calls disconnect_volume. 21:36:57 <jgriffith> I am unclear on the challenge there? 21:37:11 <jgriffith> hemna: +1 21:37:12 <mriedem> jgriffith: during detach in nova, the first thing we do is disconnect the volume in the virt driver 21:37:17 <smcginnis> hemna: So Cinder would call disconnect_volume in that case, not nova, right? 21:37:22 <hemna> smcginnis, no 21:37:25 <hemna> nova does 21:37:29 <jgriffith> mriedem: right...sorry, hemna 's comment pointed it out for me 21:37:33 <mriedem> hemna: yeah we talked about this last week a bit 21:37:44 <smcginnis> hemna: Oh, right, I see. 21:37:45 <mriedem> i had some pseudo logic in that meeting for the nova code 21:37:55 <hemna> only n-cpu can, because that's the host where the volume exists (/dev/disk/by-path/<entry here>) 21:38:06 <jgriffith> mriedem: so the trick is that in my case and LVM's case I've written it such that you get a unique target for each attach 21:38:26 <jgriffith> mriedem: so even if you have the same LVM volume attached twice on a compute node it has two attachments/iqns 21:38:30 <jgriffith> mriedem: so we don't care 21:38:38 <jgriffith> you said detach... we detach 21:38:44 <jgriffith> mriedem: that's how it solves the problem 21:39:01 <ildikov> I think we said something about having a flag about the back end 21:39:08 <mriedem> so disconnecting Z-A doesn't mess up X-A 21:39:14 <jgriffith> mriedem: exactly 21:39:20 <jgriffith> mriedem: the idea is to completely decouple them 21:39:25 <mriedem> ildikov: jgriffith is saying that wouldn't be a problem with his soloution 21:39:29 <jgriffith> mriedem: make them totally independent 21:39:29 <ildikov> as when we have a target per volume than we need to call disconnect volume regardless of how many attachments we have on the host 21:39:43 <mriedem> ildikov: the flag, ifi remember correctly, was something in the connection_info about the cinder backend 21:39:53 <jgriffith> mriedem: for devices that 'can't' do multiple targets for the same volume that's fixable as well 21:40:13 <jgriffith> mriedem: you can still spoof it easy enough on the /dev/disk-by-path entry 21:40:28 <mriedem> you being cinder 21:40:29 <mriedem> ? 21:40:32 <ildikov> mriedem: yeah, that might be the one, I don't remember where we placed it 21:40:43 <hemna> ildikov, yah I think that's where it was. 21:40:44 <jgriffith> mriedem: no, that is up to Nova/Brick when they make the iscsi attachment 21:40:50 <hemna> shared/notshared flag 21:41:00 <jgriffith> mriedem: well... yeah, cinder via volume-name change 21:41:12 <mriedem> volume-name change? 21:41:18 <jgriffith> mriedem: yeah 21:41:25 <ildikov> hemna: right, I didn't like the name, but I like the flag itself I remember now :) 21:41:25 <mriedem> i'm not following 21:41:46 <jgriffith> mriedem: so in the model info instead of attaching and mounting at /dev/disk-by-path/volume-xxxxx.iqn umptysquat 21:41:49 <jgriffith> twice!! 21:42:07 <jgriffith> you do something like: iqn umptysquat_2 21:42:10 <jgriffith> you do something like: iqn umptysquat_3 21:42:12 <jgriffith> you do something like: iqn umptysquat_4 21:42:14 <jgriffith> etc 21:42:35 <cFouts> no reference counting anywhere then 21:42:39 <jgriffith> nova gets a detach call... uses the attach ID to get the device path and disconnects that one 21:42:55 <jgriffith> cFouts: correct ZERO reference counting becuase they're indepndent 21:43:01 <jgriffith> and decoupled 21:43:29 <hemna> so I think the question was, how to accomplish that for backends that can't do separate targets for the same volume on the same initiator 21:43:39 <mriedem> the attach id gets us the attachment which has the device path from the connector dict? 21:43:45 <jgriffith> mriedem: I haven't worked out a POC patch on the Nova side yet because I've been told that people don't like the proposal so I don't want to spend time on somethign for nothing :) 21:43:46 <mriedem> and we pass that connector to os-brick to disconnect? 21:44:11 <jgriffith> mriedem: it can we have everything in the attach object 21:44:24 <jgriffith> hemna: that's the exact case that I just described? 21:44:48 <jgriffith> hemna: you just attach it using a modified name 21:44:55 <hemna> but udev is what creates those paths in /dev/disk/by-path 21:45:01 <hemna> so I'm confused 21:45:10 <jgriffith> hemna: nah, we have the ability to specify those 21:45:27 <hemna> we don't create those paths though 21:45:39 <hemna> and by we, I mean os-brick 21:45:54 <jgriffith> hemna: I can modify how that works 21:46:05 <jgriffith> anyway.... 21:46:13 <jgriffith> is there even any general interest here? 21:46:23 <ildikov> scottda: I think you mentioned NFS and SMBFS on the etherpad which have detach issues as well 21:46:40 <jgriffith> the snowflakes that require only a single target are a challenge, but it should be solvable 21:46:52 <scottda> jgriffith: I think the idea sounds good. I certainly would keep it around until we have all alternatives discussed. 21:47:08 <ildikov> scottda: do we have an alternative that addresses all types of detach issues we're facing with? 21:47:13 <jgriffith> scottda: well I wasn't going to burn it in the next 15 minutes :) 21:47:21 <smcginnis> jgriffith: My concern is the single target systems. But if we can get that to work, I like the simplicity. 21:47:33 <scottda> ildikov: You mean including NFS and SMB? I'm not sure. Maybe hemna helps with that... 21:47:52 <ildikov> jgriffith: I need to process more the snowflakes part to be able to have a solid opinion 21:47:59 <jgriffith> smcginnis: the single target systems are going to be a challenge no matter what I think. I've yet to see a proposal that I think will work really 21:48:04 <hemna> smcginnis, that's always been the problem we haven't solved yet in general, nova not knowing when it can safely call os-brick.disconnect_volume 21:48:24 <jgriffith> hemna: add some info or another check to Cinder 21:48:34 <jgriffith> I mean worst case scenario you could just do that 21:48:47 <jgriffith> hemna: cinder.api.safe-to-discon 21:48:56 <ildikov> scottda: there's a note in the Issues summary part that says "Problem exists in non-multi-attach for NFS and SMBFS volumes" 21:49:03 <hemna> jgriffith, if we stored both the host and instance_uuid in each volume_attachment table entry, nova can use that along with the 'shared' flag in the connection_info coming back from initialize_connection, to decide if it should call disconnect_volume or not. 21:49:04 <jgriffith> hemna: snowflakes can implement a check/response 21:49:05 <jgriffith> True/False 21:49:14 <jgriffith> hemna: my proposal does just that 21:49:49 <jgriffith> without the extra tracking complexity you mention 21:50:16 <ildikov> jgriffith: I think if we can have something like that shared flag Nova can figure out at detach time that would help 21:50:23 <hemna> ok, maybe I simply don't understand the nova side of your changes then. 21:50:34 <jgriffith> hemna: https://review.openstack.org/#/c/284867/2/cinder/volume/manager.py Line 947 21:50:36 <hemna> re: /dev/disk/by-path entries being created outside of udev 21:50:48 <jgriffith> alright, fair enough 21:50:55 <jgriffith> let's hear your proposal? 21:51:26 <ildikov> 10 minutes left fro the official hour 21:51:27 <hemna> mine isn't that different really. 21:51:29 <hemna> heh 21:51:34 <jgriffith> ildikov: yeah, I think it's a lot easier to add another check 21:51:39 <ildikov> hemna: can we run through yours? 21:51:45 <jgriffith> hemna: well how do you solve the problem that you said I don't solve? 21:51:51 <hemna> have os-reserve create the attachment table entry and return the attachment_id 21:51:57 <hemna> nova has it for every cinder call after that. 21:52:01 <jgriffith> hemna: uhhh 21:52:06 <hemna> including initalize_connection 21:52:09 <jgriffith> so 2 questions: 21:52:18 <hemna> can I finish please ? 21:52:22 <jgriffith> 1. why put it in reserve 21:52:39 <jgriffith> 2. how is it better to just use my proposal in a different call? 21:52:44 <jgriffith> hemna: yes, sorry 21:52:49 <hemna> thanks 21:53:14 <hemna> so I like having it returned in os-reserve, because then every nova call has the attachment_id for what it's working on. it's clean, and explicit. 21:53:47 <hemna> that solves the issue that I see in your wip for initialize_connection not handling multi-attach (re: manager.py line 920) 21:54:10 <jgriffith> ? 21:54:14 <hemna> we still have the issue of nova needing to know when it's safe to call osbrick.disconnect_volume. 21:54:46 <hemna> but that can be overcome with the shared flag in connection_info coming back from initialize_connection, as well as having the host in the attachments table along with instance_uuid. 21:55:08 <hemna> https://review.openstack.org/#/c/284867/2/cinder/volume/manager.py line 920 21:55:15 <hemna> attachment = attachments[0] 21:55:15 <jgriffith> hemna: so what your proposing though just creates an empty entry in the table during reserve, then it gets updated the same places as it does in my wip no? 21:55:25 <jgriffith> hemna: because reserve doesn't have any info (it can't) 21:55:34 <hemna> that gets the first attachment it finds for that volume, which can be on any host or against any instance_uuid. 21:55:48 <hemna> all reserve needs to do is create the volume_attachments entry 21:55:54 <hemna> and get the attachment_id and return that. 21:56:12 <jgriffith> hemna: that's better why? 21:56:14 <hemna> nova has the instance_uuid, it can pass that to os-reserve as well. 21:56:36 <mriedem> i was going to say, seems os-reserve needs the instance uuid 21:56:37 <hemna> because we have the attachment_id for initialize_connection and we can always work on the correct attachment. 21:56:42 <hemna> mriedem, yup. 21:56:49 <hemna> then the API from nova to cinder is explicit 21:57:04 <hemna> both nova and cinder knows which attachment is being worked on 21:57:13 <hemna> there is no guess work. 21:57:17 <jgriffith> mriedem: hemna I'm still unclear on how this is any different? 21:57:17 <mriedem> hemna: does nova also pass the instance.host to os-reserve? 21:57:29 <hemna> it can 21:57:46 <hemna> then the attachment has both of those items already set at reserve time. 21:57:52 <jgriffith> hemna: mriedem in fact all the code/impl is exactly the same. You just turn reserve into something different that creates a dummy db entry? 21:57:55 <ildikov> maybe it should 21:57:57 <scottda> maybe we should focus on how these 2 alternatives are different....next meeting. 21:58:07 <ildikov> I don't know the calls during live migration, etc. though 21:58:42 <ildikov> scottda: +1 21:58:45 <hemna> live migration just updates the attachment and changes the host from A to B 21:58:58 <scottda> Perhaps disregard whether the attachment_id is created and returned to Nova in reserve() or intitialize_conn and see where there is agreement. 21:59:11 <mriedem> i have to head out folks 21:59:12 <jgriffith> mriedem: hemna but it already passed *ALL* of that during the existing intialize_connection that we already have 21:59:18 <scottda> And, if possible, create more diagrams! 21:59:23 <mriedem> can we agree on some action items for next week? 21:59:36 <hemna> I can try and work on the other diagrams 21:59:42 <hemna> I'll see if I can switch to using dia 21:59:50 <hemna> since it's opensource. 21:59:52 <scottda> Perhaps some diagrams that show the proposed changes as well? 22:00:12 <mriedem> jgriffith: the only difference from my POV there is os-reserve is in nova-api before we cast to compute where we do os-initialize_connection 22:00:14 <hemna> initialize_connection doesn't have the attachment_id currently 22:00:16 <mriedem> i don't know if that makes much difference 22:01:01 <jgriffith> mriedem: sure.. but it's odd to me. Reserve was specifically to address race conditions and only set the state in the DB to keep from loosing the volume during the process 22:01:02 <ildikov> we could also evaluate the cases like live migration, shelve, etc. to see whether we have issues with any of the above listed calls 22:01:18 <jgriffith> I can't see what the advantage to changing that is 22:01:29 <mriedem> only the attachment_id if we need that later i guess 22:01:31 <hemna> jgriffith, it sets the volume to attaching and yet there is no attachment entry. 22:01:35 <mriedem> like a reservation id 22:01:39 <jgriffith> mriedem: it's not a big deal, and probably would be just fine. Just trying to understand the advantage 22:01:56 <hemna> then nova could have the attachment_id during calls to initialize_connection 22:02:04 <hemna> which works for single and multi-attach 22:02:12 <mriedem> if there are other gaps with live migration/evacuate/shelve offload, then maybe we work those through both approaches and see if there is an advantage, like ildikov said 22:02:13 <jgriffith> hemna: ok, sure I guess 22:02:31 <hemna> ok, so diagram for detach 22:02:35 <hemna> diagram for live migration 22:02:39 <mriedem> #action hemna diagram for detach 22:02:40 <hemna> if I can get to those. 22:02:49 <mriedem> #action hemna diagram for live migration 22:02:56 <scottda> hemna: I'll help with those 22:02:58 <mriedem> #action hemna and jgriffith enter the pit one leaves 22:03:05 <hemna> lol 22:03:06 <cFouts> heh 22:03:08 <jgriffith> mriedem: LOL 22:03:10 <jgriffith> mriedem: nahhh 22:03:13 <scottda> mriedem: You didn't know about the cage at the Summit? 22:03:14 <jgriffith> hemna: can have it 22:03:20 <ildikov> mriedem: yeah, that might show some differences, if not than we can still decide by complexity, amount of changes, any risks, etc. 22:03:30 <ildikov> mriedem: lol :) 22:03:52 <scottda> Whichever we decide, we're still going to have to explain to other Cinder and Nova reviewers... 22:03:54 <jgriffith> just change all of the api signatures... wtf, it'll be fun :) 22:03:55 <mriedem> as a cinder outsider, i'm interested in both but don't know the details or pitfalls enough either way 22:03:58 <scottda> so the diagrams will come in handy. 22:04:16 <mriedem> i.e. i'll need to be told in both cases what the nova changes would look like 22:04:24 <hemna> yup 22:04:32 <mriedem> i think i understood hemna's more last week when we were talking through it 22:04:47 <ildikov> scottda: those are handy in general, I'm already a fan! 22:04:50 <mriedem> anyway, my 2 cents 22:04:56 <mriedem> gotta go though 22:04:59 <mriedem> thanks everyone 22:05:00 <ildikov> scottda: helps much in figuring out what's going on 22:05:06 <scottda> mriedem: Thanks! 22:05:18 <scottda> I'm going to have to head soon myself.... 22:05:26 <scottda> Anything else we can get done here today? 22:05:34 <ildikov> I think we're done for today 22:05:43 <andrearosa> thanks everyone 22:05:44 <ildikov> or well, my brain is at least :) 22:06:06 <ildikov> I will announce the next meeting on the ML for next week as a reminder 22:06:06 <scottda> Get some sleep ildikov andrearosa DuncanT 22:06:14 <scottda> Thanks! 22:06:15 <ildikov> we can track the action items on the etherpad 22:06:37 <ildikov> hemna: can you link the diagrams there if you haven't done already? 22:06:46 <DuncanT> G'night all 22:06:58 <hemna> ildikov, I linked the attach diagram in the etherpad 22:07:08 <ildikov> hemna: coolio, thanks much 22:07:19 <scottda> Bye all 22:07:21 <ildikov> ok, then thanks everyone 22:07:43 <ildikov> have a good night/evening/afternoon/morning :) 22:07:54 <ildikov> #endmeeting