17:00:14 #startmeeting cinder-nova-api-changes 17:00:15 Meeting started Thu Mar 23 17:00:14 2017 UTC and is due to finish in 60 minutes. The chair is ildikov. Information about MeetBot at http://wiki.debian.org/MeetBot. 17:00:17 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 17:00:19 The meeting name has been set to 'cinder_nova_api_changes' 17:00:29 DuncanT ameade cFouts johnthetubaguy jaypipes takashin alaski e0ne jgriffith tbarron andrearosa hemna erlon mriedem gouthamr ebalduf patrickeast smcginnis diablo_rojo gsilvis xyang1 raj_singh lyarwood breitz 17:00:29 o/ 17:00:30 o/ 17:00:45 smcginnis: Jinx 17:00:52 o/ 17:01:03 ildikov: Can you add me to your ping list please? 17:01:19 jungleboyj: sure 17:01:21 o/ 17:01:25 Thank you. 17:01:40 jungleboyj: done 17:01:45 hi all 17:01:49 let's start :) 17:02:25 so we have lyarwood's patches out 17:02:50 I saw debates on get_attachment_id 17:03:11 is there a chance to get to a conclusion on what to do with that? 17:03:31 johnthetubaguy: did you have a chance to catch mdbooth? 17:03:55 He did 17:04:09 yeah, I think we are on the same page 17:04:17 it shouldn't block anything 17:04:18 i'm +2 on the bdm.attachment_id changes 17:04:22 or was 17:04:58 yeah, I am basically +2 as well, but mdbooth suggested waiting to see the first patch that reads the attacment_id, which I am fine with 17:04:58 mdbooth: johnthetubaguy: what's the conclusion? 17:05:23 basically waiting to get the patch on top of that that uses the attachment id 17:05:31 which is the detach volume patch, I think 17:05:34 * mdbooth thought they looked fine, but saw no advantage merging them until they're used. Merging early risks a second db migration if we didn't get it right first time. 17:06:18 * mdbooth is uneasy about not having an index on attachment_id, but can't be specific about that without seeing a user. 17:06:41 is it a get_by_attachment_id that the detach patch uses? I think it's not 17:07:25 it seems like we're getting ourselves into a catch 22 with two cores on the +2 side 17:07:37 we need jgriffith's detach patch restored and rebased 17:07:49 sorry... late 17:07:52 https://review.openstack.org/#/c/438750/ 17:08:02 https://review.openstack.org/#/c/438750/2/nova/compute/manager.py 17:08:12 that checks if the bdm.attachment_id is set and makes the detach call decision based on that 17:08:21 we list the BDMs by instance, not attacment id 17:08:33 but we check the attachment_id to see if we detach with the new api or old api 17:08:37 mriedem: +1 17:09:00 mriedem that's wrong :) 17:09:34 mriedem: but that will not get the get_by_attachment_id usage for us 17:09:48 ildikov: we don't need get_by_attachment_id 17:09:54 jgriffith: why is that wrong? 17:10:03 mriedem never mind 17:10:20 mriedem: I thought that's why we hold the attachment_id patches 17:10:21 mriedem I don't think I'm following your statement... going through it again 17:10:47 jgriffith: so are you planning to restore and rebase that on top of https://review.openstack.org/#/c/437665/ 17:10:57 ildikov: that's why mdbooth wanted to hold them, 17:10:59 mriedem when it looks like it's going to land yes 17:11:00 which i don't agree with 17:11:12 mriedem wait... don't agree with what? 17:11:14 jgriffith: i'm ready to go with those bdm.attachment_id changes 17:11:20 sorry, 17:11:28 i don't agree that we need to hold the bdm.attachment_id changes 17:11:29 mriedem: I'm on that page too 17:11:31 they are ready to ship 17:11:39 https://review.openstack.org/#/c/437665/ and below 17:11:40 johnthetubaguy: ^^? 17:11:45 mriedem then yes, I'll do that today, I woul dLOVE to do it after it merges if possible :) 17:11:50 mriedem: I don't necessarily disagree, btw. They're just not being used yet, so I thought we might as well merge them later. 17:11:59 so in an hour or so :) 17:12:03 mdbooth: they aren't being used b/c jgriffith is waiting for us to merge them 17:12:08 hence the catch-22 17:12:15 so yeah we agree, let's move forward 17:12:23 mriedem: Does jgriffith have a nova patch which uses them? 17:12:29 These aren't exposed externally yet. 17:12:35 mdbooth I did yes 17:12:37 mriedem: thanks for confirming me :) 17:12:44 mdbooth: https://review.openstack.org/#/c/438750/ will yes 17:12:50 I could go either way, merge now, or rebase old patch on top. Doing neither I can't subscribe to. 17:13:32 johnthetubaguy: mriedem: let's ship it 17:13:43 johnthetubaguy: mriedem: one item done, move to the next one 17:13:46 * mdbooth is confused. That patch can't use attachment_id because it passed tests. 17:13:58 And it doesn't 'have attachment_id 17:14:09 mdbooth: jgriffith and lyarwood originally wrote duplicate changes 17:14:12 at the PTG 17:14:14 or shortly after 17:14:26 jgriffith's patch ^ relied on his version of bdm.attachment_uuid 17:14:28 mdbooth: to be honest the detach flow currently built on the assumption we don't have attachment_id set and therefore it uses the old detach flow 17:14:32 we dropped those and went with lyarwood's 17:14:50 Ok :) 17:14:53 so jgriffith will have to rebase and change attachment_uuid to attachment_id in his change 17:15:37 what's next? 17:15:49 mriedem: so we ship or hold now? 17:16:00 we merge lyarwood's bdm.attachment_id changes 17:16:10 mriedem: awesome, thanks 17:16:16 sweet 17:16:30 i plan on going over johnthetubaguy's spec again this afternoon but honestly, 17:16:37 at this point, barring any calamity in there, 17:16:39 i'm going to just approve it 17:16:48 mriedem: +2 17:16:48 mriedem: I would just ship that too 17:17:19 mriedem: looks sane to me and we cannot figure everything out on paper anyhow 17:17:34 we all agree the general direction now, so yeah, lets steam ahead and deal with the icebergs as they come 17:17:44 johnthetubaguy: +1 17:18:14 we need to get the cinderclient version discovery patch back to life 17:18:21 * ildikov looks at jgriffith :) 17:18:52 ildikov johnthetubaguy so my understanding is you guys wanted that baked in to the V3 detach patch 17:19:07 so I plan on that being part of the refactor against lyarwood 's changes 17:19:26 i'm not sure we need that, 17:19:31 if the bdm has attachment_id set, 17:19:36 it means we attached with new enough cinder 17:19:57 so on detach, if we have bdm.attachment_id, i think we assume cinder v3 is there and just call the delete_attachment method or whatever 17:19:59 so I am almost +2 on that refactor patch, FWIW, there is just a tiny nit left 17:20:00 mriedem: we need that some time before implementing attach 17:20:04 the version checking happens in the attach method 17:20:09 s/method/patch/ 17:20:12 mriedem that works 17:20:19 ildikov: 17:20:23 yes it comes before the attach patch 17:20:25 which is the end 17:20:42 mriedem: ok, then we're kind of on the same page I think 17:20:43 yeah, attach is version negotiation time, everyone else just checks attachment_id to see if they have to use the new version or not 17:20:47 and just hope something with the stupid migrate or HA stuff didn't happen :) 17:21:11 well, i think we'd get a 400 back if the microversion isn't accepted 17:21:16 and we fail the detach 17:21:18 mriedem correct 17:21:28 we can make the detach code smarter wrt version handling later if needed 17:21:28 thats all legit, seems fine 17:21:32 mriedem and I can revert to the old call in the cinder method itself 17:21:32 i just wouldn't worry about that right now 17:21:39 no revert, just fail hard 17:21:44 right hard fail 17:21:53 ummm... ok, you's the bosses 17:22:07 we want to eventually drop the old calls, 17:22:11 so we only get in that mess if you downgraded Cinder after creating the attachment 17:22:12 so once we attach new path, we detach new path 17:22:13 or fail 17:22:22 yeah if you downgrade cinder, well, 17:22:26 that's on you 17:22:30 right, dragons 17:22:44 johnthetubaguy: if someone does that then I would guess they might have other issues too anyhow... 17:23:05 ildikov: right, its just not... allowed 17:23:15 johnthetubaguy: I think that is a very unlikely edge case. 17:23:28 Something has really gone wrong in that case. 17:23:37 ok what's next? 17:23:37 certainly not something we would want to support or encourage people to do 17:23:38 ... so we fail hard in detach, if attachment_id is present, but the microversion request goes bad, simples 17:23:57 i think the detach patch is the goal for next week yes? 17:24:08 I think we need to get detach working 17:24:12 is there anything else? 17:24:20 the refactor is the question 17:24:39 this? https://review.openstack.org/#/c/439520/ 17:24:51 we could try merge lyarwood's refactor first, I think its crazy close to being ready after mdbooth's hard work revieing it 17:25:02 thats the one 17:25:19 johnthetubaguy: ah, right, the detach refactor 17:25:36 I think it will make the detach patch much simpler if we get the refactor in first 17:25:50 mdbooth: whats your take on that ^ 17:25:57 ok i haven't looked at the refactor at all yet 17:26:09 * mdbooth reads 17:26:15 to be clear, its just the above patch, and not the follow on patch 17:26:22 jgriffith's detach patch is pretty simple, it's just a conditional on the bdm.attachment_id being set or not 17:26:47 johnthetubaguy: I think lyarwood is out of office this week, so if that's the only thing missing we can add that and get this one done too 17:27:31 mriedem: hmm, I should double check why I thought it would be a problem 17:27:42 I haven't had a chance to look at lyarwood's updated refactor in detail yet (been buried downstream for a few days), but when I glanced at it it looked like a very simple code motion which brings the code inline with attach. 17:28:40 It looked low risk and obvious (barring bugs). Don't think I've seen the other patches. 17:28:51 so, I remember now... 17:28:53 https://review.openstack.org/#/c/438750/2/nova/compute/manager.py 17:28:58 we have many if statements 17:29:21 I think after lyarwood's patch we have many less, or at least, they are in a better place 17:29:57 if thats not the case, then I think the order doesn't matter 17:30:11 johnthetubaguy: Right. All the mess is in once place. 17:30:19 Centralised mess > distributed mess 17:30:28 yeah, mess outside the compute manager 17:30:33 johnthetubaguy: yeah that's true 17:30:51 ok so i'm cool with that 17:30:56 if you guys think the refactor is close 17:31:31 +1 17:32:46 so in summary, I guess that means jgriffith to rebase on top of https://review.openstack.org/#/c/439520 17:32:47 ? 17:32:54 johnthetubaguy: mdbooth: are one of you going to address the remaining nit? 17:33:03 it's just defining the method in the base class right? 17:33:16 we should be using ABCs there...but that's another story 17:33:19 mriedem: It's open in my browser. I'll probably look in the morning. 17:33:26 mdbooth: ok 17:33:33 otherwise i'll address johnthetubaguy's comment 17:33:37 and he can +2 if he's happy 17:33:47 I was going to do that now 17:33:53 ok 17:34:07 yes so let's rebase https://review.openstack.org/#/c/438750/ on top of https://review.openstack.org/#/c/439520 17:34:12 and the goal for next week is detach being done 17:34:36 #info let's rebase https://review.openstack.org/#/c/438750/ on top of https://review.openstack.org/#/c/439520 and the goal for next week is detach being done 17:34:39 mriedem johnthetubaguy thanks, that the one I pointed out this morning when I tried to hijack your other meeting :) 17:35:06 mriedem: johnthetubaguy: +1, thanks 17:36:19 ok, it seems we're done for today 17:36:29 anything else from anyone? 17:36:31 ok, so let's be done with multiattach talk for this meeting. 17:36:32 yes 17:36:39 at the ptg, we talked about a cinder imagebackend in nova 17:36:45 mdbooth is interested in that as well, 17:36:49 but we need a spec, 17:36:57 so i'm wondering if mdbooth or jgriffith would be writing a spec 17:37:19 mriedem so I have some grievances with that approach :) 17:37:34 Time for the airing of grievances. 17:37:34 air them out 17:37:44 bring out the aluminum pole 17:37:46 i'll get the festivus pole 17:37:52 ;) 17:38:04 Here's the thing... I know you all hate flavor extra-specs, but lord it's simple 17:38:04 jgriffith: I missed the PTG unfortunately. Do you want to have a detailed chat tomorrow? 17:38:26 more importantly, there's a side effect of the cinder image-backend that I didn't think of 17:38:29 it's all or nothing 17:38:46 so you do that and you loose the ability to do ephemerals if you wanted to 17:38:51 i thought your whole use case was "i have cinder for storage, i want to use it for everything" 17:38:59 "i don't have any local storage on my computes" 17:39:00 which meh... ok, I could work around that 17:39:09 That's one use case yes 17:39:10 is this per compute node? I think thats OK? 17:39:19 yes it's per compute node 17:39:22 so i think that's fine 17:39:33 so that's my question :) 17:39:43 that addresses my customers that have ceph... 17:39:47 if you really want ephemeral, then you have hosts with ephemeral 17:39:52 thats totally fine right, some flavors go to a subset of the boxes 17:39:53 and put them in aggregates 17:39:56 right 17:40:00 BUT, how do I schedule... because it seems you don't like scheduler hints either ;) 17:40:10 you have your cheap flavors go to the ephemeral aggregates 17:40:18 and your "enterprise" flavors go to the cinder-backend hosts 17:40:27 mriedem so we use flavors? 17:40:30 jgriffith: the flavors tie to the aggregates 17:40:30 yes 17:40:34 isn't that how I started this? 17:40:37 no 17:40:41 jgriffith: the approach we are talking about uses the normal APIs, no hints required 17:40:49 you started by baking the bdm into extra specs which we're not doing 17:40:57 right, this is all normal stuff 17:41:04 mriedem flavor extra-specs TBC 17:41:04 Incidentally, I've been trying to get the virtuozzo folks to implement this for their stuff: I'd like imagebackend to be stored in metadata attached to the bdm 17:41:24 To me, this is the most sane way for this stuff to co-exist 17:41:26 jgriffith: but standard extra specs that are tied to aggregate metadata 17:41:45 Having it hard-coded in nova.conf is extremely limiting 17:41:57 we also want the cinder imagebackend for other storage drivers, like scaleio 17:42:03 and i'm sure veritas at some point 17:42:54 mdbooth: so there is a thing where we should support multiple ephemeral backends on a single box at some point, and we totally need that, I just don't think thats this (I could be totally wrong) 17:43:01 I believe there are going to be mixed use cases. For example, HPC where you want some of your storage to be local and really fast, even at the cost of persistence. 17:43:29 we talked about multiple ephemeral backends at the ptg didn't we? 17:43:44 the vz guys were talking about something like that, but i might be misremembering the use case 17:43:55 Well the cinder imagebackend would be all or nothing as it stands 17:44:07 Because it's always all or nothing right now 17:44:09 so the HPC folks care about resource usage being optimal, so they would segregate the workloads that want SSD, and just use volumes for the extra stroage, I think... 17:44:23 right, 17:44:27 but thats probably a rat hole we might want to dig out of 17:44:31 mriedem: Yes, the vz guys want to add a new hard-coded knob to allow multiple backend 17:44:34 i think if you have different classes of storage needs you have aggregates and flavors 17:44:38 I think that's just adding mess to mess 17:44:39 jgriffith: do you still support local volumes? 17:44:48 johnthetubaguy: they don't 17:44:57 that came up again in the ML recently 17:44:59 ah, that got taken out 17:45:00 i can dig up the link 17:45:00 johnthetubaguy I don't know what you mean 17:45:07 LocalBlockDevice 17:45:10 or whatever cinder called it 17:45:19 yeah, like a qcow2 on local filesystem 17:45:19 Oh! 17:45:23 No, we removed that 17:45:42 I'm happy to entertain a better way to do it, but not for Pike 17:45:48 Technically, it's still there and deprecated and will be removed in Q. 17:45:51 so... I kinda see a world where all storage is in Cinder only, once we add that back in (properly) 17:45:52 when I say entertain, I mean implement 17:45:59 I already know how to do it 17:46:06 but the cinder ephemeral backend seems a good starting place 17:46:15 johnthetubaguy I agree with that 17:46:18 johnthetubaguy: +1 17:46:20 +1 17:46:27 I'm just concerned WRT if we'll get there or not 17:46:49 so if we don't write the spec, we certain not to get there 17:46:53 http://lists.openstack.org/pipermail/openstack-dev/2016-September/104479.html is the thread 17:46:59 Looking at the *tricks* some backends have used to leverage Nova Instances makes it disturbing 17:47:09 johnthetubaguy touchet 17:47:27 jgriffith: I had to say that, the urge was too strong :) 17:47:38 I was hoping to make one last ditch effort at a plea for flavor extra-specs... but I see that ship has certainly sailed and I fell off the pier 17:47:39 mriedem: thanks thats the one 17:47:40 http://lists.openstack.org/pipermail/openstack-dev/2017-February/112847.html was picking it back up during the PTG 17:47:58 i've also asked jaypipes to chime in on that thread, 17:47:59 johnthetubaguy I would've thought less of you if you hadn't said it ;) 17:48:03 because it came up in the nova channel again yesterday 17:48:08 and the answer was traits and placement, 17:48:13 but i'm not sure how that works yet in that scenario 17:49:14 i need the answer to this http://lists.openstack.org/pipermail/openstack-dev/2017-February/112903.html 17:49:20 mriedem: yeah, we get into ensemble scheduler that was proposed at one point, but lets not go there, but worth finding the notes on that 17:50:06 i think the answer during the PTG was, 17:50:12 you have a resource provider aggregate, 17:50:20 and the compute and storage pool are in the aggregate, 17:50:32 mriedem: I think we need the aggregate to have both resource providers in them 17:50:33 and a resource type that matches the volume provider 17:50:35 and there is a 'distance' attribute on the aggregate denoting the distance between providers, 17:50:49 and if you want compute and storage close, then you request the server with distance=0 17:50:59 so there is a spec about mapping out the DC "nearness" in aggregates 17:51:01 good lord 17:51:06 we can't escape that one forever 17:51:15 Look.. there's another idea you might want to consider 17:51:19 johnthetubaguy: i think that's essentially what jay told me this would be 17:51:24 We land the cinder image-backend 17:51:30 mriedem: yeah, sounds like the same plan 17:51:42 Have a cinder-agent or full cinder-volume service on each compute node... 17:52:00 If you want something fast, place on said node with cinder capacity, skip the iscsi target and use it 17:52:24 but the fact is, all this talk about "iscsi isn't fast enough" has proven false in most cases 17:52:34 we are really just talking about how that gets modelled in the scheduler, we are kinda 60% there right now I think 17:52:59 johnthetubaguy I know, but i"m suggesting before doing the other 40% we consider an alternative 17:53:08 should we suggest this as a forum topic? 17:53:12 jgriffith: depends on how much you spend on networking vs the speed you want, agreed its possible right now 17:53:22 the "i want to use cinder for everything" thing has come up a few different ways 17:53:28 mriedem: we might want to draw this up a bit more crisply first, but yeah, I think we should 17:53:31 mriedem I'm willing to try and discuss it 17:53:48 mriedem but honestly if it's like other topics that's already decided I don't want to waste anyones time, including my own 17:53:57 * johnthetubaguy sits on his hands so he doesn't say he will try to write this up 17:54:05 well the point of the forum session is to get ops and users in the room too 17:54:20 mriedem: ++ 17:54:31 anyway, i can take a todo to try and flesh out something for that 17:54:41 mriedem: I am happy to review that 17:54:54 so the baby step towards that is Cinder backend ephemeral driver 17:55:01 yeah 17:55:06 which helps the "diskless" folks 17:55:21 johnthetubaguy the good thing is that certainly does open up many more long term possibilities 17:55:46 it seems aligned with our strategic direction 17:55:49 There has been enough request around the idea that we should keep trying to move something forward. 17:55:59 * johnthetubaguy puts down management books 17:56:13 jungleboyj which request? 17:56:23 or... I mean "which idea" 17:56:41 using cinder for everything 17:56:43 local disk? Ephemeral Cinder? Cinder image-backend 17:56:44 always boot from volume 17:56:49 mriedem thank you 17:57:02 3 minutes left 17:57:02 I get distracted easily 17:57:04 Ephemeral Cinder 17:57:04 root_disk = -1, etc 17:57:19 so next steps... 17:57:33 forum session, and nova spec on cinder ephemeral back end? 17:57:40 mriedem takes the first one 17:57:44 So, the plan is that the question of being able to co-locate volumes and instances will be handled by the placement service? 17:57:44 the second one? 17:57:44 yup 17:58:01 jungleboyj: that will be part of the forum discussion i think 17:58:12 mriedem: Great 17:58:16 who's going to start the nova spec for the cinder image backend? 17:58:24 mriedem I'll do it 17:58:26 I promise 17:58:32 ok, work with mdbooth 17:58:37 he's our imagebackend expert 17:58:43 mdbooth Lucky you :) 17:58:47 I want to review both of those things when they exist 17:59:02 mdbooth I meant that you get to work with me, not an enviable task 17:59:18 jgriffith: Hehe 18:00:05 #action mriedem to raise forum session about cinder managing all storage, scheduling, etc 18:00:22 #action jgriffith to work with mdbooth on getting a spec together for the cinder image backend 18:00:28 and a timeout warning 18:00:29 something like that? 18:00:32 yes 18:00:34 works for me 18:00:37 sweet 18:00:38 times up 18:00:45 I guess we are done done 18:00:53 :-) 18:01:02 awesome, thank you all! 18:01:07 thansk 18:01:09 great meeting today :) 18:01:23 see you next week :) 18:01:26 #endmeeting