17:00:16 #startmeeting cinder-nova-api-changes 17:00:17 Meeting started Thu Feb 2 17:00:16 2017 UTC and is due to finish in 60 minutes. The chair is ildikov. Information about MeetBot at http://wiki.debian.org/MeetBot. 17:00:18 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 17:00:21 The meeting name has been set to 'cinder_nova_api_changes' 17:00:28 scottda DuncanT ameade cFouts johnthetubaguy jaypipes takashin alaski e0ne jgriffith tbarron andrearosa hemna erlon mriedem gouthamr ebalduf patrickeast smcginnis diablo_rojo gsilvis xyang1 raj_singh lyarwood 17:00:34 hi 17:00:36 yough 17:00:41 hi 17:00:56 \o_ 17:01:11 That time again? :) 17:01:13 hi :) 17:01:36 I will need to switch to my phone in about 25 minutes as I need to get from one place to the other 17:01:42 wait, what day is it? 17:01:47 #chair scottda smcginnis 17:01:48 Current chairs: ildikov scottda smcginnis 17:01:59 jgriffith: don't worry it's Thursday :) 17:02:04 :) 17:02:50 I pinged the Nova guys, hopefully they can join 17:03:01 I updated the etherpad with the latest items 17:03:04 #link https://etherpad.openstack.org/p/cinder-nova-api-changes 17:03:41 I think the most urgent is to figure out how to get Cinder V3 tested on the gate with Nova 17:03:53 ildikov +1 17:04:04 both from the perspective of the new API and V3 in general 17:04:32 jgriffith: Were you still hitting issues trying to run just v3? 17:04:37 did mattR put up a patch for that? 17:04:38 we need some updates so we can pass extra config options to devstack on the gate to use the right endpoints, etc 17:04:41 smcginnis yup 17:04:59 scottda: https://review.openstack.org/#/c/326585/ 17:04:59 Anyone else tried this? 17:05:06 scottda: https://etherpad.openstack.org/p/infra-ptg-pike 17:05:08 I've been running v3 with nova and it's working 17:05:15 smcginnis I actually just tried running master against V3 (without the new attach stuff) and it seems broken 17:05:19 scottda oh really? 17:05:32 scottda you had a succesful tempest run? do tell? 17:05:46 jgriffith: Not with nova patches for new attach stuff, just using old stuff and cinder v3 17:05:57 no, haven't run tempest 17:05:58 scottda: tempest you mean or manual testing? 17:06:09 scottda: ok :) 17:06:18 I actually ran yesterday and had failures with volume create and delete stuff timing out. 17:06:20 scottda so I've been trying to just get a clean tempest run WITHOUT new attach/detach in Nova, just setting nova/volume/cinder.py endpoint to V3 17:06:37 scottda I get all sorts of failures for invalid parameters, unexpected results etc etc 17:06:49 jgriffith: yes, me too, and getting failures as well. 17:07:05 well what the heck? 17:07:24 can I just move the new code back into V2 then? and we can remove all the V3 stuff since it doesn't work anyway :) 17:07:32 but I'm setting nova.conf to use cinder v3. and seeing errors in unexpected places... 17:07:44 jgriffith: sure, go for it. 17:07:52 scottda yeah, because we never tested any of this :( 17:08:14 * bswartz sneaks in late 17:08:39 the basic volume create works with manual testing, so it must be some config issue I guess or at least partially that 17:08:44 I'm confused. It should be the exact same code when not using microversions. 17:09:13 smcginnis: yeah, that's why I think there's something with config 17:09:15 smcginnis emphasis on *should* 17:09:17 yeah, v2 and v3.0 are the same 17:10:29 As our microversion expert, I blame scottda. :P 17:10:42 I can set nova to use cinder v3 and attach and detach. I can verify the calls go through cinder v3 17:10:59 I can use my POC patch to use cinder v3.27 17:11:00 it would be great to figure this out until the PTG, so if we can get the devstack bits through then we can turn on testing in the placement API job in Nova 17:11:21 I'll work on finding and debugging tempest failures 17:11:37 And I'm working on getting rid of the term "microversions" and using "API versions" 17:11:39 to get at least V3 tested on a regular basis as a starting point 17:11:39 :) 17:12:10 scottda: or just use apple pie :) 17:12:18 ha..I prefer pizza 17:13:12 I'm fine with that too :) 17:13:58 Well, ultimately the nova people will need to decide how to use the new cinder apis, based on how to figure out if the cinder server supports them 17:14:30 scottda ummmm 17:15:01 scottda that's not really how this community works... "Here, let me throw this over the wall... now it's your problem" 17:15:13 jgriffith: That's now what I'm saying 17:15:23 I'm talking about in the code for nova/compute/manager.py 17:15:26 in attach 17:15:39 scottda "Well, ultimately the nova people will need to decide how to use the new cinder apis" 17:15:41 scottda: which part of that? 17:15:44 There could be: try:...new stuff except: use old stuff 17:15:54 there already is 17:16:02 or some kind of if: is_new_stuff() else: use old stuff 17:16:04 scottda ildikov and I have already been working on that 17:16:23 scottda: in compute/manager.py it works like a charm already for attach 17:16:24 scottda the problem is the Cinder V3 stuff doesn't seem to work right 17:16:34 We need to make sure the new stuff is working right. But I think scottda's saying _how_ those get implemented in Nova and handled will need to be up to the nova team. 17:16:39 scottda the only calls that work are the new attach/detach calls 17:16:50 jgriffith: Not for me. They all work 17:16:51 scottda: we manual tested it with jgriffith only as tempest is broken anyhow and used hardcoded numbers, but the mechanism itself works 17:17:19 scottda you're confusing me... I *thought* you said you saw tempest failures as well? 17:17:38 smcginnis: if we make it work they will be fine, that's why we play with the PoC, but they will have the final word on it for sure 17:17:42 scottda ok, if they all work for you, paste your config, I'll give it another go and see what I missed 17:17:54 smcginnis: unfortunately we don't have merge rights for that repo... :) 17:17:55 but on my side any time nova calls volume-create/delete/snapshot it fails 17:18:12 smcginnis: or maybe luckily, I'm not sure :) 17:18:12 I saw tempest failures that seemed unrelated. 17:18:16 jgriffith: do you have your nova POC change to hand? 17:18:20 ildikov: ;) 17:18:38 lyarwood https://review.openstack.org/#/c/330285/ 17:18:45 jgriffith: thanks 17:18:59 lyarwood I modified that a bit and just set the URL by brute force in my cinder.py file 17:19:03 I have this in /etc/nova/nova.conf 17:19:10 https://www.irccloud.com/pastebin/j1ynr31m/ 17:19:21 and restart n-cpu 17:19:24 scottda ok, I'll set that and run tempest 17:19:28 scottda thanks 17:19:43 and I use this: https://review.openstack.org/420201 17:19:51 scottda: yeap, that's what I use too 17:23:21 scottda: I think you can leave out the manager.py part from that patch and just get the microversion part fixed 17:23:43 ildikov: yeah, that's just a demo to show it works 17:24:04 ildikov: I don't want to go further until we discuss *how* and *when* to get cinder server version info 17:24:19 Some folks in cinder suggest we could put something in nova.conf 17:24:47 just have the admin configure to say " use_cinder_version: 3.27" or something like that 17:24:51 scottda: for PoC purposes it's fine, we have those parts in compute/api.py and will figure out the manager part as it might get tricky with old/new volumes mixed 17:25:21 scottda: that's API facing config, that should not go into nova.conf 17:25:30 scottda: I agree with the Nova folks regarding to that 17:25:47 Ok, so that's even worse than what I had: http://paste.openstack.org/show/597404/ 17:26:12 my config obviously must be wrong 17:26:17 jgriffith: :( 17:26:39 yeah... all Identity errors 17:26:41 jgriffith: I'm guessing. I'm running now and not seeing errors yet 17:26:55 scottda cool, merge it 17:26:57 :) 17:27:00 scottda: smcginnis: I need switch to my phone, but I made you chairs, so if I seem to be lost please end the meeting when it's time 17:27:03 jgriffith: LOL 17:27:22 ildikov: Got it. 17:27:28 My URL obviously must be incorrect 17:27:50 smcginnis: tnx :) 17:27:58 jgriffith: Maybe try isolating the run down to one or a handful of tests to try to see how/where it's breaking down? 17:28:11 smcginnis I did that yesterday :) 17:28:35 first I want to get to the same level that scottda and ildikov are instead of brute forcing the URL in myself 17:28:56 jgriffith: ATM, I'm running a fresh devstack with only the new cinder v3 in nova.conf 17:29:09 that's the only change 17:29:37 scottda alright, well if it works then awesome 17:29:39 AFter that, I'll test my POC and have nova prefer v3.0 17:29:48 then work on 3.27 17:30:03 jgriffith: you can hardcore the catalog_info in the Nova code too if you don't like the config file 17:30:33 ildikov: It's be nice to get things working without hard-coded changes. 17:30:46 s/It's/It'd 17:31:07 scottda: we will switch that one day anyhow :) 17:31:16 I hope at least 17:31:36 ildikov: OK. I'm not really sure where in the code you are talking about. but OK 17:32:03 So 1) get it working with hard coding, 2) figure out preferred way to do it without hard coding, 3) propose way for nova to start using the microversion calls? 17:32:20 ildikov yeah, that's what I was doing that I only had like 22 failures 17:32:42 but since scottda gets a clean run must be something wrong on my side, trying to figure it out now 17:32:52 scottda: where the config stuff is and the catalog_info is set to v2 by default 17:33:13 jgriffith: I haven't gotten a clean run yet...just progressed beyond the 111 failures you've pasted. 17:33:19 I'm running now 17:33:28 91 17:33:48 jgriffith: I started my tempest run with the new stuff in, but will let you know if I hit any unrelated changes 17:34:22 s/changes/failures :) 17:37:24 Do we have anything more regarding Tempest? 17:38:04 Guess we can't really do much until we see that passing. 17:38:28 Don't think so either 17:39:03 We need to look into what we would want to discuss on the PTG 17:40:07 Both regarding Nova and what we need to still fix on the Cinder part for multi-attach 17:40:30 As we still haven't figured out detach in that scenario 17:41:34 https://www.irccloud.com/pastebin/fDEInZJz/ 17:41:44 And I became very unpopular now it seems :) 17:41:48 I'll start looking at those 12 failures 17:42:15 ildikov: Sorry. Yeah, I agree we need to discuss those. 17:42:15 ooh multi-attach. is that still a 4 letter word? 17:42:22 That's a bit more friendly than 91 17:43:32 hemna: Yes. :) 17:43:33 hemna: it depends on which four letters you pick :) 17:43:39 * bswartz is sad that multiattach is so hard 17:43:51 looks to be all "no valid host found" 17:44:04 Had to laugh (sadly) to myself yesterday looking back through some old Cinder meeting logs. We were talking about this in June of 2015. 17:44:14 bswartz: it's not, it's the drivers' fault... 17:44:31 ildikov: lies! 17:45:00 the driver interface encourages drivers to not support multiattach, but it can be done 17:45:25 bswartz: maybe a bit exaggeration but not lie 17:45:27 live migration would never have been possible without drivers implementing multiattach through a back door interface 17:45:51 and live migration works perfectly well on at least a few drivers 17:45:55 bswartz: I meant the detach issues 17:46:08 ah 17:46:39 bswartz: you don't have attached to the same host twice with live migration 17:47:11 ildikov: that's a different statement than multi attach not working though 17:47:40 bswartz: but you can have two instances on the same host with multiattach and that's when things get messy 17:47:51 yup 17:47:53 I agree we need some kind of smarter ref counting to know when detaches are safe 17:48:02 that's can't possibly be blamed on drivers however 17:48:10 s/that's/that/ 17:48:11 bswartz: if you cannot safely detach I consider that as not working 17:48:37 Technically, multi-detach doesn't work 17:48:46 multi-attach is fine 17:49:29 bswartz: back ends handle the target differently, one exports a new one for each attachment others have one per host 17:49:49 so it's not just ref count anymore 17:50:26 but you need to know which back end we're using, etc 17:50:38 that problem isn't solvable at the driver level, it needs the manager to implement something 17:50:57 It needs both I think 17:51:13 well yes, the drivers have to implement whatever new logic the manager requires 17:51:22 You cannot solve this without driver support 17:51:47 I thought drivers were supposed to put a shared flag in the return of initialize_connection ? 17:51:48 ildikov: maybe we can agree on "it's all the driver INTERFACE's fault" 17:52:18 bswartz: I can live with that, yes :) 17:53:15 so as a summary we need to figure out how to solve this in Cinder 17:54:19 Also things are popping up like supporting volume name for the new attach/detach calls 17:54:50 I added that to the etherpad, we have a bug report for that 17:55:49 We can just have bug reports for things that pop up and see what's urgent to fix 17:56:11 ildikov: well we can blame the driver interface on vish because he isn't here 17:56:22 hemna: I think that flag sounds good 17:56:49 hemna: we just need to agree that's the way and get the drivers updated 17:56:56 I think I'm the owner of an ancient bug related to detach not working on the NFS driver for exactly this kind of reason 17:57:15 ildikov, so by default drivers don't support the multi attach flag 17:57:32 hemna that could work, but I was also going to add supports_multiattach to the capabilities 17:57:36 so they'll have to submit a patch to support it. part of that review could be ensuring that they pass back the shared flag 17:57:44 bswartz: oh, then plz get involved in fixing this :) 17:57:51 jgriffith, I think we already have one... 17:57:54 default is False, so if you don't update your driver when we get this all settled then your out of luck 17:58:07 hemna can't use it 17:58:12 hemna: not even LVM? 17:58:15 unless I set everybody that has it to False 17:58:21 ildikov: if I had the spare time I would 17:58:28 jgriffith: That'd be fair for now. 17:58:36 defeats the purpose of flagging drivers that need updated 17:58:56 until the terminate_connection stuff is sorted and updated in the driver can't trust it 17:59:45 bswartz: any idea on how to fix or just testing stuff sometimes is help, so whatever you can chime in is good 18:00:07 ildikov: I'll do what I can 18:00:26 bswartz: thanks 18:00:42 yah we already have folks reporting multiattach in capabilities 18:01:26 we need to turn this off somehow for now as the code does not prevent from using it 18:01:32 https://github.com/openstack/cinder/blob/master/cinder/volume/drivers/pure.py#L545 18:01:45 We're out of time :( 18:02:15 well, we can submit a patch to set all of those to False 18:02:16 :) 18:02:25 it's a small list of drivers 18:02:48 pure, zte, netapp, ibm 18:03:06 sounds good to me! 18:03:18 3par as well 18:04:20 Let's set those false too for now and figure out detach 18:04:44 Do we have anything else for today? 18:06:19 Ok, let's close this meeting and fix what we agreed on 18:06:25 ildikov: Thanks 18:06:33 ildikov scottda smcginnis FYI clean tempest run! 18:06:49 Have a nice rest of your day 18:06:49 jgriffith: Really? Awesome. What was the difference? 18:06:53 jgriffith: Cool 18:07:08 jgriffith: awesome! 18:07:08 smcginnis the URL being used if V2 18:07:20 jgriffith: What do you have for RAM and number of cpus? 18:07:21 "GET /v2/422f4b0126b4473f93f7e3ce46918c32/volumes/ce025819-f959-46a9-8 18:07:23 vs 18:07:31 jgriffith: I knew it!!! :) 18:07:34 "GET /v3/422f4b0126b4473f93f7e3ce46918c32/volumes/ce025819-f959-46a9-8 18:07:50 Darn version #s. :) 18:07:52 8 gig and 8 cores 18:08:08 hmmm..ok, I"ve 8GB and 2 cores 18:08:14 My hack was using the 'v3' 18:08:39 but I think you may miss the point :) 18:08:53 I think point is that it works with v2 18:08:54 :) 18:09:06 scottda :). In that case you got my point :) 18:09:25 Let's stick with v2 then :) 18:09:33 ildikov LOL 18:09:51 Drop v3 and micro versions :) 18:10:05 ildikov is it Christmas already? 18:10:31 jgriffith: I see snow so it might be :) 18:12:05 Ok, let's switch to the Cinder channel for more version fun :) 18:12:21 Thank y'all for today :) 18:12:44 #endmeeting