17:00:23 <ildikov> #startmeeting cinder-nova-api-changes 17:00:25 <openstack> Meeting started Thu May 12 17:00:23 2016 UTC and is due to finish in 60 minutes. The chair is ildikov. Information about MeetBot at http://wiki.debian.org/MeetBot. 17:00:27 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 17:00:29 <smcginnis> o/ 17:00:30 <openstack> The meeting name has been set to 'cinder_nova_api_changes' 17:00:35 <scottda> hi 17:00:36 <mriedem> o/ 17:00:41 <ildikov> scottda ildikov DuncanT ameade cFouts johnthetubaguy jaypipes takashin alaski e0ne jgriffith tbarron andrearosa hemna erlon mriedem gouthamr ebalduf patrickeast smcginnis diablo_rojo gsilvis 17:00:52 <ildikov> hi 17:01:19 <alaski> o/ 17:01:31 <thingee> mriedem: I'm not sure why my question is being avoided after I asked twice. Is this because it's a priority problem in nova, or because it never will be for multiattach. 17:01:33 <smcginnis> Do we have an agenda up somewhere? 17:01:36 <ildikov> as far as I know jgriffith_ is out today, but we still have a few items to touch on 17:01:40 <cFouts> o/ 17:01:50 <mriedem> thingee: later 17:02:02 <ildikov> etherpad with info: #link https://etherpad.openstack.org/p/cinder-nova-api-changes 17:02:07 <aimeeu> lurking and learning 17:02:22 <mriedem> there were some items from ildikov's meeting minutes from last week 17:02:33 <ildikov> smcginnis: I added the list of items we are targeting to get done to the etherpad 17:02:38 <mriedem> " John Griffith will work on the above described solution, that target is to have patches up by next week." 17:02:39 <smcginnis> ildikov: Thanks! 17:02:49 <ildikov> we can go through those 17:03:16 <hemna> I have a question 17:03:46 <hemna> I'm working on a nova patch to not call check_attach at attach time 17:03:47 <ildikov> I haven't seen patch(es) up from John yet 17:04:14 <hemna> and check_attach does 2 things. 1) it checks internal state of the volume and 2) checks the availability zone 17:04:38 <hemna> does it make sense to add an optional AZ param to os-attach ? 17:04:50 <hemna> and have cinder check at os-reserve ? 17:05:00 <hemna> or just keep the check on the nova side only 17:05:01 <mriedem> https://github.com/openstack/nova/blob/026468772672215d34a593e631d1e62d6a615aa4/nova/volume/cinder.py#L279 17:05:18 <hemna> https://github.com/openstack/nova/blob/master/nova/volume/cinder.py#L289-L299 17:05:38 <mriedem> so, the az stuff is a mess kind of 17:05:38 <hemna> I was just working on moving that code into a check_availability_zone() call in there instead 17:05:53 <hemna> but before I go forward, I'd like to hear opinions on it 17:05:53 <mriedem> see https://github.com/openstack/nova/blob/026468772672215d34a593e631d1e62d6a615aa4/nova/virt/block_device.py#L60 17:06:40 <mriedem> ^ is really for boot from volume where nova creates the volume 17:06:52 <hemna> I'd prefer to change nova's attach code to simply call os-reserve 17:06:57 <mriedem> because nova will create the volume in the same AZ that the instance is in, which might not exist in cinder 17:07:00 <hemna> instead of a volume get, then check, then os-reserve 17:07:15 <mriedem> hemna: i think the az check in the api just needs to remain a separate thing 17:07:34 <mriedem> see my todo here https://github.com/openstack/nova/blob/026468772672215d34a593e631d1e62d6a615aa4/nova/virt/block_device.py#L79 17:07:57 <mriedem> i've had a long-term wish of creating the volume in nova-api for boot from volume, and then attaching it later 17:08:05 <hemna> https://github.com/openstack/nova/blob/master/nova/compute/api.py#L3095 17:08:07 <mriedem> so we do all of the az checking and stuff with cinder in the api rather than on the compute 17:08:08 <hemna> that thing 17:08:22 <scottda> But for nova to do the AZ check, it will still need the volume.get, which defeats the point of what hemna is trying to do. 17:08:23 <hemna> I was hoping could simply be a call to self.volume_api.reserve_Volume() 17:08:32 <hemna> scottda, +1 17:08:33 <hemna> yah 17:08:45 <hemna> so there is that. 17:09:13 <hemna> the get, then reserve means there is still a race 17:09:26 <mriedem> so you'd have to pass the az to os-reserve 17:09:29 <hemna> yah 17:09:42 <hemna> as an optional param 17:09:49 <hemna> if it's there, cinder tests it. 17:09:57 <mriedem> re: https://github.com/openstack/nova/blob/master/nova/compute/api.py#L3095 ndipanov had a patch for a race in there also: https://review.openstack.org/#/c/290793/ 17:09:59 <hemna> if it's not, it assumes it's open, re: no AZ 17:10:45 <mriedem> yeah, and nova's logic for passing the az would be based on what we have in https://github.com/openstack/nova/blob/026468772672215d34a593e631d1e62d6a615aa4/nova/virt/block_device.py#L60 17:10:46 <mriedem> for bfv 17:11:59 <mriedem> shall we take notes in https://etherpad.openstack.org/p/cinder-nova-api-changes ? 17:11:59 <hemna> https://github.com/openstack/nova/blob/master/nova/volume/cinder.py#L289 17:12:05 <hemna> so right now, that's checked 17:12:34 <hemna> kinda the same thing 17:12:43 <mriedem> yeah, nova would just need to re-use some logic to determine if it needs to pass the az to os-reserve 17:12:52 <hemna> sounds like the _get_volume_create_az_value() needs to be public 17:12:57 <mriedem> if CONF.cinder.cross_az_attach, we'd pass None 17:12:58 <ildikov> mriedem: I will add the decision points to the etherpad after the meeting 17:14:20 <hemna> I don't see any AZ check on the cinder side 17:14:29 <hemna> so I dunno 17:14:48 <scottda> I don't think there's any AZ checks enforced in Cinder 17:14:51 <mriedem> yes there is 17:14:55 <mriedem> when creating the volume 17:15:06 <mriedem> nova can pass an az and if it doesn't exist cinder fails the volume create request 17:15:08 <mriedem> UNLESS 17:15:14 <mriedem> you set a backdoor config option to ignore htat 17:15:16 <mriedem> *that 17:15:20 <hemna> the create flow passes in an AZ 17:15:20 <smcginnis> Unless a fallback is configured. 17:15:34 <mriedem> smcginnis: right, which was a hack because we didn't have the fix in nova 17:15:39 <mriedem> which is https://github.com/openstack/nova/blob/026468772672215d34a593e631d1e62d6a615aa4/nova/virt/block_device.py#L60 17:15:42 <hemna> bleh 17:15:47 <mriedem> https://github.com/openstack/nova/commit/f9a51b970f688b90baf0ae3ef31d79b3fec02ed1 17:15:52 <hemna> ok, so I don't want to make the AZ nightmare worse 17:16:05 <scottda> hemna: You made is worse by mentioning it. 17:16:19 <mriedem> well, passing the az to os-reserve and cinder checking if it's provided, isn't really making it worse 17:16:24 <mriedem> if nova doesn't provide it, it's a noop 17:16:32 <hemna> scottda, :) 17:16:43 <mriedem> if cinder microversion isn't new enough for nova to pass it, then nova still has to check like it is today 17:16:47 <hemna> I guess the real question is, should cinder care? 17:17:00 <hemna> should cinder be doing the check and fail if the AZ doesn't match ? 17:17:10 <mriedem> so, 17:17:25 <mriedem> when i was fixing this bug in nova, i had a thread in the ML about removing the nova cross_az_attach option 17:17:25 <scottda> There are use cases where deployers had geographically distinct AZs, so this was needed. 17:17:26 <hemna> afaik AZ is a nova concept ? 17:17:30 <mriedem> and there were operators saying they relied on it 17:17:35 <mriedem> scottda: yes 17:17:36 <mriedem> that 17:17:49 <scottda> We did it in our (now defunct) public cloud... 17:18:02 <mriedem> see http://lists.openstack.org/pipermail/openstack-operators/2015-September/008252.html 17:18:06 <mriedem> for some light bedtime reading 17:18:11 <hemna> :) 17:18:17 <smcginnis> The backdoor config option work brought up the fact that AZs were never fully baked. 17:18:25 <mriedem> starts here http://lists.openstack.org/pipermail/openstack-operators/2015-September/008224.html 17:18:45 <scottda> Yeah, the terminology is vague, and that's part of the problem....but we still have to live with it. 17:18:49 <hemna> and by 'fully baked' does that mean that nova should be passing the AZ in calls to Cinder ? 17:18:56 <hemna> so that they can both be on the same page? 17:18:58 <mriedem> this is the cinder workaround https://review.openstack.org/#/c/217857/ 17:19:15 <smcginnis> hemna: I think to fully support and enforce AZs, yeah. :/ 17:19:47 <mriedem> there are some decent details and background in the commit message of https://review.openstack.org/#/c/227564/ 17:19:53 <hemna> so if a user creates a volume, is the AZ set? and to what? and how is that checked against attach calls from nova ? 17:19:56 <smcginnis> But maybe we should shelve this az discussion for now and get back to multiattach. AZs are an issue for single and multi attach. 17:19:57 <hemna> bleh 17:20:01 * hemna cowers in defeat 17:20:31 <mriedem> so ftr, to fully remove nova's check_attach, cinder's os-reserve would need to take an az 17:20:33 <mriedem> to validate it 17:20:39 <hemna> mriedem, yah 17:20:40 <mriedem> at least to be consistent with how things are today 17:20:46 <hemna> that's why I brought it up 17:20:47 <mriedem> let it be written in the etherpad for all time! 17:20:47 <ildikov> I guess we can make the 'check_attach' removal a two step process 17:21:06 <hemna> so, if I still do the AZ check on the nova side 17:21:16 <hemna> the race is smaller 17:21:24 <hemna> at least nova won't be checking volume state 17:21:48 <scottda> Yeah, but that's a bit of code churn and review time for an incomplete fix... 17:21:52 <hemna> I think eventually, we do want to just pass the AZ to cinder and then nova can call reserve w/o a get. 17:22:25 <hemna> I won't change the functionality of check_attach for now. 17:22:33 <hemna> but I will refactor the AZ check out of there 17:22:34 <mriedem> scottda: it's just a bug fix really 17:22:42 <hemna> and then simply call the new AZ check after the get. 17:22:51 <mriedem> yeah i think hemna and i are on the same page 17:22:53 <scottda> fair enough 17:22:55 <hemna> then reserve_volume will catch the state checks. 17:23:07 <ildikov> is it only the BFV case? 17:23:14 <mriedem> no 17:23:16 <scottda> ildikov: no 17:23:18 <ildikov> I mean when check_AZ will need to be called 17:23:35 <mriedem> so in the remaining 7 minutes i have... 17:23:42 <hemna> ok I'll forge ahead with this and push it up today then. 17:23:59 <mriedem> hemna: you might want to look at https://review.openstack.org/#/c/290793/ too 17:24:04 <ildikov> cool, added a note to the etherpad about the AZ check 17:25:11 <hemna> mriedem, ok will do 17:25:16 <ildikov> mriedem: can you check the multiattach spec when you have some time? 17:25:39 <mriedem> ildikov: is it any different from mitaka? 17:26:20 <ildikov> mriedem: slightly updated, I added a link to the etherpad so that we would not need to add implementation details to the spec regarding how to sort out things in Cinder 17:26:22 <mriedem> because i was under the impression that the multiattach spec was going to be dependent on the POC that jgriffith_ was going to be doing 17:26:59 <ildikov> does this mean we can talk about approving it, when that is ready? 17:27:31 <mriedem> i'd prefer to not land a bunch of technical debt in nova just to get this in 17:28:13 <ildikov> the Cinder part is a dependency in the sepc, if these issues are not sorted out, than we're in trouble anyway 17:28:28 <ildikov> it does not mean to sort it out in Nova instead in my view 17:29:15 <mriedem> ok i'll have to review the spec to see the changes then 17:29:21 <ildikov> and the plan is to get them done :) 17:29:25 <mriedem> #action mriedem to review multiattach nova spec 17:29:36 <ildikov> tnx 17:29:41 <mriedem> #action hemna to poke at cleaning up nova check_attach 17:29:53 <hemna> coolio 17:29:55 <mriedem> what's the status on cinder migrate testing on the multinode job in the gate? 17:29:57 <ildikov> if there's anything Nova specific that's missing I will add it 17:30:30 <scottda> mriedem: We're starting with cinder migrate on a single node. We think we can get that working.... 17:30:54 <scottda> But it looks like Devstack support for multi-backend was removed. I'm trying to figure out why, and what alternative exists. 17:31:09 <mriedem> scottda: as in resize? 17:31:10 <scottda> But eventually want multi-node as well. 17:31:30 <hemna> wait what? 17:31:38 <hemna> cinder multi-backend removed from devstack ? 17:31:41 <scottda> no, just have 2 LVM volume groups as separate backends, and migrate between them on a single node. 17:31:51 <thingee> mriedem: no like like multi drivers 17:32:04 <scottda> hemna: No, I've actually found a way to do it, the syntax has changed... 17:32:06 <scottda> and 17:32:20 <scottda> and Tempest multi-backend tests are failing for me. Not sure why. 17:32:47 <mriedem> and that will still test swap volume? 17:33:08 <scottda> yes, calling cinder migrate will call swap volume. 17:34:35 <mriedem> ok, do we want to talk about https://review.openstack.org/#/c/312773/ ? 17:35:39 <scottda> What do you think of that patch mriedem ? 17:35:49 <mriedem> honestly i haven't had the time to dig into it 17:36:19 <mriedem> would be nice to see the live migration job or multi node job passing on it 17:36:22 <mriedem> but those are super flaky 17:36:50 <mriedem> i can dig into the test failures for volume-backed live migration 17:36:54 <mriedem> and see if they are related 17:37:54 <mriedem> finally, before i go, 17:38:04 <mriedem> anyone talked to jgriffith_ on the os-initialize_connection changes? 17:38:31 <scottda> no, I haven't 17:38:37 <ildikov> mriedem: the job says for live migration that it passed, but I might missed smth in the logs... 17:38:57 <ildikov> mriedem: I talked to him briefly, he's working on it, but we couldn't go into details 17:39:03 <mriedem> ildikov: yeah http://logs.openstack.org/73/312773/1/experimental/gate-tempest-dsvm-multinode-live-migration/c57f6b9/console.html#_2016-05-08_09_14_44_572 17:39:11 <hemna> the experimental jobs seem.....borked almost every time. :( 17:39:26 <scottda> I think John said in IRC that he had unit tests for his patch mostly passing... 17:39:54 <ildikov> hemna: it's weird a bit, it congratulates you and then marks the test failed... 17:40:17 <hemna> hehe 17:40:41 <mriedem> it is 17:40:41 <scottda> like a participation trophy. 17:40:41 <mriedem> http://logs.openstack.org/73/312773/1/experimental/gate-tempest-dsvm-multinode-live-migration/c57f6b9/console.html#_2016-05-08_09_18_59_360 17:40:44 <mriedem> setting up ceph 17:40:50 <mriedem> i've pinged tdurakov on that, he works on that job 17:41:02 <ildikov> mriedem: scottda: I will try to catch him and add notes to the etherpad about that item this week or early next 17:41:03 <mriedem> that job sets up various storage backends in a single job 17:41:10 <mriedem> and runs the same 4 tests 17:41:13 <mriedem> looks like it's not working for ceph atm 17:41:21 <mriedem> ildikov: ok 17:41:42 <mriedem> alright, over by 11 minutes 17:41:44 <mriedem> anything else? 17:41:52 <ildikov> also this time next week might be tricky for me 17:42:00 <ildikov> but will try my best 17:42:20 <mriedem> change the time as needed 17:42:25 <scottda> Let's work on a new time. It'd be nice to have JohnGarbuttt here, and JohnG as well 17:42:35 <ildikov> also I know johnthetubaguy cannot make it at this slot, so if it's problematic to either of you in general please let me know and then we can find another one 17:42:37 <hemna> ok 17:42:48 <mriedem> also, fyi, i'm out from 5/20-5/30 17:42:51 <hemna> thanks for the help guys 17:42:56 <mriedem> back on 5/31 17:43:09 <smcginnis> mriedem: Nice 17:43:11 <ildikov> mriedem: ok, thanks for the info 17:43:20 <scottda> ok, bye all. 17:43:33 <ildikov> I will reach out to you regarding time slots 17:43:41 <ildikov> thanks all! 17:44:15 <ildikov> #endmeeting