16:00:47 #startmeeting cinder-nova-api-changes 16:00:52 Meeting started Thu Sep 7 16:00:47 2017 UTC and is due to finish in 60 minutes. The chair is ildikov. Information about MeetBot at http://wiki.debian.org/MeetBot. 16:00:53 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 16:00:56 The meeting name has been set to 'cinder_nova_api_changes' 16:01:04 johnthetubaguy jaypipes e0ne jgriffith hemna mriedem patrickeast smcginnis diablo_rojo xyang1 raj_singh lyarwood jungleboyj stvnoyes 16:01:10 o/ 16:01:23 o. 16:01:43 * jungleboyj is watching out of the corner of my eye 16:01:43 hi all 16:03:12 let's dive in, I think the today's meeting will be a short one 16:03:22 we have a bunch of open reviews: https://review.openstack.org/#/q/topic:bp/cinder-new-attach-apis+status:open 16:04:27 the gate is in general happy with the patches right now 16:04:40 we have the new cinderclient out which fixed a bunch of things for us 16:05:04 mriedem: do you think we can get these things merged by the end of next week? 16:05:35 that big patch on version 133 is scaring me 16:05:46 mriedem: I'm happy to sit together with you and johnthetubaguy and/or other cores interested and get things fixed and merged 16:05:50 https://review.openstack.org/#/c/330285/ will likely not be merged by the end of next week 16:06:01 johnthetubaguy: it has a ton of tests if you mean the size of that thing 16:06:35 mriedem: if we could spend time on figuring out if there's anything fundamentally questionable with that still that would be great 16:06:51 mriedem: and then I can polish it up as needed that should not be a big issue 16:06:53 initially the size of it, there are some specifics, its more about the context of the changes being hard to reason about 16:07:12 how about a review for https://review.openstack.org/#/c/463987/ ? (migrate) While I still remember what I did... 16:07:49 johnthetubaguy: we cannot really split it up to smaller changes honestly 16:08:08 stvnoyes: i'm runn‌ing tests in another patch on that again today 16:08:13 https://review.openstack.org/#/c/481290/ 16:08:20 i would like to get that one in soon 16:09:07 ok great. let me know if there's anything I can do 16:09:15 looks like it's passing so that's good 16:09:32 I checked that briefly too, it looked good 16:09:34 {0} tempest.api.compute.admin.test_live_migration.LiveMigrationRemoteConsolesV26Test.test_volume_backed_live_migration [26.192704s] ... ok 16:09:41 {0} tempest.api.compute.admin.test_live_migration.LiveMigrationRemoteConsolesV26Test.test_volume_backed_live_migration [26.192704s] ... ok 16:09:48 * smcginnis catches up on scrollback 16:09:51 oh those are the same 16:09:57 {2} tempest.api.compute.admin.test_live_migration.LiveAutoBlockMigrationV225Test.test_volume_backed_live_migration [35.446023s] ... ok 16:10:04 mriedem: i also updated the live migrate tempest test to check that the attachments were done correctly - https://review.openstack.org/#/c/487884/ 16:10:33 nice 16:10:48 stvnoyes: problem is, 16:10:51 we don't ever run test_iscsi_volume 16:11:31 hmmm, I'll need to recheck, I did that test because I was finding that the other tests didn't exercise the new code, but let me go back and recheck that 16:11:45 i'll let you know 16:12:10 if the other tests do what we need i'll move the checks over to them 16:12:29 BFV one just have some attachments that need testing, although that is a different path 16:12:32 http://logs.openstack.org/90/481290/3/check/gate-tempest-dsvm-multinode-live-migration-ubuntu-xenial/e8a67e8/console.html#_2017-09-07_15_03_37_331325 16:12:57 bummer 16:13:23 yeah :/ 16:13:47 i did have https://review.openstack.org/#/c/459316/ 16:14:25 so, i could roll a dnm change on top of https://review.openstack.org/#/c/481290/ that depends on ^ 16:14:31 and see if the scsi live migration works 16:14:43 it's just that it intermittenly fails on it's own 16:15:27 mriedem: do you remember how it failed? 16:15:58 it was april, so no 16:16:12 well, although, 16:16:19 in https://review.openstack.org/#/c/459316/ it's not the live migration job that's failing 16:16:35 yeah, that's all I saw 16:17:08 so for the scsi one i'd have to recheck those as the logs are gone 16:17:16 and i think the patch above it failed the scsi test 16:17:19 but the logs are gone 16:17:37 "The live migration job failed on the iscsi test which is enabled in the change below it, so maybe that's not very stable after all." 16:18:34 hmm, I see now 16:19:07 like you say though, getting a pass from that would be no bad thing 16:19:28 yeah, so i can stack something up there to run it 16:19:33 plus stvnoyes's tempest patch as a dependency 16:19:42 jesus my tab farm is growing 16:21:10 that would be great to have those tests in 16:21:37 and this farm is just crazy I always end up cleaning up around 5-6 merged dependencies, etc... 16:22:20 mriedem: do you have capacity to do these this week or early next? 16:22:33 do how? 16:22:42 no i don't have capacity, but i'll hump what i can 16:23:38 ok, no worries, I'm just trying to figure out what needs to get done and who can do it 16:24:02 i will try to stack these iscsi live migration volume test changes to run those against stvnoyes's changes 16:24:11 and go through the live migration change again 16:24:22 that sounds great 16:24:35 thank you 16:25:12 does anything else need more testing BTW? 16:25:43 we have coverage in Grenade and stvnoyes made sure we have updates in tempest for a few items 16:25:56 I added unit and functional tests to the attach patch 16:26:12 can I ask a dumb question? 16:26:21 shoot :) 16:26:45 the live-migrate patch, we are only testing the old code path isn't broken, because the attach new flow patch is separate? 16:27:11 i.e. bdm.attachment_id is always empty? 16:27:15 johnthetubaguy: the attach patch depends on the live migrate one 16:28:02 johnthetubaguy: i'm testing those together here https://review.openstack.org/#/c/481290/ 16:28:02 currently the ci is testing the patch with empty attach id's since the live migrate patch will go in forst 16:28:05 johnthetubaguy: and I think the DNM patch also has the Depends-On tags to get the new code in 16:28:08 first 16:28:26 the attach patch has the tests running with the full code base 16:28:35 OK, got it 16:28:37 thanks 16:28:53 totally didn't see that depends-on in there 16:29:22 I just wonder if we want live-migrate patch passing those extra tests before and after 16:29:31 the new attach flow 16:29:38 johnthetubaguy: not your fault we have a ton everywhere that I clean up occasionally, so it's hard to follow 16:30:44 johnthetubaguy- not sure what you mean, which tests? 16:30:53 the live migration patch itself is passing live migration with the old flow 16:31:01 my patch with the depends-on is passing it with the new flow 16:31:10 since it depends on both the live migration change and the api change to use the new flow 16:31:15 ah, sweet 16:31:21 which is why i was making sure it was working first 16:31:37 the api change depends-on the live migrate one too to pass the functional test for it... 16:31:40 and i'll stack something on top of that to run the iscsi test 16:32:19 cool, sorry I wasn't following those moving bits before, makes sense to me now 16:33:17 it's hard to follow 16:33:49 we have less moving bits now, so it's getting there 16:34:37 yeah, there is some light at the end of the tunnel 16:35:01 should we try to get together for a short group review next week while we're all at the same place? 16:35:17 or we now manage without that? 16:35:35 I like the idea of trying that 16:35:43 I think we plan to update the teams on this activity, but that will not get things merged 16:36:05 and I don't think we would want to go into that deep details on the joint session 16:36:37 I think I get the new attaching -> attached stuff, is there context on what changed peoples minds on that? 16:36:37 so I'm happy to block some time next week to get a better progress on these patches and sort out anything that's still questionable 16:37:22 johnthetubaguy: we needed to have a final 'OK' from the consumer, in this case Nova to tell Cinder the connection is all built up on the host 16:38:03 ildikov: its for the API state I guess, it went attached to early without that? 16:38:08 johnthetubaguy: otherwise the volume would get to 'in-use' before brick finished it's business and you can start a detach too early which messed up the things on the compute 16:38:25 *its 16:38:27 ah... you can start detach too early, got it 16:38:42 yeah, we had broken iscsi connections 16:39:14 in the test logs as they execute the calls in a sequence, so we realized we need the extra state change 16:39:15 I remember wanting it before, I couldn't remember why, thats it 16:39:25 we got there eventually :) 16:39:38 i wonder if that had something to do with the failing iscsi test 16:39:40 I think I commented in one patch, delete is called before create, during a move 16:40:23 it's quite complicated especially with having the two flows existing in parallel, so it just makes it harder to get a clean slate to re-design it or even just to see the big picture... 16:40:33 I thought it was going to the other way around to avoid a window where the volume gets detached 16:41:12 ildikov: upgrade certainly spoils almost every party 16:42:04 grenade seems fine with the simple attach-detach at least 16:42:35 we tried to cover the create/delete path, although without re-writing the whole code it's tough to keep orders 16:42:49 although I think we didn't really leave a big window anywhere 16:43:30 its line 2951 in here: https://review.openstack.org/#/c/330285/133/nova/compute/manager.py 16:43:42 it might be correct, just not got my head around that bit 16:45:36 I am struggling to remember what part of the flow all this little helper methods are, probably have to go re-read that 16:45:44 I need to double check that part as we had a few rebase rounds and we might left it there by mistake 16:46:05 johnthetubaguy: I have to re-understand those helper methods every second week... 16:46:25 ildikov: might be worth adding some comments on those methods, if you re-discover what they are really doing? 16:46:34 johnthetubaguy: partially why I asked whether we want to get together next week and go through these 16:46:43 stvnoyes: here is the change to test the iscsi tempest thing https://review.openstack.org/#/c/501805/ 16:47:11 johnthetubaguy: will try, although sometimes they are used by multiple things and I didn't feel that I want to be responsible for a not fully correct note... :) 16:47:12 ok great, I had just been running it locally. 16:47:23 johnthetubaguy: as letting people to figure it out is still better than misleading them 16:47:42 ildikov: yeah, I think we might want to bit the bullet on understanding those 16:48:12 ildikov: I am happy to review the comments and share the blame, normally I am totally against such comments because of the missleading over time thing 16:48:27 johnthetubaguy: same here 16:48:46 just to go back to a think 16:48:47 johnthetubaguy: we can try to draw some high level diagrams just to visualize the flows 16:49:09 assuming its needed, can we do attach before the detach there? 16:49:27 johnthetubaguy: which one do you mean? 16:49:49 I suspect it may fail if the rebuild lands on the same host like it does in the gate 16:50:27 johnthetubaguy: do you mean the prep_block_device one now? 16:50:35 both of them 16:50:51 well, the detach followed by attach_create 16:51:00 I mean attachment_delete followed by create 16:51:21 the reverse is less racey 16:51:24 yeah, we reserve the volume with create there 16:51:41 totally need the create, I just want it before the delete, ideally 16:52:24 I need to check the Cinder API as I remember we said that we will allow things for the same instance, just need to double check on where things evolved in practice 16:53:07 it's also tough with these helper methods to ensure we get the right order of things, etc. 16:54:31 johnthetubaguy: please add all questions/comments you have to the review and I will ensure we catch mriedem and jgriffith next week to go through the questionable ones 16:54:49 yeah, they should all be in comments on the review now 16:54:53 ok, we have 6 more minutes left 16:55:00 johnthetubaguy: +1, thanks 16:55:17 is there anything else to discuss so we're all prepared for next week? 16:56:07 multi-attach? 16:56:09 * johnthetubaguy ducks 16:56:22 johnthetubaguy: I uploaded the spec for review 16:56:42 johnthetubaguy: it's in WIP, I will add a few small changes to it, so we kind of have the base for discussions 16:56:44 I should totally re-read that, lots of the folks I work with really want that 16:57:09 johnthetubaguy: there were a lot of people from a lot of areas who want that 16:57:38 johnthetubaguy: that's why I'm pushing for it to talk about it next week and get progress 16:57:52 totally, just hinting I should be able to make time to help keep moving that, not 100% sure 16:58:07 johnthetubaguy: feel free to add a ton of comments/questions to the spec review 16:58:13 johnthetubaguy: that would be great! 16:58:21 you got the link for that spec? 16:58:43 https://review.openstack.org/#/c/499777/ 16:58:56 #link https://review.openstack.org/#/c/499777/ 16:59:08 sweet 16:59:46 ok, we hit the top of the hour, but we can continue on the channels if there's more to discuss for prep 16:59:52 johnthetubaguy: good to have you back :) 17:00:01 nice to be around again 17:00:09 :) 17:00:17 thank you all for today 17:00:22 and see you all next week! :) 17:00:31 #endmeeting