14:04:47 <enriquetaso> #startmeeting cinder_bs 14:04:47 <opendevmeet> Meeting started Wed Oct 26 14:04:47 2022 UTC and is due to finish in 60 minutes. The chair is enriquetaso. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:04:47 <opendevmeet> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:04:47 <opendevmeet> The meeting name has been set to 'cinder_bs' 14:04:50 <rosmaita> geguileo: i replied to your osc question in an email! 14:04:55 <enriquetaso> hey everyone 14:05:02 <rosmaita> o/ 14:05:27 <geguileo> rosmaita: I saw, but that looks like a lot of work and probably not the biggest reason why osc is soooooooooooooooooooooooooooooooooooooo slow 14:05:28 <enriquetaso> Full list of bugs 14:05:30 <enriquetaso> #link https://lists.openstack.org/pipermail/openstack-discuss/2022-October/031001.html 14:05:41 <rosmaita> geguileo: yeah, it was disappointing 14:06:17 <enriquetaso> I'm not sure how to prioritize the next two bugs: 14:06:23 <enriquetaso> #topic The volume multiattach and in-use after retyping another backend, then can not detach it 14:06:28 <enriquetaso> #link https://bugs.launchpad.net/cinder/+bug/1994018 14:06:34 <enriquetaso> After retyping from Ceph to Huawei the volume can not be detached because the connection info volume_id has been changed. 14:06:39 <enriquetaso> I haven't reproduced this bug yet. That's why I haven't assigned importance to the bug yet. Looks like Nova is involved but I'm not sure what to do with this bug. 14:08:28 <geguileo> enriquetaso: We used to have that issue in Nova a while back 14:08:37 <geguileo> enriquetaso: and it was fixed 14:08:41 <geguileo> though this could be a different one 14:08:58 <geguileo> once we know the openstack release we may look if it was fixed or not 14:09:29 <geguileo> oh, wait it looks different 14:09:40 <enriquetaso> yes, i've remember we discussed something like that a while ago 14:09:49 <enriquetaso> oh 14:10:26 <geguileo> but I see where the problem is... 14:12:25 <geguileo> and we are in trouble... 14:13:14 <rosmaita> i don't like the sound of that 14:13:22 <geguileo> I'm 99% sure that's a legitimate bug 14:13:46 <geguileo> and surprise, surprise, it's the multi-attach design/implementation 14:13:59 <geguileo> that didn't take into account that the volume_id changes during a live migration 14:15:36 <geguileo> but didn't we prevent multi-attach volume from live migrating/retyping? 14:15:36 <enriquetaso> :/ 14:17:12 <geguileo> wait, wait, wait 14:17:40 <enriquetaso> "You can migrate only detached volumes with no snapshots." 14:18:00 <geguileo> enriquetaso: what's that from? 14:18:53 <enriquetaso> #link https://opendev.org/openstack/cinder/src/branch/master/doc/source/cli/cli-manage-volumes.rst#migrate-a-volume 14:20:48 <geguileo> enriquetaso: afaik that is incorrect 14:21:03 <geguileo> we can migrate in-use volumes 14:21:16 <eharney> with retype, right? 14:21:34 <geguileo> but I don't understand WHY we allow migrating/retyping with migration of multi-attach in-use volumes 14:21:52 <eharney> i thought we didn't 14:22:13 <geguileo> eharney: I'm looking at the code and unless I'm missing something, we do 14:22:26 <geguileo> as long as both the source and target types are both multi-attach 14:22:50 <geguileo> I'm pretty sure that's a bug 14:23:17 <geguileo> WTF!! It's an if condition bug 14:23:24 <geguileo> MF!!! 14:23:42 <geguileo> apparently it was too much to ask to write an if clause 14:23:54 <enriquetaso> :( 14:23:59 <geguileo> it's a trivial fix 14:24:05 <geguileo> it's a oneliner 14:24:16 <enriquetaso> share link to the code please 14:24:25 <geguileo> basically changing != with or 14:24:47 <geguileo> it's in cinder/volume/api.py 14:24:49 <geguileo> if src_is_multiattach != tgt_is_multiattach: 14:24:57 <geguileo> s/!=/or/ 14:25:30 <enriquetaso> #action(enriquetaso): open bug to fix https://opendev.org/openstack/cinder/src/branch/master/doc/source/cli/cli-manage-volumes.rst#migrate-a-volume . It's incorrect, with allow retype in-use volumes (without multi-attach). 14:25:32 <enriquetaso> oops 14:25:39 <geguileo> reading a bit more, we may want to make a couple more changes besides that one 14:26:25 <geguileo> because, as expected, it looks like the whole check section is not properly thought of 14:26:55 <geguileo> it should only complain if the source is multi-attach, it shouldn't matter about the target 14:27:44 <eharney> if the source is a multi-attach type, or is attached to multiple instances? 14:28:06 <eharney> because right now the check is based on the type 14:28:09 <geguileo> eharney: I think it should check if the source is multi-attach and the volume is not available ==> error 14:28:20 <geguileo> if the target is multi-attach and it's not authorized ==> error 14:28:41 <eharney> geguileo: not sure why it matters if the source is multi-attach if it's not actually multiply attached 14:28:42 <geguileo> and that should be it 14:29:14 <geguileo> eharney: I think it would be possible to do the second attachment while we are migrating 14:29:32 <geguileo> or have a race condition right there... 14:30:56 <geguileo> eharney: ideally we would code a conditional DB update that takes into account the actual attachments 14:31:29 <geguileo> that way we would allow a live migration of a multi-attach volume just by reducing the number of VMs using it 14:31:44 <eharney> right 14:31:54 <geguileo> the TL;DR, you are correct, that would be the right way to fix it 14:32:07 <geguileo> but then it's not a small patch, but a large one to prevent races 14:39:09 <enriquetaso> OK, so the bug is valid and we need to work on some bugfixes then 14:39:28 <enriquetaso> and also in some doc fixes 14:43:01 <enriquetaso> OK, thanks!! 14:43:05 <enriquetaso> moving to the next one 14:43:28 <enriquetaso> #topic Cinder cannot work when 1 node of 3 rabbit node cluster down 14:43:32 <enriquetaso> #link https://bugs.launchpad.net/cinder/+bug/1994021 14:43:35 <enriquetaso> There's a discussion on the mailling list regarding this problem. 14:43:39 <enriquetaso> #link https://lists.openstack.org/pipermail/openstack-discuss/2022-October/030968.html 14:43:49 <enriquetaso> is something we could do in cinder? 14:50:05 <enriquetaso> should i link oslo project as well? 14:51:36 <geguileo> enriquetaso: I think this has 0 to do with Cinder specifically... 14:51:56 <geguileo> I mean, we get the transport_url parameter, and that's what gets used 14:52:33 <geguileo> so it's either an oslo.messaging issue or a configuration/deployment issue 14:53:04 <enriquetaso> thanks geguileo, makes sense, i'll update the bug report 14:53:53 <geguileo> we could also be setting it incorrectly in our code, but I wouldn't think that's the issue... 14:54:59 <enriquetaso> thanks geguileo 14:55:02 <geguileo> enriquetaso: I have added a comment to the retype LP bug with the techincal discussion we had here 14:55:16 <enriquetaso> cool! 14:55:27 <enriquetaso> that always helps 14:56:01 <enriquetaso> OK, we are running out of time. The bug meeting should be half an hour and I took one hour 14:56:48 <enriquetaso> please check the bug email for all the bugs for this week. 14:57:48 <enriquetaso> #endmeeting