#openstack-cinder log

14:04:47 <enriquetaso> #startmeeting cinder_bs
14:04:47 <opendevmeet> Meeting started Wed Oct 26 14:04:47 2022 UTC and is due to finish in 60 minutes.  The chair is enriquetaso. Information about MeetBot at http://wiki.debian.org/MeetBot.
14:04:47 <opendevmeet> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
14:04:47 <opendevmeet> The meeting name has been set to 'cinder_bs'
14:04:50 <rosmaita> geguileo: i replied to your osc question in an email!
14:04:55 <enriquetaso> hey everyone
14:05:02 <rosmaita> o/
14:05:27 <geguileo> rosmaita: I saw, but that looks like a lot of work and probably not the biggest reason why osc is soooooooooooooooooooooooooooooooooooooo slow
14:05:28 <enriquetaso> Full list of bugs
14:05:30 <enriquetaso> #link https://lists.openstack.org/pipermail/openstack-discuss/2022-October/031001.html
14:05:41 <rosmaita> geguileo: yeah, it was disappointing
14:06:17 <enriquetaso> I'm not sure how to prioritize the next two bugs:
14:06:23 <enriquetaso> #topic The volume multiattach and in-use after retyping another backend, then can not detach it
14:06:28 <enriquetaso> #link https://bugs.launchpad.net/cinder/+bug/1994018
14:06:34 <enriquetaso> After retyping from Ceph to Huawei the volume can not be detached because the connection info volume_id has been changed.
14:06:39 <enriquetaso> I haven't reproduced this bug yet. That's why I haven't assigned importance to the bug yet. Looks like Nova is involved but I'm not sure what to do with this bug.
14:08:28 <geguileo> enriquetaso: We used to have that issue in Nova a while back
14:08:37 <geguileo> enriquetaso: and it was fixed
14:08:41 <geguileo> though this could be a different one
14:08:58 <geguileo> once we know the openstack release we may look if it was fixed or not
14:09:29 <geguileo> oh, wait it looks different
14:09:40 <enriquetaso> yes, i've remember we discussed something like that a while ago
14:09:49 <enriquetaso> oh
14:10:26 <geguileo> but I see where the problem is...
14:12:25 <geguileo> and we are in trouble...
14:13:14 <rosmaita> i don't like the sound of that
14:13:22 <geguileo> I'm 99% sure that's a legitimate bug
14:13:46 <geguileo> and surprise, surprise, it's the multi-attach design/implementation
14:13:59 <geguileo> that didn't take into account that the volume_id changes during a live migration
14:15:36 <geguileo> but didn't we prevent multi-attach volume from live migrating/retyping?
14:15:36 <enriquetaso> :/
14:17:12 <geguileo> wait, wait, wait
14:17:40 <enriquetaso> "You can migrate only detached volumes with no snapshots."
14:18:00 <geguileo> enriquetaso: what's that from?
14:18:53 <enriquetaso> #link https://opendev.org/openstack/cinder/src/branch/master/doc/source/cli/cli-manage-volumes.rst#migrate-a-volume
14:20:48 <geguileo> enriquetaso: afaik that is incorrect
14:21:03 <geguileo> we can migrate in-use volumes
14:21:16 <eharney> with retype, right?
14:21:34 <geguileo> but I don't understand WHY we allow migrating/retyping with migration of multi-attach in-use volumes
14:21:52 <eharney> i thought we didn't
14:22:13 <geguileo> eharney: I'm looking at the code and unless I'm missing something, we do
14:22:26 <geguileo> as long as both the source and target types are both multi-attach
14:22:50 <geguileo> I'm pretty sure that's a bug
14:23:17 <geguileo> WTF!!  It's an if condition bug
14:23:24 <geguileo> MF!!!
14:23:42 <geguileo> apparently it was too much to ask to write an if clause
14:23:54 <enriquetaso> :(
14:23:59 <geguileo> it's a trivial fix
14:24:05 <geguileo> it's a oneliner
14:24:16 <enriquetaso> share link to the code please
14:24:25 <geguileo> basically changing != with or
14:24:47 <geguileo> it's in cinder/volume/api.py
14:24:49 <geguileo> if src_is_multiattach != tgt_is_multiattach:
14:24:57 <geguileo> s/!=/or/
14:25:30 <enriquetaso> #action(enriquetaso): open bug to fix https://opendev.org/openstack/cinder/src/branch/master/doc/source/cli/cli-manage-volumes.rst#migrate-a-volume . It's incorrect, with allow retype in-use volumes (without multi-attach).
14:25:32 <enriquetaso> oops
14:25:39 <geguileo> reading a bit more, we may want to make a couple more changes besides that one
14:26:25 <geguileo> because, as expected, it looks like the whole check section is not properly thought of
14:26:55 <geguileo> it should only complain if the source is multi-attach, it shouldn't matter about the target
14:27:44 <eharney> if the source is a multi-attach type, or is attached to multiple instances?
14:28:06 <eharney> because right now the check is based on the type
14:28:09 <geguileo> eharney: I think it should check if the source is multi-attach and the volume is not available ==> error
14:28:20 <geguileo> if the target is multi-attach and it's not authorized ==> error
14:28:41 <eharney> geguileo: not sure why it matters if the source is multi-attach if it's not actually multiply attached
14:28:42 <geguileo> and that should be it
14:29:14 <geguileo> eharney: I think it would be possible to do the second attachment while we are migrating
14:29:32 <geguileo> or have a race condition right there...
14:30:56 <geguileo> eharney: ideally we would code a conditional DB update that takes into account the actual attachments
14:31:29 <geguileo> that way we would allow a live migration of a multi-attach volume just by reducing the number of VMs using it
14:31:44 <eharney> right
14:31:54 <geguileo> the TL;DR, you are correct, that would be the right way to fix it
14:32:07 <geguileo> but then it's not a small patch, but a large one to prevent races
14:39:09 <enriquetaso> OK, so the bug is valid and we need to work on some bugfixes then
14:39:28 <enriquetaso> and also in some doc fixes
14:43:01 <enriquetaso> OK, thanks!!
14:43:05 <enriquetaso> moving to the next one
14:43:28 <enriquetaso> #topic Cinder cannot work when 1 node of 3 rabbit node cluster down
14:43:32 <enriquetaso> #link https://bugs.launchpad.net/cinder/+bug/1994021
14:43:35 <enriquetaso> There's a discussion on the mailling list regarding this problem.
14:43:39 <enriquetaso> #link https://lists.openstack.org/pipermail/openstack-discuss/2022-October/030968.html
14:43:49 <enriquetaso> is something we could do in cinder?
14:50:05 <enriquetaso> should i link oslo project as well?
14:51:36 <geguileo> enriquetaso: I think this has 0 to do with Cinder specifically...
14:51:56 <geguileo> I mean, we get the transport_url parameter, and that's what gets used
14:52:33 <geguileo> so it's either an oslo.messaging issue or a configuration/deployment issue
14:53:04 <enriquetaso> thanks geguileo, makes sense, i'll update the bug report
14:53:53 <geguileo> we could also be setting it incorrectly in our code, but I wouldn't think that's the issue...
14:54:59 <enriquetaso> thanks geguileo
14:55:02 <geguileo> enriquetaso: I have added a comment to the retype LP bug with the techincal discussion we had here
14:55:16 <enriquetaso> cool!
14:55:27 <enriquetaso> that always helps
14:56:01 <enriquetaso> OK, we are running out of time. The bug meeting should be half an hour and I took one hour
14:56:48 <enriquetaso> please check the bug email for all the bugs for this week.
14:57:48 <enriquetaso> #endmeeting