14:03:47 <jbernard> #startmeeting cinder
14:03:47 <opendevmeet> Meeting started Wed May 14 14:03:47 2025 UTC and is due to finish in 60 minutes.  The chair is jbernard. Information about MeetBot at http://wiki.debian.org/MeetBot.
14:03:47 <opendevmeet> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
14:03:47 <opendevmeet> The meeting name has been set to 'cinder'
14:03:51 <jayaanand_> hi
14:03:53 <simondodsley> o/
14:03:55 <agalica> o/
14:03:58 <Luzi> o/
14:04:02 <jbernard> o/
14:04:04 <hvlcchao1> o/
14:04:09 <eharney> hi
14:04:17 <gireesh> o/
14:04:18 <jbernard> #link https://etherpad.opendev.org/p/cinder-flamingo-meetings
14:04:35 <rosmaita> o/
14:05:43 <whoami-rajat> hi
14:06:04 <jungleboyj> o/
14:06:52 <jbernard> welcome everyone, lets get started
14:07:00 <jbernard> rosmaita: you have some annoucements
14:07:03 <jbernard> #topic annoucements
14:07:29 <rosmaita> yeah, looks like maybe possibly the gates will get under control today
14:07:39 <rosmaita> info is on the agenda
14:08:20 <simondodsley> exciting - been waiting for this one
14:08:30 <rosmaita> keep your fingers crossed
14:08:38 <rosmaita> the other CI news is about the NFS CI job
14:08:39 <simondodsley> thanks rosmaita for sorting this out
14:09:11 <rosmaita> tosky has a patch up to add the cinder-tempest-plugin tests to the devstack-plugin-nfs job
14:09:28 <rosmaita> there's a question on his patch about whether the swap size change will work
14:09:35 <Sai> o/
14:09:44 <rosmaita> so i put up a patch to run the job 10 times
14:09:49 <rosmaita> it's still running
14:10:11 <rosmaita> but that should give us some info ... key thing is, we need to be running the cinder-tempest-plugin tests on NFS
14:10:46 <rosmaita> but hopefully we can get https://review.opendev.org/c/openstack/devstack-plugin-nfs/+/898965 merged soon
14:10:58 <simondodsley> is there a simple way 3rd party CIs can force this to run on our side?
14:11:46 <rosmaita> by "this' do you mean the NFS job, or the cinder-tempest-plugin tests?
14:11:53 <simondodsley> NFS
14:12:31 <simondodsley> I'm happy to run the NFS generic plugin on our CI but need to understand how to configure Zuul so it uses NFS shares I provide
14:12:53 <simondodsley> This would also provide some validation on using the generic driver on 3rd party paltforms
14:13:16 <rosmaita> that would be nice ... but to answer your question, i'm not sure
14:13:16 <eharney> definitely possible, if you have NFS shares, you'd just need to configure the generic driver and its related options in cinder
14:13:45 <rosmaita> what Eric said
14:14:02 <eharney> https://opendev.org/openstack/devstack/src/branch/master/lib/cinder_backends/nfs    shows what needs to be set
14:14:43 <simondodsley> gret - I'll look into that
14:15:06 <rosmaita> sounds good!
14:15:10 <rosmaita> that's all from me
14:16:07 <whoami-rajat> I've proposed some patches to better address the new location API issue but it needs attention from glance rather than cinder https://review.opendev.org/q/topic:%22fix-location-apis%22
14:17:42 <rosmaita> whoami-rajat: those would allow us to run our tests with 'do_secure_hash = true" ?
14:18:04 <whoami-rajat> with the tempest change merged, YES
14:18:46 <rosmaita> ok, but for now, we can just happily run with  'do_secure_hash = false' and the ceph job will pass
14:19:14 <whoami-rajat> yep, I also have a patch to disable it in glance since it might cause issues in deployments as well
14:19:56 <rosmaita> that secure hash seems to be a performance hit
14:20:55 <eharney> the test failures aren't because it's a perf issue
14:22:33 <eharney> the test fail b/c the image is in "active" while there are still tasks happening that interfere with other operations
14:23:45 <rosmaita> yeah, but isn't the "tasks still happening" taking up time & cpu ?
14:24:04 <eharney> yes
14:24:06 <rosmaita> i mean, if they were instantaneous, we wouldn't see this
14:24:10 <rosmaita> ok
14:24:36 <whoami-rajat> if we disable hashing then we are not really testing it anywhere
14:24:50 <eharney> in short i think there are design problems that that tempest test exposed, and we need to document those (as a bug) before we start merging workarounds into tempest tests
14:27:57 <whoami-rajat> bug: https://bugs.launchpad.net/glance/+bug/2110581
14:27:57 <whoami-rajat> fix: https://review.opendev.org/c/openstack/glance/+/949668
14:27:57 <whoami-rajat> doc: https://review.opendev.org/c/openstack/glance/+/949672
14:29:21 <eharney> i think there is another bug in that it is unclear what the expected behavior is for an image that's in "active" (and therefore usable) but that still has ongoing hash calculation tasks running -- it seems like it shouldn't be in the active state
14:29:29 <eharney> but we don't have to sort all that out here
14:31:24 <whoami-rajat> it was by design this way but the cases like delete were not considered so turned out to be problematic
14:31:25 <whoami-rajat> yep, let us meet with glance and we can discuss
14:31:39 <eharney> delete is not the scariest part of this design IMO
14:33:54 <jbernard> ok, thanks for getting all of this sorted
14:34:31 <jbernard> the glance meeting is tomorrow, maybe this is a good topic for their agenda
14:35:09 <jbernard> do we have any specific plans to introduce this at that meeting, or are the relevant persons already aware?
14:36:53 <eharney> there still isn't a bug fully describing the problem so i'm not assuming that all the relevant material is being discussed yet
14:38:49 <jbernard> ok, unless someone wants to volunteer, i can jump in there tomorrow and at least raise the issue
14:39:28 <hemna> mep
14:40:47 <jbernard> ok, next on the list
14:41:07 <jbernard> tobias-urdin has an rbd data pool patch up
14:41:18 <jbernard> https://review.opendev.org/c/openstack/cinder/+/914930
14:42:07 <jbernard> asking if it should honor the configured data pool when cloning, or use the one from the original volume
14:42:45 <eharney> in the situation where you've configured a data pool on an existing deployment that didn't have one?
14:43:20 <jbernard> or if the configured data pool doesn't match the data pool of the volume being cloned
14:44:06 <eharney> good question
14:44:09 <jbernard> yeah
14:44:16 <eharney> it's also not obvious to me how this impacts our capacity stats collection
14:44:27 <eharney> because the patch seems to ignore that currently
14:44:52 <jbernard> i lean toward honoring the configuration defined, but ive only been thinking about it for a minute
14:45:19 <jbernard> eharney: that's a good questiion, that patch definately needs some input
14:46:59 <jbernard> eharney: ive added us as reviewers, so i should show up in your queue
14:47:05 <jbernard> s/i/it
14:48:47 <whoami-rajat> I'm missing the use case of the patch, is the purpose to have RBD image data in a separate data pool and the metadata in the volumes pool?
14:49:45 <eharney> yes, it doesn't really spell out the use case very well, but one of the reasons that's desired is to enable use of erasure-coded pools
14:49:56 <jbernard> yep
14:50:41 <whoami-rajat> ok, don't we also need to document the performance impact of using erasure coded pool? I'm sure the parity calculation should be pretty CPU intensive
14:51:11 <eharney> it will definitely have a perf impact (on the ceph cluster side)
14:52:17 <jbernard> related to tobias-urdin, his microversion patches are all posted and ready for review
14:52:30 <jbernard> https://etherpad.opendev.org/p/cinder-flamingo-reviews#L18
14:52:58 <whoami-rajat> yes, so deployers/tooling needs to configure more resources to the nodes deploying the OSDs hosting the data pool
14:53:21 <whoami-rajat> to me this patch is very abstract given the impact it can have on deployments
14:54:01 <jbernard> whoami-rajat: these would be good review comments ;)
14:54:21 <eharney> it's geared toward someone who already decided they wanted to setup erasure-coded pools, but yeah, more documentation would definitely help
14:55:32 <jbernard> i want to get this one in before the bell,
14:55:48 <jbernard> mhen, Luzi have some cycles to work on image encryption
14:56:00 <jbernard> https://review.opendev.org/q/topic:%22LUKS-image-encryption%22
14:56:03 <whoami-rajat> not for a specific customer but a lot of deployments adapt new features thinking it's suited for their use case without knowing the full context so I'm more inclined towards documenting this better
14:56:35 <Luzi> well starting with next week we will have a few hours per week to work on the image encryption again
14:57:14 <Luzi> it would be nice to finally come to terms with this :D
14:57:16 <jbernard> those patches need review (as do others), if anyone is looking for things in immediate need of attention
14:57:23 <jayaanand_> we have multiple customers looking for in-use NFS volume extension. there is a proposal to fix generic NFS issue related bug https://bugs.launchpad.net/cinder/+bug/1870367. can we implement NetApp driver in similar line.
14:57:28 <jbernard> https://etherpad.opendev.org/p/cinder-flamingo-reviews
14:58:03 <jbernard> eharney: i meant to relay jayaanand_'s quetions from earlier, i think you might know best this answer
14:59:07 <eharney> there is a patch up currently implementing this in the netapp nfs driver, right?
14:59:17 <eharney> https://review.opendev.org/q/topic:%22bp/extend-volume-completion-action%22
14:59:18 <eharney> in there ^
14:59:51 <jayaanand_> it didn't went pass review
14:59:56 <agamboa> hey rosmaita, we took a look at your comments and addressed them for this merge: https://review.opendev.org/c/openstack/cinder/+/901318/
15:00:06 <eharney> i'm not sure i understand the question. this area is still being worked on
15:00:15 <rosmaita> agamboa: ok, will take a look
15:00:35 <jayaanand_> https://review.opendev.org/c/openstack/cinder/+/873889
15:01:52 <jayaanand_> also bluprint is re-proposed https://review.opendev.org/c/openstack/cinder-specs/+/922374
15:02:11 <eharney> i think the intent is to complete 873889 and land it
15:02:49 <jayaanand_> ok
15:02:59 <jayaanand_> thank you!
15:03:50 <jbernard> we're just a bit over time, last call everyone
15:04:20 <agalica> I have a quick question if we're in the open questions section
15:04:34 <agalica> If we were to support a new type of storage array which in turn requires a new volume type for one of the required features do we require two patches or just one?
15:04:55 <agalica> My suspicion is 2, but I wanted to confirm since the new storage device requires said feature
15:05:16 <jbernard> agalica: you mean a new volume driver?
15:05:25 <jbernard> agalica: do you have an code up that we can look at?
15:05:56 <agalica> not at this time as it's a future.  I mean a storage device that we currently do not support in the driver.  We are adding support for it, but it requires a volume type feature as well.
15:06:02 <whoami-rajat> added my comment to the RBD patch
15:06:14 <jbernard> whoami-rajat: thank you!
15:06:22 <tobias-urdin> sorry late to the meeting
15:06:44 <jbernard> tobias-urdin: no worries
15:07:14 <jbernard> agalica: i personally am not sure, i say post the code and we can discuss
15:07:22 <whoami-rajat> agalica, volume types can have driver specific attributes in extra specs so a special volume type should not require code changes but would be better to understand your use case more in detail
15:07:25 <agalica> So, for example, supporting storage type X requires feature Y which has not been implemented.  In order to support X, we must support Y.  Does this mean that Y requires its own patch and then we can have a patch for storage type X?  Or can both of these things happen at once?  If it helps, most likely the only changes required will be these new supporting features
15:08:11 <agalica> ok, thanks guys.
15:10:19 <jbernard> ok, lets wrap up, thank you everyone!
15:10:23 <jbernard> #endmeeting