#openstack-meeting-alt log

14:06:57 <rosmaita> #startmeeting cinder
14:06:57 <opendevmeet> Meeting started Wed Feb 12 14:06:57 2025 UTC and is due to finish in 60 minutes.  The chair is rosmaita. Information about MeetBot at http://wiki.debian.org/MeetBot.
14:06:57 <opendevmeet> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
14:06:57 <opendevmeet> The meeting name has been set to 'cinder'
14:07:07 <rosmaita> #topic roll call
14:07:12 <whoami-rajat> hi
14:07:14 <sp-bmilanov> hello!
14:07:17 <sfernand> hi
14:07:19 <rosmaita> o/
14:07:22 <akawai> o/
14:09:54 <rosmaita> ok, guess we can get started
14:10:05 <rosmaita> #topic announcements
14:10:24 <rosmaita> first, os-brick release for epoxy is next week
14:10:43 <rosmaita> so, obviously, os-brick reviews are a priority right now
14:11:28 <rosmaita> #link https://review.opendev.org/q/project:openstack/os-brick+status:open+branch:master
14:12:08 <rosmaita> if you want any hope of your patch getting in, make sure it is *not* in Merge Conflict and has +1 from Zuul
14:12:15 <whoami-rajat> i wanted to get this one in but it needs to be updated with few details, commit message, releasenote etc
14:12:17 <whoami-rajat> #link https://review.opendev.org/c/openstack/os-brick/+/939916
14:12:26 <msaravan> hi
14:13:17 <rosmaita> whoami-rajat: go ahead and add that stuff
14:13:24 <rosmaita> to your patch, i mean
14:13:39 <rosmaita> seems like we are seeing a lot more FC issues lately
14:13:56 <rosmaita> is that an industry trend?
14:14:20 <whoami-rajat> sure, the main thing holding me was to do some more testing with this but I'm kind of stuck there for known reasons ... though I've at least tested it once
14:15:18 <whoami-rajat> that's a good topic i wanted to bring up as well, are the vendor teams seeing sudden slowness in newer distros related to 1. LUN scanning 2. multipath device creation 3. multipath being ready for I/O
14:16:28 <rosmaita> while people are thinking about ^^ , just want to say that the next os-brick release (after the official epoxy one next week) would have to wait until after April 2
14:16:43 <rosmaita> (unless some kind of major regression, cve is discovered)
14:18:35 <rosmaita> hmmm ... the silence is deafening
14:18:44 <rosmaita> whoami-rajat: you may want to ask that on the ML
14:19:04 <rosmaita> ok, week after next is the python-cinderclient release for Epoxy
14:19:09 <whoami-rajat> can do that, thanks
14:19:19 <rosmaita> #link https://review.opendev.org/q/project:openstack/python-cinderclient+status:open+branch:master
14:20:00 <rosmaita> not much in there, but the admin-editable metadata can't go in unless we get the cinder-side change merged first
14:20:08 <rosmaita> not sure what the status of that is
14:20:48 <rosmaita> #link https://review.opendev.org/c/openstack/cinder/+/928794
14:21:02 <rosmaita> looks like sfernand caught an issue with the cinder patch
14:22:31 <rosmaita> ok, and finally, the Epoxy feature freeze is the week of Feb 24
14:22:48 <rosmaita> which is coming up fast
14:23:13 <whoami-rajat> I had a question about the unmerged specs for features, any plans for them?
14:24:37 <rosmaita> there seem to be a lot of them
14:24:41 <rosmaita> #link https://review.opendev.org/q/project:openstack/cinder-specs+status:open+branch:master
14:27:29 <rosmaita> guess that's really a question for Jon
14:28:20 <whoami-rajat> i see, i was just asking about one of my reproposal of an old spec :D
14:28:22 <whoami-rajat> #link https://review.opendev.org/c/openstack/cinder-specs/+/931581
14:29:39 <rosmaita> i think if you have the code ready, it would be ok to do a spec-freeze-exception, especially since the spec had been accepted earlier
14:30:33 <rosmaita> quick question for you ... in that path, the virtual_size of the image will not be set in glance?
14:30:53 <rosmaita> (follow up from takashi's min_disk patch discussion last week)
14:32:00 <rosmaita> i think we may want to continue with the min_disk patch, even if glance is currently setting virtual_size for "normal" image uploads
14:32:20 <whoami-rajat> +1
14:32:35 <rosmaita> but that's a side issue, if i'm right, please leave a comment on takashi's patch
14:32:58 <rosmaita> #link https://review.opendev.org/c/openstack/cinder/+/804584
14:33:12 <rosmaita> ok, that's it for announcements
14:33:34 <rosmaita> no "old patch of the week" from me ... looks like we have a few of those sitting for os-brick
14:33:44 <rosmaita> so please take a look when you have time
14:34:07 <rosmaita> ok, the next item on the agenda are the Review Requests
14:34:53 <rosmaita> what i would like to do is give people who have revised a patch a minute to say something about how they have addressed the reviewers' concerns
14:35:29 <rosmaita> so for example nileshthathagar i think you addressed my question about extending the timeouts in some dell patches?
14:36:34 <rosmaita> #link https://review.opendev.org/c/openstack/cinder/+/804584
14:36:52 <rosmaita> s0rry, bad paste there
14:37:04 <rosmaita> #link https://review.opendev.org/c/openstack/cinder/+/939514
14:37:37 <rosmaita> i am curious what other people think about extending the retries
14:38:08 <rosmaita> if my calculations are correct, the retry loop could hold for a little over half an hour
14:38:50 <rosmaita> seems like maybe it would be better to just fail after a reasonable time?
14:39:00 <rosmaita> even 17 min seems more reasonable
14:39:23 <rosmaita> i am just worried that the issues caused by waiting that long and still no success would outweigh an earlier failure
14:39:53 <rosmaita> (i picked 17 min because iirc, that was the worst case in your testing)
14:40:15 <nileshthathagar> Hi Brian
14:40:27 <rosmaita> hello!
14:40:36 <nileshthathagar> yes in worst case it is going to retry 6th time
14:40:52 <nileshthathagar> but not for every request
14:41:31 <nileshthathagar> it can one in 10-15 request or more
14:41:32 <rosmaita> right, i just worry that if something happens to the backend, these could pile up for a half hour
14:42:41 <nileshthathagar> yeah that can be happen but it will be rare.
14:43:12 <rosmaita> maybe it's nothing to worry about, i just have a bad feeling
14:43:46 <rosmaita> ok when you push a new patch set to update the release note, that will clear my -1
14:44:20 <nileshthathagar> ok thanks i will do it
14:44:48 <whoami-rajat> do we know what's the reason for wait here?
14:44:50 <nileshthathagar> also there are another patches which have retry.
14:45:06 <whoami-rajat> from the code, i feel we are trying to delete the volume and if it has dependent snapshots, we retry?
14:45:42 <rosmaita> yeah, it sounds like these aren't normal cinder snapshots, some kind of backend thing
14:45:53 <nileshthathagar> yes there are some active snapshot that is not get clear from powermax
14:45:54 <rosmaita> and sometimes the backend takes a long time to clean them up
14:46:40 <nileshthathagar> it is taking sometime but same it is happening occassionaly
14:47:36 <whoami-rajat> okay, it would be good to investigate under which circumstances those delays happen
14:48:03 <whoami-rajat> since the _cleanup_device_retry is only called during delete volume, the retry change is isolated to only one operation which is good
14:48:28 <rosmaita> well, except, like nileshthathagar says, there are some retries added on other patches
14:48:34 <rosmaita> all with the same parameters, iirc
14:49:03 <nileshthathagar> yes
14:49:07 <whoami-rajat> hmm, that sounds concerning ...
14:50:34 <rosmaita> yes, i would feel better if there could be some kind of load test done over 30 hours with a large number of simulated users to see what happens to cinder
14:50:57 <rosmaita> but i would like to ride a unicorn, too
14:51:04 <whoami-rajat> :D
14:51:35 <whoami-rajat> at least analysing the logs from storage array would give us some clue, like in this case if the driver is sending snapshot delete request, why it still exists
14:51:37 <sfernand> wondering why not implementing periodic tasks for dealing with such longer waits. Volume could be marked as deleted and a periodic might make sure it gets cleared in the backend
14:51:49 <whoami-rajat> there should be some logging around that
14:52:39 <rosmaita> good points from both of you ... i mean, i would feel better if we merge the patch with better logging to ultimately go in the direction sfernand suggests
14:54:33 <rosmaita> ok, moving on ... anyone else on the review request list like to make a statement?
14:55:14 <nileshthathagar> As of now it is happening some times. Will do a load testing.
14:55:43 <nileshthathagar> But it will take a time
14:55:55 <rosmaita> yeah, i understand
14:56:20 <rosmaita> i think the worry here is that this fix might cause problems later on
14:56:52 <rosmaita> but, it might make sense to do the fix now and pro-actively try to figure out how bad the later problems will be
14:57:01 <nileshthathagar> yes
14:57:04 <rosmaita> and then address them before users hit them
14:57:21 <kpdev> @rosmaita: we need approval on storpool PR https://review.opendev.org/c/openstack/cinder/+/933078 , you have reviewed, so someone else from core, please review
14:58:43 <rosmaita> it's a quick review, and i think a safe patch, so someone please take a look
14:59:03 <rosmaita> definitely important to get that one in soon
14:59:12 <nileshthathagar> will definitly do that. but incase some customer gets issue. so we have some kind of fix for them
14:59:13 <inori> we need review on Fujitsu Eternus https://review.opendev.org/c/openstack/cinder/+/907126 , this patch has been commented in the past, and we've replied them.
14:59:15 <rosmaita> kpdev: are all your os-brick changes merged?
14:59:41 <kpdev> yes
14:59:46 <rosmaita> great
14:59:51 <rosmaita> inori: ack
15:00:10 <inori> Thank you rosmaita
15:00:53 <rosmaita> looks like there are a bunch of child patches, so it would be good for us to get that one out of the way
15:00:56 <rosmaita> #link https://review.opendev.org/c/openstack/cinder/+/907126
15:01:15 <rosmaita> just noticed that we are over time
15:01:24 <rosmaita> thanks everyone for attending!
15:01:32 <rosmaita> #endmeeting