#openstack-cinder log

15:00:42 <enriquetaso> #startmeeting cinder_bs
15:00:42 <opendevmeet> Meeting started Wed Jul  6 15:00:42 2022 UTC and is due to finish in 60 minutes.  The chair is enriquetaso. Information about MeetBot at http://wiki.debian.org/MeetBot.
15:00:42 <opendevmeet> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
15:00:42 <opendevmeet> The meeting name has been set to 'cinder_bs'
15:00:59 <enriquetaso> Only one bug for today's meeting
15:01:00 <enriquetaso> #link https://lists.openstack.org/pipermail/openstack-discuss/2022-July/029420.html
15:01:44 <enriquetaso> #topic  creating a bootable volume fails async when vol_size < virtual_size of image
15:01:51 <enriquetaso> #link https://bugs.launchpad.net/cinder/+bug/1980268
15:01:58 <enriquetaso> Summary: When creating an instance from the bootable volume, if the virtual size of the image is greater than the volume size, the check is done later in the cinder-volume service and fails whereas nova waits for a reply and times out.
15:02:09 <enriquetaso> cinder logs
15:02:09 <enriquetaso> "ERROR cinder.volume.manager ImageUnacceptable: Image <immage uuid> is unacceptable: Image virtual size is 10GB and doesn't fit in a volume of size 8GB.""
15:02:20 <enriquetaso> Fix proposed to master
15:02:24 <enriquetaso> #link https://review.opendev.org/c/openstack/cinder/+/847335
15:03:35 <whoami-rajat> hi
15:04:01 <whoami-rajat> yeah, I've added a check to the API layer for virtual_size
15:04:22 <geguileo> whoami-rajat: but I think this is a Nova bug, right?
15:04:28 <whoami-rajat> it is currently failing the grenade job which I've seen before but might be a different issue
15:04:42 <geguileo> because Cinder moves the status of the volume to "error", right?
15:04:44 <whoami-rajat> geguileo, it is kind of, can be handled at both ends
15:04:54 <geguileo> but the right approach is fixing it in Nova
15:05:03 <geguileo> because they have to check the status of the volume for "error"
15:05:09 <geguileo> not only "available"
15:05:18 <geguileo> and if it goes into error they should not timeout, but instead fail
15:06:00 <whoami-rajat> that's true
15:06:27 <whoami-rajat> but it as it is helpful for us as well when creating a bootable volume since we will fail fast
15:06:40 <rosmaita> i thought lyarwood fixed the nova side already
15:06:47 <geguileo> whoami-rajat: yeah, but if we fix it too soon Nova may never fix their side
15:07:30 <rosmaita> we do need both, because there's no guarantee of virtual_size being populated
15:07:35 <whoami-rajat> rosmaita, i discussed this with lyarwood and i don't think he did
15:07:55 <whoami-rajat> it's discovered long ago but no changes on nova side were made to handle it
15:08:07 <rosmaita> i thought i saw a patch, guess not
15:08:22 <whoami-rajat> rosmaita, maybe you're right and i might have missed it
15:08:24 <geguileo> this is a larger problem than this specific case
15:08:36 <rosmaita> that's a good point
15:08:38 <geguileo> because cinder can fail for literally 10^6 different reasons
15:08:51 <rosmaita> only 10^6 ?
15:08:54 <geguileo> and if they are just expecting the happy path, that's a real problem
15:09:01 <geguileo> rosmaita: yeah, feeling optimistic today  ;-)
15:09:08 <rosmaita> :)
15:09:42 <geguileo> btw, I'm not saying that I'm against the cinder fix, I actually think it's a great fix
15:10:05 <geguileo> I'm in favor of fail-fast systems
15:10:54 <HappyStacker> have a question as I don't know if we can categorize it as bug or not. We at Dell have a storage system called PowerFlex which rounds volume size by 8. But in case an operator creates a volume then extend it to a size which is not a 8GB multiplier, cinder will report a size which doesn't match the backend because it doesn't expect a size different from what we have told him.
15:10:55 <whoami-rajat> geguileo, I understand your point but don't know what we can do to convince nova team to fix it or fix similar related errors ...
15:11:36 <geguileo> whoami-rajat: I can write a bot that creates a 1000 replicas of the original bug report...
15:11:44 <geguileo> };-)
15:11:55 <rosmaita> don't you mean 10^6 replicas?
15:11:56 <geguileo> whoami-rajat: because there's a Nova bug, right?
15:12:07 <geguileo> rosmaita: lol
15:12:47 <whoami-rajat> geguileo, on launchpad? not one I'm aware of since i reported the cinder one, we can add nova in it
15:12:49 <geguileo> HappyStacker: we can discuss that particular case at the end of the bug meeting
15:13:06 <geguileo> whoami-rajat: well, if there is not a Nova bug then it would be hard for them to fix it
15:13:07 <HappyStacker> sure geguileo
15:13:16 <geguileo> and I can't complain  XD
15:13:31 <geguileo> anyway, the fix seems reasonable and appropriate
15:13:38 <geguileo> I'll stop complaining
15:14:05 <enriquetaso> sorry, what should we do regarding nova bug?
15:14:16 <enriquetaso> should I link nova in the bug?
15:15:01 <geguileo> enriquetaso: no, I think we need a different bug, because their bug is, if I understood it correctly, that they don't handle error status gracefully in that flow
15:15:08 <whoami-rajat> geguileo, thanks for your inputs, we should at least inform nova team to fix it and see if they get it fixed
15:16:14 <enriquetaso> whoami-rajat, would you mind creating a bug for nova with the error?
15:16:32 <whoami-rajat> enriquetaso, sure will do
15:17:39 <enriquetaso> thanks
15:17:51 <enriquetaso> #topic  PowerFlex which rounds volume size by 8
15:17:56 <enriquetaso> HappyStacker, ^
15:18:05 <enriquetaso> would you mind sharing the question again
15:18:14 <enriquetaso> or do you have a launchpad bug for it?
15:18:34 <enriquetaso> have a question as I don't know if we can categorize it as bug or not. We at Dell have a storage system called PowerFlex which rounds volume size by 8. But in case an operator creates a volume then extend it to a size which is not a 8GB multiplier, cinder will report a size which doesn't match the backend because it doesn't expect a size different from what we have told him.
15:18:37 <HappyStacker> we don't have any launchapd for that uet and it's on our to do lis
15:19:06 <HappyStacker> the question is do we need to raise a bug?
15:19:17 <HappyStacker> is it a bug? as cinder reacts as expected
15:19:22 <geguileo> HappyStacker: does this cause operational problems?
15:19:51 <HappyStacker> monitoring storage capacity from within an operator standpoint
15:20:13 <HappyStacker> the backend is fine, but will not match what an operator see
15:20:14 <enriquetaso> i think maybe override the extend() method on powerflex driver to do as you need?
15:20:22 <enriquetaso> oh
15:20:27 <enriquetaso> ok
15:20:57 <geguileo> but when you create a 1GB volume in that backend (results in an 8GB volume) and attach it to a VM, format it, and use it
15:20:58 <whoami-rajat> enriquetaso, that method currently doesn't return a model update so cinder won't update the size even if they return new size
15:21:01 <enriquetaso> so in the backend you have a 8GB multiplier but 'cinder list' shows something else? wow, that weird
15:21:13 <geguileo> can then it be migrated to another backend with a 1GB volume and still work?
15:21:21 <HappyStacker> yes that's the case
15:21:28 <geguileo> enriquetaso: not wow, it's the expected behavior
15:21:44 <HappyStacker> the real volume will be 8GB seen by the operating system
15:21:47 <enriquetaso> true
15:21:51 <geguileo> customers should NEVER request a 1GB volume and suddenly see an 8GB volume
15:22:04 <geguileo> regardless of how much is used in the backend
15:22:07 <enriquetaso> same with encrypted volumes tho :P
15:22:27 <geguileo> enriquetaso: that's related, yes
15:22:40 <geguileo> HappyStacker: so would that operation work fine?
15:22:43 <whoami-rajat> geguileo, for the create case, the size is correct but for the extend case it's mismatched between cinder DB and actual backend vol
15:22:58 <geguileo> whoami-rajat: how can it be correct for the create?
15:23:05 <geguileo> ooooooh, I see what you mean
15:23:13 <rosmaita> eharney always mentions this case, i think there is another driver that has this issue
15:23:13 <geguileo> that they are returning the wrong value on the extend!
15:23:14 <whoami-rajat> geguileo, because we return size in the model update for create but not for extend
15:23:18 <HappyStacker> At the creation time, I think it work but it doesn't reflect the reality when we do an extend
15:23:48 <geguileo> HappyStacker: reality is a subjective term  ;-)
15:23:58 <geguileo> Cinder must report the USER requeste size (1GB)
15:24:02 <HappyStacker> We wand to fix it, but we don't know where to start
15:24:05 <geguileo> not actual allocated size
15:24:40 <geguileo> HappyStacker: when you extend from 1GB to 2GB does the size end up saying 8GB?
15:24:43 <HappyStacker> but how an operator is supposed to monitor the size allocation?
15:24:48 <whoami-rajat> here for create they return the updated size https://github.com/openstack/cinder/blob/master/cinder/volume/drivers/dell_emc/powerflex/driver.py#L635
15:24:49 <HappyStacker> yes
15:24:53 <geguileo> HappyStacker: that's a different issue
15:25:09 <geguileo> whoami-rajat: that's a bug right there
15:25:20 <whoami-rajat> but for extend, we don't return anything (but even if we return, cinder doesn't expect a return and won't do anything) https://github.com/openstack/cinder/blob/master/cinder/volume/drivers/dell_emc/powerflex/driver.py#L804
15:25:22 <geguileo> HappyStacker: one thing is a bug in a driver, another a new feature
15:25:33 <geguileo> whoami-rajat: then extend is correct
15:25:37 <HappyStacker> as whoami-rajat, there is no issue at the creation time
15:25:40 <geguileo> and the bug they have is in create
15:25:52 <geguileo> HappyStacker: on the contrary the creation is *WRONG*
15:26:04 <HappyStacker> yes from a cinder standpoint
15:26:08 <whoami-rajat> geguileo, ok, so for creation, we shouldn't modify the size value?
15:26:10 <geguileo> a driver should never, ever change the size
15:26:42 <geguileo> if I'm a user and I request a 1GB volume and then check the volume and see an 8GB volume I'm going to call somebody to complain
15:26:43 <HappyStacker> ok so botomm line is at the creation the size should reflect the user request
15:26:54 <geguileo> HappyStacker: at creation a driver should not return the size
15:27:08 <geguileo> that's a cinder core thingy
15:27:18 <HappyStacker> ok
15:27:26 <geguileo> so, there is a bug there
15:27:42 <geguileo> on the other hand a new feature would be necessary for what you want
15:27:55 <geguileo> which is reporting the real/allocated size for a volume
15:28:03 <HappyStacker> but how can make our powerflex comply with cinder best practices?
15:28:08 <HappyStacker> yes
15:28:15 <geguileo> not returning the size
15:28:21 <geguileo> that's the best practice
15:29:01 <HappyStacker> so the bug is that cinder should ALWAYS return the size requested by the user and not returned by the backend
15:29:05 <geguileo> a new DB field in volume and snapshots
15:29:27 <HappyStacker> and the feature would be to have a new field calles allocation for instance
15:29:27 <geguileo> HappyStacker: yes, though the way of fixing it is for the driver not to return the size
15:29:41 <geguileo> yeah, or actual_size
15:30:05 <geguileo> it would need to be in both volumes and snapshots DB tables
15:30:11 <HappyStacker> I came with one problem, I'll leave with two, thanks guys! just kidding, gotcha
15:30:29 <geguileo> HappyStacker: one is a problem which is fixed with a 1 line patch
15:30:37 <geguileo> HappyStacker: the other is a feature, not a problem  ;-P
15:30:58 <geguileo> HappyStacker: The feature would require a spec
15:31:09 <HappyStacker> yes
15:31:26 <HappyStacker> and the other needs to have a launchpad
15:31:36 <HappyStacker> with a patchset
15:31:46 <geguileo> yup
15:31:54 <HappyStacker> including the one line fix
15:31:57 <HappyStacker> ok
15:32:18 <enriquetaso> great
15:32:32 <HappyStacker> I'll bring it up to my team
15:32:56 <geguileo> HappyStacker: feel free to ping us regarind the feature/spec
15:33:16 <HappyStacker> yes, I'll certainly do as it won't be trivial for me
15:34:46 <vbel> Happy powerflexing! ;)
15:35:01 <HappyStacker> ROFL
15:35:39 <vbel> there is another old problem that is fixed - https://review.opendev.org/c/openstack/cinder/+/806605  . Unfortunately no +2s :(
15:36:31 <enriquetaso> OK.. Moving on
15:36:35 <enriquetaso> #topic open discussion
15:37:08 <opendevreview> Merged openstack/cinder master: mypy: cinder/api/common.py  https://review.opendev.org/c/openstack/cinder/+/841125
15:38:45 <vbel> Is anything else needed for getting new Tatlin Unified storage driver into Zed? https://review.opendev.org/c/openstack/cinder/+/825492
15:40:18 <enriquetaso> Cinder people please review ^
15:41:14 <vbel> thanks!
15:41:34 <enriquetaso> looks like eharney review it a time ago and the code was updated :)
15:41:48 <enriquetaso> OK
15:41:53 <enriquetaso> run out of time!!
15:41:58 <enriquetaso> #endmeeting