15:00:42 <enriquetaso> #startmeeting cinder_bs 15:00:42 <opendevmeet> Meeting started Wed Jul 6 15:00:42 2022 UTC and is due to finish in 60 minutes. The chair is enriquetaso. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:00:42 <opendevmeet> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:00:42 <opendevmeet> The meeting name has been set to 'cinder_bs' 15:00:59 <enriquetaso> Only one bug for today's meeting 15:01:00 <enriquetaso> #link https://lists.openstack.org/pipermail/openstack-discuss/2022-July/029420.html 15:01:44 <enriquetaso> #topic creating a bootable volume fails async when vol_size < virtual_size of image 15:01:51 <enriquetaso> #link https://bugs.launchpad.net/cinder/+bug/1980268 15:01:58 <enriquetaso> Summary: When creating an instance from the bootable volume, if the virtual size of the image is greater than the volume size, the check is done later in the cinder-volume service and fails whereas nova waits for a reply and times out. 15:02:09 <enriquetaso> cinder logs 15:02:09 <enriquetaso> "ERROR cinder.volume.manager ImageUnacceptable: Image <immage uuid> is unacceptable: Image virtual size is 10GB and doesn't fit in a volume of size 8GB."" 15:02:20 <enriquetaso> Fix proposed to master 15:02:24 <enriquetaso> #link https://review.opendev.org/c/openstack/cinder/+/847335 15:03:35 <whoami-rajat> hi 15:04:01 <whoami-rajat> yeah, I've added a check to the API layer for virtual_size 15:04:22 <geguileo> whoami-rajat: but I think this is a Nova bug, right? 15:04:28 <whoami-rajat> it is currently failing the grenade job which I've seen before but might be a different issue 15:04:42 <geguileo> because Cinder moves the status of the volume to "error", right? 15:04:44 <whoami-rajat> geguileo, it is kind of, can be handled at both ends 15:04:54 <geguileo> but the right approach is fixing it in Nova 15:05:03 <geguileo> because they have to check the status of the volume for "error" 15:05:09 <geguileo> not only "available" 15:05:18 <geguileo> and if it goes into error they should not timeout, but instead fail 15:06:00 <whoami-rajat> that's true 15:06:27 <whoami-rajat> but it as it is helpful for us as well when creating a bootable volume since we will fail fast 15:06:40 <rosmaita> i thought lyarwood fixed the nova side already 15:06:47 <geguileo> whoami-rajat: yeah, but if we fix it too soon Nova may never fix their side 15:07:30 <rosmaita> we do need both, because there's no guarantee of virtual_size being populated 15:07:35 <whoami-rajat> rosmaita, i discussed this with lyarwood and i don't think he did 15:07:55 <whoami-rajat> it's discovered long ago but no changes on nova side were made to handle it 15:08:07 <rosmaita> i thought i saw a patch, guess not 15:08:22 <whoami-rajat> rosmaita, maybe you're right and i might have missed it 15:08:24 <geguileo> this is a larger problem than this specific case 15:08:36 <rosmaita> that's a good point 15:08:38 <geguileo> because cinder can fail for literally 10^6 different reasons 15:08:51 <rosmaita> only 10^6 ? 15:08:54 <geguileo> and if they are just expecting the happy path, that's a real problem 15:09:01 <geguileo> rosmaita: yeah, feeling optimistic today ;-) 15:09:08 <rosmaita> :) 15:09:42 <geguileo> btw, I'm not saying that I'm against the cinder fix, I actually think it's a great fix 15:10:05 <geguileo> I'm in favor of fail-fast systems 15:10:54 <HappyStacker> have a question as I don't know if we can categorize it as bug or not. We at Dell have a storage system called PowerFlex which rounds volume size by 8. But in case an operator creates a volume then extend it to a size which is not a 8GB multiplier, cinder will report a size which doesn't match the backend because it doesn't expect a size different from what we have told him. 15:10:55 <whoami-rajat> geguileo, I understand your point but don't know what we can do to convince nova team to fix it or fix similar related errors ... 15:11:36 <geguileo> whoami-rajat: I can write a bot that creates a 1000 replicas of the original bug report... 15:11:44 <geguileo> };-) 15:11:55 <rosmaita> don't you mean 10^6 replicas? 15:11:56 <geguileo> whoami-rajat: because there's a Nova bug, right? 15:12:07 <geguileo> rosmaita: lol 15:12:47 <whoami-rajat> geguileo, on launchpad? not one I'm aware of since i reported the cinder one, we can add nova in it 15:12:49 <geguileo> HappyStacker: we can discuss that particular case at the end of the bug meeting 15:13:06 <geguileo> whoami-rajat: well, if there is not a Nova bug then it would be hard for them to fix it 15:13:07 <HappyStacker> sure geguileo 15:13:16 <geguileo> and I can't complain XD 15:13:31 <geguileo> anyway, the fix seems reasonable and appropriate 15:13:38 <geguileo> I'll stop complaining 15:14:05 <enriquetaso> sorry, what should we do regarding nova bug? 15:14:16 <enriquetaso> should I link nova in the bug? 15:15:01 <geguileo> enriquetaso: no, I think we need a different bug, because their bug is, if I understood it correctly, that they don't handle error status gracefully in that flow 15:15:08 <whoami-rajat> geguileo, thanks for your inputs, we should at least inform nova team to fix it and see if they get it fixed 15:16:14 <enriquetaso> whoami-rajat, would you mind creating a bug for nova with the error? 15:16:32 <whoami-rajat> enriquetaso, sure will do 15:17:39 <enriquetaso> thanks 15:17:51 <enriquetaso> #topic PowerFlex which rounds volume size by 8 15:17:56 <enriquetaso> HappyStacker, ^ 15:18:05 <enriquetaso> would you mind sharing the question again 15:18:14 <enriquetaso> or do you have a launchpad bug for it? 15:18:34 <enriquetaso> have a question as I don't know if we can categorize it as bug or not. We at Dell have a storage system called PowerFlex which rounds volume size by 8. But in case an operator creates a volume then extend it to a size which is not a 8GB multiplier, cinder will report a size which doesn't match the backend because it doesn't expect a size different from what we have told him. 15:18:37 <HappyStacker> we don't have any launchapd for that uet and it's on our to do lis 15:19:06 <HappyStacker> the question is do we need to raise a bug? 15:19:17 <HappyStacker> is it a bug? as cinder reacts as expected 15:19:22 <geguileo> HappyStacker: does this cause operational problems? 15:19:51 <HappyStacker> monitoring storage capacity from within an operator standpoint 15:20:13 <HappyStacker> the backend is fine, but will not match what an operator see 15:20:14 <enriquetaso> i think maybe override the extend() method on powerflex driver to do as you need? 15:20:22 <enriquetaso> oh 15:20:27 <enriquetaso> ok 15:20:57 <geguileo> but when you create a 1GB volume in that backend (results in an 8GB volume) and attach it to a VM, format it, and use it 15:20:58 <whoami-rajat> enriquetaso, that method currently doesn't return a model update so cinder won't update the size even if they return new size 15:21:01 <enriquetaso> so in the backend you have a 8GB multiplier but 'cinder list' shows something else? wow, that weird 15:21:13 <geguileo> can then it be migrated to another backend with a 1GB volume and still work? 15:21:21 <HappyStacker> yes that's the case 15:21:28 <geguileo> enriquetaso: not wow, it's the expected behavior 15:21:44 <HappyStacker> the real volume will be 8GB seen by the operating system 15:21:47 <enriquetaso> true 15:21:51 <geguileo> customers should NEVER request a 1GB volume and suddenly see an 8GB volume 15:22:04 <geguileo> regardless of how much is used in the backend 15:22:07 <enriquetaso> same with encrypted volumes tho :P 15:22:27 <geguileo> enriquetaso: that's related, yes 15:22:40 <geguileo> HappyStacker: so would that operation work fine? 15:22:43 <whoami-rajat> geguileo, for the create case, the size is correct but for the extend case it's mismatched between cinder DB and actual backend vol 15:22:58 <geguileo> whoami-rajat: how can it be correct for the create? 15:23:05 <geguileo> ooooooh, I see what you mean 15:23:13 <rosmaita> eharney always mentions this case, i think there is another driver that has this issue 15:23:13 <geguileo> that they are returning the wrong value on the extend! 15:23:14 <whoami-rajat> geguileo, because we return size in the model update for create but not for extend 15:23:18 <HappyStacker> At the creation time, I think it work but it doesn't reflect the reality when we do an extend 15:23:48 <geguileo> HappyStacker: reality is a subjective term ;-) 15:23:58 <geguileo> Cinder must report the USER requeste size (1GB) 15:24:02 <HappyStacker> We wand to fix it, but we don't know where to start 15:24:05 <geguileo> not actual allocated size 15:24:40 <geguileo> HappyStacker: when you extend from 1GB to 2GB does the size end up saying 8GB? 15:24:43 <HappyStacker> but how an operator is supposed to monitor the size allocation? 15:24:48 <whoami-rajat> here for create they return the updated size https://github.com/openstack/cinder/blob/master/cinder/volume/drivers/dell_emc/powerflex/driver.py#L635 15:24:49 <HappyStacker> yes 15:24:53 <geguileo> HappyStacker: that's a different issue 15:25:09 <geguileo> whoami-rajat: that's a bug right there 15:25:20 <whoami-rajat> but for extend, we don't return anything (but even if we return, cinder doesn't expect a return and won't do anything) https://github.com/openstack/cinder/blob/master/cinder/volume/drivers/dell_emc/powerflex/driver.py#L804 15:25:22 <geguileo> HappyStacker: one thing is a bug in a driver, another a new feature 15:25:33 <geguileo> whoami-rajat: then extend is correct 15:25:37 <HappyStacker> as whoami-rajat, there is no issue at the creation time 15:25:40 <geguileo> and the bug they have is in create 15:25:52 <geguileo> HappyStacker: on the contrary the creation is *WRONG* 15:26:04 <HappyStacker> yes from a cinder standpoint 15:26:08 <whoami-rajat> geguileo, ok, so for creation, we shouldn't modify the size value? 15:26:10 <geguileo> a driver should never, ever change the size 15:26:42 <geguileo> if I'm a user and I request a 1GB volume and then check the volume and see an 8GB volume I'm going to call somebody to complain 15:26:43 <HappyStacker> ok so botomm line is at the creation the size should reflect the user request 15:26:54 <geguileo> HappyStacker: at creation a driver should not return the size 15:27:08 <geguileo> that's a cinder core thingy 15:27:18 <HappyStacker> ok 15:27:26 <geguileo> so, there is a bug there 15:27:42 <geguileo> on the other hand a new feature would be necessary for what you want 15:27:55 <geguileo> which is reporting the real/allocated size for a volume 15:28:03 <HappyStacker> but how can make our powerflex comply with cinder best practices? 15:28:08 <HappyStacker> yes 15:28:15 <geguileo> not returning the size 15:28:21 <geguileo> that's the best practice 15:29:01 <HappyStacker> so the bug is that cinder should ALWAYS return the size requested by the user and not returned by the backend 15:29:05 <geguileo> a new DB field in volume and snapshots 15:29:27 <HappyStacker> and the feature would be to have a new field calles allocation for instance 15:29:27 <geguileo> HappyStacker: yes, though the way of fixing it is for the driver not to return the size 15:29:41 <geguileo> yeah, or actual_size 15:30:05 <geguileo> it would need to be in both volumes and snapshots DB tables 15:30:11 <HappyStacker> I came with one problem, I'll leave with two, thanks guys! just kidding, gotcha 15:30:29 <geguileo> HappyStacker: one is a problem which is fixed with a 1 line patch 15:30:37 <geguileo> HappyStacker: the other is a feature, not a problem ;-P 15:30:58 <geguileo> HappyStacker: The feature would require a spec 15:31:09 <HappyStacker> yes 15:31:26 <HappyStacker> and the other needs to have a launchpad 15:31:36 <HappyStacker> with a patchset 15:31:46 <geguileo> yup 15:31:54 <HappyStacker> including the one line fix 15:31:57 <HappyStacker> ok 15:32:18 <enriquetaso> great 15:32:32 <HappyStacker> I'll bring it up to my team 15:32:56 <geguileo> HappyStacker: feel free to ping us regarind the feature/spec 15:33:16 <HappyStacker> yes, I'll certainly do as it won't be trivial for me 15:34:46 <vbel> Happy powerflexing! ;) 15:35:01 <HappyStacker> ROFL 15:35:39 <vbel> there is another old problem that is fixed - https://review.opendev.org/c/openstack/cinder/+/806605 . Unfortunately no +2s :( 15:36:31 <enriquetaso> OK.. Moving on 15:36:35 <enriquetaso> #topic open discussion 15:37:08 <opendevreview> Merged openstack/cinder master: mypy: cinder/api/common.py https://review.opendev.org/c/openstack/cinder/+/841125 15:38:45 <vbel> Is anything else needed for getting new Tatlin Unified storage driver into Zed? https://review.opendev.org/c/openstack/cinder/+/825492 15:40:18 <enriquetaso> Cinder people please review ^ 15:41:14 <vbel> thanks! 15:41:34 <enriquetaso> looks like eharney review it a time ago and the code was updated :) 15:41:48 <enriquetaso> OK 15:41:53 <enriquetaso> run out of time!! 15:41:58 <enriquetaso> #endmeeting