14:00:04 <whoami-rajat> #startmeeting cinder 14:00:04 <opendevmeet> Meeting started Wed Jun 28 14:00:04 2023 UTC and is due to finish in 60 minutes. The chair is whoami-rajat. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:00:04 <opendevmeet> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:00:04 <opendevmeet> The meeting name has been set to 'cinder' 14:00:08 <whoami-rajat> #topic roll call 14:00:11 <happystacker> Hello all 14:00:22 <eharney> hi 14:00:38 <rosmaita> o/ 14:00:40 <senrique> hi 14:00:54 <tosky> o/ 14:01:18 <whoami-rajat> #link https://etherpad.opendev.org/p/cinder-bobcat-meetings 14:01:35 <whoami-rajat> guess who forgot this was the last wednesday of the month? 14:01:37 <Tony_Saad> 0/ 14:01:38 * whoami-rajat facepalms 14:01:47 <MatheusAndrade[m]> o/ 14:02:15 <jungleboyj> o/ 14:02:34 <thiagoalvoravel> o/ 14:02:57 <felipe_rodrigues> o/ 14:03:48 <luizsantos[m]> o/ 14:04:48 <whoami-rajat> good turnout today 14:04:51 <whoami-rajat> let's get started 14:04:54 <whoami-rajat> #topic announcements 14:05:06 <whoami-rajat> Milestone-2 (06 July) 14:05:17 <whoami-rajat> we have the Volume and target Driver freeze coming up next week 14:05:23 <whoami-rajat> #link https://etherpad.opendev.org/p/cinder-2023-2-bobcat-drivers 14:05:34 <whoami-rajat> I have created this etherpad to track the open drivers 14:05:54 <whoami-rajat> currently there are only 2, Yadro and Lustre 14:06:10 <whoami-rajat> if there are any other drivers which i missed, please add it to the list 14:06:22 <helenadantas[m]> o/ 14:06:45 <whoami-rajat> others, please review the drivers 14:06:48 <caiquemello[m]> o/ 14:07:06 <whoami-rajat> next, SQLAlchemy 2.0 resolution 14:07:12 <whoami-rajat> #link https://review.opendev.org/c/openstack/governance/+/887083 14:07:29 <whoami-rajat> JayF proposed a governance patch to provide details about migration to SQLAlchemy 2.0 14:07:54 <whoami-rajat> basically they are planning to bump the requirements to sqlalchemy2.0 early in 2024.1 cycle 14:08:05 <whoami-rajat> to get an idea of which projects are compatible with it and which are not 14:08:26 <whoami-rajat> it is also mentioned that projects need to move away from sqlalchemy-migrate and implement alembic 14:08:45 <whoami-rajat> that work has already been done in Cinder, thanks to Stephen 14:08:57 <whoami-rajat> hopefully the requirement bump to sqlalchemy 2.0 won't affect cinder but let's see 14:09:04 <whoami-rajat> we will have sufficient amount of time to fix the gate 14:09:23 <whoami-rajat> also the resolution isn't final yet so there might be changes but above points should be constant, as they have been for past few cycles 14:09:43 <whoami-rajat> next, EM Discussion 14:09:50 <whoami-rajat> there is no update on this 14:10:13 <whoami-rajat> was just checking the TC meeting discussion and I guess Kristi or JayF is going to start a ML thread for continuing discussion from PTG 14:10:32 <whoami-rajat> let's see if we get an update this week 14:10:47 <whoami-rajat> not sure if we want to hold on our EOL patches till then 14:10:55 <whoami-rajat> rosmaita, what are your thoughts? ^ 14:11:24 <rosmaita> not sure, really 14:12:10 <rosmaita> maybe we could do one more announcement to give packagers a heads-up 14:12:26 <whoami-rajat> sounds good 14:12:30 <rosmaita> i think we can separate out the overall EM discussion from what's best for cinder right now 14:12:34 <whoami-rajat> do we want to use the old thread or create a new one? 14:12:55 <rosmaita> i think create a new one 14:12:57 <whoami-rajat> yeah true, not sure how long this discussion going to go 14:13:12 <rosmaita> so we can get something like "FINAL CALL FOR COMMENTS: " in the subject line 14:13:41 <whoami-rajat> sure, i will take an action item for this 14:13:49 <whoami-rajat> and good idea to separate it out from the EM discussion 14:14:14 <whoami-rajat> #action whoami-rajat to send a mail regarding reminder for cinder branches being EOL 14:14:30 <rosmaita> sounds good 14:14:39 <whoami-rajat> thanks! 14:14:54 <whoami-rajat> that's all the announcements we had for today 14:15:00 <whoami-rajat> anyone has anything else? 14:15:32 <whoami-rajat> rosmaita, i did do a summary of your summary of PTG, not sure if you got time to go through it but apart from that, anything from PTG you would like to highlight? 14:16:08 <rosmaita> i can't think of anything 14:16:22 <rosmaita> i hope the poor turnout for the PTG was because of bad timing, not lack of interest 14:16:38 <rosmaita> that is, we already had the 2023.2 PTG virtually a few months ago 14:17:05 <whoami-rajat> yeah, that could be the same reason why we weren't able to gather much topics for cinder sessions as well 14:17:07 <rosmaita> so maybe companies didn't think it was a priority to go to this one 14:18:14 <whoami-rajat> agreed, doesn't seem the project is losing interest, it was just an event at odd time 14:19:02 <whoami-rajat> ok, guess that's all for announcements 14:19:05 <whoami-rajat> let's move to topics 14:19:13 <whoami-rajat> #topic Bug: https://bugs.launchpad.net/cinder/+bug/2002535 14:19:24 <whoami-rajat> not sure who added it, there is no IRC nick 14:19:47 <happystacker> my bad it's me 14:20:30 <happystacker> so it seems that there is a conversion from qcow2 to raw during the creation from an image 14:20:31 <whoami-rajat> oh ok, please go ahead 14:20:53 <happystacker> and the metadata of the volume is not updated 14:20:59 <happystacker> so it remains qcow2 14:21:06 <happystacker> while the block dev is raw 14:21:34 <happystacker> as a result, when creating an instance the attachment shows qcow2 but it works fine 14:21:48 <happystacker> until you want to resize it where an error is thrown 14:21:57 <happystacker> saying that the format mismatch 14:22:21 <happystacker> So I have proposed a fix: https://review.opendev.org/c/openstack/cinder/+/881549 14:22:27 <whoami-rajat> are you using the generic nfs driver or vendor nfs? 14:22:52 <eharney> is that the right gerrit link? 14:23:13 <happystacker> I'm using powerstore but it happens in the nfs generic space 14:23:15 <happystacker> weird 14:23:40 <happystacker> https://review.opendev.org/c/openstack/cinder/+/887081 14:23:46 <happystacker> that's the right one 14:24:11 <senrique> I reproduced part of the problem of the launchpad bug.. 14:24:33 <eharney> i think that patch has some problems but i'll review it more closely and comment there 14:24:44 <happystacker> sure, I'm still learning 14:24:53 <senrique> When creating a volume from a glance image.. cinder created a raw volume. This happens because the nfs driver first fetches the glance image glance and keeps using the same format. 14:25:01 <eharney> (it probably autodetects the wrong format when using raw images) 14:25:12 <senrique> fetch the image to raw* 14:25:14 <whoami-rajat> happystacker, is this config option enabled in your deployment? nfs_qcow2_volumes 14:25:44 <whoami-rajat> and are you using a master deployment or an older version? 14:25:47 <happystacker> Yes, but it seems not to change anything 14:26:17 <happystacker> master deployment 14:27:52 <whoami-rajat> hmm, i think we have a problem but need to test it 14:28:06 <happystacker> let me know if I can help more 14:28:16 <whoami-rajat> we set the format during driver initialization to 'raw' https://github.com/openstack/cinder/blob/e673bcc368d3a24ec21713adbd83b4ab6cbcae18/cinder/volume/drivers/remotefs.py#L180 14:29:03 <whoami-rajat> but if we enable the qcow2 option, we switch the format to 'qcow2' https://github.com/openstack/cinder/blob/e673bcc368d3a24ec21713adbd83b4ab6cbcae18/cinder/volume/drivers/remotefs.py#L326 14:29:27 <whoami-rajat> but we will restart the service while doing the config change 14:29:30 <whoami-rajat> so i guess it should be fine 14:30:02 <whoami-rajat> anyway, as i said, need to be tested but it is possible that the format value is not assigned correctly in some cases 14:30:12 <senrique> whoami-rajat, in L530 `image_utils.fetch_to_raw` 14:30:54 <happystacker> problem happens here: https://github.com/openstack/cinder/blob/e673bcc368d3a24ec21713adbd83b4ab6cbcae18/cinder/volume/drivers/remotefs.py#L530 14:31:52 <happystacker> so maybe we just can add a validation step which prevents the conversion if the qcow2 option is enabled? 14:32:01 <whoami-rajat> i think we always convert the qcow2 image to raw when we want to create a bootable volume from image 14:32:23 <whoami-rajat> do we support writing a qcow2 image to a qcow2 volume? 14:32:26 <happystacker> As for now, there is a check which verifies that both format are the same, if it's not the case, it'll update the metadata 14:32:50 <happystacker> it's always converted yes 14:32:51 <eharney> we could support qcow2->qcow2, not sure if it tries to currently 14:33:25 <happystacker> so if the conversion happens, then the metadata should be updated as well right? 14:33:32 <whoami-rajat> i remember disabling qcow2->qcow2 in glance but not sure what was the reason 14:33:36 <eharney> it looks like the key is checking the format when copying the image and doing the right thing when the volume is supposed to be a qcow2 format 14:35:24 <whoami-rajat> yep we don't support it in glance cinder store, let me see if i can find the reason https://github.com/openstack/glance_store/blob/5a81f77bd48e46eac6ab0636f0f52dbceec4e8d3/glance_store/_drivers/cinder/nfs.py#L72 14:35:55 <eharney> because the glance store is expecting to deal with cinder volumes that are raw, it doesn't support nfs snaps either 14:36:04 <eharney> presumably because that restriction is much easier than adding support for all of that 14:36:10 <happystacker> ok 14:36:37 <senrique> I'm not sure if possible but maybe we replace fetch_to_raw with `fetch_to_volume_format` https://github.com/openstack/cinder/blob/e673bcc368d3a24ec21713adbd83b4ab6cbcae18/cinder/image/image_utils.py#L822 14:36:51 <happystacker> which means we need to keep the conversion 14:37:04 <happystacker> but update the metadata to reflect the change? 14:37:26 <happystacker> so that the attachment will be raw based but not qcow2 14:37:32 <whoami-rajat> https://github.com/openstack/glance_store/commit/85c7a06687291eba30510d63d3ee8b9e9cb33c5f 14:38:04 <whoami-rajat> looks like there is some problem with extend volume in case of qcow2->qcow2, as i see written in the commit message 14:38:25 <whoami-rajat> but don't remember anything else ... 14:40:01 <eharney> also, we need to support writing images to qcow2 volumes to support nfs encrypted volumes 14:40:01 <senrique> in that case.. i think updating the metadata looks good 14:40:28 <senrique> yes, nfs encryption works with qcow2 14:40:51 <whoami-rajat> eharney, ack, i think your comments makes sense that we are lacking a lot of support around it so better to just block it 14:42:58 <whoami-rajat> senrique, i don't exactly know the reason why we call fetch to raw here but we do that in all other reference drivers like lvm, ceph etc 14:43:47 <whoami-rajat> but again they don't have to deal with qcow2 volumes 14:43:52 <whoami-rajat> not really sure 14:46:43 <whoami-rajat> let's continue this discussion in the next event, maybe good for midcycle -2 ? let's see 14:46:48 <whoami-rajat> we have another topic to discussi 14:46:51 <whoami-rajat> discuss 14:47:15 <whoami-rajat> #topic test_rebuild_volume_backed_server failing 100% on ceph job 14:47:19 <whoami-rajat> senrique, that's you 14:47:32 <senrique> CI was failing because of that 14:47:42 <senrique> #link https://bugs.launchpad.net/cinder/+bug/2025096 14:47:56 <senrique> but now the "test_rebuild_volume_backed_server" test is skipped for the Ceph job until the problem is fixed. 14:48:14 <senrique> #link https://review.opendev.org/c/openstack/devstack-plugin-ceph/+/887003 14:48:24 <senrique> that's all :) 14:48:28 <whoami-rajat> senrique, i analyzed the job results yesterday and found the issue 14:48:41 <whoami-rajat> let me see if it's still in my notes 14:48:50 <senrique> i think the nfs job is failling with the same issue tho 14:49:21 <whoami-rajat> currently we have a timeout of 20 seconds per GB on the nova side during reimage 14:49:38 <whoami-rajat> which we optimistically set for a good deployment 14:49:50 <whoami-rajat> but our gate jobs are pretty slow with IO and hence takes more time 14:50:02 <whoami-rajat> the cinder logs tell the truth about the time taken for the operation 14:50:05 <whoami-rajat> first log: 01:02:49.782746 14:50:11 <whoami-rajat> last log: 01:03:23.661182 14:50:24 <whoami-rajat> so it's around 34 seconds which is greater than the timeout 14:50:34 <whoami-rajat> hence we fail on the nova side 14:50:35 <whoami-rajat> Jun 26 01:03:11.721921 np0034441113 nova-compute[97944]: ERROR oslo_messaging.rpc.server eventlet.timeout.Timeout: 20 seconds 14:51:07 <whoami-rajat> I didn't know there was a bug report, i will add my analysis to it though dansmith has already proposed patches to fix it 14:51:13 <whoami-rajat> thanks for linking the bug senrique 14:51:36 <senrique> thanks whoami-rajat, good report 14:52:03 <rosmaita> that's a good catch whoami-rajat 14:52:21 <eharney> i'm currently looking into cinder-tempest-plugin-cbak-ceph failures as well, unclear if those are related 14:52:23 <senrique> ++ 14:57:26 <whoami-rajat> sorry got busy updating the bug, will do it later 14:57:31 <whoami-rajat> let's move to open discussion for 3 minutes 14:57:34 <whoami-rajat> #topic open discussion 14:57:47 <whoami-rajat> there is a big list of review requests today 14:58:13 <whoami-rajat> please take a look when you get some time 14:58:25 <happystacker> thank you! 14:58:40 <drencrom> Hi all. I just would like to have some eyes on https://review.opendev.org/c/openstack/cinder/+/868485, its been a while. Thanks 15:00:15 <rosmaita> drencrom: ack 15:00:20 <whoami-rajat> we are out of time 15:00:24 <whoami-rajat> thanks everyone for attending 15:00:26 <whoami-rajat> #endmeeting