14:00:05 <rosmaita> #startmeeting cinder 14:00:05 <opendevmeet> Meeting started Wed Jun 16 14:00:05 2021 UTC and is due to finish in 60 minutes. The chair is rosmaita. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:00:05 <opendevmeet> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:00:05 <opendevmeet> The meeting name has been set to 'cinder' 14:00:12 <sfernand> hi 14:00:16 <rosmaita> #topic roll call 14:00:32 <rosmaita> (#topic doesn't work, but i will use it anyway) 14:00:41 <eharney> hi 14:00:43 <walshh_> hi 14:00:54 <e0ne> hi 14:01:03 <enriquetaso> hi 14:01:28 <rosmaita> #link https://etherpad.opendev.org/p/cinder-xena-meetings 14:01:36 <rosmaita> ok, that doesn't work either 14:01:51 <tbarron> hi 14:02:31 <rosmaita> a lot on the agenda today, so let's get started 14:02:40 <rosmaita> #topic announcements 14:02:54 <rosmaita> cinder-tempest-plugin-lvm-lio-barbican job is failing because sqlalchemy 1.4 broke barbican's alembic migration 14:03:03 <rosmaita> this is now fixed, courtesy of geguileo 14:03:11 <rosmaita> https://review.opendev.org/c/openstack/barbican/+/796284/ 14:03:34 <rosmaita> barbican isn't currently included in the requirements check job 14:03:44 <rosmaita> https://review.opendev.org/c/openstack/requirements/+/796647 14:04:04 <rosmaita> ^^ proposes a barbican crosscheck job 14:04:37 <rosmaita> are there any other projects we depend on that should be checked? 14:04:39 <jungleboyj> o/ 14:04:45 * jungleboyj sneaks in late 14:04:57 <geguileo> rosmaita: I don't think that would have detected this issue 14:05:16 <geguileo> rosmaita: isn't that the unit tests? 14:05:18 <rosmaita> geguileo: it would have caught the ut failures you fixed 14:05:27 <rosmaita> not the migration, though 14:05:37 <geguileo> rosmaita: which was the one that blocked our gate 14:06:14 <geguileo> they need something like our cinder/tests/unit/db/test_migrations.py 14:07:10 <rosmaita> we can suggest that, but the barbican team seems to be under staffed these days 14:08:21 <geguileo> good to know :-( 14:08:34 <rosmaita> also, i noticed that the nova functional tests are run 14:08:52 <rosmaita> could add ours, not sure how much that would help detect problems 14:09:59 <rosmaita> we should keep this in mind for the next time sqlalchemy is updated to 1.5 or 2.0 14:10:19 <rosmaita> next item: vulnerability:managed tag accepted for os-brick 14:10:37 <rosmaita> which doesn't really change anything because we all thought it was already managed 14:10:51 <rosmaita> next item: request from jungleboyj 14:11:00 <rosmaita> Help me vote for the Y release name: https://twitter.com/jungleboyj/status/1404464680349929474 14:11:18 <rosmaita> jungleboyj: when is the deadline for that? 14:11:26 <jungleboyj> Yes. :-) Just a note that I have a naming poll out there for the Y release. Have had good participation. 14:11:44 <jungleboyj> I need to submit by vote today, so, if you want to help me pick the name. Please vote. 14:12:03 <rosmaita> my personal favorite is "You" 14:12:11 <rosmaita> so that no one will know what release you are talking about 14:12:24 * jungleboyj isn't surprised 14:12:26 <rosmaita> next item: reminder about festival of reviews on Friday 14:12:34 <eharney> Yoghurt but no Yogurt? i dunno... 14:12:35 <jungleboyj> Enter the Chaos Monkey 14:12:37 <rosmaita> info here: http://lists.openstack.org/pipermail/openstack-discuss/2021-June/023100.html 14:13:05 <rosmaita> friday is a holiday in some locations, but not enough to reschedule 14:13:16 <rosmaita> at least that's my impression? 14:14:06 <rosmaita> hearing nothing to the contrary, next item: 14:14:17 <rosmaita> cinder-coresec: need comments on https://bugs.launchpad.net/bugs/1929223 before 23:59 UTC on Friday 14:14:43 <rosmaita> so please comment at your earliest convenience 14:14:58 <rosmaita> finally, reminder that the spec freeze is next friday 14:15:28 <rosmaita> so we need to review specs in a responsive manner, that is, right away 14:15:38 <rosmaita> #link https://review.opendev.org/q/project:openstack%252Fcinder-specs+status:open 14:16:07 <rosmaita> that's it for announcements, unless someone else has something to share? 14:16:52 <rosmaita> ok, moving on 14:17:02 <rosmaita> #topic Two cinder patches blocking glance feature 14:17:07 <rosmaita> whoami-rajat: that's you 14:18:17 <rosmaita> not sure whoami-rajat is around 14:18:23 <rosmaita> but he left enough info in the agenda 14:18:35 <rosmaita> he's been working on hardening the glance_store cinder driver 14:18:50 <rosmaita> and found a cinder issue that needs to be addressed 14:18:59 <rosmaita> #link https://review.opendev.org/c/openstack/cinder/+/783389 14:19:18 <rosmaita> i think ^^ is fine and corrects a mistake when the validation schema stuff was added to cinder 14:19:23 <rosmaita> see my comment on the patch 14:19:42 <rosmaita> the other one is a cinderclient change 14:19:46 <rosmaita> #link https://review.opendev.org/c/openstack/python-cinderclient/+/783628 14:20:26 <rosmaita> i think that one addresses a similar problem in the cinderclient, that is, it is requiring an optional parameter 14:21:02 <rosmaita> anyway, please review so rajat can get that glance_store patch out of his life, which will allow him to concentrate on cinder 14:21:07 <eharney> makes sense 14:21:28 <rosmaita> ok, next topic is a big one 14:21:40 <rosmaita> #topic some concerns about the frequency of Cinder failures in the gate 14:21:51 <rosmaita> jungleboyj is getting pressure from other members of the TC 14:21:59 <rosmaita> and in turn, is passing some pressure onto us 14:22:06 <jungleboyj> :-) Yes. 14:22:07 <eharney> is there anything going on here other than the known issues with LVM crashing? 14:22:30 <jungleboyj> Based on the discussion yesterday it appears that that is the likely cause for concern. 14:22:30 <rosmaita> eharney: it's hard to tell 14:22:32 <whoami-rajat> rosmaita: sorry was afk, thanks for covering it 14:22:37 <jungleboyj> But it is hard to tell. 14:22:53 <eharney> well it should be easy to quantify the LVM issues with elastic-recheck, has anyone tried that? 14:23:08 <rosmaita> eharney: "should be" and "no" 14:23:43 <enriquetaso> LVM issues is https://bugs.launchpad.net/cinder/+bug/1901783/ ? 14:24:12 <eharney> yes but it fails on more operations than just volume delete 14:24:19 <rosmaita> short-term, i would like to propose no more naked rechecks 14:24:29 <jungleboyj> rosmaita: ++ 14:24:37 <rosmaita> for one thing, the first two items i looked at weren't cinder's fault 14:26:04 <jungleboyj> We should also work on getting the LVM crashes fixed. 14:26:16 <rosmaita> yes, i agree 14:26:18 <jungleboyj> And fix the barbican issues. 14:26:30 <eharney> we know how to work around them in a messy way, i think we don't have a way to properly fix them 14:26:39 <eharney> the barbican issues are already fixed AFAIK 14:26:49 <rosmaita> anyway, short term i propose that when you issuue a recheck, do this: 14:26:58 <rosmaita> recheck <job> <failed test name> 14:27:08 <rosmaita> and maybe some info about the failure if relevant 14:27:15 <rosmaita> and you can add info here: 14:27:21 <rosmaita> https://etherpad.opendev.org/p/cinder-xena-ci-tracking 14:27:43 <rosmaita> but if someone has time to set up an elasticsearch query to automate this, that would be better 14:27:58 <rosmaita> but short term it would be good to get some quick data about what is going on 14:28:42 <rosmaita> i thought that most of the failures are in teardown, and not related to actual tests, but i am not sure whether that's true or not 14:29:53 <rosmaita> any questions about our short term data collection? 14:30:29 <whoami-rajat> from personal standpoint, gates were passing consistently before this barbican issue and now also it's working, maybe failure is more often seen in other project gates 14:30:37 <jungleboyj> And the teardown failures are often due to volumes being left around due to something like the LVM crash. 14:30:53 <jungleboyj> whoami-rajat: That is a concern. 14:31:28 <rosmaita> well, i think it's mostly in tempest-integrated-storage 14:31:35 <rosmaita> so nova, glance, and cinder 14:33:06 <rosmaita> anyway, no more naked rechecks ... at least pretend you care 14:33:26 <rosmaita> and i guess, review the lvm patches 14:33:36 <eharney> i think we still need to write more lvm patches 14:33:52 <eharney> lvdisplay is not covered, not sure what else 14:36:00 <Guest2396> rosmaita: could we somehow keep that kind of info in the cinder wiki or something? 14:36:19 <rosmaita> Guest2396: which kind of info? 14:36:38 <rosmaita> btw, everyone keep an eye on https://review.opendev.org/c/openstack/cinder/+/772126 (it's in recheck now) 14:36:50 <geguileo> rosmaita: the etherpad link for ci trakcing, and the recheck job failedtestname thingy 14:37:11 <eharney> 722126 doesn't fix the crash (which was the initial hope) 14:37:17 <rosmaita> geguileo: glad you asked, i think i want to put it into the channel topic 14:37:28 <rosmaita> and the wiki, that is a good idea 14:37:46 <geguileo> rosmaita: having the etherpads we are currently using in the wiki could be useful 14:37:59 <geguileo> (for those of us with fish memory) 14:38:01 <rosmaita> #action rosmaita check with opendev team about getting topic changed 14:38:19 <rosmaita> #action rosmaita add current etherpads to wiki 14:38:37 <eharney> 722126 needs a follow-up to retry on crash 14:38:40 <rosmaita> geguileo: there's also the spotlight links on the meeting agenda 14:39:58 <jungleboyj> rosmaita: We used to have that list in the Wiki. Needs to be updated. 14:40:03 <rosmaita> eharney: what's the best way to track these? use https://launchpad.net/bugs/1901783 or other bugs? 14:40:17 <rosmaita> jungleboyj: noted 14:40:40 <rosmaita> #action rosmaita publicize the ci-tracking effort in all available methods 14:41:03 <eharney> we should probably write a new bug for retrying all of the other lvm commands that can segfault that weren't covered by 1901783, since that bug already has backports spanning a few branches 14:41:37 <enriquetaso> i can help with that eharney 14:41:43 <rosmaita> ok, let's discuss that during the upcoming bug squad meeting 14:41:58 <enriquetaso> sure 14:41:58 <eharney> ok 14:42:20 <rosmaita> ok, sounds like we have some strategy to address our CI problems 14:42:47 <rosmaita> #topic Community goal proposal: Test with TLS (formerly SSL) by Default 14:42:51 <rosmaita> enriquetaso: that's you 14:43:04 <enriquetaso> Hello, just a quick question around TLS. As this is sort of a community-goal and I'm not familiar with TLS, I wonder if enabling TLS is a problem for us? 14:43:16 <enriquetaso> Or should we priority this XS review? maybe for our XS meeting next friday. 14:43:28 <rosmaita> well, the review will not pass 14:43:33 * enriquetaso reading brian's comments on the etherpad 14:43:44 <rosmaita> yeah, i looked into this a bit yesterday 14:44:43 <rosmaita> not sure how big a deal it is if CI can't reliably use TLS for this one test job 14:45:00 <rosmaita> my impression is that there's something going on in the botocore library 14:45:10 <rosmaita> since 2014 14:45:13 <enriquetaso> oops 14:45:16 <enriquetaso> OK 14:45:18 <tosky> I see a reference to an aws-cli issue, but is it relevant for that different S3 implementation? 14:45:37 <rosmaita> tosky: i don't know, you wouldn't think so 14:45:47 <rosmaita> but i think the cli also uses botocore 14:46:25 <rosmaita> it's weird that we would also see it in a fake s3 implementation 14:46:34 <tosky> alternative, not fake :) 14:47:07 <rosmaita> well, apparently it is so close to the original that it is causing the same problem 14:47:14 <rosmaita> :) 14:48:12 <rosmaita> i guess at this point, i can leave a comment on ricolin's etherpad about this and send a note to the ML for anyone interested in the s3 backup service to please take a look? 14:48:23 <enriquetaso> +1 14:48:31 <rosmaita> i guess file a bug as well if there isn't one already for this 14:48:56 <enriquetaso> #action(enriquetaso): reply to ricolin's and fill a bug 14:49:08 <rosmaita> enriquetaso: thanks! 14:49:23 <rosmaita> ok, next topic 14:49:36 <rosmaita> #topic Finishing up snapshotting in-use volumes spec 14:49:39 <rosmaita> eharney: that's you 14:50:10 <eharney> geguileo found one complication i needed to cover here, wanted to make sure people were generally on-board as we get close to the specs deadline 14:50:40 <eharney> my initial spec left out what should happen for people who are, for whatever reason, passing force=False as a parameter to snapshot create 14:51:05 <eharney> since i'm not sure those people exist, i'm leaning toward just not allowing that going forward, as part of this change 14:51:31 <geguileo> eharney: not allowing the force parameter in general, or just the force=false? 14:51:51 <eharney> hrm 14:52:34 <eharney> the spec at the moment says the latter, but you have me wondering if we should do the former now 14:53:50 <rosmaita> #link https://review.opendev.org/c/openstack/cinder-specs/+/781914 14:54:04 <eharney> i outlined some of the options (apparently not including that one) in comments on the previous patchset version 14:54:06 <eharney> rosmaita: right, thanks 14:55:06 <rosmaita> what's the use case for force=false ? just to remind yourself that the volume is attached? 14:55:20 <eharney> i'm not sure there is a very good use case for it 14:55:27 <eharney> it's more just that you could do it.. 14:56:30 <eharney> geguileo: thoughts? 14:57:00 <geguileo> rosmaita: the case is where you have code and you are unconditionally passing the parameter, but use a variable to pass it to true or false 14:57:09 <geguileo> eharney: I like the removal of the force parameter 14:57:26 <geguileo> but failing on force=false is probably less code 14:57:37 <eharney> the downside of just removing it altogether is that it's more of a hurdle for people who had just added force=True (which would be the common case) 14:57:53 <geguileo> yeah, that's a big one 14:58:08 <geguileo> they may be using the highest microversion and just passing it like you say 14:58:20 <geguileo> that's a big reason for accepting it 14:58:34 <rosmaita> ok, we are about out of time ... let's discuss on the spec 15:00:06 <rosmaita> thanks everyone, join the bug squad meeting in #openstack-cinder 15:00:22 <rosmaita> #endmeeting