14:00:05 #startmeeting cinder 14:00:05 Meeting started Wed Jun 16 14:00:05 2021 UTC and is due to finish in 60 minutes. The chair is rosmaita. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:00:05 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:00:05 The meeting name has been set to 'cinder' 14:00:12 hi 14:00:16 #topic roll call 14:00:32 (#topic doesn't work, but i will use it anyway) 14:00:41 hi 14:00:43 hi 14:00:54 hi 14:01:03 hi 14:01:28 #link https://etherpad.opendev.org/p/cinder-xena-meetings 14:01:36 ok, that doesn't work either 14:01:51 hi 14:02:31 a lot on the agenda today, so let's get started 14:02:40 #topic announcements 14:02:54 cinder-tempest-plugin-lvm-lio-barbican job is failing because sqlalchemy 1.4 broke barbican's alembic migration 14:03:03 this is now fixed, courtesy of geguileo 14:03:11 https://review.opendev.org/c/openstack/barbican/+/796284/ 14:03:34 barbican isn't currently included in the requirements check job 14:03:44 https://review.opendev.org/c/openstack/requirements/+/796647 14:04:04 ^^ proposes a barbican crosscheck job 14:04:37 are there any other projects we depend on that should be checked? 14:04:39 o/ 14:04:45 * jungleboyj sneaks in late 14:04:57 rosmaita: I don't think that would have detected this issue 14:05:16 rosmaita: isn't that the unit tests? 14:05:18 geguileo: it would have caught the ut failures you fixed 14:05:27 not the migration, though 14:05:37 rosmaita: which was the one that blocked our gate 14:06:14 they need something like our cinder/tests/unit/db/test_migrations.py 14:07:10 we can suggest that, but the barbican team seems to be under staffed these days 14:08:21 good to know :-( 14:08:34 also, i noticed that the nova functional tests are run 14:08:52 could add ours, not sure how much that would help detect problems 14:09:59 we should keep this in mind for the next time sqlalchemy is updated to 1.5 or 2.0 14:10:19 next item: vulnerability:managed tag accepted for os-brick 14:10:37 which doesn't really change anything because we all thought it was already managed 14:10:51 next item: request from jungleboyj 14:11:00 Help me vote for the Y release name:  https://twitter.com/jungleboyj/status/1404464680349929474 14:11:18 jungleboyj: when is the deadline for that? 14:11:26 Yes. :-) Just a note that I have a naming poll out there for the Y release. Have had good participation. 14:11:44 I need to submit by vote today, so, if you want to help me pick the name. Please vote. 14:12:03 my personal favorite is "You" 14:12:11 so that no one will know what release you are talking about 14:12:24 * jungleboyj isn't surprised 14:12:26 next item: reminder about festival of reviews on Friday 14:12:34 Yoghurt but no Yogurt? i dunno... 14:12:35 Enter the Chaos Monkey 14:12:37 info here: http://lists.openstack.org/pipermail/openstack-discuss/2021-June/023100.html 14:13:05 friday is a holiday in some locations, but not enough to reschedule 14:13:16 at least that's my impression? 14:14:06 hearing nothing to the contrary, next item: 14:14:17 cinder-coresec: need comments on https://bugs.launchpad.net/bugs/1929223 before 23:59 UTC on Friday 14:14:43 so please comment at your earliest convenience 14:14:58 finally, reminder that the spec freeze is next friday 14:15:28 so we need to review specs in a responsive manner, that is, right away 14:15:38 #link https://review.opendev.org/q/project:openstack%252Fcinder-specs+status:open 14:16:07 that's it for announcements, unless someone else has something to share? 14:16:52 ok, moving on 14:17:02 #topic Two cinder patches blocking glance feature 14:17:07 whoami-rajat: that's you 14:18:17 not sure whoami-rajat is around 14:18:23 but he left enough info in the agenda 14:18:35 he's been working on hardening the glance_store cinder driver 14:18:50 and found a cinder issue that needs to be addressed 14:18:59 #link https://review.opendev.org/c/openstack/cinder/+/783389 14:19:18 i think ^^ is fine and corrects a mistake when the validation schema stuff was added to cinder 14:19:23 see my comment on the patch 14:19:42 the other one is a cinderclient change 14:19:46 #link https://review.opendev.org/c/openstack/python-cinderclient/+/783628 14:20:26 i think that one addresses a similar problem in the cinderclient, that is, it is requiring an optional parameter 14:21:02 anyway, please review so rajat can get that glance_store patch out of his life, which will allow him to concentrate on cinder 14:21:07 makes sense 14:21:28 ok, next topic is a big one 14:21:40 #topic some concerns about the frequency of Cinder failures in the gate 14:21:51 jungleboyj is getting pressure from other members of the TC 14:21:59 and in turn, is passing some pressure onto us 14:22:06 :-) Yes. 14:22:07 is there anything going on here other than the known issues with LVM crashing? 14:22:30 Based on the discussion yesterday it appears that that is the likely cause for concern. 14:22:30 eharney: it's hard to tell 14:22:32 rosmaita: sorry was afk, thanks for covering it 14:22:37 But it is hard to tell. 14:22:53 well it should be easy to quantify the LVM issues with elastic-recheck, has anyone tried that? 14:23:08 eharney: "should be" and "no" 14:23:43 LVM issues is https://bugs.launchpad.net/cinder/+bug/1901783/ ? 14:24:12 yes but it fails on more operations than just volume delete 14:24:19 short-term, i would like to propose no more naked rechecks 14:24:29 rosmaita: ++ 14:24:37 for one thing, the first two items i looked at weren't cinder's fault 14:26:04 We should also work on getting the LVM crashes fixed. 14:26:16 yes, i agree 14:26:18 And fix the barbican issues. 14:26:30 we know how to work around them in a messy way, i think we don't have a way to properly fix them 14:26:39 the barbican issues are already fixed AFAIK 14:26:49 anyway, short term i propose that when you issuue a recheck, do this: 14:26:58 recheck 14:27:08 and maybe some info about the failure if relevant 14:27:15 and you can add info here: 14:27:21 https://etherpad.opendev.org/p/cinder-xena-ci-tracking 14:27:43 but if someone has time to set up an elasticsearch query to automate this, that would be better 14:27:58 but short term it would be good to get some quick data about what is going on 14:28:42 i thought that most of the failures are in teardown, and not related to actual tests, but i am not sure whether that's true or not 14:29:53 any questions about our short term data collection? 14:30:29 from personal standpoint, gates were passing consistently before this barbican issue and now also it's working, maybe failure is more often seen in other project gates 14:30:37 And the teardown failures are often due to volumes being left around due to something like the LVM crash. 14:30:53 whoami-rajat: That is a concern. 14:31:28 well, i think it's mostly in tempest-integrated-storage 14:31:35 so nova, glance, and cinder 14:33:06 anyway, no more naked rechecks ... at least pretend you care 14:33:26 and i guess, review the lvm patches 14:33:36 i think we still need to write more lvm patches 14:33:52 lvdisplay is not covered, not sure what else 14:36:00 rosmaita: could we somehow keep that kind of info in the cinder wiki or something? 14:36:19 Guest2396: which kind of info? 14:36:38 btw, everyone keep an eye on https://review.opendev.org/c/openstack/cinder/+/772126 (it's in recheck now) 14:36:50 rosmaita: the etherpad link for ci trakcing, and the recheck job failedtestname thingy 14:37:11 722126 doesn't fix the crash (which was the initial hope) 14:37:17 geguileo: glad you asked, i think i want to put it into the channel topic 14:37:28 and the wiki, that is a good idea 14:37:46 rosmaita: having the etherpads we are currently using in the wiki could be useful 14:37:59 (for those of us with fish memory) 14:38:01 #action rosmaita check with opendev team about getting topic changed 14:38:19 #action rosmaita add current etherpads to wiki 14:38:37 722126 needs a follow-up to retry on crash 14:38:40 geguileo: there's also the spotlight links on the meeting agenda 14:39:58 rosmaita: We used to have that list in the Wiki. Needs to be updated. 14:40:03 eharney: what's the best way to track these? use https://launchpad.net/bugs/1901783 or other bugs? 14:40:17 jungleboyj: noted 14:40:40 #action rosmaita publicize the ci-tracking effort in all available methods 14:41:03 we should probably write a new bug for retrying all of the other lvm commands that can segfault that weren't covered by 1901783, since that bug already has backports spanning a few branches 14:41:37 i can help with that eharney 14:41:43 ok, let's discuss that during the upcoming bug squad meeting 14:41:58 sure 14:41:58 ok 14:42:20 ok, sounds like we have some strategy to address our CI problems 14:42:47 #topic Community goal proposal: Test with TLS (formerly SSL) by Default 14:42:51 enriquetaso: that's you 14:43:04 Hello, just a quick question around TLS. As this is sort of a community-goal  and I'm not familiar with TLS, I wonder if enabling TLS is a problem for us? 14:43:16 Or should we priority this XS review? maybe for our  XS meeting next friday. 14:43:28 well, the review will not pass 14:43:33 * enriquetaso reading brian's comments on the etherpad 14:43:44 yeah, i looked into this a bit yesterday 14:44:43 not sure how big a deal it is if CI can't reliably use TLS for this one test job 14:45:00 my impression is that there's something going on in the botocore library 14:45:10 since 2014 14:45:13 oops 14:45:16 OK 14:45:18 I see a reference to an aws-cli issue, but is it relevant for that different S3 implementation? 14:45:37 tosky: i don't know, you wouldn't think so 14:45:47 but i think the cli also uses botocore 14:46:25 it's weird that we would also see it in a fake s3 implementation 14:46:34 alternative, not fake :) 14:47:07 well, apparently it is so close to the original that it is causing the same problem 14:47:14 :) 14:48:12 i guess at this point, i can leave a comment on ricolin's etherpad about this and send a note to the ML for anyone interested in the s3 backup service to please take a look? 14:48:23 +1 14:48:31 i guess file a bug as well if there isn't one already for this 14:48:56 #action(enriquetaso): reply to ricolin's and fill a bug 14:49:08 enriquetaso: thanks! 14:49:23 ok, next topic 14:49:36 #topic Finishing up snapshotting in-use volumes spec 14:49:39 eharney: that's you 14:50:10 geguileo found one complication i needed to cover here, wanted to make sure people were generally on-board as we get close to the specs deadline 14:50:40 my initial spec left out what should happen for people who are, for whatever reason, passing force=False as a parameter to snapshot create 14:51:05 since i'm not sure those people exist, i'm leaning toward just not allowing that going forward, as part of this change 14:51:31 eharney: not allowing the force parameter in general, or just the force=false? 14:51:51 hrm 14:52:34 the spec at the moment says the latter, but you have me wondering if we should do the former now 14:53:50 #link https://review.opendev.org/c/openstack/cinder-specs/+/781914 14:54:04 i outlined some of the options (apparently not including that one) in comments on the previous patchset version 14:54:06 rosmaita: right, thanks 14:55:06 what's the use case for force=false ? just to remind yourself that the volume is attached? 14:55:20 i'm not sure there is a very good use case for it 14:55:27 it's more just that you could do it.. 14:56:30 geguileo: thoughts? 14:57:00 rosmaita: the case is where you have code and you are unconditionally passing the parameter, but use a variable to pass it to true or false 14:57:09 eharney: I like the removal of the force parameter 14:57:26 but failing on force=false is probably less code 14:57:37 the downside of just removing it altogether is that it's more of a hurdle for people who had just added force=True (which would be the common case) 14:57:53 yeah, that's a big one 14:58:08 they may be using the highest microversion and just passing it like you say 14:58:20 that's a big reason for accepting it 14:58:34 ok, we are about out of time ... let's discuss on the spec 15:00:06 thanks everyone, join the bug squad meeting in #openstack-cinder 15:00:22 #endmeeting