15:00:26 <tbarron> #startmeeting manila
15:00:27 <openstack> Meeting started Thu Jan 31 15:00:26 2019 UTC and is due to finish in 60 minutes.  The chair is tbarron. Information about MeetBot at http://wiki.debian.org/MeetBot.
15:00:28 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
15:00:30 <openstack> The meeting name has been set to 'manila'
15:00:33 <bswartz> .o/
15:00:38 <tbarron> bswartz: hi
15:00:42 <tbarron> ping ganso
15:00:47 <tbarron> ping gouthamr
15:00:50 <tbarron> ping amito
15:00:58 <amito> hey o/
15:01:12 <carthaca> hi
15:01:16 <tbarron> zhongjun isn't online but recently did a review :)
15:01:26 <tbarron> hi amito carthaca
15:01:30 <tbarron> ping vkmc
15:01:48 <tbarron> ping toabctl
15:01:58 <toabctl> hey
15:01:59 <tbarron> ping erlon
15:02:00 <ganso> hello
15:02:23 <tbarron> ping tpsilva (wonder if he's still doing manila ...)
15:02:47 <tbarron> that's the ping list
15:02:57 <tbarron> i'll give it a couple minutes ...
15:02:59 <ganso> tbarron: he doesn't work for NetApp anymore
15:03:10 * bswartz hands gouthamr a coffee
15:03:21 <tbarron> ganso: that's what I heard, but w'd like canonical folks here too!
15:03:44 * tbarron is certainly wearing his upstream hat today
15:03:56 <ganso> lol
15:04:08 <tbarron> toabctl: hi!
15:04:15 <gouthamr> o/
15:04:20 <gouthamr> ty bswartz :)
15:04:41 <tbarron> ping xyang
15:05:26 <tbarron> OK let's go
15:05:29 <tbarron> Hi All!
15:05:54 <tbarron> #link https://wiki.openstack.org/wiki/Manila/Meetings
15:06:06 <tbarron> #topic announcements
15:06:43 <tbarron> cinder meetup is next week in RTP area of North Carolina
15:07:00 <tbarron> if you are around and can attend talk to jungleboyj
15:07:21 <tbarron> I have a bunch of conflicts but may do some of it
15:07:41 <tbarron> the manila zorillas always consider WWCD
15:07:43 <jungleboyj> :-)
15:07:46 <tbarron> what would cinder do
15:07:59 <tbarron> even if we like to do it better :)
15:08:07 <jungleboyj> If people are interested we would be happy to have you attend.
15:08:13 <tbarron> That's all I have for announcements.
15:08:20 <tbarron> jungleboyj: thanks
15:08:43 <tbarron> Anyone else have announcements?
15:09:40 <tbarron> #topic share shrinking issue - gouthamr you are up.
15:10:05 <gouthamr> o/ ty tbarron, this was a review question
15:10:11 <jungleboyj> tbarron:  Hope you can make it for some of the meetings.  :-)  Or post meeting fun.
15:10:12 <gouthamr> #LINK LINK: https://review.openstack.org/#/c/632615/13/manila/share/drivers/dell_emc/plugins/unity/client.py@309
15:10:25 <gouthamr> #LINK: https://review.openstack.org/#/c/632615/13/manila/share/drivers/dell_emc/plugins/unity/client.py@309
15:11:01 <bswartz> I would have expected a shrink to fail in this case
15:11:48 <gouthamr> bswartz: ack, that's what i'm familiar with too
15:11:58 <tbarron> gouthamr: do they not have the ability to detect this condition?
15:12:11 <bswartz> Unless we're able to shrink the share all the way down to the desired size, I would expect failure and no resize at all
15:12:13 <tbarron> gouthamr: in advance of modifying the share ...
15:12:22 <ganso> it should fail
15:12:22 <gouthamr> tbarron: per the response, seems like no..
15:12:31 <ganso> because the manila DB will believe the share is a 2GB share
15:12:34 <ganso> while it is not
15:12:48 <bswartz> Then the user can retry again for a shrink to a larger size
15:13:52 <tbarron> bswartz: I agree that should be the requirement
15:13:59 <bswartz> so gouthamr: what is the specific issue we need to decide here?
15:14:13 <gouthamr> that's it, actually - i was wondering what we'd do here
15:14:43 <tbarron> gouthamr: so they can do the shrink, detect that it was partial, do an implicit expand back to the original size, and fail  the shrink :)
15:14:45 <gouthamr> because the committer says they can't detect the usage and fail the operation
15:15:06 <gouthamr> tbarron: that's one option in case they can't predetermine what the usage on the share is
15:15:42 <tbarron> gouthamr: I think you should explain the semantics manila expects and propose that if they can't do anything better.
15:15:53 <gouthamr> ack
15:15:54 <tbarron> it's kinda like a db transaction rollback.
15:16:10 <tbarron> back ends with limitations have to do more work
15:16:14 <gouthamr> on those lines, however, how does one recover from that error condition they've put themselves in?
15:16:21 <gouthamr> "STATUS_SHRINKING_POSSIBLE_DATA_LOSS_ERROR"
15:16:35 * gouthamr is sorry for the caps - copy pasta
15:16:46 <tbarron> why are they in that condition?  I don't understand.
15:17:05 <tbarron> They are shrinking ... not done yet in manila
15:17:22 <gouthamr> tbarron: when a user tries to shrink a share beyond the usage size, the 'status' is set to that ^
15:17:22 <tbarron> in their back end they shrunk to 3gb instead of 2
15:18:13 <tbarron> their usage is 3gb
15:18:33 <tbarron> the driver brings the share back to 4gb on their back end
15:18:34 <gouthamr> afaict, the only ways seem to be: try to shrink again to a valid size, or contact a privileged user (admin) to reset the status
15:18:39 <tbarron> and failis the shrink
15:19:22 <gouthamr> tbarron: yes, the driver failing with that error, puts the share in "share_shrinking_possible_data_loss" status
15:19:40 <tbarron> gouthamr their driver shouldn't fail with that error
15:19:52 <tbarron> oh
15:19:56 <tbarron> i see now
15:20:21 <tbarron> why do we say possible data loss just b/c the shrink failed
15:20:39 <ganso> gouthamr: so you think that, if they were able to detect and by raising a proper exception, manila should set the share status back to available instead of shrink error, correct?
15:21:20 <tbarron> if a driver can know it can't do an op b/c it's unsafe and refuses it seems like there should be a different status
15:21:26 <ganso> tbarron: because the driver raises the exception and we currently (AFAIK) have no way to tell if the shrink has been attempted and failed, or if the driver just detected that it cannot do.
15:21:41 <tbarron> ganso: yeah I understand now
15:21:54 <tbarron> seems like we have a limitation then.
15:21:55 <ganso> tbarron: we should handle different exceptions for this method
15:22:03 <ganso> tbarron: but we aren't
15:22:30 <gouthamr> ganso: yeah, would that be weird? we can set a user-message
15:22:34 <ganso> tbarron: I wouldn't say it is a limitation, we need a code improvement
15:22:45 <tbarron> so they should also propose some core work for that code improvement
15:22:56 <ganso> gouthamr: but we need a code improvement to avoid the shrink error status.
15:23:05 <ganso> gouthamr: user message is not enough
15:23:09 <tbarron> ganso: one person's limitation is another's enhancement opportunity
15:23:37 <ganso> tbarron: =)
15:24:17 <ganso> either way, I think we have several drivers that fall into the same category
15:24:21 <gouthamr> ganso: why though? i know it is terribly weird if something appears to be happening, and then randomly goes back to "available"
15:24:38 <tbarron> gouthamr: ganso: today the user will have to go to an admin to get it fixed but they could do a little bit of work and add exception handling for a diffrent exception and it would be  better
15:25:02 <gouthamr> ganso: but the alternatives to get yourselves out of that situation is 1) Shrink the share again to a valid size (which isn't too obvious)
15:25:05 <gouthamr> 2) go to the admin
15:25:07 <ganso> tbarron: gouthamr thinks the lack of status change is wierd
15:25:32 <ganso> I partially agree with that
15:25:43 <ganso> both solutions have advantages and disadvantages
15:26:00 <gouthamr> yeah, because you can't do anything else in manila with a share that's actually not in an error state if the driver bailed out correctly
15:26:05 <ganso> going back to available and triggering a user message sounds good to me
15:26:06 <tbarron> what's weird is that every kind of error leaves the user needing to go to an admin
15:26:32 <ganso> tbarron: +1
15:26:49 <tbarron> if a back end is smart enough to block an operation and leavae hte user safe we ought to be able to handle that
15:27:08 <tbarron> gouthamr: but we're moving into a bigger issue than thhis particular review
15:27:17 <gouthamr> tbarron: yes we are
15:27:21 * gouthamr its a trap
15:27:27 <tbarron> let's have them do the right thing given our current framework
15:27:33 <ganso> gouthamr you brought a can of worms
15:27:54 <tbarron> and I'll put the can of worms on next agenda
15:28:03 <tbarron> next week's agenda
15:28:20 <tbarron> we can all think about it and have headaches in the mean time
15:28:49 <tbarron> gouthamr: you ok with that?
15:28:54 <gouthamr> haha, works for me - to the current question on the review on what the Unity driver should do, i like the solution and will let them know
15:29:07 <tbarron> #topic pylint again
15:29:29 <tbarron> ganso brought up all the spurious pylint failures last week
15:29:41 <tbarron> but I think we can make some headway
15:29:51 <tbarron> as an experiment:
15:30:07 <tbarron> #link https://review.openstack.org/#/c/633576/
15:30:20 <tbarron> touches every file, so they all get examined
15:30:34 <tbarron> and filters out the test files, alembic migrations, etc
15:30:40 <tbarron> which are full of noise
15:30:59 <tbarron> the report is small enough that one can find genuine issues
15:31:03 <tbarron> and fix them
15:31:31 <tbarron> and some issues that should be handled with # pylint ignore comments
15:31:53 <tbarron> so I propose that we leave this job to go red for a while
15:32:06 <tbarron> but make progress on cleaning up the codebase
15:32:21 <tbarron> That means reviewers should look at it when it's red.
15:32:35 <tbarron> Only the files touched by the review are examined.
15:32:50 <gouthamr> tbarron: did things change significantly in pylint 1.9.0, or should we bring back the ignores we had in the old tools/lintstack.py
15:32:53 <tbarron> Try to clean up the non test code files in the course of a review
15:32:59 <tbarron> It can be a followup.
15:33:45 <tbarron> gouthamr: I don't really know but I think the main thing was not so much the ignores as that
15:34:06 <tbarron> the old job used to work as a ratchet, only looking for newly introduced issues
15:34:13 <ganso> tbarron: do you plan to merge that "skip workaround"?
15:34:26 <tbarron> ganso: not until it's more refined
15:35:09 <bswartz> So these pylint warning are legit?
15:35:28 <tbarron> bswartz: enough of them that they're worth looking at
15:35:40 <bswartz> Any new contributors looking for low hanging fruit?
15:35:45 <tbarron> I've done two reviews fixing stuff already
15:36:03 <tbarron> Sometimes a fix is just to add pylint ignores, as in
15:36:24 <bswartz> Yeah but in those cases the warnings aren't legit
15:36:40 <bswartz> I was asking if it's finding things that we would actually want to fix, absent any lint checker
15:36:44 <tbarron> #link https://review.openstack.org/#/c/634210/
15:36:58 <tbarron> bswartz: we do want to fix duplicate methods in general
15:37:09 <tbarron> bswartz: just did  that in a driver review
15:37:21 <bswartz> Then why not globally disable E0102?
15:37:25 <tbarron> bswartz: the author didn't know he had a method by the same name
15:37:34 <gouthamr> tbarron: afair we used to disable E0102
15:37:36 <bswartz> Oh I got it backwards
15:37:39 <bswartz> I see
15:37:41 <tbarron> bswartz: b/c usually you want to know and remove the method
15:37:56 <tbarron> bswartz: but in api code with microversion decorators it's ok
15:38:19 <gouthamr> oh, can we disable it in a subdirectory?
15:38:23 <tbarron> bswartz: and only two # pylint ignore comments were missing, we usually do it
15:38:43 <tbarron> gouthamr: well even in a file you only want to ignore sometimes
15:39:01 <tbarron> gouthamr: but to your question,  yes for pylint but no for our script
15:39:46 <tbarron> gouthamr: our script picks individual files and runs pylint against them in a way that the .pylintrc directives to skip certain dirs don't have effect
15:39:53 <tbarron> at least in my experimentation
15:40:09 <tbarron> that's why I'm modifying the script to ignore subdirs
15:40:25 <tbarron> we could give the script an ignore-list
15:40:43 <gouthamr> yes, that'd be helpful
15:40:44 <carthaca> maybe it it possible to let the microversion decorator do the job to pylint ignore?
15:40:47 <tbarron> like for the alembic stuff and maybe the wsgi stuff
15:41:00 <tbarron> carthaca: be my guest :)
15:41:24 <tbarron> I'm hearing lots of good optimization ideas, feel free to propose improvements!
15:41:55 <tbarron> My point wasn't to declare a policy but rather to give you folks a heads up and engage interest and efforts in cleanup.
15:42:13 <tbarron> bswartz point that a lot of these are low-hanging fruit is a good one.
15:42:16 <tbarron> ok
15:42:38 <tbarron> #topic skip  third party jobs on WIP patches??
15:42:57 <tbarron> Now that our job queue definitions are in tree it's easy
15:43:26 <tbarron> to modify .zuul.yaml in WIP patches to run only the jobs relevant to your patch
15:43:51 <bswartz> Are you talking about 3rd party CI?
15:43:52 <tbarron> You're working on the lvm driver, just run the lvm job in the check queue until
15:44:01 <tbarron> you're ready to merge.
15:44:04 <bswartz> Or the upstream CI?
15:44:08 <tbarron> save resources, run faster.
15:44:24 <tbarron> bswartz: so far laying groundwork, upstream so far.
15:44:35 <bswartz> Could the approach be used by 3rd party CIs too?
15:44:44 <tbarron> bswartz: bingo
15:45:08 <bswartz> What exactly do I do to .zuul.yaml if I have a WIP patch?
15:45:10 <tbarron> bswartz: and if a third party job sees WIP status on a patch it would skip
15:45:24 <gouthamr> bswartz: yank out all the jobs except the one you care about :)
15:45:25 <tbarron> unless there's an explicit run-netappm etc
15:45:45 <bswartz> Seems like a good way to economize on compute resources
15:45:50 <ganso> tbarron: I'm confused. The approach I saw in your pylint patch skips 1st party jobs, not 3rd party
15:46:10 <gouthamr> bswartz: like this:
15:46:12 <gouthamr> #LINK https://review.openstack.org/#/c/545695/28/.zuul.yaml
15:46:20 <tbarron> ganso: cause I don't have a way to skip third pary, that's why I brought this up
15:46:40 <tbarron> ganso: I think your CI has to check for something and decide to skip
15:46:42 <bswartz> gouthamr: that doesn't help with skipping 3rd party CI
15:47:02 <tbarron> ganso: gouthamr's idea is that we also mark the review WIP
15:47:06 <gouthamr> doesn't, thought your question pertained to our jobs
15:47:20 <tbarron> ganso: bswartz: your CI would then skp
15:47:22 <tbarron> skip
15:47:32 <tbarron> the code poster would do two things:
15:47:39 <ganso> tbarron: yep, but all the effort to skip 3rd party jobs would be on 3rd parties side, I believe there is nothing we can do right now to achieve that without changes in 3rd parties' sidde
15:47:40 <tbarron> 1) modify zuul.yaml
15:47:50 <tbarron> 2) mark the review WIP
15:48:02 <tbarron> ganso: that's what I'm saying also
15:48:06 <bswartz> You mean just the word "WIP" in the commit message should skip the jobs?
15:48:18 <bswartz> Or workflow -1?
15:48:21 <ganso> why not the workflow -1?
15:48:23 <gouthamr> we could use Workflow -1
15:48:23 <ganso> bswartz: +1
15:48:36 <tbarron> bswartz: ganso: gouthamr: that's better
15:48:45 <gouthamr> that's easier to detect/parse in the older zuul that most third party CIs are running
15:48:55 <gouthamr> just a simple change to their config
15:49:06 <bswartz> The problem is you can't workflow -1 until after the review is live
15:49:13 <bswartz> So there's a race condition
15:49:17 <gouthamr> bswartz: not really
15:49:29 <bswartz> I guess as long as 3rd party CIs aren't too fast it doesn't matter
15:49:36 <gouthamr> bswartz: we encourage third party CIs not to run until upstream CI has voted
15:49:49 <bswartz> But then it's a totally different approach to modifying .zuul.yaml to disable first party jobs
15:49:55 <ganso> bswartz: but most 3rd party CIs have their jobs triggered only when Zuul +1's
15:50:21 <tbarron> bswartz: yes, it would be nice to have an elegant central solution but I don't have one in hand
15:51:34 <tbarron> ganso: do you want to adjust netapp to not run if Workflow -1?
15:51:35 <bswartz> Okay, 2 different approaches are fine
15:52:02 <ganso> tbarron: when I have some time to, sure
15:52:18 * tbarron sends ganso some of gouthamr's free time
15:52:41 * ganso receives plenty of free time!
15:52:48 <tbarron> I'm guessing this is a small change for each third party CI that will actually
15:52:53 <gouthamr> free, what's that?
15:52:57 <tbarron> save its maintainers time.
15:53:27 <tbarron> #topic bugs
15:53:36 <tbarron> jgrosso is on PTO but
15:53:49 <tbarron> gouthamr has been working with him in his free time
15:53:52 <gouthamr> it's just as easy with zuulv3: https://zuul-ci.org/docs/zuul/user/config.html#pipeline
15:54:12 <tbarron> and jgrosso has a spreadsheet of bugs in the works
15:54:21 <gouthamr> #zuulIsCool
15:54:43 <tbarron> he's going to email cores and other active reviewers soon
15:55:05 <tbarron> and we'll post the spreadsheet and a proposed bug backlog reduction plan
15:55:21 <tbarron> a lot of it will be agreeing to some cleanup and prioiritization
15:55:31 <tbarron> can't spell ^^^
15:56:00 <tbarron> So don't put mail from jgrosso in your spam folder please :)
15:56:20 <tbarron> gouthamr: anyting you have on $topic?
15:56:37 <gouthamr> tbarron: nope, that's mostly it - i'm excited to get this work started
15:56:46 <tbarron> #topic open discussion
15:56:48 <tbarron> :)
15:56:53 * gouthamr jgrosso is clearly not as lazy as him
15:57:09 <tbarron> he's quite organized
15:57:18 <gouthamr> actually, i suggested he mail openstack-discuss
15:57:26 <tbarron> gouthamr: agree
15:58:12 <tbarron> ok, seems like we're through for today.
15:58:17 <tbarron> Thanks everyone!!
15:58:21 <bswartz> Perfectly on time
15:58:25 <tbarron> #endmeeting