15:01:44 #startmeeting manila 15:01:44 Meeting started Thu Feb 27 15:01:44 2020 UTC and is due to finish in 60 minutes. The chair is gouthamr. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:01:45 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:01:47 The meeting name has been set to 'manila' 15:01:51 o/ 15:01:53 hi 15:01:54 o/ 15:01:56 hi 15:01:57 Hello 15:01:59 hey 15:02:00 Hi 15:02:33 hello o/ 15:02:40 o/ 15:02:40 Agenda: https://wiki.openstack.org/wiki/Manila/Meetings 15:03:02 courtesy ping: xyang toabctl ganso vkmc amito 15:03:29 let's begin with 15:03:34 #topic Announcements 15:03:55 #link https://releases.openstack.org/ussuri/schedule.html 15:04:16 We're two weeks away from manila's Feature Proposal Freeze 15:04:47 hey 15:04:51 we're expecting new features to be substantially complete: i.e, unit, functional and integration tests passing by this deadline 15:05:39 Feature freeze itself isn't for a month after that - but it gives us enough time for review, rebases and other code churn 15:06:00 please let me know if you anticipate any problems with respect to that.. 15:06:24 no other announcements for the week 15:06:29 does anyone else have any? 15:07:38 #topic Goals for Victoria 15:07:56 the goal search email went out to the ML 15:08:23 if you're interested in taking a look: 15:08:29 #link http://lists.openstack.org/pipermail/openstack-discuss/2020-February/012396.html 15:08:33 #link https://etherpad.openstack.org/p/YVR-v-series-goals 15:09:13 we knew of the zuulv3 goal for a while now, but, we should anticipate another to be picked up by the community 15:09:53 if you're interested in proposing one, please do 15:10:59 We're looking ahead to Victoria, and a gentle reminder that we published our planning etherpad for the PTG here: 15:11:01 #link https://etherpad.openstack.org/p/vancouver-ptg-manila-planning (Victoria PTG Planning Etherpad) 15:11:23 so we'll dive into whatever goals evolve in more detail during the PTG 15:12:07 next up 15:12:11 #topic Tracking our work 15:12:54 in terms of reviews needing attention 15:13:05 #link https://review.opendev.org/#/q/owner:tamm.maari%2540gmail.com+status:open 15:13:26 this is maaritamm's work with manilaclient/OSC ^ 15:14:17 if we don't find substantial issues, we should aim to get these patches merged by feature proposal freeze 15:14:32 maaritamm's internship is almost coming to an end :( 15:14:44 :/ 15:14:47 :( 15:14:54 :( 15:15:37 feels like yesterday that she started - i think she's done an incredible job ramping up and learning all the nuances of manilaclient 15:15:46 and of OSC 15:15:57 gouthamr ++ 15:15:58 for sure 15:15:58 I will stick around as much as I can though still :) 15:17:05 that's commendable, thank you maaritamm :) 15:17:31 please give these patches all the review attention you can.. 15:18:04 any other reviews that need our attention? 15:18:31 what about the last rocky backport? 15:19:18 dviroel: good point, i'm a little bothered by one issue wrt rocky 15:19:52 i'd like some closure for it this week - if any patches need to land there, please alert me 15:20:08 i think i didn't find any that needed to be in the release 15:21:12 but, beyond sounding cryptic - we may have a significant bugfix coming that might be appropriate before we make that final release 15:21:40 if we can't have that bugfix by the end of the week, i'll +1 this: 15:21:49 #link https://review.opendev.org/#/c/709896/ (Rocky final release proposal) 15:22:29 does that sound sane? 15:22:34 yes 15:23:22 on the review, i made it sound like we're going to discuss the issue in this meeting ^ 15:24:15 i'll keep you informed via #openstack-manila 15:24:34 gouthamr: let us know if you need anything 15:25:40 thanks dviroel 15:25:54 cool, anything else? 15:26:11 #topic Bugs (vhari) 15:26:24 let's hear from vhari! 15:26:25 #link: https://etherpad.openstack.org/p/manila-bug-triage-pad-new (Bug Triage etherpad) 15:26:39 o/ 15:27:01 lets take a look at 1st bug https://bugs.launchpad.net/manila/+bug/1858328 15:27:03 Launchpad bug 1858328 in Manila "Manila share does not get into "shrinking_possible_data_loss_error" status when shrinking a share" [Low,Confirmed] - Assigned to Douglas Viroel (dviroel) 15:27:18 dviroel: When I looked at this one, I wondered if the NetApp driver didn't return the expected "possible data loss" error because 15:27:24 it was of the opinion 15:27:34 that it had caught the shrink attempt in time 15:27:42 and there is no possible data loss 15:28:21 Of course we don't have an error state in manila that really accomodates that .... 15:29:12 tbarron: yes, at the is aborted. 15:29:37 s/at the/at the end 15:30:24 if so, then I think we have the question whether NetAop should adapt to manila manager expectations 15:30:39 and the user will be told there is possible data loss even though 15:30:43 there isn't really 15:30:59 or whether we should adapt the manila manager framework 15:31:43 which only allows AVAILABLE, ERORmumbleSHRINKING_POSSIBLE_DATA_LOSS 15:31:44 there's a comment in the code base that seems appropriate: 15:31:46 #link https://opendev.org/openstack/manila/src/commit/188705d58b7022b30955bfa49d7b62ba93b7e9ef/manila/share/manager.py#L3904-L3908 15:32:10 feels like dejavu 15:32:23 it is strange we'd raise an "error" if the share is perfectly alright 15:32:31 gouthamr: +1 15:32:56 but, we'd have to possibly look if drivers don't validate, but detect data loss? (is such a thing possible?) 15:33:04 lkuchlan is suggesting we set it to AVAILABLE but generate an ERROR user msg 15:33:26 but I kinda think maybe we should have an additional error state 15:33:27 ack, that seems to be u_glide's thought as well, when implementing this state 15:33:34 and trust the drivers to signal the right thing 15:34:30 for each driver, if it decides to indicate the safe error, we 15:34:38 validate that change in code review 15:35:15 this would require a corresponding tempest test case change to allow the new error state 15:35:39 maybe several tempest tests and unit tests 15:36:23 a quick code search confirms that assumption ^ 15:36:39 I haven't checked, perhaps only NetApp is doing a safe check on shrinks 15:37:08 apart from the Dell/EMC Unity driver, all drivers check for instances where consumed space is greater than the requested space 15:37:13 and return the error 15:37:46 k 15:37:50 the Unity storage driver seems to perform this validation on their storage system 15:38:04 #link https://opendev.org/openstack/manila/src/commit/73b0bccd9f0e3238a153cb9ee461bbaefd6aa6d4/manila/share/drivers/dell_emc/plugins/unity/client.py#L309-L318 (Dell/EMC Unity share shrinking possible data loss) 15:38:33 either case, it's a validation and data loss has been prevented 15:38:50 yup 15:38:59 yes, same thing. 15:39:09 so this bug is tech debt that we haven't gotten to? 15:39:36 would it make sense to address it uniformly as such, rather than change the NetApp driver? 15:39:41 seems like it. How does the tempest test pass in 3rd party CI? 15:40:04 ^ good question, need to take a look 15:40:08 lkuchlan's issue seems to be with a scenario test 15:40:21 we don't test for this status in an API test 15:40:23 gouthamr: I'm for a uniform solution but that doesn't require that all drivers change at once. 15:40:43 change the manager to support setting the new state (and the tests) 15:40:51 then drivers can do it one by one 15:41:04 tbarron: new state? 15:41:09 as motivated to not alarm their users unnecessarily 15:41:18 gouthamr: I don't see any other way, do you? 15:41:42 new exception and new state 15:41:54 it's a bigger fix, but here's what i'm thinking: 15:42:15 raise SHRINK_REFUSED 15:42:28 1) Fix the share manager to raise a user message on this validation error, and set the share status to "available" 15:42:49 2) Fix the NetApp driver to return the exception expected by the share manager for this validation so it conforms with the other drivers 15:42:51 mgr sets STATUS_SHINKKING_ERROR 15:43:12 3) Fix our scenario test to expect the share to transition to "available" and for the share size to remain the same 15:43:55 +1 15:44:00 does this leave us with possible data loss with some back ends but the share is available? 15:44:14 tbarron: that'd be the first thing to check 15:44:44 well we no a priori that it could, so this solution seems to do the wrong thing from a logical standpoint 15:44:50 tbarron: i looked through the drivers now, none of them are forcing a shrink, possibly because we call out the need for this validation 15:45:04 even it might work empirically, with all our drivers 15:45:12 in the driver developer guide (or the interface doc) 15:45:22 we're keeping a possible data loss state and 15:45:51 having to make sure every driver really prevents it 15:45:58 #link https://docs.openstack.org/manila/queens/contributor/driver_requirements.html#share-shrinking (driver expectations) 15:46:20 tomorrow we get a driver where there might be possible data loss, and we keep a state for that, but never use it? 15:46:37 it seems to me a kludge and a bad design 15:46:50 #link https://opendev.org/openstack/manila/src/commit/14d3e268a05265db53b5cfd19d9a85a3ba73a271/manila/share/driver.py#L1163-L1165 (driver interface doc) 15:47:30 tbarron: yeah, not inclined to do that - if a storage system can somehow detect data loss while shrinking a share, it should be able to do so before shrinking the share 15:47:53 Sure, we set the expectations that drivers should do the right thing. But if we rely entirely on that then 15:47:59 tbarron: or during shrinking the share and refuse to shrink like Unity and NetApp doing so 15:48:09 we can get rid of the possible data loss state entirely 15:48:47 I think we all agree it's better for a back end to detect and prevent data loss. 15:48:48 yes, if there is an exception in that path, we'd set the status to "error" and log - allowing operators to take a look anyway 15:50:00 okay, dviroel are you still comfortable handling this bug? 15:50:32 gouthamr: yes, we have danielarthurt looking at it right now 15:50:48 dviroel++ 15:50:49 thanks dviroel 15:51:16 awesome, ty dviroel danielarthurt - when you have your findings, please summarize on the bug report 15:51:48 sure thing 15:52:24 cool, think we're almost out of time - passing on the token to gouthamr 15:52:26 this change should be backported, for us to continue using that test case as it is written 15:52:38 thank you vhari - this was an interesting one 15:52:51 indeed. yw 15:53:49 dviroel tbarron danielarthurt: we can brainstorm further on what that means - changing this behavior does seem like it gets into the grey area between a bugfix and a feature 15:54:10 but, i don't have bright ideas to fix a design issue like this 15:54:27 yup 15:54:34 +1 15:55:11 okay, lets take this discussion to #openstack-manila and to the bug 15:55:56 danielarthurt: i'll subscribe to the bug, but, if you don't see a response from me/tbarron/dviroel after your update - please ping us :) 15:56:22 Ok 15:56:33 ty 15:56:37 #topic Open Discussion 15:56:39 dviroel: 15:57:37 andrebeltrami has nothing to say, btw 15:57:46 lol 15:57:46 lol 15:57:48 haha 15:57:54 haha, it was okay to gossip about you during open discussion 15:58:07 sorry for that :( 15:58:15 lol, np andrebeltrami 15:58:32 alright folks, lets wrap up and see each other on #openstack-manila 15:58:40 thank you for attending 15:58:42 thanks gouthamr 15:58:43 thanks! 15:58:45 #endmeeting