15:00:54 <gouthamr> #startmeeting manila
15:00:54 <openstack> Meeting started Thu Jan 14 15:00:54 2021 UTC and is due to finish in 60 minutes.  The chair is gouthamr. Information about MeetBot at http://wiki.debian.org/MeetBot.
15:00:55 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
15:00:58 <openstack> The meeting name has been set to 'manila'
15:01:01 <carloss> o/
15:01:03 <vkmc> o/
15:01:11 <gouthamr> courtesy ping: ganso dviroel lseki tbarron andrebeltrami felipe_rodrigues esantos
15:01:13 <carthaca> Hi
15:01:13 <andrebeltrami_> hello
15:01:47 <gouthamr> hello o/
15:02:12 <gouthamr> here's our agenda for today: https://wiki.openstack.org/wiki/Manila/Meetings#Next_meeting
15:02:17 <tbarron> hi
15:02:33 <gouthamr> thank you for tuning in, let's begin
15:02:38 <gouthamr> #topic Announcements
15:03:19 <gouthamr> We've an upcoming bug target deadline next week: Milestone-r
15:03:24 <gouthamr> Milestone-2*
15:03:27 <dviroel> o/
15:03:29 <gouthamr> #link https://releases.openstack.org/wallaby/schedule.html
15:03:49 <gouthamr> our new driver deadline is in 3 weeks
15:04:11 <gouthamr> and the feature proposal freeze follows that
15:04:13 <felipe_rodrigues> o/
15:05:52 <mgoddard> o/
15:05:53 <gouthamr> so as we're slowly inching our way into the tail end of this release cycle, a lot of code could be submitted, and as ever, we'll need to keep the momentum on reviews :)
15:06:31 <gouthamr> no other pressing announcements atm, anyone else got any?
15:07:36 <gouthamr> cool, lets move on..
15:07:49 <gouthamr> #topic Manila deployment via Kolla-Ansible (mgoddard, eliaswimmer)
15:08:03 <mgoddard> Hi Zorillas
15:08:09 <gouthamr> hi mgoddard, eliaswimmer
15:08:15 <tbarron> o/
15:08:23 <eliaswimmer> hi
15:08:44 <mgoddard> This agenda item came about after discussing manila in yesterday's kolla meeting
15:09:19 <mgoddard> We have a bug, it seems we are doing multinode manila incorrectly
15:09:21 <mgoddard> https://bugs.launchpad.net/kolla-ansible/+bug/1905542
15:09:23 <openstack> Launchpad bug 1905542 in kolla-ansible "Manila ceph configuration won't work in HA mode" [Undecided,New]
15:09:52 <mgoddard> We naively deploy manila-share active/active, and hope for the best
15:10:29 <mgoddard> tbarron joined in the conversation and suggested we discuss the state of manila in Kolla here
15:10:34 <tbarron> Are you doing this for cinder-volume as well?
15:10:34 <mgoddard> so here we are
15:10:46 <mgoddard> well, that's another story
15:11:00 <tbarron> mgoddard: And thanks for joining us!
15:11:28 <mgoddard> we were previously using cinder backend_host, then it somehow got dropped, and now we have a bug which we're working on
15:11:38 <tbarron> I ask because manila developed as a fork of cinder and we inheritied some issues w.r.t. active-active service talking to storage back end
15:11:49 <tbarron> They are making some progress.
15:11:56 <mgoddard> right
15:12:00 <tbarron> We keep talking about it but
15:12:23 <mgoddard> it certainly would be nice to have
15:12:36 <tbarron> as I mentioned to you many of us work with deployment arch (tripleo) that does a pretty good job running the service active-passive so
15:12:45 <mgoddard> I will do everything in my power not to have to deploy pacemaker
15:12:53 <tbarron> we've been able to get away with procrastination.
15:13:21 <tbarron> mgoddard: I am actually quie sympathetic to that POV and am not recommendinng pacemaker for kolla-ansible.
15:13:50 <gouthamr> yeah, testing/supporting active-active HA for manila-share has been on the backlog for a while
15:13:51 <tbarron> We already have the machinery in place so using it for manila isn't a big deal in that context.
15:13:53 <mgoddard> do we have any alternatives?
15:14:06 <mgoddard> (for existing releases)
15:14:54 <tbarron> mgoddard: I'm not draft *you* as I know you are doing many things already but we really need some folks working in manila whose
15:15:03 <tbarron> day jobs don't depend on tripleo
15:15:10 <tbarron> to balance things out
15:15:37 <gouthamr> not that i'm aware of, we've only ever tested/documented using pacemaker:
15:15:39 <gouthamr> #link https://docs.openstack.org/ha-guide/storage-ha-file-systems.html#add-shared-file-systems-api-resource-to-pacemaker
15:15:46 <tbarron> mgoddard: the particular bug you raise is one thing, I think the independent keys idea may work for that
15:16:19 <tbarron> But the more general problem that active-active is not really safe for manila-share service yet is another matter.
15:16:20 * gouthamr scratch that link, it should be manila-share we're talking about
15:18:16 <tbarron> mgoddard: manila-share core code needs some fixes, then back-end by back-end there need to be fixes.
15:18:36 <tbarron> mgoddard: which back ends is the kolla community most interested in?
15:19:14 <eliaswimmer> for me it is ceph and generic
15:19:22 <mgoddard> we support generic, hnas, cephfsnative, cephfsnfs & glusterfsnfs
15:19:35 <mgoddard> although it's fairly customisable
15:19:51 <mgoddard> so other drivers too potentially
15:21:19 <mgoddard> so from our perspective I think there are two issues
15:21:41 <mgoddard> 1. what do we need to do today to safely use existing manila-share
15:22:07 <mgoddard> 2. if and when active/active manila-share becomes available, how do we use it
15:22:53 <mgoddard> I'd love to find you some contributors, although I expect many projects are in a similar position
15:23:49 <gouthamr> ^ +1 thanks for that - that'd be the prime concern - we could talk in theory about the pitfalls there may be for attempting active-active HA
15:23:57 <gouthamr> with the workaround you have
15:24:13 <gouthamr> for the native cephfs backend
15:24:39 <mgoddard> that would be helpful to understand the severity of the issue
15:25:02 <gouthamr> from the top of my head: driver startup, and periodic driver interactions
15:25:25 <tbarron> note that depending on the back end there is an issue of HA for the back end as well
15:25:35 <tbarron> no issue for native cephfs
15:26:13 <tbarron> for generic driver we have the issues covered in this incomplete review https://review.opendev.org/c/openstack/manila-specs/+/504987
15:26:46 <mgoddard> queens :)
15:26:51 <tbarron> for cephfsnfs, tripleo uses pacemaker :(
15:27:09 <tbarron> mgoddard: yeah, we've been talking about this stuff for some time
15:27:25 <gouthamr> during startup, there's a reconciliation of exports on the backend, so this might run twice, but, if the driver startup is staggered, this isn't a biggie; and there aren't many periodic driver interactions for the native cephfs driver
15:27:32 <gouthamr> just the scheduler stats update
15:27:50 <mgoddard> it's very difficult to retrofit if this stuff isn't designed in from the beginning
15:28:07 <tbarron> mgoddard: +1
15:29:13 <tbarron> gouthamr: yeah, fixing for native cephfs and adding some missing tooz locks in manila-share core would be a good start
15:29:40 <tbarron> And then maybe documenting caveats for other back ends until someone can fix them.
15:30:37 <tbarron> eliaswimmer: is your interest in the generic driver for production use?
15:30:39 <gouthamr> yeah, that's a good point - one of the other pieces that relies on coordination is share access control - we currently protect critical sections there with file locks
15:31:10 <gouthamr> if the file locks aren't coordinated, this could potentially break
15:31:16 <eliaswimmer> tbarron: only until I have a good setup for using cephfs natively
15:31:44 <tbarron> gouthamr: the code changes to swith them to tooz wouldn't be hard, but need to deploy tooz with dlm and test test test
15:32:04 <gouthamr> tbarron: +1
15:32:47 <tbarron> gouthamr: mgoddard but consequences of races there are probably not too bad either.  Access to share isn't granted or withdrawn properly and needs to be re-done?
15:33:31 <tbarron> Not like data loss.  The more general bogey with active-active cinder-volume or manila-share.
15:35:16 <mgoddard> that's all helpful stuff, thanks
15:35:21 <tbarron> mgoddard: so k-a gate runs multinode jobs with a DLM in play? If manila does make changes for a-a, they could be tested there?
15:35:22 <gouthamr> i think so - if you added a lot of access rules at once on a particular share, we could claim some of your access rules have been applied. but, in reality, cephfs could know nothing about them
15:36:02 <tbarron> gouthamr: so undetected access failure, or rather in-use detection by end users.
15:36:40 <gouthamr> yeah, the annoying workaround would be to apply access rules one at a time
15:36:50 <mgoddard> currently we do not deploy a DLM in CI jobs, although I expect we will soon
15:37:07 <mgoddard> it should be a one-liner to enable
15:37:26 <tbarron> mgoddard: So perhaps respecting yoour time we should leave the conversation where it is atm.  We don't have a quick
15:37:57 <tbarron> solution for you but the good thing is we in manila are more aware of the need and potential for testing
15:37:58 <mgoddard> yes, we should let you continue with your other topics
15:38:15 <tbarron> outside of just manila gate  or tripleo.
15:38:36 <mgoddard> +1, I think we're all a bit more synced up now
15:38:42 <tbarron> mgoddard: Thanks much for your time, and you know where to find us!
15:38:56 <gouthamr> yes, thank you for bringing this up mgoddard eliaswimmer
15:39:18 <mgoddard> well, thank you for inviting us. Enjoy the rest of the meeting o/
15:39:31 <gouthamr> let's move on to regular programming
15:39:32 <tbarron> eliaswimmer: hang around please :D
15:39:39 <gouthamr> on wait
15:40:01 <tbarron> gouthamr: he has bug, can be there
15:40:06 <gouthamr> tbarron: anything else about $topic
15:40:08 <gouthamr> ah i see
15:40:15 <gouthamr> #topic Reviews needing feedback
15:40:42 <tbarron> s/he I shouldn't be assuming :D
15:40:44 <gouthamr> alright, time for our favorite review focus etherpad
15:41:01 <gouthamr> #link https://etherpad.opendev.org/p/manila-wallaby-review-focus (Wallaby cycle review focus etherpad)
15:42:01 <gouthamr> if you have anything that needs reviewer attention, please feel free to add your change to this etherpad, and we'll track through the end of this release cycle
15:43:02 <gouthamr> it's brand new, we could have patches that you need a nudge on already
15:43:52 <gouthamr> anything that needs to be discussed right now?
15:45:13 <gouthamr> cool; we'll talk about this again next week
15:45:25 <gouthamr> we can move on to:
15:45:28 <gouthamr> #topic Bugs (vhari)
15:45:39 <gouthamr> o/ vhari - the floor's yours
15:46:27 <vhari> gouthamr, I am on another call can you pls drive the Bugs
15:46:33 <gouthamr> ah sure thing
15:46:44 <gouthamr> #link https://bugs.launchpad.net/manila/+bug/1911695
15:46:45 <openstack> Launchpad bug 1911695 in OpenStack Shared File Systems Service (Manila) "generic backend resize share failure when no access rule is set" [Undecided,New]
15:46:45 <vhari> gouthamr, all new bugs are on etherpad
15:47:15 <eliaswimmer> I think this is quite easy to fix
15:47:21 <eliaswimmer> some quotes are missing
15:47:32 <eliaswimmer> before exec via bash
15:48:05 <gouthamr> ah i see
15:48:10 <eliaswimmer> If you want I can patch it
15:48:20 <gouthamr> that would be very welcome, eliaswimmer
15:48:30 <gouthamr> thanks for filing this bug
15:48:31 <tbarron> eliaswimmer++
15:49:24 <tbarron> should we target m2 or m3 for this one?
15:49:46 <eliaswimmer> m is milestone?
15:49:47 <tbarron> m2 is 3 weeks away
15:49:51 <tbarron> yes, sorry
15:49:55 <vkmc> eliaswimmer++
15:50:04 <eliaswimmer> I would like to have it in train :)
15:50:12 <gouthamr> m2 is next week, tbarron
15:50:19 <tbarron> gouthamr: ah yes
15:50:23 <tbarron> d'ph
15:50:32 <tbarron> i know that too
15:50:35 <tbarron> you just told us
15:50:45 <gouthamr> eliaswimmer: yes, we'll need to fix it in the trunk branch, and we can backport it one branch at a time
15:51:34 <gouthamr> we can set this one to m-3, and follow up on LP - do let us know if you have any questions
15:52:00 <gouthamr> for reference, the wallaby release schedule is here: https://releases.openstack.org/wallaby/schedule.html
15:52:08 <eliaswimmer> gouthamr: ok
15:52:14 <gouthamr> and milestone-3 is Mar 08 - Mar 12
15:53:07 <gouthamr> lauchpad's very slow atm
15:53:22 <gouthamr> ty for picking this up eliaswimmer, lets move to the next bug
15:53:33 <gouthamr> #link https://bugs.launchpad.net/manila/+bug/1911071
15:53:36 <openstack> Launchpad bug 1911071 in OpenStack Shared File Systems Service (Manila) "lower constraints job broken on stable branches older than ussuri" [Medium,In progress] - Assigned to Goutham Pacha Ravi (gouthamr)
15:54:05 <gouthamr> alright this one's been targeted
15:54:41 <gouthamr> we're discussing lower-constraints jobs quite a bit over the mailing list
15:55:18 <gouthamr> so far, the consensus seems to be to keep them since they've been fixed - in manila, we've now fixed them train+
15:55:34 <gouthamr> we still have a problem with stein, rocky and queens
15:55:48 <gouthamr> and these branches are on extended maintenance
15:56:01 <gouthamr> i attempted a patch for stein
15:56:08 <gouthamr> #link https://review.opendev.org/c/openstack/manila/+/770207 ([stable/stein] Update requirements and constraints)
15:56:40 <gouthamr> still failing - will need to check what's going on there
15:57:09 <gouthamr> but, i'd not like to do this on rocky and queens - i don't see the value of bumping requirement files on these really old branches
15:57:29 <gouthamr> specifically to call out lower requirements
15:58:08 <gouthamr> so i'm proposing we drop the lower-constraints testing on these
15:58:18 <gouthamr> #link https://review.opendev.org/c/openstack/manila/+/770704 ([stable/rocky] Adjust CI jobs)
15:58:33 <gouthamr> took the opportunity to do some more CI cleanup ^
15:58:54 <gouthamr> let's discuss the merits/pitfalls of this directly on the change
16:00:10 <carloss> thanks for working on this, gouthamr :)
16:00:10 <gouthamr> let's wrap up bug triage here..
16:00:11 <gouthamr> you're welcome carloss
16:00:13 <gouthamr> no time for open discussion today :)
16:00:25 <gouthamr> if you have something, please come on over to #openstack-manila
16:00:31 <gouthamr> thank you all for attending
16:00:34 <gouthamr> stay safe!
16:00:38 <gouthamr> #endmeeting