15:00:54 <gouthamr> #startmeeting manila 15:00:54 <openstack> Meeting started Thu Jan 14 15:00:54 2021 UTC and is due to finish in 60 minutes. The chair is gouthamr. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:00:55 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:00:58 <openstack> The meeting name has been set to 'manila' 15:01:01 <carloss> o/ 15:01:03 <vkmc> o/ 15:01:11 <gouthamr> courtesy ping: ganso dviroel lseki tbarron andrebeltrami felipe_rodrigues esantos 15:01:13 <carthaca> Hi 15:01:13 <andrebeltrami_> hello 15:01:47 <gouthamr> hello o/ 15:02:12 <gouthamr> here's our agenda for today: https://wiki.openstack.org/wiki/Manila/Meetings#Next_meeting 15:02:17 <tbarron> hi 15:02:33 <gouthamr> thank you for tuning in, let's begin 15:02:38 <gouthamr> #topic Announcements 15:03:19 <gouthamr> We've an upcoming bug target deadline next week: Milestone-r 15:03:24 <gouthamr> Milestone-2* 15:03:27 <dviroel> o/ 15:03:29 <gouthamr> #link https://releases.openstack.org/wallaby/schedule.html 15:03:49 <gouthamr> our new driver deadline is in 3 weeks 15:04:11 <gouthamr> and the feature proposal freeze follows that 15:04:13 <felipe_rodrigues> o/ 15:05:52 <mgoddard> o/ 15:05:53 <gouthamr> so as we're slowly inching our way into the tail end of this release cycle, a lot of code could be submitted, and as ever, we'll need to keep the momentum on reviews :) 15:06:31 <gouthamr> no other pressing announcements atm, anyone else got any? 15:07:36 <gouthamr> cool, lets move on.. 15:07:49 <gouthamr> #topic Manila deployment via Kolla-Ansible (mgoddard, eliaswimmer) 15:08:03 <mgoddard> Hi Zorillas 15:08:09 <gouthamr> hi mgoddard, eliaswimmer 15:08:15 <tbarron> o/ 15:08:23 <eliaswimmer> hi 15:08:44 <mgoddard> This agenda item came about after discussing manila in yesterday's kolla meeting 15:09:19 <mgoddard> We have a bug, it seems we are doing multinode manila incorrectly 15:09:21 <mgoddard> https://bugs.launchpad.net/kolla-ansible/+bug/1905542 15:09:23 <openstack> Launchpad bug 1905542 in kolla-ansible "Manila ceph configuration won't work in HA mode" [Undecided,New] 15:09:52 <mgoddard> We naively deploy manila-share active/active, and hope for the best 15:10:29 <mgoddard> tbarron joined in the conversation and suggested we discuss the state of manila in Kolla here 15:10:34 <tbarron> Are you doing this for cinder-volume as well? 15:10:34 <mgoddard> so here we are 15:10:46 <mgoddard> well, that's another story 15:11:00 <tbarron> mgoddard: And thanks for joining us! 15:11:28 <mgoddard> we were previously using cinder backend_host, then it somehow got dropped, and now we have a bug which we're working on 15:11:38 <tbarron> I ask because manila developed as a fork of cinder and we inheritied some issues w.r.t. active-active service talking to storage back end 15:11:49 <tbarron> They are making some progress. 15:11:56 <mgoddard> right 15:12:00 <tbarron> We keep talking about it but 15:12:23 <mgoddard> it certainly would be nice to have 15:12:36 <tbarron> as I mentioned to you many of us work with deployment arch (tripleo) that does a pretty good job running the service active-passive so 15:12:45 <mgoddard> I will do everything in my power not to have to deploy pacemaker 15:12:53 <tbarron> we've been able to get away with procrastination. 15:13:21 <tbarron> mgoddard: I am actually quie sympathetic to that POV and am not recommendinng pacemaker for kolla-ansible. 15:13:50 <gouthamr> yeah, testing/supporting active-active HA for manila-share has been on the backlog for a while 15:13:51 <tbarron> We already have the machinery in place so using it for manila isn't a big deal in that context. 15:13:53 <mgoddard> do we have any alternatives? 15:14:06 <mgoddard> (for existing releases) 15:14:54 <tbarron> mgoddard: I'm not draft *you* as I know you are doing many things already but we really need some folks working in manila whose 15:15:03 <tbarron> day jobs don't depend on tripleo 15:15:10 <tbarron> to balance things out 15:15:37 <gouthamr> not that i'm aware of, we've only ever tested/documented using pacemaker: 15:15:39 <gouthamr> #link https://docs.openstack.org/ha-guide/storage-ha-file-systems.html#add-shared-file-systems-api-resource-to-pacemaker 15:15:46 <tbarron> mgoddard: the particular bug you raise is one thing, I think the independent keys idea may work for that 15:16:19 <tbarron> But the more general problem that active-active is not really safe for manila-share service yet is another matter. 15:16:20 * gouthamr scratch that link, it should be manila-share we're talking about 15:18:16 <tbarron> mgoddard: manila-share core code needs some fixes, then back-end by back-end there need to be fixes. 15:18:36 <tbarron> mgoddard: which back ends is the kolla community most interested in? 15:19:14 <eliaswimmer> for me it is ceph and generic 15:19:22 <mgoddard> we support generic, hnas, cephfsnative, cephfsnfs & glusterfsnfs 15:19:35 <mgoddard> although it's fairly customisable 15:19:51 <mgoddard> so other drivers too potentially 15:21:19 <mgoddard> so from our perspective I think there are two issues 15:21:41 <mgoddard> 1. what do we need to do today to safely use existing manila-share 15:22:07 <mgoddard> 2. if and when active/active manila-share becomes available, how do we use it 15:22:53 <mgoddard> I'd love to find you some contributors, although I expect many projects are in a similar position 15:23:49 <gouthamr> ^ +1 thanks for that - that'd be the prime concern - we could talk in theory about the pitfalls there may be for attempting active-active HA 15:23:57 <gouthamr> with the workaround you have 15:24:13 <gouthamr> for the native cephfs backend 15:24:39 <mgoddard> that would be helpful to understand the severity of the issue 15:25:02 <gouthamr> from the top of my head: driver startup, and periodic driver interactions 15:25:25 <tbarron> note that depending on the back end there is an issue of HA for the back end as well 15:25:35 <tbarron> no issue for native cephfs 15:26:13 <tbarron> for generic driver we have the issues covered in this incomplete review https://review.opendev.org/c/openstack/manila-specs/+/504987 15:26:46 <mgoddard> queens :) 15:26:51 <tbarron> for cephfsnfs, tripleo uses pacemaker :( 15:27:09 <tbarron> mgoddard: yeah, we've been talking about this stuff for some time 15:27:25 <gouthamr> during startup, there's a reconciliation of exports on the backend, so this might run twice, but, if the driver startup is staggered, this isn't a biggie; and there aren't many periodic driver interactions for the native cephfs driver 15:27:32 <gouthamr> just the scheduler stats update 15:27:50 <mgoddard> it's very difficult to retrofit if this stuff isn't designed in from the beginning 15:28:07 <tbarron> mgoddard: +1 15:29:13 <tbarron> gouthamr: yeah, fixing for native cephfs and adding some missing tooz locks in manila-share core would be a good start 15:29:40 <tbarron> And then maybe documenting caveats for other back ends until someone can fix them. 15:30:37 <tbarron> eliaswimmer: is your interest in the generic driver for production use? 15:30:39 <gouthamr> yeah, that's a good point - one of the other pieces that relies on coordination is share access control - we currently protect critical sections there with file locks 15:31:10 <gouthamr> if the file locks aren't coordinated, this could potentially break 15:31:16 <eliaswimmer> tbarron: only until I have a good setup for using cephfs natively 15:31:44 <tbarron> gouthamr: the code changes to swith them to tooz wouldn't be hard, but need to deploy tooz with dlm and test test test 15:32:04 <gouthamr> tbarron: +1 15:32:47 <tbarron> gouthamr: mgoddard but consequences of races there are probably not too bad either. Access to share isn't granted or withdrawn properly and needs to be re-done? 15:33:31 <tbarron> Not like data loss. The more general bogey with active-active cinder-volume or manila-share. 15:35:16 <mgoddard> that's all helpful stuff, thanks 15:35:21 <tbarron> mgoddard: so k-a gate runs multinode jobs with a DLM in play? If manila does make changes for a-a, they could be tested there? 15:35:22 <gouthamr> i think so - if you added a lot of access rules at once on a particular share, we could claim some of your access rules have been applied. but, in reality, cephfs could know nothing about them 15:36:02 <tbarron> gouthamr: so undetected access failure, or rather in-use detection by end users. 15:36:40 <gouthamr> yeah, the annoying workaround would be to apply access rules one at a time 15:36:50 <mgoddard> currently we do not deploy a DLM in CI jobs, although I expect we will soon 15:37:07 <mgoddard> it should be a one-liner to enable 15:37:26 <tbarron> mgoddard: So perhaps respecting yoour time we should leave the conversation where it is atm. We don't have a quick 15:37:57 <tbarron> solution for you but the good thing is we in manila are more aware of the need and potential for testing 15:37:58 <mgoddard> yes, we should let you continue with your other topics 15:38:15 <tbarron> outside of just manila gate or tripleo. 15:38:36 <mgoddard> +1, I think we're all a bit more synced up now 15:38:42 <tbarron> mgoddard: Thanks much for your time, and you know where to find us! 15:38:56 <gouthamr> yes, thank you for bringing this up mgoddard eliaswimmer 15:39:18 <mgoddard> well, thank you for inviting us. Enjoy the rest of the meeting o/ 15:39:31 <gouthamr> let's move on to regular programming 15:39:32 <tbarron> eliaswimmer: hang around please :D 15:39:39 <gouthamr> on wait 15:40:01 <tbarron> gouthamr: he has bug, can be there 15:40:06 <gouthamr> tbarron: anything else about $topic 15:40:08 <gouthamr> ah i see 15:40:15 <gouthamr> #topic Reviews needing feedback 15:40:42 <tbarron> s/he I shouldn't be assuming :D 15:40:44 <gouthamr> alright, time for our favorite review focus etherpad 15:41:01 <gouthamr> #link https://etherpad.opendev.org/p/manila-wallaby-review-focus (Wallaby cycle review focus etherpad) 15:42:01 <gouthamr> if you have anything that needs reviewer attention, please feel free to add your change to this etherpad, and we'll track through the end of this release cycle 15:43:02 <gouthamr> it's brand new, we could have patches that you need a nudge on already 15:43:52 <gouthamr> anything that needs to be discussed right now? 15:45:13 <gouthamr> cool; we'll talk about this again next week 15:45:25 <gouthamr> we can move on to: 15:45:28 <gouthamr> #topic Bugs (vhari) 15:45:39 <gouthamr> o/ vhari - the floor's yours 15:46:27 <vhari> gouthamr, I am on another call can you pls drive the Bugs 15:46:33 <gouthamr> ah sure thing 15:46:44 <gouthamr> #link https://bugs.launchpad.net/manila/+bug/1911695 15:46:45 <openstack> Launchpad bug 1911695 in OpenStack Shared File Systems Service (Manila) "generic backend resize share failure when no access rule is set" [Undecided,New] 15:46:45 <vhari> gouthamr, all new bugs are on etherpad 15:47:15 <eliaswimmer> I think this is quite easy to fix 15:47:21 <eliaswimmer> some quotes are missing 15:47:32 <eliaswimmer> before exec via bash 15:48:05 <gouthamr> ah i see 15:48:10 <eliaswimmer> If you want I can patch it 15:48:20 <gouthamr> that would be very welcome, eliaswimmer 15:48:30 <gouthamr> thanks for filing this bug 15:48:31 <tbarron> eliaswimmer++ 15:49:24 <tbarron> should we target m2 or m3 for this one? 15:49:46 <eliaswimmer> m is milestone? 15:49:47 <tbarron> m2 is 3 weeks away 15:49:51 <tbarron> yes, sorry 15:49:55 <vkmc> eliaswimmer++ 15:50:04 <eliaswimmer> I would like to have it in train :) 15:50:12 <gouthamr> m2 is next week, tbarron 15:50:19 <tbarron> gouthamr: ah yes 15:50:23 <tbarron> d'ph 15:50:32 <tbarron> i know that too 15:50:35 <tbarron> you just told us 15:50:45 <gouthamr> eliaswimmer: yes, we'll need to fix it in the trunk branch, and we can backport it one branch at a time 15:51:34 <gouthamr> we can set this one to m-3, and follow up on LP - do let us know if you have any questions 15:52:00 <gouthamr> for reference, the wallaby release schedule is here: https://releases.openstack.org/wallaby/schedule.html 15:52:08 <eliaswimmer> gouthamr: ok 15:52:14 <gouthamr> and milestone-3 is Mar 08 - Mar 12 15:53:07 <gouthamr> lauchpad's very slow atm 15:53:22 <gouthamr> ty for picking this up eliaswimmer, lets move to the next bug 15:53:33 <gouthamr> #link https://bugs.launchpad.net/manila/+bug/1911071 15:53:36 <openstack> Launchpad bug 1911071 in OpenStack Shared File Systems Service (Manila) "lower constraints job broken on stable branches older than ussuri" [Medium,In progress] - Assigned to Goutham Pacha Ravi (gouthamr) 15:54:05 <gouthamr> alright this one's been targeted 15:54:41 <gouthamr> we're discussing lower-constraints jobs quite a bit over the mailing list 15:55:18 <gouthamr> so far, the consensus seems to be to keep them since they've been fixed - in manila, we've now fixed them train+ 15:55:34 <gouthamr> we still have a problem with stein, rocky and queens 15:55:48 <gouthamr> and these branches are on extended maintenance 15:56:01 <gouthamr> i attempted a patch for stein 15:56:08 <gouthamr> #link https://review.opendev.org/c/openstack/manila/+/770207 ([stable/stein] Update requirements and constraints) 15:56:40 <gouthamr> still failing - will need to check what's going on there 15:57:09 <gouthamr> but, i'd not like to do this on rocky and queens - i don't see the value of bumping requirement files on these really old branches 15:57:29 <gouthamr> specifically to call out lower requirements 15:58:08 <gouthamr> so i'm proposing we drop the lower-constraints testing on these 15:58:18 <gouthamr> #link https://review.opendev.org/c/openstack/manila/+/770704 ([stable/rocky] Adjust CI jobs) 15:58:33 <gouthamr> took the opportunity to do some more CI cleanup ^ 15:58:54 <gouthamr> let's discuss the merits/pitfalls of this directly on the change 16:00:10 <carloss> thanks for working on this, gouthamr :) 16:00:10 <gouthamr> let's wrap up bug triage here.. 16:00:11 <gouthamr> you're welcome carloss 16:00:13 <gouthamr> no time for open discussion today :) 16:00:25 <gouthamr> if you have something, please come on over to #openstack-manila 16:00:31 <gouthamr> thank you all for attending 16:00:34 <gouthamr> stay safe! 16:00:38 <gouthamr> #endmeeting