15:01:06 <tbarron> #startmeeting manila
15:01:06 <openstack> Meeting started Thu Apr 11 15:01:06 2019 UTC and is due to finish in 60 minutes.  The chair is tbarron. Information about MeetBot at http://wiki.debian.org/MeetBot.
15:01:07 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
15:01:09 <openstack> The meeting name has been set to 'manila'
15:01:11 <bswartz> .o/
15:01:19 <carloss> hello :)
15:01:23 <ganso> hello
15:01:25 <tbarron> courtesy ping: gouthamr xyang toabctl bswartz ganso erlon tpsilva vkmc amito jgrosso
15:01:51 <xyang> hi
15:02:16 <lseki> hi
15:02:41 <tbarron> we have some conflicting meetings at my workplace today so attendance may be low
15:02:49 <bswartz> doh!
15:02:55 <vh_> hi
15:02:56 <tbarron> agenda: https://wiki.openstack.org/wiki/Manila/Meetings
15:03:10 <tbarron> #topic announcements
15:03:24 <gouthamr> o/
15:03:26 <tbarron> Thanks bswartz for chairing last week's meeting!
15:03:36 <bswartz> np
15:03:56 <tbarron> So Stein shipped yesterday and stable/stein branch has been cut.
15:04:15 <tbarron> That's really all on that front except
15:04:22 <tbarron> I'll remind
15:04:33 <tbarron> that our PTG planning etherpad is here:
15:04:50 <tbarron> #link https://etherpad.openstack.org/p/manila-denver-train-ptg-planning
15:05:02 <tbarron> whew, I didn't paste in the ping list :)
15:05:36 <tbarron> We hae lots of stuff to discuss and plan at PTG, so please look over the etherpad first
15:06:06 <tbarron> Anyone else have any announcments?
15:06:25 <tbarron> ok
15:06:34 <tbarron> #topic Stable Backports
15:07:03 <tbarron> We've spent some time during the release phase looking at backports to stable branches since
15:07:28 <tbarron> we didn't have any release candidate bugs.
15:07:35 <bswartz> Woot
15:07:37 <vkmc_> o/
15:07:37 <tbarron> hi vhariria lseki
15:08:07 <tbarron> so we have some that need reviews from stable/backports cores
15:08:20 <tbarron> as a reminder, this is a subset of the regular cores who
15:08:47 <tbarron> are accepted not just by manila cores but by the central stable/releases team
15:09:14 <tbarron> and they insist on seeing a track record of reviews of candidate backports before they let people join the club
15:09:34 <bswartz> Do they have to also be manila reviewers?
15:09:38 <tbarron> so if you are a core or an aspiring core and can't vote +/-2 your reviews are still helpful and
15:09:45 <bswartz> Or can generalized stable-maint reviewers help us merge backports?
15:09:53 <tbarron> you can built up a track record
15:10:04 <tbarron> bswartz: I think they can theoretically but they don't
15:10:18 <tbarron> they sometimes will -2/-2 though :)
15:10:22 <bswartz> That might be something to bring up with the wider community
15:10:40 <bswartz> If they want to centralize that function then more resource sharing would make more sense
15:10:58 <tbarron> bswartz: yeah, we can ask about that
15:11:08 <tbarron> anyways, we have some that need attention
15:11:18 <tbarron> #link https://review.openstack.org/#/q/status:open+project:openstack/manila+branch:stable/queens
15:11:33 <tbarron> #link https://review.openstack.org/#/c/648716/
15:12:15 <tbarron> We'd like to get these merged and then cut new stable/* releases before PTG
15:12:45 <tbarron> Also, stable/pike will soon join stable/ocata as an EM branch
15:12:52 <tbarron> "extended maintenance"
15:13:18 <tbarron> which means no more releases but as long as our team has interest and resources to keep
15:13:30 <tbarron> CI going for those branches we can still backport to them
15:14:04 <tbarron> We can check who still uses stable/ocata and stable/pike and see if this is worth it
15:14:44 <tbarron> Red Hat at least will be maintaining stable/queens equivalent for some time but we have less interest in
15:15:00 <tbarron> stable/ocata and stable/pike as time goes on
15:15:17 <tbarron> so SUSE, canonical, etc. can weigh in if you care
15:15:37 <tbarron> Anything else on this topic?
15:16:12 <tbarron> #topic Followup on: Third party CIs and devs: DevStack plugin changes - we will no longer support installing manila-tempest-plugin
15:16:33 <tbarron> gouthamr thanks for broaching this topic last week
15:16:43 <gouthamr> #link https://review.openstack.org/#/c/648716/
15:16:52 <gouthamr> is seemingly harmless, because there's atleast one third party CI running there and passing
15:17:38 <gouthamr> i started compiling an updated list of 3rd party CI maintainers - will email openstack-discuss and this list soon
15:17:48 <tbarron> gouthamr: I tested that it didn't break first-party jobs even with the manila plugin left as is, so I think that's right
15:18:04 <tbarron> gouthamr: thanks for working on that.
15:18:20 <tbarron> gouthamr: seems like reasonable progress for one week.
15:18:41 <tbarron> but I'd request some reviews of 648716 then.
15:18:50 <gouthamr> tbarron: can we split out the devstack plugin changes from https://review.openstack.org/#/c/646037/ >
15:18:56 <bswartz> Ooh it's NetApp CI
15:19:11 <tbarron> I think we should go on and merge it and if there's any issues -- there shouldn't be -- we can flush them out.
15:19:55 <tbarron> gouthamr: yes, are you thinking of a particular order is we split?
15:20:35 <gouthamr> https://review.openstack.org/#/c/646037/ can depend on the manila devstack plugin changes and the manila-tempest-plugin patch (https://review.openstack.org/#/c/648716/)
15:21:49 <tbarron> gouthamr: sure, any particular reason?
15:22:16 <gouthamr> yes, because the changes to the devstack plugin aren't related to the python3 changes
15:22:42 <tbarron> gouthamr: ok, -1 with a comment saying we should separate those concerns :)
15:22:55 <gouthamr> sure will do
15:23:17 <tbarron> gouthamr: they're glommed together b/c getting the lvm job to work with python3 required the devstack plugin change
15:23:42 <tbarron> gouthamr: i don't fully understand why it didn't work w/o it, but both seemed the right thing to do and
15:23:50 <gouthamr> tbarron: all the jobs really, we were installing manila-tempest-plugin with py2
15:23:59 <gouthamr> s/were/are
15:24:20 <tbarron> gouthamr: well dummy and cephfs jobs were working with py3 I think
15:24:23 <bswartz> Py2 is like a zombie. You try to kill it but it keeps coming back
15:24:28 <tbarron> cephs is a different plugin though
15:24:36 <tbarron> dummy I didn't understand but anyways
15:24:54 <gouthamr> tbarron: iirc those were working because of the PYTHON3_VERSION variable
15:25:20 <tbarron> gouthamr: yeah, but i couldn't get lvm to work even with it
15:25:52 <tbarron> anyways, we'll separate the concerns, merge these in the right order, and have something we can understand a bit bettedr I hope
15:26:18 <tbarron> Anything else on this topic?
15:26:42 <tbarron> #topic bugs
15:26:44 * gouthamr is slow to type thanks to attending multiple-meetings, sry
15:27:07 <tbarron> gouthamr: anything else on the previous topic?
15:27:11 <gouthamr> nope
15:27:25 <tbarron> our wonderful bug overlord is out today
15:27:38 <bswartz> In another meeting?
15:27:47 <tbarron> I don't know if gouthamr or vhariria have anything?
15:28:02 <tbarron> bswartz: no he's not in the office today.
15:28:18 <tbarron> I don't think we had a lot of new bugs this week.
15:28:39 <bswartz> I workflowed a few things
15:28:48 <tbarron> I closed out a couple, jgrosso cleaning up bugs and surfacing issues that we need to address has been
15:28:51 <tbarron> very helpful.
15:28:54 <tbarron> bswartz: thanks!
15:28:56 <bswartz> Is 646037 the one that will get split?
15:29:31 <tbarron> bswartz: yes
15:29:46 <vhariria> tbarron: nothing new, jgrosso is out no updates yet
15:29:51 <bswartz> Why must there be so much yaml?
15:30:00 <tbarron> it's title indicates the original goal but I glommed all kinds of other stuff onto it
15:30:07 <tbarron> bswartz: so that it's not json
15:30:13 <tbarron> bswartz: or xml
15:30:23 <tbarron> I declare....
15:30:31 <tbarron> vhariria: thanks
15:30:34 <bswartz> Yes, but the volume of "markup" text is truly behemoth
15:30:51 <tbarron> bswartz: no argument from me on that :)
15:30:58 <bswartz> And I sense a lot of copy/pasting
15:31:13 <bswartz> Is this because of stuff we've done or is it because of indra?
15:31:18 <bswartz> s/indra/infra/
15:31:36 <tbarron> bswartz: I think both
15:32:10 <tbarron> bswartz: with JJB it was more centralized, then these "legacy jobs" don't really take advantage of
15:32:19 <gouthamr> hopefully, with converting these from legazy to native zuul v3 manifests, we can use templates to remove some of the duplication
15:32:24 <tbarron> what we can do with zuulv3 to make this stuff more modular
15:32:29 <bswartz> gouthamr: +100
15:32:32 <tbarron> ^^^ what gouthamr said
15:33:05 <tbarron> this is like halfway through a code refactor, when you expand it all to see how you can re-modularize
15:33:10 <bswartz> I strikes me as odd that we're embedding vast amounts of bash code in yaml files
15:33:24 <tbarron> bswartz: it is odd
15:34:05 <tbarron> I think you'll like the native zuulv3 jobs a lot better
15:34:44 <tbarron> but no one has had time to really tackle the conversion yet
15:35:04 <tbarron> While we're on bugs though.
15:35:21 <tbarron> lseki: how is that scaling bug going?
15:35:27 <tbarron> lseki: the one SAP found?
15:35:51 <tbarron> lseki: I think to reproduce it the dummy back end isn't going to be helpful.
15:36:04 <vkmc_> We need to plan that with time, I'm aware 3rd parties use the post and pre test scripts and might need time to migrate
15:36:08 <lseki> tbarron: the support team is contacting SAP folks to gather more info
15:36:11 <tbarron> lseki: it doesn't have any inertia in manila-share.
15:36:53 <tbarron> lseki: and as I understand it, the scheduler is showing manila-share as down when you have a lot of manila-share services under load.
15:37:02 <vkmc_> re zuulv3
15:37:38 <tbarron> vkmc_: yes, 3rd party conversion is another whole aspect :)
15:38:20 <tbarron> vkmc_: we might end up moving first party jobs off of those scripts but still need to keep them for some time.
15:38:45 <tbarron> lseki: it will be interesting to see what SAP says.
15:38:55 <bswartz> A service being marked "down" just means it hasn't sent a heartbeat in a while
15:39:12 <bswartz> For a service under load, I think it's easy to understand how that might happen
15:39:30 <bswartz> The question is how to manage the load better
15:40:09 <tbarron> bswartz: do you think that's all up to the driver for that back end or could we handle it better centrally?
15:40:22 <lseki> tbarron: yes, we'll keep it up-to-date at launchpad when SAP folks answer
15:40:43 <bswartz> tbarron: if the driver is able to behave badly enough to cause this symptom, then we have an architectural problem
15:41:03 <bswartz> Ideally all drivers would behave well, but there's no way to ensure that
15:41:25 <tbarron> bswartz: like maybe drivers should share some kind of threading library routine so that we can say it's healthy even if it's busy
15:41:30 <bswartz> So there has to be a mechanism to prevent driver slowness from causing availability problems
15:42:22 <tbarron> I'm particularly interested in this issue b/c there are manila customers who want to deploy circa 100 manila-share services
15:42:47 <tbarron> The odds of some of these being busy are higher with more deployed.
15:42:59 <tbarron> And the odds of the scheduler somehow missing updates are higher.
15:43:14 <tbarron> And these will be "edge" deployments meaning that it is expected that
15:43:28 <tbarron> from time to time they will lose connectivity back to the central borg.
15:43:40 <tbarron> We don't want to confuse that kind of partition with
15:43:46 <bswartz> We need to think through this use case carefully
15:43:57 <tbarron> a false-negative like the stuff that this bug reports.
15:44:02 <bswartz> I can think of a few ways to attack the problem, but I don't understand the details well enough
15:44:25 <tbarron> bswartz: we have this as a PTG topic, so I am pointing ahead of time to it now.
15:44:36 <bswartz> Oh good
15:44:41 <tbarron> bswartz: cinder and manila both
15:45:02 <bswartz> Please try to schedule at a time I can participate (i.e. relatively early in the day)
15:45:37 <tbarron> we'll probably do early Friday on this one, after introductions, PTG schedule/agenda, and retrospective.
15:46:02 <tbarron> so later morning east coast US time
15:46:42 <tbarron> An aspect of this as well is that we really want to get manila-share (and cinder-volume) running active-active, w/o
15:46:46 <tbarron> pacemaker control.
15:47:16 <bswartz> That's a whole other dimension of complexity
15:47:21 <tbarron> That's because edge deployments may end up using software defined storage on hyperconverged nodes
15:47:42 <tbarron> and if you manage them active/passive with pacemaker then you risk losing a lot of your storage just
15:47:51 <tbarron> because some service appears unhealthy.
15:48:00 <tbarron> bswartz: yes :)
15:48:22 <tbarron> So there are some really interesting and important issues for this PTG I think.
15:49:03 <tbarron> plan of record for manila is to use DLM (tooz backed by etcd) for active-active synchronization issues but
15:49:12 <bswartz> If only python wasn't single-threaded
15:49:30 <tbarron> for these edge deployments there would be separate DLM clusters at the edges and given the
15:49:53 <tbarron> possibility of edge partitions running etcd cluster from central borg all the way to edges may be
15:49:56 <bswartz> Language issues bite us again and again
15:49:58 <tbarron> problematic
15:50:23 <tbarron> so and issue for manila to consider is re-architecting the cases where we rely on locking across services
15:52:26 <tbarron> an issue
15:52:27 <tbarron> we'd need to use the DB for exclusion in some cases where we use locks held across service casts today
15:52:27 <tbarron> OK, just floating stuff for people to think about!
15:52:28 <tbarron> manila still has lots of very interesting issues to work on!
15:52:30 <tbarron> bswartz: mebbe we should do some of this with goroutines
15:52:52 <tbarron> ok, perhaps belated :)
15:52:52 <bswartz> Yes, rewrite all of openstack in go
15:52:59 <tbarron> #topic open discussion
15:53:04 <bswartz> That definitely won't cause any bugs
15:53:11 <tbarron> bswartz: :)
15:53:12 <bswartz> >_<
15:53:58 <bswartz> I really do like go as a language, and I regret that it wasn't mature enough when work on openstack started
15:53:59 <tbarron> less drastic may be squeezing out eventlet and leveraging some native threading and multiprocessing stuff
15:54:16 <tbarron> i find i like types
15:54:34 <bswartz> Indeed
15:54:35 <tbarron> for production code
15:54:53 <tbarron> though for scripting it's nice to be able to be lazy
15:55:00 <bswartz> +1
15:55:38 <tbarron> Seems like we're through for today.
15:56:00 <tbarron> Thanks everyone, see you in #openstack-manila and let's get ready for PTG!
15:56:04 <tbarron> #endmeeting