15:01:06 <tbarron> #startmeeting manila 15:01:06 <openstack> Meeting started Thu Apr 11 15:01:06 2019 UTC and is due to finish in 60 minutes. The chair is tbarron. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:01:07 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:01:09 <openstack> The meeting name has been set to 'manila' 15:01:11 <bswartz> .o/ 15:01:19 <carloss> hello :) 15:01:23 <ganso> hello 15:01:25 <tbarron> courtesy ping: gouthamr xyang toabctl bswartz ganso erlon tpsilva vkmc amito jgrosso 15:01:51 <xyang> hi 15:02:16 <lseki> hi 15:02:41 <tbarron> we have some conflicting meetings at my workplace today so attendance may be low 15:02:49 <bswartz> doh! 15:02:55 <vh_> hi 15:02:56 <tbarron> agenda: https://wiki.openstack.org/wiki/Manila/Meetings 15:03:10 <tbarron> #topic announcements 15:03:24 <gouthamr> o/ 15:03:26 <tbarron> Thanks bswartz for chairing last week's meeting! 15:03:36 <bswartz> np 15:03:56 <tbarron> So Stein shipped yesterday and stable/stein branch has been cut. 15:04:15 <tbarron> That's really all on that front except 15:04:22 <tbarron> I'll remind 15:04:33 <tbarron> that our PTG planning etherpad is here: 15:04:50 <tbarron> #link https://etherpad.openstack.org/p/manila-denver-train-ptg-planning 15:05:02 <tbarron> whew, I didn't paste in the ping list :) 15:05:36 <tbarron> We hae lots of stuff to discuss and plan at PTG, so please look over the etherpad first 15:06:06 <tbarron> Anyone else have any announcments? 15:06:25 <tbarron> ok 15:06:34 <tbarron> #topic Stable Backports 15:07:03 <tbarron> We've spent some time during the release phase looking at backports to stable branches since 15:07:28 <tbarron> we didn't have any release candidate bugs. 15:07:35 <bswartz> Woot 15:07:37 <vkmc_> o/ 15:07:37 <tbarron> hi vhariria lseki 15:08:07 <tbarron> so we have some that need reviews from stable/backports cores 15:08:20 <tbarron> as a reminder, this is a subset of the regular cores who 15:08:47 <tbarron> are accepted not just by manila cores but by the central stable/releases team 15:09:14 <tbarron> and they insist on seeing a track record of reviews of candidate backports before they let people join the club 15:09:34 <bswartz> Do they have to also be manila reviewers? 15:09:38 <tbarron> so if you are a core or an aspiring core and can't vote +/-2 your reviews are still helpful and 15:09:45 <bswartz> Or can generalized stable-maint reviewers help us merge backports? 15:09:53 <tbarron> you can built up a track record 15:10:04 <tbarron> bswartz: I think they can theoretically but they don't 15:10:18 <tbarron> they sometimes will -2/-2 though :) 15:10:22 <bswartz> That might be something to bring up with the wider community 15:10:40 <bswartz> If they want to centralize that function then more resource sharing would make more sense 15:10:58 <tbarron> bswartz: yeah, we can ask about that 15:11:08 <tbarron> anyways, we have some that need attention 15:11:18 <tbarron> #link https://review.openstack.org/#/q/status:open+project:openstack/manila+branch:stable/queens 15:11:33 <tbarron> #link https://review.openstack.org/#/c/648716/ 15:12:15 <tbarron> We'd like to get these merged and then cut new stable/* releases before PTG 15:12:45 <tbarron> Also, stable/pike will soon join stable/ocata as an EM branch 15:12:52 <tbarron> "extended maintenance" 15:13:18 <tbarron> which means no more releases but as long as our team has interest and resources to keep 15:13:30 <tbarron> CI going for those branches we can still backport to them 15:14:04 <tbarron> We can check who still uses stable/ocata and stable/pike and see if this is worth it 15:14:44 <tbarron> Red Hat at least will be maintaining stable/queens equivalent for some time but we have less interest in 15:15:00 <tbarron> stable/ocata and stable/pike as time goes on 15:15:17 <tbarron> so SUSE, canonical, etc. can weigh in if you care 15:15:37 <tbarron> Anything else on this topic? 15:16:12 <tbarron> #topic Followup on: Third party CIs and devs: DevStack plugin changes - we will no longer support installing manila-tempest-plugin 15:16:33 <tbarron> gouthamr thanks for broaching this topic last week 15:16:43 <gouthamr> #link https://review.openstack.org/#/c/648716/ 15:16:52 <gouthamr> is seemingly harmless, because there's atleast one third party CI running there and passing 15:17:38 <gouthamr> i started compiling an updated list of 3rd party CI maintainers - will email openstack-discuss and this list soon 15:17:48 <tbarron> gouthamr: I tested that it didn't break first-party jobs even with the manila plugin left as is, so I think that's right 15:18:04 <tbarron> gouthamr: thanks for working on that. 15:18:20 <tbarron> gouthamr: seems like reasonable progress for one week. 15:18:41 <tbarron> but I'd request some reviews of 648716 then. 15:18:50 <gouthamr> tbarron: can we split out the devstack plugin changes from https://review.openstack.org/#/c/646037/ > 15:18:56 <bswartz> Ooh it's NetApp CI 15:19:11 <tbarron> I think we should go on and merge it and if there's any issues -- there shouldn't be -- we can flush them out. 15:19:55 <tbarron> gouthamr: yes, are you thinking of a particular order is we split? 15:20:35 <gouthamr> https://review.openstack.org/#/c/646037/ can depend on the manila devstack plugin changes and the manila-tempest-plugin patch (https://review.openstack.org/#/c/648716/) 15:21:49 <tbarron> gouthamr: sure, any particular reason? 15:22:16 <gouthamr> yes, because the changes to the devstack plugin aren't related to the python3 changes 15:22:42 <tbarron> gouthamr: ok, -1 with a comment saying we should separate those concerns :) 15:22:55 <gouthamr> sure will do 15:23:17 <tbarron> gouthamr: they're glommed together b/c getting the lvm job to work with python3 required the devstack plugin change 15:23:42 <tbarron> gouthamr: i don't fully understand why it didn't work w/o it, but both seemed the right thing to do and 15:23:50 <gouthamr> tbarron: all the jobs really, we were installing manila-tempest-plugin with py2 15:23:59 <gouthamr> s/were/are 15:24:20 <tbarron> gouthamr: well dummy and cephfs jobs were working with py3 I think 15:24:23 <bswartz> Py2 is like a zombie. You try to kill it but it keeps coming back 15:24:28 <tbarron> cephs is a different plugin though 15:24:36 <tbarron> dummy I didn't understand but anyways 15:24:54 <gouthamr> tbarron: iirc those were working because of the PYTHON3_VERSION variable 15:25:20 <tbarron> gouthamr: yeah, but i couldn't get lvm to work even with it 15:25:52 <tbarron> anyways, we'll separate the concerns, merge these in the right order, and have something we can understand a bit bettedr I hope 15:26:18 <tbarron> Anything else on this topic? 15:26:42 <tbarron> #topic bugs 15:26:44 * gouthamr is slow to type thanks to attending multiple-meetings, sry 15:27:07 <tbarron> gouthamr: anything else on the previous topic? 15:27:11 <gouthamr> nope 15:27:25 <tbarron> our wonderful bug overlord is out today 15:27:38 <bswartz> In another meeting? 15:27:47 <tbarron> I don't know if gouthamr or vhariria have anything? 15:28:02 <tbarron> bswartz: no he's not in the office today. 15:28:18 <tbarron> I don't think we had a lot of new bugs this week. 15:28:39 <bswartz> I workflowed a few things 15:28:48 <tbarron> I closed out a couple, jgrosso cleaning up bugs and surfacing issues that we need to address has been 15:28:51 <tbarron> very helpful. 15:28:54 <tbarron> bswartz: thanks! 15:28:56 <bswartz> Is 646037 the one that will get split? 15:29:31 <tbarron> bswartz: yes 15:29:46 <vhariria> tbarron: nothing new, jgrosso is out no updates yet 15:29:51 <bswartz> Why must there be so much yaml? 15:30:00 <tbarron> it's title indicates the original goal but I glommed all kinds of other stuff onto it 15:30:07 <tbarron> bswartz: so that it's not json 15:30:13 <tbarron> bswartz: or xml 15:30:23 <tbarron> I declare.... 15:30:31 <tbarron> vhariria: thanks 15:30:34 <bswartz> Yes, but the volume of "markup" text is truly behemoth 15:30:51 <tbarron> bswartz: no argument from me on that :) 15:30:58 <bswartz> And I sense a lot of copy/pasting 15:31:13 <bswartz> Is this because of stuff we've done or is it because of indra? 15:31:18 <bswartz> s/indra/infra/ 15:31:36 <tbarron> bswartz: I think both 15:32:10 <tbarron> bswartz: with JJB it was more centralized, then these "legacy jobs" don't really take advantage of 15:32:19 <gouthamr> hopefully, with converting these from legazy to native zuul v3 manifests, we can use templates to remove some of the duplication 15:32:24 <tbarron> what we can do with zuulv3 to make this stuff more modular 15:32:29 <bswartz> gouthamr: +100 15:32:32 <tbarron> ^^^ what gouthamr said 15:33:05 <tbarron> this is like halfway through a code refactor, when you expand it all to see how you can re-modularize 15:33:10 <bswartz> I strikes me as odd that we're embedding vast amounts of bash code in yaml files 15:33:24 <tbarron> bswartz: it is odd 15:34:05 <tbarron> I think you'll like the native zuulv3 jobs a lot better 15:34:44 <tbarron> but no one has had time to really tackle the conversion yet 15:35:04 <tbarron> While we're on bugs though. 15:35:21 <tbarron> lseki: how is that scaling bug going? 15:35:27 <tbarron> lseki: the one SAP found? 15:35:51 <tbarron> lseki: I think to reproduce it the dummy back end isn't going to be helpful. 15:36:04 <vkmc_> We need to plan that with time, I'm aware 3rd parties use the post and pre test scripts and might need time to migrate 15:36:08 <lseki> tbarron: the support team is contacting SAP folks to gather more info 15:36:11 <tbarron> lseki: it doesn't have any inertia in manila-share. 15:36:53 <tbarron> lseki: and as I understand it, the scheduler is showing manila-share as down when you have a lot of manila-share services under load. 15:37:02 <vkmc_> re zuulv3 15:37:38 <tbarron> vkmc_: yes, 3rd party conversion is another whole aspect :) 15:38:20 <tbarron> vkmc_: we might end up moving first party jobs off of those scripts but still need to keep them for some time. 15:38:45 <tbarron> lseki: it will be interesting to see what SAP says. 15:38:55 <bswartz> A service being marked "down" just means it hasn't sent a heartbeat in a while 15:39:12 <bswartz> For a service under load, I think it's easy to understand how that might happen 15:39:30 <bswartz> The question is how to manage the load better 15:40:09 <tbarron> bswartz: do you think that's all up to the driver for that back end or could we handle it better centrally? 15:40:22 <lseki> tbarron: yes, we'll keep it up-to-date at launchpad when SAP folks answer 15:40:43 <bswartz> tbarron: if the driver is able to behave badly enough to cause this symptom, then we have an architectural problem 15:41:03 <bswartz> Ideally all drivers would behave well, but there's no way to ensure that 15:41:25 <tbarron> bswartz: like maybe drivers should share some kind of threading library routine so that we can say it's healthy even if it's busy 15:41:30 <bswartz> So there has to be a mechanism to prevent driver slowness from causing availability problems 15:42:22 <tbarron> I'm particularly interested in this issue b/c there are manila customers who want to deploy circa 100 manila-share services 15:42:47 <tbarron> The odds of some of these being busy are higher with more deployed. 15:42:59 <tbarron> And the odds of the scheduler somehow missing updates are higher. 15:43:14 <tbarron> And these will be "edge" deployments meaning that it is expected that 15:43:28 <tbarron> from time to time they will lose connectivity back to the central borg. 15:43:40 <tbarron> We don't want to confuse that kind of partition with 15:43:46 <bswartz> We need to think through this use case carefully 15:43:57 <tbarron> a false-negative like the stuff that this bug reports. 15:44:02 <bswartz> I can think of a few ways to attack the problem, but I don't understand the details well enough 15:44:25 <tbarron> bswartz: we have this as a PTG topic, so I am pointing ahead of time to it now. 15:44:36 <bswartz> Oh good 15:44:41 <tbarron> bswartz: cinder and manila both 15:45:02 <bswartz> Please try to schedule at a time I can participate (i.e. relatively early in the day) 15:45:37 <tbarron> we'll probably do early Friday on this one, after introductions, PTG schedule/agenda, and retrospective. 15:46:02 <tbarron> so later morning east coast US time 15:46:42 <tbarron> An aspect of this as well is that we really want to get manila-share (and cinder-volume) running active-active, w/o 15:46:46 <tbarron> pacemaker control. 15:47:16 <bswartz> That's a whole other dimension of complexity 15:47:21 <tbarron> That's because edge deployments may end up using software defined storage on hyperconverged nodes 15:47:42 <tbarron> and if you manage them active/passive with pacemaker then you risk losing a lot of your storage just 15:47:51 <tbarron> because some service appears unhealthy. 15:48:00 <tbarron> bswartz: yes :) 15:48:22 <tbarron> So there are some really interesting and important issues for this PTG I think. 15:49:03 <tbarron> plan of record for manila is to use DLM (tooz backed by etcd) for active-active synchronization issues but 15:49:12 <bswartz> If only python wasn't single-threaded 15:49:30 <tbarron> for these edge deployments there would be separate DLM clusters at the edges and given the 15:49:53 <tbarron> possibility of edge partitions running etcd cluster from central borg all the way to edges may be 15:49:56 <bswartz> Language issues bite us again and again 15:49:58 <tbarron> problematic 15:50:23 <tbarron> so and issue for manila to consider is re-architecting the cases where we rely on locking across services 15:52:26 <tbarron> an issue 15:52:27 <tbarron> we'd need to use the DB for exclusion in some cases where we use locks held across service casts today 15:52:27 <tbarron> OK, just floating stuff for people to think about! 15:52:28 <tbarron> manila still has lots of very interesting issues to work on! 15:52:30 <tbarron> bswartz: mebbe we should do some of this with goroutines 15:52:52 <tbarron> ok, perhaps belated :) 15:52:52 <bswartz> Yes, rewrite all of openstack in go 15:52:59 <tbarron> #topic open discussion 15:53:04 <bswartz> That definitely won't cause any bugs 15:53:11 <tbarron> bswartz: :) 15:53:12 <bswartz> >_< 15:53:58 <bswartz> I really do like go as a language, and I regret that it wasn't mature enough when work on openstack started 15:53:59 <tbarron> less drastic may be squeezing out eventlet and leveraging some native threading and multiprocessing stuff 15:54:16 <tbarron> i find i like types 15:54:34 <bswartz> Indeed 15:54:35 <tbarron> for production code 15:54:53 <tbarron> though for scripting it's nice to be able to be lazy 15:55:00 <bswartz> +1 15:55:38 <tbarron> Seems like we're through for today. 15:56:00 <tbarron> Thanks everyone, see you in #openstack-manila and let's get ready for PTG! 15:56:04 <tbarron> #endmeeting