15:01:06 #startmeeting manila 15:01:06 Meeting started Thu Apr 11 15:01:06 2019 UTC and is due to finish in 60 minutes. The chair is tbarron. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:01:07 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:01:09 The meeting name has been set to 'manila' 15:01:11 .o/ 15:01:19 hello :) 15:01:23 hello 15:01:25 courtesy ping: gouthamr xyang toabctl bswartz ganso erlon tpsilva vkmc amito jgrosso 15:01:51 hi 15:02:16 hi 15:02:41 we have some conflicting meetings at my workplace today so attendance may be low 15:02:49 doh! 15:02:55 hi 15:02:56 agenda: https://wiki.openstack.org/wiki/Manila/Meetings 15:03:10 #topic announcements 15:03:24 o/ 15:03:26 Thanks bswartz for chairing last week's meeting! 15:03:36 np 15:03:56 So Stein shipped yesterday and stable/stein branch has been cut. 15:04:15 That's really all on that front except 15:04:22 I'll remind 15:04:33 that our PTG planning etherpad is here: 15:04:50 #link https://etherpad.openstack.org/p/manila-denver-train-ptg-planning 15:05:02 whew, I didn't paste in the ping list :) 15:05:36 We hae lots of stuff to discuss and plan at PTG, so please look over the etherpad first 15:06:06 Anyone else have any announcments? 15:06:25 ok 15:06:34 #topic Stable Backports 15:07:03 We've spent some time during the release phase looking at backports to stable branches since 15:07:28 we didn't have any release candidate bugs. 15:07:35 Woot 15:07:37 o/ 15:07:37 hi vhariria lseki 15:08:07 so we have some that need reviews from stable/backports cores 15:08:20 as a reminder, this is a subset of the regular cores who 15:08:47 are accepted not just by manila cores but by the central stable/releases team 15:09:14 and they insist on seeing a track record of reviews of candidate backports before they let people join the club 15:09:34 Do they have to also be manila reviewers? 15:09:38 so if you are a core or an aspiring core and can't vote +/-2 your reviews are still helpful and 15:09:45 Or can generalized stable-maint reviewers help us merge backports? 15:09:53 you can built up a track record 15:10:04 bswartz: I think they can theoretically but they don't 15:10:18 they sometimes will -2/-2 though :) 15:10:22 That might be something to bring up with the wider community 15:10:40 If they want to centralize that function then more resource sharing would make more sense 15:10:58 bswartz: yeah, we can ask about that 15:11:08 anyways, we have some that need attention 15:11:18 #link https://review.openstack.org/#/q/status:open+project:openstack/manila+branch:stable/queens 15:11:33 #link https://review.openstack.org/#/c/648716/ 15:12:15 We'd like to get these merged and then cut new stable/* releases before PTG 15:12:45 Also, stable/pike will soon join stable/ocata as an EM branch 15:12:52 "extended maintenance" 15:13:18 which means no more releases but as long as our team has interest and resources to keep 15:13:30 CI going for those branches we can still backport to them 15:14:04 We can check who still uses stable/ocata and stable/pike and see if this is worth it 15:14:44 Red Hat at least will be maintaining stable/queens equivalent for some time but we have less interest in 15:15:00 stable/ocata and stable/pike as time goes on 15:15:17 so SUSE, canonical, etc. can weigh in if you care 15:15:37 Anything else on this topic? 15:16:12 #topic Followup on: Third party CIs and devs: DevStack plugin changes - we will no longer support installing manila-tempest-plugin 15:16:33 gouthamr thanks for broaching this topic last week 15:16:43 #link https://review.openstack.org/#/c/648716/ 15:16:52 is seemingly harmless, because there's atleast one third party CI running there and passing 15:17:38 i started compiling an updated list of 3rd party CI maintainers - will email openstack-discuss and this list soon 15:17:48 gouthamr: I tested that it didn't break first-party jobs even with the manila plugin left as is, so I think that's right 15:18:04 gouthamr: thanks for working on that. 15:18:20 gouthamr: seems like reasonable progress for one week. 15:18:41 but I'd request some reviews of 648716 then. 15:18:50 tbarron: can we split out the devstack plugin changes from https://review.openstack.org/#/c/646037/ > 15:18:56 Ooh it's NetApp CI 15:19:11 I think we should go on and merge it and if there's any issues -- there shouldn't be -- we can flush them out. 15:19:55 gouthamr: yes, are you thinking of a particular order is we split? 15:20:35 https://review.openstack.org/#/c/646037/ can depend on the manila devstack plugin changes and the manila-tempest-plugin patch (https://review.openstack.org/#/c/648716/) 15:21:49 gouthamr: sure, any particular reason? 15:22:16 yes, because the changes to the devstack plugin aren't related to the python3 changes 15:22:42 gouthamr: ok, -1 with a comment saying we should separate those concerns :) 15:22:55 sure will do 15:23:17 gouthamr: they're glommed together b/c getting the lvm job to work with python3 required the devstack plugin change 15:23:42 gouthamr: i don't fully understand why it didn't work w/o it, but both seemed the right thing to do and 15:23:50 tbarron: all the jobs really, we were installing manila-tempest-plugin with py2 15:23:59 s/were/are 15:24:20 gouthamr: well dummy and cephfs jobs were working with py3 I think 15:24:23 Py2 is like a zombie. You try to kill it but it keeps coming back 15:24:28 cephs is a different plugin though 15:24:36 dummy I didn't understand but anyways 15:24:54 tbarron: iirc those were working because of the PYTHON3_VERSION variable 15:25:20 gouthamr: yeah, but i couldn't get lvm to work even with it 15:25:52 anyways, we'll separate the concerns, merge these in the right order, and have something we can understand a bit bettedr I hope 15:26:18 Anything else on this topic? 15:26:42 #topic bugs 15:26:44 * gouthamr is slow to type thanks to attending multiple-meetings, sry 15:27:07 gouthamr: anything else on the previous topic? 15:27:11 nope 15:27:25 our wonderful bug overlord is out today 15:27:38 In another meeting? 15:27:47 I don't know if gouthamr or vhariria have anything? 15:28:02 bswartz: no he's not in the office today. 15:28:18 I don't think we had a lot of new bugs this week. 15:28:39 I workflowed a few things 15:28:48 I closed out a couple, jgrosso cleaning up bugs and surfacing issues that we need to address has been 15:28:51 very helpful. 15:28:54 bswartz: thanks! 15:28:56 Is 646037 the one that will get split? 15:29:31 bswartz: yes 15:29:46 tbarron: nothing new, jgrosso is out no updates yet 15:29:51 Why must there be so much yaml? 15:30:00 it's title indicates the original goal but I glommed all kinds of other stuff onto it 15:30:07 bswartz: so that it's not json 15:30:13 bswartz: or xml 15:30:23 I declare.... 15:30:31 vhariria: thanks 15:30:34 Yes, but the volume of "markup" text is truly behemoth 15:30:51 bswartz: no argument from me on that :) 15:30:58 And I sense a lot of copy/pasting 15:31:13 Is this because of stuff we've done or is it because of indra? 15:31:18 s/indra/infra/ 15:31:36 bswartz: I think both 15:32:10 bswartz: with JJB it was more centralized, then these "legacy jobs" don't really take advantage of 15:32:19 hopefully, with converting these from legazy to native zuul v3 manifests, we can use templates to remove some of the duplication 15:32:24 what we can do with zuulv3 to make this stuff more modular 15:32:29 gouthamr: +100 15:32:32 ^^^ what gouthamr said 15:33:05 this is like halfway through a code refactor, when you expand it all to see how you can re-modularize 15:33:10 I strikes me as odd that we're embedding vast amounts of bash code in yaml files 15:33:24 bswartz: it is odd 15:34:05 I think you'll like the native zuulv3 jobs a lot better 15:34:44 but no one has had time to really tackle the conversion yet 15:35:04 While we're on bugs though. 15:35:21 lseki: how is that scaling bug going? 15:35:27 lseki: the one SAP found? 15:35:51 lseki: I think to reproduce it the dummy back end isn't going to be helpful. 15:36:04 We need to plan that with time, I'm aware 3rd parties use the post and pre test scripts and might need time to migrate 15:36:08 tbarron: the support team is contacting SAP folks to gather more info 15:36:11 lseki: it doesn't have any inertia in manila-share. 15:36:53 lseki: and as I understand it, the scheduler is showing manila-share as down when you have a lot of manila-share services under load. 15:37:02 re zuulv3 15:37:38 vkmc_: yes, 3rd party conversion is another whole aspect :) 15:38:20 vkmc_: we might end up moving first party jobs off of those scripts but still need to keep them for some time. 15:38:45 lseki: it will be interesting to see what SAP says. 15:38:55 A service being marked "down" just means it hasn't sent a heartbeat in a while 15:39:12 For a service under load, I think it's easy to understand how that might happen 15:39:30 The question is how to manage the load better 15:40:09 bswartz: do you think that's all up to the driver for that back end or could we handle it better centrally? 15:40:22 tbarron: yes, we'll keep it up-to-date at launchpad when SAP folks answer 15:40:43 tbarron: if the driver is able to behave badly enough to cause this symptom, then we have an architectural problem 15:41:03 Ideally all drivers would behave well, but there's no way to ensure that 15:41:25 bswartz: like maybe drivers should share some kind of threading library routine so that we can say it's healthy even if it's busy 15:41:30 So there has to be a mechanism to prevent driver slowness from causing availability problems 15:42:22 I'm particularly interested in this issue b/c there are manila customers who want to deploy circa 100 manila-share services 15:42:47 The odds of some of these being busy are higher with more deployed. 15:42:59 And the odds of the scheduler somehow missing updates are higher. 15:43:14 And these will be "edge" deployments meaning that it is expected that 15:43:28 from time to time they will lose connectivity back to the central borg. 15:43:40 We don't want to confuse that kind of partition with 15:43:46 We need to think through this use case carefully 15:43:57 a false-negative like the stuff that this bug reports. 15:44:02 I can think of a few ways to attack the problem, but I don't understand the details well enough 15:44:25 bswartz: we have this as a PTG topic, so I am pointing ahead of time to it now. 15:44:36 Oh good 15:44:41 bswartz: cinder and manila both 15:45:02 Please try to schedule at a time I can participate (i.e. relatively early in the day) 15:45:37 we'll probably do early Friday on this one, after introductions, PTG schedule/agenda, and retrospective. 15:46:02 so later morning east coast US time 15:46:42 An aspect of this as well is that we really want to get manila-share (and cinder-volume) running active-active, w/o 15:46:46 pacemaker control. 15:47:16 That's a whole other dimension of complexity 15:47:21 That's because edge deployments may end up using software defined storage on hyperconverged nodes 15:47:42 and if you manage them active/passive with pacemaker then you risk losing a lot of your storage just 15:47:51 because some service appears unhealthy. 15:48:00 bswartz: yes :) 15:48:22 So there are some really interesting and important issues for this PTG I think. 15:49:03 plan of record for manila is to use DLM (tooz backed by etcd) for active-active synchronization issues but 15:49:12 If only python wasn't single-threaded 15:49:30 for these edge deployments there would be separate DLM clusters at the edges and given the 15:49:53 possibility of edge partitions running etcd cluster from central borg all the way to edges may be 15:49:56 Language issues bite us again and again 15:49:58 problematic 15:50:23 so and issue for manila to consider is re-architecting the cases where we rely on locking across services 15:52:26 an issue 15:52:27 we'd need to use the DB for exclusion in some cases where we use locks held across service casts today 15:52:27 OK, just floating stuff for people to think about! 15:52:28 manila still has lots of very interesting issues to work on! 15:52:30 bswartz: mebbe we should do some of this with goroutines 15:52:52 ok, perhaps belated :) 15:52:52 Yes, rewrite all of openstack in go 15:52:59 #topic open discussion 15:53:04 That definitely won't cause any bugs 15:53:11 bswartz: :) 15:53:12 >_< 15:53:58 I really do like go as a language, and I regret that it wasn't mature enough when work on openstack started 15:53:59 less drastic may be squeezing out eventlet and leveraging some native threading and multiprocessing stuff 15:54:16 i find i like types 15:54:34 Indeed 15:54:35 for production code 15:54:53 though for scripting it's nice to be able to be lazy 15:55:00 +1 15:55:38 Seems like we're through for today. 15:56:00 Thanks everyone, see you in #openstack-manila and let's get ready for PTG! 15:56:04 #endmeeting