15:00:30 <bswartz> #startmeeting manila
15:00:30 <openstack> Meeting started Thu Dec 10 15:00:30 2015 UTC and is due to finish in 60 minutes.  The chair is bswartz. Information about MeetBot at http://wiki.debian.org/MeetBot.
15:00:31 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
15:00:34 <openstack> The meeting name has been set to 'manila'
15:00:43 <bswartz> hello all
15:00:47 <u_glide1> hello
15:00:48 <vponomaryov> hi
15:00:52 <markstur_> hi
15:00:55 <aovchinnikov> hi
15:00:56 <xyang1> hi
15:00:58 <ganso> hello
15:01:12 <zhongjun2> hi
15:01:13 <bswartz> #agenda https://wiki.openstack.org/wiki/Manila/Meetings
15:01:24 <csaba> hi
15:02:00 <cknight1> Hi
15:02:06 <bswartz> okay
15:02:23 <bswartz> #topic announcements
15:02:38 <toabctl> hi
15:02:54 <bswartz> The midcycle meeting has been scheduled for Jan 13-14
15:03:12 <dustins> \o
15:03:13 <bswartz> that's pretty early, which may be a good thing
15:03:30 <ameade> o/
15:03:36 <bswartz> I'll be updating the wiki and etherpad with relevant details
15:03:44 <bswartz> but the date is set now
15:03:46 <tbarron> hi
15:04:04 <bswartz> #topic QoS_support
15:04:13 <bswartz> zhongjun2: you're up
15:04:15 <zhongjun2> Link:https://review.openstack.org/#/c/247286/
15:04:24 <zhongjun2> The most of the driver already report the "'QoS_support': False" to scheduler.
15:04:24 <zhongjun2> Does we need to save Qos_support as common capabilities and write 'QoS_support' to capabilities_and_extra_specs.rst?
15:05:16 <bswartz> zhongjun2: so the first step is for us to agree that we need a qos_support common extra spec, and to agree what it means
15:05:24 <bswartz> then we add it to the dev doc you mentioned
15:05:40 <bswartz> we've discussed how qos should work in manila a few times
15:06:07 <bswartz> and I want to thank you for taking lead on the topic of qos and continuing to push for progress
15:06:35 <bswartz> zhongjun2: do you have a definition for what the qos_support extra spec should mean?
15:07:01 <bswartz> oh n/m it's in the link
15:08:05 <zhongjun2> Driver could define extra specs as what the need.
15:08:30 <bswartz> okay so I think we all agreed that we want admins to be able to set qos limits in their share types, but that the limits should be set by vendor-specific extra specs
15:08:56 <bswartz> so what does this standard QoS_support extra spec give us?
15:09:16 <zhongjun2> Currently, QoS_support is a common capabilities in Manila Code, just not write in doc.
15:09:53 <bswartz> is the idea that you'd like to setup a filter in your share type for QoS_support=True?
15:10:04 <zhongjun2> I write extra spec in patch 4.
15:10:18 <bswartz> then if you have multiple backend types, some of which have QoS and some of which don't, you'll only get the QoS-supporting backends?
15:10:55 <bswartz> when was the QoS_support extra spec added to the code? does anyone know?
15:11:03 <bswartz> was that part of the cinder fork? or was it added later?
15:11:22 <zhongjun2> Yes, I like to setup a filter in my share type for QoS_support=True
15:11:49 <kaisers> \o
15:12:52 <zhongjun2> https://github.com/openstack/manila/blob/master/manila/scheduler/host_manager.py#L111
15:12:55 <markstur_> I'd suggest we remove the old QoS_support=False as unused copy/paste code
15:13:06 <bswartz> okay so we know that different backends may have different levels of QoS support -- some may support throttling, while others may support minimum guarantees
15:13:31 <bswartz> if we just have a single extra spec that implies "support" it won't tell the admin that much
15:13:36 <zhongjun2> long ago, we already have qos_support
15:13:56 <markstur_> For zongjun2 either add a capability like huawei_qos = True to do what he wants, or get some agreement on what qos = True will be
15:14:21 <bswartz> markstur_: I'm thinking something similar
15:14:46 <ganso> markstur_: s/he/she
15:15:02 <zhongjun2> :)
15:15:21 <markstur_> Thanks, ganso
15:15:23 <bswartz> I'm thinking if we have huawei_max_iops and huawei_max_bytes_per_sec (or whatever they are) then you setup your share_type with extra specs like:
15:15:53 <bswartz> driver_name <is> Huawei, huawei_max_iops = 1000
15:16:16 <zhongjun2> QoS_support = True, and huawei_max_iops=300 is enough?
15:16:18 <bswartz> and in that case you don't need a QoS_support capability
15:16:34 <bswartz> if you have both netapp and huawei, then it looks like
15:17:06 <bswartz> driver_name <in> (Huawei, NetApp), huawei_max_iops = 1000, netapp_max_iops = 1000
15:17:36 <bswartz> I think you can get what you want without needing a common extra_spec
15:18:14 <bswartz> I'm not opposed to a common extra spec if it adds something, but so far I can't see what it adds other than a simple shorthand for the driver_name <in> .... filter
15:18:23 <zhongjun2> In Cinder, QoS_support is exist, and different manufacturers have different qos parameters, it is ok
15:18:42 <bswartz> and my worry is that as different vendors add different ways of doing QoS, the capability might actually cause confusion
15:19:19 <bswartz> zhongjun2: the difference in cinder is that they've defined a common way of specifying limits which everyone is supposed to support -- so the meaning of the QoS_support flag is explicit in that case
15:19:50 <bswartz> we decided that the common definition can actually lead to inconsistent behavior though, in case one vendor's IOP is different from another vendor's IOP
15:19:51 <ganso> bswartz: I think QoS_support only says if it is enabled or not. Like, if a user has a share in a backend that supports QoS and, and he would like to continue having QoS after a retype or migration, that would be the case... even if other QoS-related extra-specs are different
15:20:37 <ganso> bswartz: so instead of huawei specifying huawei_qos = True + other extra_specs, and HDS specifying hds_qos = True + other extra_specs, both can specify QoS_support = True
15:21:01 <bswartz> btw since Tokyo I've looked into exactly how NetApp does QoS and it's somewhat complicated
15:21:27 <bswartz> it seems unlikely that the NetApp method would match exactly with another vendor's method
15:21:38 <bswartz> the differences might be small enough to ignore, but I worry about the case when they're not
15:22:12 <zhongjun2> In Huawei driver, some array support qos, some array not support qos, so we need a qos flag to choose which one it ok, driver_name <in> (Huawei, NetApp) not enough.
15:22:20 <bswartz> ganso: I get what you're saying, but what happens when some driver support qos ceilings only and other driver support qos ceiling and floors?
15:22:24 <ganso> bswartz: defining specs like "max_iops" is complicated across vendors, but zhongjun2 is proposing something simpler for now
15:23:04 <ganso> bswartz: anything besides "QoS_supports True/False" would be vendor specific, not cross-vendor
15:23:07 <bswartz> ganso: if qos_support only implies that the driver support some kind of throttling, then how to you filter to drivers that also support qos guarantees?
15:23:49 <ganso> bswartz: I guess right now you don't, unless they are from the same vendor
15:23:59 <bswartz> if we do this then we just need to be very clear that qos_support doesn't have any specific meaning -- it's just a common convention that all drivers can use to express whether their specific QoS feature is enabled on that array
15:24:18 <ganso> bswartz: +1
15:24:27 <cknight1> bswartz: +1  But I still don't see the value.
15:25:04 <bswartz> cknight: you'd prefer that huawei adds a huawei_qos_support flag to distinguish between huawei arrays with qos and huawei arrays without?
15:25:55 <cknight1> bswartz:  something like that, I guess.  it just seems odd to add a flag that is common for all drivers if there is no related QoS feature in common.
15:26:22 <bswartz> cknight1: I agree -- the benefit that I see is less namespace pollution
15:26:47 <bswartz> and also the fact that the code has been in tree for a long time -- less disruption to leave it in and give it an explicit meaning
15:27:04 <bswartz> as long as we clearly document what the flag means and doesn't mean, I think we're okay
15:27:16 <markstur_> I guess a common capability would allow you to do the example w/ both huawei and netapp specs (scoped specs)
15:27:23 <cknight1> bswartz: no harm there, I suppose.  but you'll still need vendor-specific flags to take advantage of any QoS features.
15:27:36 <bswartz> it would be great if there was a QoS example for admins that showed how to configure qos on a real array in the real world
15:27:39 <markstur_> otherwised we'd need OR for scheduler capabilities (don't have that do we???)
15:27:56 <zhongjun2> Ok, I will add some example(huawei and netapp etc)
15:28:05 <bswartz> markstur_: that's an excellent point
15:28:19 <bswartz> okay I'm sold
15:28:36 <markstur_> Previously I preferred huawei_qos, but it would get ugly if we all did that, I guess.
15:28:52 <ganso> markstur_: +1
15:28:55 <markstur_> for a similar capability
15:28:59 <bswartz> without an OR operator in the filter scheduler, a common extra spec solves a real serious problem
15:29:21 <markstur_> "Common and/or Similar Capability-like-things"
15:29:40 <bswartz> okay so let's move this discussion to the gerrit review
15:29:59 <bswartz> I'll propose the wording I'd like to see in the dev docs, and we can iterate there until we're all happy
15:30:05 <bswartz> #link https://review.openstack.org/#/c/247286
15:30:11 <ganso> bswartz: +1
15:30:27 <bswartz> zhongjun2: does that work for you?
15:31:05 <zhongjun2> Ok, Thanks. so we agree with use qos_support as commom capabilities?
15:31:30 <bswartz> zhongjun2: I don't hear any strong opposition
15:31:41 <bswartz> zhongjun2: thanks again for continuing to pursue this
15:31:52 <bswartz> #topic Architectural concerns and interoperability of new features
15:32:00 <bswartz> cknight: you're up
15:32:04 <cknight1> It's a touch premature to go into details until I can write everything down, so this is largely a teaser about an upcoming proposal.
15:32:16 <cknight1> Over the last several months, Manila has added or proposed a number of experimental features like migration, CGs, and replication.
15:32:30 * bswartz mutters about migrating replicated consistency groups...
15:32:37 <cknight1> Migration is generic, but CGs and replication will be limited to just a few backends, so the user experience is irregular.
15:33:00 <cknight1> Apart from Ben and me muttering, not much thought has been given to how those features will interoperate.
15:33:15 <cknight1> And there are more things on the roadmap, such as backup and retype, that must fit cleanly into the feature matrix.
15:33:30 <cknight1> I'm very concerned that we are adding major features without an overarching architectural vision.
15:33:48 <cknight1> If that continues, we will end up with a hugely complicated feature support matrix and codebase that makes life miserable for both users and developers.
15:33:53 <bswartz> +1
15:33:56 <dustins> And testers!
15:34:02 <ameade> +1
15:34:06 <cknight1> dustins: Yes!
15:34:13 <cknight1> dustins: never forget the testers
15:34:21 <cknight1> I've been developing a set of ideas that could provide an orderly framework for advanced features while making the user experience more uniform.
15:34:25 <dustins> bswartz's scenario is the stuff of my nightmares
15:34:28 <ganso> +1
15:34:39 <cknight1> (Incidentally, the vision I will propose would apply equally well to Cinder, which IMHO is further down the path towards a supportability disaster.)
15:34:51 <cknight1> But with new features labeled as experimental, Manila is better positioned to show the way.
15:35:20 <cknight1> Kudos to Ben for insisting the new stuff be experimental.
15:35:31 <bswartz> xyang1 and tbarron both know about the discussions going on in cinder to try to reconcile similar issues
15:35:44 <cknight1> Before next week's meeting, I will add everything to a wiki and post a link to the Manila DL.  So stay tuned!
15:35:54 <xyang1> bswartz: sure
15:35:57 <cknight1> And if you have ideas of your own, please share them or add to the wiki.
15:36:03 <dustins> cknight1: looking forward to it
15:36:13 <bswartz> cknight1: want to give us the 2 sentence summary of the proposal?
15:36:20 <ameade> i for one and really excited about mitigating these feature matrix issues
15:36:29 <cknight1> bswartz: OK, I'll try.
15:36:40 <tbarron> that was one sentence right there
15:36:47 <ganso> lol
15:37:13 <cknight1> bswartz: Look at CGs.  It's a highly specific grouping concept with very valuable but limited applicability.
15:37:30 <cknight1> bswartz: Anyone with a non-CG backend can't use it.
15:37:49 <cknight1> bswartz: And now, all other advanced features have to be CG aware.  That snowballs very fast.
15:38:07 <cknight1> bswartz: So instead, here are two ideas:
15:38:38 <cknight1> bswartz: #1  Add a generic grouping construct to Manila.  Without any guaranteed high-value features.
15:39:10 <cknight1> bswartz: If a driver doesn't do CGs or group-based replication, no problem, the share manager just handles groups in a simple loop.
15:39:25 <cknight1> bswartz: So all drivers, even LVM or Generic, get simple grouping functions.
15:39:59 <cknight1> bswartz: #2  Add the notion of a Share Group Type, similar to the existing Share Type.
15:40:20 <cknight1> bswartz: So a group can have capabilities like consistent snapshots, group_replication, etc.
15:40:39 <cknight1> bswartz: If a group has those set, the manager can defer to its driver to handle the advanced feature.
15:40:59 <toabctl> cknight1: #1 means that everything is a group (and groups with a single member are possible) ? so no difference between CGs and non-CGs ?
15:41:00 <bswartz> cknight1: that was more than 2 sentences, but I think that at least make the proposal clear
15:41:03 <cknight1> bswartz: So *all* functions available on primitives (i.e. shares) are available in groups.
15:41:16 <vponomaryov> cknight1: what if we make all shares be relate dto some CG? but by default as relation 1to1?
15:41:16 <cknight1> bswartz: The user experience is much simpler and uniform.
15:41:41 <vponomaryov> toabctl: +1 ))
15:42:02 <cknight1> bswartz: And the framework is there for advanced grouping functions like CGs, replication of CGs, retype of things in groups, migration of replicated groups, etc.
15:42:40 <cknight1> vponomaryov: That's an implementation detail that I've thought about but not concluded on.  Insight welcome!
15:42:51 <bswartz> toabctl: I think there is a difference between CGs and non-CGs, but the difference comes in the extra specs on the group type
15:43:00 <cknight1> bswartz: OK, that's the high-level summary.  More coming in a wiki.
15:43:01 <xyang1> I am opposing doing this in Cinder because we have explored it before and decided not to do it
15:43:13 <bswartz> at the manila DB level the groups would look the same
15:43:55 <bswartz> xyang1: if you remember the reasons why cinder decided against a similar approach they would be really valuable feedback for cknight
15:44:34 <xyang1> a long story and I think a lot of those are captured in reviews
15:44:41 <cknight1> bswartz: +1.  Don't want to repeat the mistakes others.
15:44:50 <cknight1> xyang1: Please send links to relevant reviews!
15:44:52 <xyang1> I can provide links later
15:44:53 <bswartz> xyang1: when cknight mentioned the idea to me it sounded like the proposal matched fairly well with that you've been pushing in cinder
15:45:13 <xyang1> we are not doing group type in cinder
15:45:35 <xyang1> #1 may be closer to what we are trying in cinder
15:45:43 <bswartz> so if we're missing a crucial detail then let's find out what it is
15:46:06 <xyang1> but we are trying to use the same replication api for volume and group
15:46:24 <xyang1> sure
15:46:37 <bswartz> xyang1: was is a case of difficulty with forwards/backward compatibility with existing APIs? because we don't have those problems in Manila
15:46:54 <tbarron> xyang1: I think cknight1's idea may be related to winston's ?
15:47:05 <bswartz> we are fortunate that the existing CG implementation in Manila is marked experimental we can can rewrite it if we choose
15:47:16 <xyang1> it was because of the complexity it will bring
15:47:35 <xyang1> we eventually gave up on it
15:47:42 <xyang1> that was in Grizzly
15:48:05 <xyang1> I am referring to the type group
15:48:19 <bswartz> xyang1: I agree that groups with types adds a new level of complexity -- but my fear is that trying to mix features like replications, cgs, and migraiton without a common grouping concept will result in even worse complexity
15:48:28 <ameade> +1
15:48:29 <bswartz> so I'm willing to entertain this proposal
15:48:34 <cknight1> bswartz: +1  My thoughts exactly.
15:48:53 <xyang1> that is fine, but I don't think we should do it in cinder
15:49:53 <xyang1> our file based storage does not have CG concept, so it is different in manila and cinder that way too
15:50:09 <bswartz> well that's a discussion for the cinder meeting -- we're not here to talk about cinder, other than as an example of what's been tried before and how it's turned out
15:50:42 <xyang1> you said something about cinder in the beginning
15:50:47 <bswartz> I think I care less than others about maintaining commonality between cinder and manila
15:51:33 <xyang1> If you want to try in Manila, I don't have problem
15:52:18 <cknight1> xyang1: fair enough, thanks.  I look forward to seeing some of the earlier Cinder discussions.
15:52:23 <bswartz> well I hope we can discuss cknight's idea in more depth next week after there is a doc about it
15:52:52 <bswartz> #topic open discussion
15:53:01 <bswartz> does anyone have something else for this week?
15:53:47 <xyang1> cknight1:  one important thing is the difference between a group and pool
15:53:57 <bswartz> next week the manila meeting should happen as usual, but the following 2 weeks we start to run into holidays so we should discuss if those meetings should be cancelled
15:54:07 <cknight1> xyang1: yes, very true.
15:54:25 <tbarron> xyang1: +1 :-)
15:54:48 <bswartz> does anyone plan to work dec 24th and dec 31?
15:55:01 <vponomaryov> bswartz: yes
15:55:03 <bswartz> I hate canceling 2 meetings in a row
15:55:03 <dustins> I'm not
15:55:32 <xyang1> vponomaryov: no holiday break?
15:55:45 <bswartz> okay well we can decide next week about meeting cancelations
15:55:47 <vponomaryov> xyang1: jan 1th and 7th
15:56:03 <cknight1> bswartz:  I have no issues canceling the meeting those weeks.  The DL & IRC are still available, but lots of folks would probably appreciate the break.
15:56:11 <bswartz> check the meeting wiki to see when the next meeting is
15:56:43 <bswartz> we may hold brief meetings if anything comes up -- in any case I don't expect everyone to be there
15:56:53 <bswartz> next week we're still on though
15:57:16 <bswartz> okay I guess we're done for today
15:57:18 <bswartz> thanks everyone
15:57:26 <dustins> thanks!
15:57:34 <bswartz> #endmeeting