15:00:30 <bswartz> #startmeeting manila 15:00:30 <openstack> Meeting started Thu Dec 10 15:00:30 2015 UTC and is due to finish in 60 minutes. The chair is bswartz. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:00:31 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:00:34 <openstack> The meeting name has been set to 'manila' 15:00:43 <bswartz> hello all 15:00:47 <u_glide1> hello 15:00:48 <vponomaryov> hi 15:00:52 <markstur_> hi 15:00:55 <aovchinnikov> hi 15:00:56 <xyang1> hi 15:00:58 <ganso> hello 15:01:12 <zhongjun2> hi 15:01:13 <bswartz> #agenda https://wiki.openstack.org/wiki/Manila/Meetings 15:01:24 <csaba> hi 15:02:00 <cknight1> Hi 15:02:06 <bswartz> okay 15:02:23 <bswartz> #topic announcements 15:02:38 <toabctl> hi 15:02:54 <bswartz> The midcycle meeting has been scheduled for Jan 13-14 15:03:12 <dustins> \o 15:03:13 <bswartz> that's pretty early, which may be a good thing 15:03:30 <ameade> o/ 15:03:36 <bswartz> I'll be updating the wiki and etherpad with relevant details 15:03:44 <bswartz> but the date is set now 15:03:46 <tbarron> hi 15:04:04 <bswartz> #topic QoS_support 15:04:13 <bswartz> zhongjun2: you're up 15:04:15 <zhongjun2> Link:https://review.openstack.org/#/c/247286/ 15:04:24 <zhongjun2> The most of the driver already report the "'QoS_support': False" to scheduler. 15:04:24 <zhongjun2> Does we need to save Qos_support as common capabilities and write 'QoS_support' to capabilities_and_extra_specs.rst? 15:05:16 <bswartz> zhongjun2: so the first step is for us to agree that we need a qos_support common extra spec, and to agree what it means 15:05:24 <bswartz> then we add it to the dev doc you mentioned 15:05:40 <bswartz> we've discussed how qos should work in manila a few times 15:06:07 <bswartz> and I want to thank you for taking lead on the topic of qos and continuing to push for progress 15:06:35 <bswartz> zhongjun2: do you have a definition for what the qos_support extra spec should mean? 15:07:01 <bswartz> oh n/m it's in the link 15:08:05 <zhongjun2> Driver could define extra specs as what the need. 15:08:30 <bswartz> okay so I think we all agreed that we want admins to be able to set qos limits in their share types, but that the limits should be set by vendor-specific extra specs 15:08:56 <bswartz> so what does this standard QoS_support extra spec give us? 15:09:16 <zhongjun2> Currently, QoS_support is a common capabilities in Manila Code, just not write in doc. 15:09:53 <bswartz> is the idea that you'd like to setup a filter in your share type for QoS_support=True? 15:10:04 <zhongjun2> I write extra spec in patch 4. 15:10:18 <bswartz> then if you have multiple backend types, some of which have QoS and some of which don't, you'll only get the QoS-supporting backends? 15:10:55 <bswartz> when was the QoS_support extra spec added to the code? does anyone know? 15:11:03 <bswartz> was that part of the cinder fork? or was it added later? 15:11:22 <zhongjun2> Yes, I like to setup a filter in my share type for QoS_support=True 15:11:49 <kaisers> \o 15:12:52 <zhongjun2> https://github.com/openstack/manila/blob/master/manila/scheduler/host_manager.py#L111 15:12:55 <markstur_> I'd suggest we remove the old QoS_support=False as unused copy/paste code 15:13:06 <bswartz> okay so we know that different backends may have different levels of QoS support -- some may support throttling, while others may support minimum guarantees 15:13:31 <bswartz> if we just have a single extra spec that implies "support" it won't tell the admin that much 15:13:36 <zhongjun2> long ago, we already have qos_support 15:13:56 <markstur_> For zongjun2 either add a capability like huawei_qos = True to do what he wants, or get some agreement on what qos = True will be 15:14:21 <bswartz> markstur_: I'm thinking something similar 15:14:46 <ganso> markstur_: s/he/she 15:15:02 <zhongjun2> :) 15:15:21 <markstur_> Thanks, ganso 15:15:23 <bswartz> I'm thinking if we have huawei_max_iops and huawei_max_bytes_per_sec (or whatever they are) then you setup your share_type with extra specs like: 15:15:53 <bswartz> driver_name <is> Huawei, huawei_max_iops = 1000 15:16:16 <zhongjun2> QoS_support = True, and huawei_max_iops=300 is enough? 15:16:18 <bswartz> and in that case you don't need a QoS_support capability 15:16:34 <bswartz> if you have both netapp and huawei, then it looks like 15:17:06 <bswartz> driver_name <in> (Huawei, NetApp), huawei_max_iops = 1000, netapp_max_iops = 1000 15:17:36 <bswartz> I think you can get what you want without needing a common extra_spec 15:18:14 <bswartz> I'm not opposed to a common extra spec if it adds something, but so far I can't see what it adds other than a simple shorthand for the driver_name <in> .... filter 15:18:23 <zhongjun2> In Cinder, QoS_support is exist, and different manufacturers have different qos parameters, it is ok 15:18:42 <bswartz> and my worry is that as different vendors add different ways of doing QoS, the capability might actually cause confusion 15:19:19 <bswartz> zhongjun2: the difference in cinder is that they've defined a common way of specifying limits which everyone is supposed to support -- so the meaning of the QoS_support flag is explicit in that case 15:19:50 <bswartz> we decided that the common definition can actually lead to inconsistent behavior though, in case one vendor's IOP is different from another vendor's IOP 15:19:51 <ganso> bswartz: I think QoS_support only says if it is enabled or not. Like, if a user has a share in a backend that supports QoS and, and he would like to continue having QoS after a retype or migration, that would be the case... even if other QoS-related extra-specs are different 15:20:37 <ganso> bswartz: so instead of huawei specifying huawei_qos = True + other extra_specs, and HDS specifying hds_qos = True + other extra_specs, both can specify QoS_support = True 15:21:01 <bswartz> btw since Tokyo I've looked into exactly how NetApp does QoS and it's somewhat complicated 15:21:27 <bswartz> it seems unlikely that the NetApp method would match exactly with another vendor's method 15:21:38 <bswartz> the differences might be small enough to ignore, but I worry about the case when they're not 15:22:12 <zhongjun2> In Huawei driver, some array support qos, some array not support qos, so we need a qos flag to choose which one it ok, driver_name <in> (Huawei, NetApp) not enough. 15:22:20 <bswartz> ganso: I get what you're saying, but what happens when some driver support qos ceilings only and other driver support qos ceiling and floors? 15:22:24 <ganso> bswartz: defining specs like "max_iops" is complicated across vendors, but zhongjun2 is proposing something simpler for now 15:23:04 <ganso> bswartz: anything besides "QoS_supports True/False" would be vendor specific, not cross-vendor 15:23:07 <bswartz> ganso: if qos_support only implies that the driver support some kind of throttling, then how to you filter to drivers that also support qos guarantees? 15:23:49 <ganso> bswartz: I guess right now you don't, unless they are from the same vendor 15:23:59 <bswartz> if we do this then we just need to be very clear that qos_support doesn't have any specific meaning -- it's just a common convention that all drivers can use to express whether their specific QoS feature is enabled on that array 15:24:18 <ganso> bswartz: +1 15:24:27 <cknight1> bswartz: +1 But I still don't see the value. 15:25:04 <bswartz> cknight: you'd prefer that huawei adds a huawei_qos_support flag to distinguish between huawei arrays with qos and huawei arrays without? 15:25:55 <cknight1> bswartz: something like that, I guess. it just seems odd to add a flag that is common for all drivers if there is no related QoS feature in common. 15:26:22 <bswartz> cknight1: I agree -- the benefit that I see is less namespace pollution 15:26:47 <bswartz> and also the fact that the code has been in tree for a long time -- less disruption to leave it in and give it an explicit meaning 15:27:04 <bswartz> as long as we clearly document what the flag means and doesn't mean, I think we're okay 15:27:16 <markstur_> I guess a common capability would allow you to do the example w/ both huawei and netapp specs (scoped specs) 15:27:23 <cknight1> bswartz: no harm there, I suppose. but you'll still need vendor-specific flags to take advantage of any QoS features. 15:27:36 <bswartz> it would be great if there was a QoS example for admins that showed how to configure qos on a real array in the real world 15:27:39 <markstur_> otherwised we'd need OR for scheduler capabilities (don't have that do we???) 15:27:56 <zhongjun2> Ok, I will add some example(huawei and netapp etc) 15:28:05 <bswartz> markstur_: that's an excellent point 15:28:19 <bswartz> okay I'm sold 15:28:36 <markstur_> Previously I preferred huawei_qos, but it would get ugly if we all did that, I guess. 15:28:52 <ganso> markstur_: +1 15:28:55 <markstur_> for a similar capability 15:28:59 <bswartz> without an OR operator in the filter scheduler, a common extra spec solves a real serious problem 15:29:21 <markstur_> "Common and/or Similar Capability-like-things" 15:29:40 <bswartz> okay so let's move this discussion to the gerrit review 15:29:59 <bswartz> I'll propose the wording I'd like to see in the dev docs, and we can iterate there until we're all happy 15:30:05 <bswartz> #link https://review.openstack.org/#/c/247286 15:30:11 <ganso> bswartz: +1 15:30:27 <bswartz> zhongjun2: does that work for you? 15:31:05 <zhongjun2> Ok, Thanks. so we agree with use qos_support as commom capabilities? 15:31:30 <bswartz> zhongjun2: I don't hear any strong opposition 15:31:41 <bswartz> zhongjun2: thanks again for continuing to pursue this 15:31:52 <bswartz> #topic Architectural concerns and interoperability of new features 15:32:00 <bswartz> cknight: you're up 15:32:04 <cknight1> It's a touch premature to go into details until I can write everything down, so this is largely a teaser about an upcoming proposal. 15:32:16 <cknight1> Over the last several months, Manila has added or proposed a number of experimental features like migration, CGs, and replication. 15:32:30 * bswartz mutters about migrating replicated consistency groups... 15:32:37 <cknight1> Migration is generic, but CGs and replication will be limited to just a few backends, so the user experience is irregular. 15:33:00 <cknight1> Apart from Ben and me muttering, not much thought has been given to how those features will interoperate. 15:33:15 <cknight1> And there are more things on the roadmap, such as backup and retype, that must fit cleanly into the feature matrix. 15:33:30 <cknight1> I'm very concerned that we are adding major features without an overarching architectural vision. 15:33:48 <cknight1> If that continues, we will end up with a hugely complicated feature support matrix and codebase that makes life miserable for both users and developers. 15:33:53 <bswartz> +1 15:33:56 <dustins> And testers! 15:34:02 <ameade> +1 15:34:06 <cknight1> dustins: Yes! 15:34:13 <cknight1> dustins: never forget the testers 15:34:21 <cknight1> I've been developing a set of ideas that could provide an orderly framework for advanced features while making the user experience more uniform. 15:34:25 <dustins> bswartz's scenario is the stuff of my nightmares 15:34:28 <ganso> +1 15:34:39 <cknight1> (Incidentally, the vision I will propose would apply equally well to Cinder, which IMHO is further down the path towards a supportability disaster.) 15:34:51 <cknight1> But with new features labeled as experimental, Manila is better positioned to show the way. 15:35:20 <cknight1> Kudos to Ben for insisting the new stuff be experimental. 15:35:31 <bswartz> xyang1 and tbarron both know about the discussions going on in cinder to try to reconcile similar issues 15:35:44 <cknight1> Before next week's meeting, I will add everything to a wiki and post a link to the Manila DL. So stay tuned! 15:35:54 <xyang1> bswartz: sure 15:35:57 <cknight1> And if you have ideas of your own, please share them or add to the wiki. 15:36:03 <dustins> cknight1: looking forward to it 15:36:13 <bswartz> cknight1: want to give us the 2 sentence summary of the proposal? 15:36:20 <ameade> i for one and really excited about mitigating these feature matrix issues 15:36:29 <cknight1> bswartz: OK, I'll try. 15:36:40 <tbarron> that was one sentence right there 15:36:47 <ganso> lol 15:37:13 <cknight1> bswartz: Look at CGs. It's a highly specific grouping concept with very valuable but limited applicability. 15:37:30 <cknight1> bswartz: Anyone with a non-CG backend can't use it. 15:37:49 <cknight1> bswartz: And now, all other advanced features have to be CG aware. That snowballs very fast. 15:38:07 <cknight1> bswartz: So instead, here are two ideas: 15:38:38 <cknight1> bswartz: #1 Add a generic grouping construct to Manila. Without any guaranteed high-value features. 15:39:10 <cknight1> bswartz: If a driver doesn't do CGs or group-based replication, no problem, the share manager just handles groups in a simple loop. 15:39:25 <cknight1> bswartz: So all drivers, even LVM or Generic, get simple grouping functions. 15:39:59 <cknight1> bswartz: #2 Add the notion of a Share Group Type, similar to the existing Share Type. 15:40:20 <cknight1> bswartz: So a group can have capabilities like consistent snapshots, group_replication, etc. 15:40:39 <cknight1> bswartz: If a group has those set, the manager can defer to its driver to handle the advanced feature. 15:40:59 <toabctl> cknight1: #1 means that everything is a group (and groups with a single member are possible) ? so no difference between CGs and non-CGs ? 15:41:00 <bswartz> cknight1: that was more than 2 sentences, but I think that at least make the proposal clear 15:41:03 <cknight1> bswartz: So *all* functions available on primitives (i.e. shares) are available in groups. 15:41:16 <vponomaryov> cknight1: what if we make all shares be relate dto some CG? but by default as relation 1to1? 15:41:16 <cknight1> bswartz: The user experience is much simpler and uniform. 15:41:41 <vponomaryov> toabctl: +1 )) 15:42:02 <cknight1> bswartz: And the framework is there for advanced grouping functions like CGs, replication of CGs, retype of things in groups, migration of replicated groups, etc. 15:42:40 <cknight1> vponomaryov: That's an implementation detail that I've thought about but not concluded on. Insight welcome! 15:42:51 <bswartz> toabctl: I think there is a difference between CGs and non-CGs, but the difference comes in the extra specs on the group type 15:43:00 <cknight1> bswartz: OK, that's the high-level summary. More coming in a wiki. 15:43:01 <xyang1> I am opposing doing this in Cinder because we have explored it before and decided not to do it 15:43:13 <bswartz> at the manila DB level the groups would look the same 15:43:55 <bswartz> xyang1: if you remember the reasons why cinder decided against a similar approach they would be really valuable feedback for cknight 15:44:34 <xyang1> a long story and I think a lot of those are captured in reviews 15:44:41 <cknight1> bswartz: +1. Don't want to repeat the mistakes others. 15:44:50 <cknight1> xyang1: Please send links to relevant reviews! 15:44:52 <xyang1> I can provide links later 15:44:53 <bswartz> xyang1: when cknight mentioned the idea to me it sounded like the proposal matched fairly well with that you've been pushing in cinder 15:45:13 <xyang1> we are not doing group type in cinder 15:45:35 <xyang1> #1 may be closer to what we are trying in cinder 15:45:43 <bswartz> so if we're missing a crucial detail then let's find out what it is 15:46:06 <xyang1> but we are trying to use the same replication api for volume and group 15:46:24 <xyang1> sure 15:46:37 <bswartz> xyang1: was is a case of difficulty with forwards/backward compatibility with existing APIs? because we don't have those problems in Manila 15:46:54 <tbarron> xyang1: I think cknight1's idea may be related to winston's ? 15:47:05 <bswartz> we are fortunate that the existing CG implementation in Manila is marked experimental we can can rewrite it if we choose 15:47:16 <xyang1> it was because of the complexity it will bring 15:47:35 <xyang1> we eventually gave up on it 15:47:42 <xyang1> that was in Grizzly 15:48:05 <xyang1> I am referring to the type group 15:48:19 <bswartz> xyang1: I agree that groups with types adds a new level of complexity -- but my fear is that trying to mix features like replications, cgs, and migraiton without a common grouping concept will result in even worse complexity 15:48:28 <ameade> +1 15:48:29 <bswartz> so I'm willing to entertain this proposal 15:48:34 <cknight1> bswartz: +1 My thoughts exactly. 15:48:53 <xyang1> that is fine, but I don't think we should do it in cinder 15:49:53 <xyang1> our file based storage does not have CG concept, so it is different in manila and cinder that way too 15:50:09 <bswartz> well that's a discussion for the cinder meeting -- we're not here to talk about cinder, other than as an example of what's been tried before and how it's turned out 15:50:42 <xyang1> you said something about cinder in the beginning 15:50:47 <bswartz> I think I care less than others about maintaining commonality between cinder and manila 15:51:33 <xyang1> If you want to try in Manila, I don't have problem 15:52:18 <cknight1> xyang1: fair enough, thanks. I look forward to seeing some of the earlier Cinder discussions. 15:52:23 <bswartz> well I hope we can discuss cknight's idea in more depth next week after there is a doc about it 15:52:52 <bswartz> #topic open discussion 15:53:01 <bswartz> does anyone have something else for this week? 15:53:47 <xyang1> cknight1: one important thing is the difference between a group and pool 15:53:57 <bswartz> next week the manila meeting should happen as usual, but the following 2 weeks we start to run into holidays so we should discuss if those meetings should be cancelled 15:54:07 <cknight1> xyang1: yes, very true. 15:54:25 <tbarron> xyang1: +1 :-) 15:54:48 <bswartz> does anyone plan to work dec 24th and dec 31? 15:55:01 <vponomaryov> bswartz: yes 15:55:03 <bswartz> I hate canceling 2 meetings in a row 15:55:03 <dustins> I'm not 15:55:32 <xyang1> vponomaryov: no holiday break? 15:55:45 <bswartz> okay well we can decide next week about meeting cancelations 15:55:47 <vponomaryov> xyang1: jan 1th and 7th 15:56:03 <cknight1> bswartz: I have no issues canceling the meeting those weeks. The DL & IRC are still available, but lots of folks would probably appreciate the break. 15:56:11 <bswartz> check the meeting wiki to see when the next meeting is 15:56:43 <bswartz> we may hold brief meetings if anything comes up -- in any case I don't expect everyone to be there 15:56:53 <bswartz> next week we're still on though 15:57:16 <bswartz> okay I guess we're done for today 15:57:18 <bswartz> thanks everyone 15:57:26 <dustins> thanks! 15:57:34 <bswartz> #endmeeting