#openstack-meeting-alt log

15:00:38 <bswartz> #startmeeting manila
15:00:39 <openstack> Meeting started Thu Nov 12 15:00:38 2015 UTC and is due to finish in 60 minutes.  The chair is bswartz. Information about MeetBot at http://wiki.debian.org/MeetBot.
15:00:40 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
15:00:43 <openstack> The meeting name has been set to 'manila'
15:00:46 <bswartz> hello all
15:00:52 <dustins> \o
15:00:53 <JayXu> hello
15:00:54 <xyang1> hi
15:00:57 <ganso> hello
15:00:57 <u_glide1> hello
15:01:00 <markstur__> hello
15:01:10 <Zhongjun> hi
15:01:19 <bswartz> long agenda today
15:01:33 <bswartz> I saw a lot of agenda updates in the last 4 hours ;-)
15:01:39 <cknight> Hi
15:01:45 <bswartz> #agenda https://wiki.openstack.org/wiki/Manila/Meetings
15:01:52 <JayXu> first topic I submitted. :)
15:01:52 <ameade> o/
15:02:01 <JayXu> in the last minute
15:02:05 <bswartz> no announcements today so let's get going
15:02:10 <bswartz> #topic Which method is better for Manila QoS
15:02:23 <bswartz> zhongjun: you're up
15:03:01 <bswartz> #link http://paste.openstack.org/show/477677/
15:03:08 <Zhongjun> link http://paste.openstack.org/show/477677/
15:03:40 <bswartz> wow paste.openstack.org is ultra slow this morning
15:04:17 <ameade> I'd like to note that at the summit folks were leaning towards #2 but I don't think we all clearly understood the downsides
15:04:19 <Zhongjun> yes, ameade show some different in this link.
15:04:48 <bswartz> yeah #2 is what I proposed
15:05:22 <bswartz> I'm not convinced that it's worth the extra complexity in Manila to make it easy to share a common qos spec between share types
15:05:44 <ameade> if we do #2, then the complexity for handling share types as an admin is harder...that i think we all understand
15:05:51 <ameade> but in order for #2 to work
15:06:00 <bswartz> we have no data on the relative number of share types and qos specs deployers actually use
15:06:12 <ameade> we need a way to have netapp_iops=10 OR hwawei_iops=20
15:06:25 <bswartz> and I don't see why qos-related extra specs are more deserving a reusable wrapper than other extra specs
15:06:49 <ameade> we would also need to have extra specs that are values and not just bools
15:06:55 <toabctl> hey
15:06:59 <ameade> for the min_ and max_ stuff
15:07:01 <markstur__> re-usable spec groups sounds useful
15:07:38 <ameade> as an admin, i would love to have a single qos group that means gold and has what that means for all vendors i support
15:07:42 <bswartz> markstur__: if we go down that path, I would perfer something more generic, such as inheritable share types or something
15:08:01 <cknight> bswartz: +1
15:08:11 <ganso> bswartz: +1
15:08:17 <markstur__> bswartz: +1
15:08:19 <jasonsb> inheritable share types sounds more sympathetic to programmatic discovery
15:08:27 <jasonsb> or programmatic in general
15:08:33 <ameade> i'd prefer having multiple share types over inheritable maybe
15:08:38 <ameade> or share type bundles
15:08:50 <ameade> inheritance sucks
15:08:52 <bswartz> okay but can we agree that inheritable share types is a totally separate enhancement on top of basic qos extra spec support?
15:09:08 <ameade> yeah i agree with that
15:09:10 <ganso> bswartz: +1
15:09:14 <bswartz> I'd like to get qos working, then come back and focus on the management complexity
15:09:21 <ameade> we still need a way to do ORs and ranges
15:09:48 <bswartz> ameade: example?
15:10:25 <cknight> why do you need a range if you can use min/max values?
15:10:40 <ameade> cknight: i mean, wouldn't that mean the driver reports a range?
15:11:10 <cknight> ameade: not sure I see the difference
15:11:13 <ganso> ameade: quite the opposite the way I understand, the type specifies a range, the driver reports a value
15:11:40 <cknight> ganso: +1
15:11:44 <bswartz> I think we're starting to discuss something else -- which is more like performance capacity based scheduling
15:12:00 <bswartz> that's an interesting topic, but not the same as QoS (IMO)
15:12:05 <ameade> both do ranges?
15:12:33 <ameade> bswartz: so the need for OR is so i can specify qos values for 2+ vendors in a single share type
15:12:36 <bswartz> the most basic form of qos is a throttle
15:12:48 <bswartz> for a throttle, the admin just specifies a number and the backend implements it
15:12:48 <ameade> otherwise even with only 2 vendors the number of share_types explodes
15:13:26 <bswartz> ameade: I'm not sure I see why
15:13:45 <bswartz> your paste shows how to do 2 vendors in one share type -- it seems trivial to extend to 3
15:14:29 <ameade> that example wouldnt work because it would try to find a backend that matches both netapp_iops and huawei_iops no?
15:14:41 <bswartz> no
15:14:52 <bswartz> in reality we'd used scoped keys
15:14:57 <bswartz> so the filter scheduler would never see them
15:15:07 <bswartz> I would edit your paste if it were possible
15:15:26 <ameade> so we still need to know backend capable of applying those qos values
15:15:28 <bswartz> we should use a wiki for qos design not paste
15:15:57 <bswartz> we can agree on one common extra specs -- qos_support = True/False
15:16:06 <bswartz> that's what you would filter on
15:16:10 <ameade> so it's all or nothing for qos support?
15:16:25 <bswartz> well part of the agreement would be to define EXACTLY what it means
15:16:34 <bswartz> and document that definition
15:16:41 <ameade> so any vendor who has a unique qos thing they can provide is out of the picture?
15:16:50 <ganso> if the most common production approach is something similar as to "bronze, silver, gold, platinum", then for each type the admin would specify QoS ranges that fit those share type standards
15:16:55 <bswartz> we would have to agree on what the basic requirement is for qos support
15:16:58 <cknight> ameade: no, you can just use vendor-specific scoped keys
15:17:14 <ameade> cknight: and report vendor specific qos capabilities?
15:17:19 <ganso> in order to implement OR, I think we need a QoS spec group... because no backend can match both huwawei and netapp QoS extra specs
15:17:48 <bswartz> cknight: I think what ameade is getting at is that if we use unscoped keys for filtering, like netapp_qos_support and huawei_qos_support, then your share type needs to have an OR expression
15:18:15 <cknight> bswartz: thanks, I get it
15:18:24 <bswartz> that's why I'd be in favor of a common capability for the basic qos_support
15:18:47 <bswartz> but the unscoped keys can be vendor specific because the scheduler never sees them anyways
15:18:55 <bswartz> s/unscoped/scoped/
15:19:38 <bswartz> I like the idea of inheritable share types of some way to bundle multiple types -- we should discuss that later on
15:19:40 <ameade> so my huawei backend could report qos_support = true but my share type has all netapp qos specs
15:19:49 <ameade> so it ends up on huawei but doesnt apply anything
15:20:13 <bswartz> ameade: I think we could agree on some common extra specs for throttling too
15:20:25 <bswartz> like max_read_bps and max_write_bps
15:20:55 <bswartz> for other things, they would need to be vendor specific and we'd just have to document how to avoid doing the wrong thing
15:21:14 <ameade> sounds like we need to flesh out the design for option #2 in the paste to think about these corner cases
15:21:16 <bswartz> I really don't see this being a huge problem in practice
15:21:41 <ameade> which is method one in
15:21:44 <ameade> #link https://wiki.openstack.org/wiki/Manila/QoS
15:22:22 <bswartz> clouds with multiple storage vendors are rare, and when they exist it's even more rare to have a share type that covers 2 or more vendors -- typically people create different share types for each backend vendor
15:22:49 <ameade> bswartz: I think that last point is a problem in itself tbh
15:23:05 <bswartz> Zhongjun: I'm sorry we don't seem to be getting closer to a decision here
15:23:16 <bswartz> we might need to schedule a working session to get qos hammered out
15:23:32 <ameade> +1
15:23:41 <bswartz> I dont' want to take up the whole meeting with qos though because we have other business
15:23:46 <ameade> and if someone has the bandwidth to design what 'method 1' would need to look like exactly
15:23:47 <Zhongjun> bswartz: It's ok
15:24:26 <bswartz> I'd like to see more detailed example of what it should look like in the real world
15:25:01 <bswartz> ameade's paste is a good start, but we need more examples that support a particular design or show why a particular design has problems
15:25:15 <ameade> +1
15:25:26 <bswartz> and right now I'm less concerned with administrator quality of life and more concerned with whether we can even implement something that makes sense
15:25:57 <bswartz> I think we can come back and solve manageability issues after we have a design that makes sense
15:26:14 <bswartz> Let's use the ML to continue discussing this one
15:26:24 <ameade> just want to make sure we don't pigeon hole ourselves
15:26:25 <bswartz> and I will try to schedule a specific time next week to discuss qos
15:26:30 <ameade> +1
15:26:39 <bswartz> #topic  Manila DR update
15:26:46 <bswartz> ameade: this one's your
15:26:48 <ameade> #link  https://review.openstack.org/#/c/238572/
15:26:49 <bswartz> yours
15:27:04 <ameade> haven't gotten any reviews but i have a couple things to run by folks in the meeting
15:27:19 <ameade> at the summit we agreed that we need a first party driver implementation of DR to run in the gate
15:27:33 <bswartz> ameade: yes
15:27:48 <ameade> i want to know if we should do that in the current generic driver or if we were going to have another generic driver soon
15:28:00 <ameade> and who can do that work and do we need it right away
15:28:05 <ameade> while the api is experimental
15:28:24 <bswartz> ameade: there are 2 new first party drivers in development -- both should have reviewable code
15:29:08 <bswartz> I honestly don't know if it would be easier to build replication support in the existing generic driver or one of the new ones
15:29:28 <bswartz> at the very least, replication support would create additional dependencies in the driver
15:29:36 <bswartz> so it would need to be optional
15:30:07 <bswartz> the 2 most promising approaches I'm aware of are:
15:30:07 <cknight> bswartz: I'm very hesitant to continue investing in the current generic driver
15:30:09 <ganso> I don't know of the implementation details of DR, but wouldn't it make more sense to use Cinder to replicate in the current generic driver since it uses Cinder... as soon as Cinder implements DR properly...
15:30:33 <bswartz> 1) block layer replication using DRBD, and using a filesystem which support replication on top of it
15:30:35 <ameade> is this something we want in one of these drivers right away in Mitaka?
15:30:47 <bswartz> and 2) filesystem layer replication, using ZFS
15:31:23 <bswartz> ganso: cinder's replication semantics are too weak for us to use them, even if it was working today (it's not)
15:31:59 <jasonsb> maybe 3) filesystem layer replication, using glusterfs?
15:32:15 <bswartz> the advantage to a DRBD solution is that we could do active/active
15:32:48 <bswartz> the ZFS-based approach would be active/passive for sure, but could probably implement "readable" semantics rather than "dr"
15:32:50 <ameade> if we do active/active then it can test promote :P
15:32:53 <ameade> cant*
15:33:09 <bswartz> ameade: maybe we should just do both then
15:33:47 <toabctl> DRBD is also relativly simple to setup
15:33:49 <bswartz> jasonsb: I'm not familiar enough with glusterfs to know what it can do, replication-wise
15:34:42 <jasonsb> bswartz: i can check on it and give some more details on how suitable
15:34:45 <bswartz> I'd be happy to hear about additional proposals for replication in a first-party driver
15:34:54 <bswartz> especially if it's less work to get it up and running
15:35:17 <ameade> ok so design aside, is it ok to just require this for when we transition off of experimental?
15:35:24 <csaba> bswartz: wrt. glusterfs: I'm not familiar with state of the art either, but can check about
15:35:24 <bswartz> toabctl: I hope you're right -- do you have any interest in writing a prototype?
15:35:42 <ameade> or is it something we need right away (obviously sooner is better)?
15:35:55 <cknight> bswartz: so this could be in *any* of the drivers that run in the gate (gluster, generic, ceph, hdfs, etc.), right?
15:37:13 <bswartz> ameade: I'm of 2 minds about that -- part of me thinks we need a first party implementation before it can merge, but then I can also imagine merging it as experimental and adding support in a first party driver afterwards
15:37:26 <toabctl> bswartz: interest yes, but no time. next SUSE cloud release is on the agenda currently. sorry
15:37:39 <bswartz> I'd like to hear more opinions on that
15:37:49 <bswartz> toabctl: okay I understand
15:37:53 <cknight> bswartz: I'd say it cannot leave the experimental state until it's tested in the gate.
15:37:54 <ameade> bswartz: same, it definitely can't be promoted from experimental without it, i think we all agree there
15:38:08 <ameade> * time check, 23min left *
15:38:25 <bswartz> ameade: you're okay with moving on?
15:38:52 <ameade> yeah for now, we can revisit later
15:38:54 <bswartz> #topic Manila Driver minimum requirements document update
15:38:57 <bswartz> ganso: you're up
15:39:27 <ganso> ok so, it seems the last issue remaining in the document is that we are not sure if manage feature is mandatory or not in DHSS=False mode
15:39:54 <bswartz> ganso: I think we agreed that it's optional
15:39:57 <ganso> we know it is not mandatory in general, because the driver can implement DHSS=True mode and not have to worry about this
15:40:14 <ganso> but if the driver operates in DHSS=False, does it have to implement manage?
15:40:25 <bswartz> there's confusion about what not supporting it means, so we should be very clear that drivers don't need to implement anything
15:41:02 <ganso> also, Valeriy questioned that if all drivers can implement this, then this should be mandatory for DHSS=False mode...
15:41:12 <ganso> I agree with him, for interoperability
15:41:34 <bswartz> so it's an admin-only feature, so interoperability is less critical
15:41:48 <bswartz> and even for drivers that support it, it's allowed to fail for arbitrary reasons
15:42:19 <bswartz> therefore even if we made it mandatory, a driver could simply always fail and still meet the contract of the driver interface
15:42:33 <markstur__> that sounds optional
15:42:33 <bswartz> thus it's silly to make it mandatory
15:42:54 <ganso> humm ok, so it is completely optional, even in DHSS=False... I will update the document
15:43:06 <bswartz> yes I think that's what we said in tokyo
15:43:22 <cknight> ganso: thanks for handling this document!
15:43:32 <bswartz> fwiw this is not a change in thinking, but just a change in how we communicate to driver maintainers
15:43:33 <markstur__> +1 thanks
15:44:07 <bswartz> we've always known that drivers can get away with a noop implementaiton so it's effectively optional
15:44:20 <bswartz> #topic Manila Data Copy service name
15:44:27 <bswartz> #link https://review.openstack.org/#/c/244286/
15:44:37 <bswartz> so thanks to ganso for starting work on the new data copy service
15:45:17 <bswartz> the proposed named was manila-data_copy which is gross for 2 reasons (mixing hypens and underscores, and it's 2 words instead of 1)
15:45:31 <bswartz> I prefer manila-data, or m-dat for short
15:45:43 <bswartz> but I'm soliciting other ideas
15:45:59 <bswartz> we don't need to spend much time on this -- please use the code review to register feedback
15:46:09 <bswartz> I just wanted to raise awareness that we need to choose a name
15:46:13 <ganso> I already changed to m-dat and manila-data
15:46:24 <ganso> if we all agree to this, then it is decided
15:46:30 <cknight> bswartz: +1 on needing another name.  your suggestion seems a good starting point for a service in the control plan that must access backends on the data plane.
15:46:36 <bswartz> ganso: thanks -- I want to know if anyone else has better/different ideas
15:47:05 <bswartz> if everyone is fine with manila-data, then we're done
15:47:14 <bswartz> #toic Upcoming change for Manila CI hooks
15:47:21 <bswartz> vponomaryov: you're up
15:47:53 <xyang1> is he on irc today
15:48:01 <bswartz> oh vponomaryov isn't here
15:48:16 <bswartz> must be connection issues
15:48:22 <bswartz> he added this topic right before the meeting
15:48:28 <bswartz> vponomaryov: hello!
15:48:35 <bswartz> it's time for your topic
15:48:57 <vponomaryov> sorry, missed the timings
15:49:17 <vponomaryov> so, this topic is about head up for driver maintainers
15:49:24 <vponomaryov> and theirs Third-party CIs
15:49:42 <bswartz> oh yes I remember this
15:49:51 <vponomaryov> #link https://review.openstack.org/#/c/243233/
15:50:28 <bswartz> yeah this change has the potential to break CI systems, depending on how they're implemented
15:50:32 <vponomaryov> first reason - we use fixed version of Tempest and our plugin is updated from time to time, so we store value in manila CI hooks
15:50:43 <bswartz> most of them should be fine (as we can see from the votes)
15:51:15 <bswartz> CI maintainers should take a closer look at this one, and after it merges make sure they're not broken
15:51:18 <vponomaryov> and to ease sync for all third-party CIs common parts are being separated to another file
15:51:58 <bswartz> any questions on this?
15:52:00 <vponomaryov> and now is right time to make updates to Third-party CIs to support both - old and new approach
15:52:15 <ganso> new approach is much better :)
15:52:31 <bswartz> ganso: agreed
15:52:47 <bswartz> #topic  Manilaclient enhancement to provide request_id when set http_log_debug is True
15:52:59 <bswartz> JayXu: you're up
15:53:17 <JayXu> we are refactoring our component test
15:53:24 <bswartz> #link https://review.openstack.org/#/c/243233/
15:53:47 <JayXu> and find that there is no way to correlate the failed request with its id
15:54:04 <bswartz> err
15:54:05 <bswartz> wrong link
15:54:06 <JayXu> so I propose to add request_id into http resp body
15:54:13 <bswartz> the agenda got screwed up
15:54:22 <JayXu> http://paste.openstack.org/show/478675/
15:54:56 <JayXu> any comment on that?
15:55:13 <xyang1> there is a cross project effort to add request id to response data, may be related to this
15:55:13 <vponomaryov> bswartz: just update page )
15:55:20 <vponomaryov> bswartz: *refresh
15:55:26 <bswartz> JayXu: was there a cross project discussion about this in tokyo?
15:55:29 <xyang1> let me dig out some info
15:55:45 <JayXu> no
15:55:50 <bswartz> I think it's a good idea -- especially if it can be done cheaply
15:56:07 <JayXu> I just got it this week when I tried to update my component test cases
15:56:12 <xyang1> I have a link, give me a sec
15:56:18 <bswartz> when I first heard about this idea I wondered if it would require much code to track the ID everywhere we want to log it
15:56:19 <jasonsb> #link https://etherpad.openstack.org/p/Mitaka_Cross_Project_Logging
15:56:25 <jasonsb> no spec yet i think
15:57:03 <bswartz> I'm sure QA guys and deployers/troubleshooters would be thrilled to have this though
15:57:42 <xyang1> https://blueprints.launchpad.net/python-cinderclient/+spec/return-request-id-to-caller
15:57:43 <bswartz> it might even enable some cool scripting to tie together log files
15:58:13 <xyang1> there is also an oslo spec
15:58:21 <bswartz> xyang1: thanks
15:58:49 <bswartz> JayXu: if you haven't seen these, I suggest reading them and making sure the proposal for Manila fits in with what others are doing
15:59:00 <JayXu> okay, thx
15:59:01 <bswartz> it sounds like a good idea, but we should be consistent with our approach
15:59:09 <bswartz> #topic open discussion
15:59:13 <bswartz> only 1 minute left
15:59:20 <bswartz> anyone have a last minute topic?
15:59:43 <bswartz> we need to followup on both qos open items and replication open items
16:00:05 <bswartz> let's use the ML for those
16:00:08 <bswartz> thanks everyone
16:00:22 <bswartz> #endmeeting