16:10:27 <DuncanT> #startmeeting cinder
16:10:28 <openstack> Meeting started Wed Jun  4 16:10:27 2014 UTC and is due to finish in 60 minutes.  The chair is DuncanT. Information about MeetBot at http://wiki.debian.org/MeetBot.
16:10:29 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
16:10:31 <openstack> The meeting name has been set to 'cinder'
16:10:35 <DuncanT> # topuc Volume backup
16:10:43 <DuncanT> #topic Volume backup
16:10:52 <navneet> ok starting again
16:10:57 <navneet> here is the spec https://review.openstack.org/#/c/97663/1
16:11:10 <navneet> n the blueprint  https://blueprints.launchpad.net/cinder/+spec/vol-backup-service-per-backend
16:11:26 <navneet> DuncanT: u hv ny concerns
16:11:34 <DuncanT> Yes
16:11:36 <navneet> or questions
16:11:38 <navneet> ok
16:12:01 <DuncanT> The original aim of volume-backup as a separate service is that it can be scaled indpendantly of the volume service
16:12:05 <DuncanT> (among aims)
16:12:16 <navneet> DuncanT: theoritically yes
16:12:21 <navneet> but actually no
16:12:38 <navneet> DuncanT: u still have vol manager for each backend
16:12:42 <DuncanT> There are bugs that stop this happening, but I'd rather fix those than increase the coupling
16:13:00 <navneet> DuncanT: dis aims at decoupling
16:13:11 <DuncanT> You have a volume-manager per backend, but you don't need a backup service per volume service
16:13:22 <navneet> DuncanT: which bugs ? theylisted somewhr
16:13:35 <DuncanT> navneet: I've not looked at the issues recently
16:13:37 <navneet> DuncanT: dats th fault in design
16:13:59 <navneet> DuncanT: dis is handling msg routing in mgr
16:14:07 <winston-d> navneet: why do you need a backup service per vol service?
16:14:32 <navneet> winston-d: to route the requests properly
16:14:35 <DuncanT> navneet: if the backup service is fixed so it always does a proper remote attach to the volume it is backing up, then it doesn't matter what the backend is
16:14:40 <navneet> manager is not the place to do it
16:15:08 <DuncanT> navneet: Then all the routing can be removed and we can just let the scheduler pick the least busy backup manager
16:15:14 <navneet> DuncanT: mgr is routing the requests to backends internally
16:15:27 <winston-d> navneet: can you elabrate?
16:15:34 <navneet> DuncanT: I dn thnk thr is any scheduler for back up
16:15:40 <navneet> and is not even needed
16:16:01 <navneet> sure
16:16:07 <DuncanT> navneet: There was, it got removed because remote attach didn't work for iscsi based drivers at the time
16:16:36 <navneet> DuncanT: its not needed now as well
16:16:42 <navneet> m not proposing to bring in the sch
16:16:57 <winston-d> navneet: who gets to decide where to backup a volume then?
16:16:58 <navneet> m just telling that the msg handling is better off in a service
16:17:19 <DuncanT> navneet: Scheduling is nice to have... some backends at way more efficient if the backup manager is co-located with the volume manager, and otherwise you want to pick a free backup service for new requests where possible
16:17:21 <navneet> winston-d: its the service
16:17:44 <navneet> DuncanT: its nice to have but m not aiming in this change
16:18:00 <navneet> and with this change it will enable sch later
16:18:02 <DuncanT> navneet: It is far from clear what you /are/ proposing in this change
16:18:05 <navneet> if somebody wants it
16:18:31 <Arkady_Kanevsky> What is API looks like? DOe sit allows user to choose where to back and to which back end to restore?
16:18:38 <navneet> DuncanT: u understand hw d request flows?
16:18:45 <navneet> using a msg server
16:19:01 <DuncanT> navneet: In general, yes
16:19:07 <zhithuang> i thought you were proposed to add a scheduler
16:19:23 <navneet> DuncanT: ok  so dis ll enable rquest routing at service level
16:19:33 <navneet> than the manager level whch is d case now
16:19:53 <navneet> and we do routing everywhr at service level which has a defined host
16:19:57 <DuncanT> So the first routing is currently done by the API
16:20:06 <navneet> DuncanT: yes
16:20:17 <navneet> DuncanT: it selects a topic and a host
16:20:28 <navneet> DuncanT: and then goes to msg server
16:20:37 <navneet> service pick it from thr
16:20:40 <DuncanT> The topic and host it selects is already wrong
16:20:54 <navneet> DuncaT: host is a common host rt now
16:21:01 <navneet> which needs to be per backend
16:21:11 <navneet> *DuncanT
16:21:32 <navneet> DuncanT: u mean u r loking into correcting host?
16:21:36 <DuncanT> Common between the volume-manager and the backup-manager, yes. That restriction needs removing
16:21:49 <navneet> DuncanT: dats wat m planning
16:22:02 <navneet> DuncanT: dis ll clean it up
16:22:06 <DuncanT> navneet: Then your spec does a terrible job of explaining it ;-)
16:22:13 <winston-d_> DuncanT: agree
16:22:19 <navneet> DuncanT: sorry if it does not :(
16:22:36 <navneet> DuncanT: I can modify if u suggest
16:22:52 <DuncanT> It sounds like you want to change the backup-manager code - the breakage happens before the message ahs even left the API node
16:23:18 <navneet> DucanT: the api ll now pick up the rt host for vol
16:23:22 <navneet> and not the common host
16:23:31 <DuncanT> What is the 'right' host though?
16:23:38 <winston-d_> what is a common host?
16:23:39 <navneet> DuncanT: n a service for that host ll address it
16:23:57 <navneet> DuncanT: for multibackend it ll be host@backend
16:24:20 <navneet> DuncanT: currently its just host
16:24:27 <navneet> with topic volume-backup
16:24:28 <DuncanT> No, that means you now need a cinder-backup service for every backend, which is *exactly* what I *don't* want
16:24:44 <navneet> DuncanT: y u dn want
16:24:45 <navneet> ?
16:25:10 <DuncanT> I want to be able to have one cinder-backup service serving N backends, which might be running on a toally different host to the cinder-volume service
16:25:47 <navneet> DuncanT: dat can be achieved with changes as well
16:25:59 <DuncanT> navneet: Not the changes you are proposing
16:26:11 <winston-d_> time check, 34 mins left
16:26:13 <navneet> DuncanT: we read the cinder.conf to have vol managers for each backend
16:26:33 <navneet> n we wud do same for service as well
16:26:41 <navneet> if it works wid current desing
16:26:46 <navneet> it ll work with canges as well
16:26:51 <navneet> *changes
16:26:52 <DuncanT> navneet: I want to completely and utterly decouple any relationship between volume-managers and backup service
16:27:07 <DuncanT> No relationship what so ever
16:27:18 <navneet> DuncanT: dats seems to be new feature
16:27:31 <navneet> DuncanT: u dn have it rt now
16:27:46 <navneet> DuncanT unknowingly u tie it up in manager
16:27:47 <DuncanT> navneet: It was an original feature that got removed because iscsi remote attach didn't work
16:27:47 <hemna> navneet, any chance you can use complete words ?
16:27:54 <DuncanT> hemna++
16:28:03 <navneet> hemna: sure :)
16:28:05 <winston-d_> hemna: +++
16:28:17 <hemna> great thanks
16:28:20 <winston-d_> my eyes are bleeding. thx
16:28:43 <navneet> DuncanT: can u plz explain me the problem ffline
16:28:45 <DuncanT> navneet: What I don't understand is why you want to have *more* cinder-backup services running
16:28:51 <hemna> *sigh*
16:29:00 <winston-d_> hemna: :)
16:29:08 <navneet> DuncanT: its not the right way to handle messages
16:29:16 <navneet> hemna: now works
16:29:41 <navneet> DuncanT: and its very difficult to extend bacup manager with current design
16:29:46 <DuncanT> navneet: What isn't? I don't *care* about messages, I care where things have to be run, and how many need to be run
16:29:55 <DuncanT> navneet: Extend it how?
16:30:35 <navneet> DuncanT: for any new feature which involves back feature support will need changes in manager
16:30:49 <DuncanT> navneet: Why?
16:31:20 <DuncanT> navneet: I'm writing a feature today, and it is no problem at all. Can you give a specific example?
16:31:22 <navneet> DuncanT: because you handle request to a particular backend in manager
16:31:37 <DuncanT> navneet: That is by design
16:31:38 <navneet> DuncanT: like in the pools stuff
16:31:43 <DuncanT> navneet: Why is that bad?
16:31:52 <hemna> this seems to be ratholing
16:32:02 <navneet> DuncanT: wat m telling is that its not the right place
16:32:24 <winston-d_> 28 mins left, maybe we should move on to next topic
16:32:25 <navneet> DuncanT: and services are there for that same exact reason
16:32:25 <DuncanT> e.g. ceph can't easily attach a volume as a block device, so it handles the I/O itself
16:32:47 <DuncanT> navneet: I don't want any backup stuff happening in the volume-manager, ever
16:32:57 <DuncanT> navneet: That is too much coupling
16:32:57 <navneet> DuncanT: I will upload a WIP
16:32:57 <hemna> DuncanT, +1
16:32:59 <winston-d_> we have 3 more topic
16:33:03 <navneet> can u guys comment on that
16:33:11 <DuncanT> navneet: Ok, a WIP might help make things clearer
16:33:17 <navneet> DuncanT: sure
16:33:19 <navneet> thx
16:33:24 <winston-d_> navneet: +1 for wip
16:33:36 <DuncanT> Ok, next topic?
16:33:38 <navneet> winston-d_:thx
16:33:41 <navneet> next topic
16:33:47 <DuncanT> #topic Dynamic multipool
16:34:05 <navneet> Did any of you got a chance to look into it
16:34:09 <navneet> the WIP?
16:34:38 <navneet> https://review.openstack.org/#/c/85760/
16:34:39 <hemna> not I.   I fundamentally disagree with the approach.
16:34:42 <DuncanT> I had a quick look, and my worry that the driver instance is no longer a singleton stands
16:34:52 <winston-d> navneet: did you put together a spec into gerrid?
16:35:03 <navneet> winston-d: for multipools no
16:35:13 <navneet> winston-d: thr is a blueprint howwevr
16:35:36 <navneet> DncanT: the driver instances u worry about?
16:35:41 <navneet> DuncanT:
16:35:59 <kmartin> navneet: you should have a matching cinder-spec for https://blueprints.launchpad.net/cinder/+spec/multiple-service-pools
16:36:06 <navneet> DuncanT: ok let me put this way
16:36:07 <hemna> I still think we need to do the more simple approach using volume types.
16:36:34 <navneet> hemna: I was curious if you hae this documented anywhere
16:36:38 <navneet> I can have a look
16:36:49 <hemna> there is no reason to be dynamically creating driver instances simply because a backend has a pool or pools.
16:36:58 <navneet> hemna: I want to do a comparative study
16:37:13 <hemna> navneet, there really isn't anything to document.   It's volume types.
16:37:31 <navneet> hemna: I think you are not understanding the real problem
16:37:33 <winston-d> just so you know, i'm working on a scheduler based change to address multi-pool needs
16:37:33 <hemna> put your pool in a volume type.  report stats for the pools in get_volume_stats() and make the scheduler aware.
16:37:35 <hemna> done
16:37:45 <winston-d> POC patchs is 50% ready
16:37:50 <navneet> hemna: there are backends like ours where you have multiple flexvols
16:37:52 <hemna> winston-d, +1
16:37:59 <kmartin> navneet: a number of driver today can change the storage pool with volume types today
16:38:07 <navneet> hemna: and each have a differen capability and capacity
16:38:13 <hemna> navneet, same with ours
16:38:16 <hemna> we use volume types.
16:38:43 <navneet> hemna: if it does not find a pariculr flexvol then it ll fials at cinder-vol
16:38:45 <navneet> rt now
16:38:58 <navneet> hemna: I saw the 3par thing
16:39:06 <navneet> hemna: u also have qualified specs
16:39:13 <hemna> each our our pools has different capabilities and sizes.  We put the pool name/id in the volume type, and we enter a default pool to use if no volume type specified.  it just works.
16:39:22 <navneet> hemna: which does not evaluate at the scheduler
16:39:55 <hemna> correct, that's why we need to make the scheduler aware of pools via get_volume_stats()
16:39:57 <navneet> hemna: putting pool name is the type is actually hiding the problem than solvng it
16:39:57 <kmartin> update the get_volume_stats to include capabilities for each volume type
16:39:59 <hemna> if we do that, then everything just works.
16:40:08 <hemna> on the contrary
16:40:15 <DuncanT> I'm interested in seeing winston-d's PoC since I'm pretty sure he is planning on doing it exactly as I would
16:40:18 <hemna> putting the pool in the volume type makes the admin directly aware
16:40:33 <navneet> DuncanT: I will be interested too
16:40:42 <hemna> dynamically creating drivers per pool, is exactly as you described, hiding the problem.
16:40:51 <navneet> DuncanT: if u r coming with ny new approach?
16:40:56 <winston-d> I'll put it up tomorrow
16:40:59 <navneet> heman: not thats not
16:41:19 <navneet> hemna: it does not mandate admin to create vol type for each pool
16:41:19 <hemna> ok, agree to disagree.
16:41:33 <kmartin> winston-d: I'm interesting in see as well, did you have a cinder-spec for it yet?
16:41:50 <navneet> hemna: your vol type will go on increasing if you have pool name in volume specs
16:41:52 <xyang1_> winston-d: what is your proposal about?
16:41:56 <kmartin> winston-d: code is fine as well? :) for a POC
16:42:00 <DuncanT> Putting the pool name in the type is an unnecessary limitation IMO, makign the scheduler aware of pools and just adding @pool to the end of the create host and making drivers understand that (or adding a pool param to the driver create and letting the manager unwrap it, if that seems cleaner) should work
16:42:13 <hemna> what does vol type increasing ?  that doesn't make sense.
16:42:36 <navneet> hemna: u need to put pool nam ein the extra spec rt?
16:42:40 <winston-d_> kmartin: not really, but i might get the code up into gerrit first. ;) but sure I will submit a spec as well
16:42:54 * jungleboyj has to drop.  Will bring my topic up in openstack-cinder or next week unless you guys want to discuss and I will catch up from the meeting notes.
16:42:55 <kmartin> winston-d: perfect, thanks
16:42:56 <navneet> hemna: for each pool u need to do the same
16:42:56 <hemna> navneet, I can't make sense of what you are writing, as you aren't using words.
16:43:22 <navneet> heman: vol type ties specs with key value pai
16:43:25 <navneet> pair
16:43:26 <DuncanT> jungleboyj: Sorry, sisn't see your note about time on the agenda
16:43:42 <DuncanT> *didn't
16:43:49 <navneet> hemna: pool name should be in the spec
16:44:07 <navneet> hemna: and so each pool will have a new volume type
16:44:20 <navneet> hemna: as per the approach u mentioned
16:44:35 <DuncanT> navneet: That certainly isn't true of the scheduler appoach
16:44:58 <navneet> DuncanT: scheduler lessens the work for admin
16:45:03 <hemna> yah, I don't see a problem with that actually.   The admin knows about the pools on the backend.  If they want them available or not available to use, it should be up to the admin to decide.
16:45:27 <navneet> DuncanT: and its dynamic  which does not require any changes to driver for having new pools
16:45:42 <navneet> hemna: I feel thats too much work for the admin
16:45:55 <winston-d_> when scheduler is made to be pool-aware, it treats every pool like a standalone backend, the only difference is pool itself doesn't really have to be in a cinder-volume service
16:45:59 <navneet> and they ll hate pools if new vol type needs to be created
16:46:10 <hemna> winston-d, +1
16:46:14 <DuncanT> winston-d++
16:46:17 <navneet> winston-d_: I see ur point
16:46:37 <navneet> but do you how many changes it requirs?
16:46:59 <hemna> there is no reason to create a new manager/driver instance just to support another pool on the same backend.
16:47:02 <DuncanT> navneet: the changes don't look that huge to me, though I might be missing some
16:47:25 <navneet> DuncanT: is thr a WIP for winstons changes?
16:47:43 <Arkady_Kanevsky> what is the mechanism for scehduler to know that multiple pools have "the same charateristics" to choose which pool to create volumes on?
16:47:47 <hemna> I think winston-d's approach is far better, as long as the admin still has a way of either creating a whitelist and/or a blacklist of which pools to use.
16:47:49 <navneet> winston-d_: do you have a spec for it?
16:48:01 <DuncanT> navneet: It does require a driver interface change so it has some way of telling the driver which pool to pick on create, but that seems a better hit than changing the driver singleton, which affects every driver more fundamentally
16:48:17 <hemna> It's bad to force the use of all available pools on the array, as the array may be shared by non OpenStack stuff.
16:48:27 <winston-d_> navneet: i don't have a spec yet, but my code is almost ready for review
16:48:38 <navneet> DuncanT: singleton drivers will be in separate threads
16:48:44 <DuncanT> hemna: The driver get_stats decides what pools to expose...
16:48:57 <navneet> DuncanT: no security issue
16:49:03 <DuncanT> navneet: Threads don't help much... not security, races
16:49:17 <hemna> DuncanT, we should have some standard mechanism for informing the driver which pools to whitelist/blacklist though, so it's consistent for all drivers.
16:49:23 <navneet> DuncanT: I dont see that happening
16:49:30 <navneet> we have green  threads in place
16:49:53 <DuncanT> navneet: I do. Two creates can now be started concurrently, inside the same process. That could not happen before
16:50:06 <navneet> DuncanT: true
16:50:09 <DuncanT> navneet: That is a *big* change
16:50:16 <navneet> DuncanT: but only one ll run
16:50:25 <navneet> DUncanT: dats the beauty
16:50:29 <DuncanT> Why will only one run?
16:50:32 <winston-d_> 10 mins left. We have 3rd party CI in the queue
16:50:39 <navneet> DuncanT: becoz we have green threads
16:50:48 <navneet> DuncanT: within same process
16:50:49 <hemna> and for N pools, you'll have N drivers, which creates M connections to the backend.   This seems like an explosion of resource usage to me as well.
16:51:12 <ameade> yeah lets progress on the agenda
16:51:26 <navneet> DuncanT, winston-d_: r u guys working on alternative approaches?
16:51:28 <DuncanT> navneet: You've still got problems if you e.g. cache backend info in your driver
16:51:36 <winston-d_> navneet: yes, i am.
16:51:38 <hemna> and with everyone using locks in their driver APIs, you don't really get any benefit of concurrency with the N drivers.  It's just a hack.
16:51:47 <DuncanT> navneet: winston-d ahs said repeatedly he has something nearly finished
16:51:52 <navneet> winston-d_: is thr a WIP out thr? ll like to coordinate
16:52:06 <DuncanT> navneet: WIP soon from winston
16:52:11 <winston-d_> navneet: i didn't put it up yet, will do tomorrow
16:52:21 <DuncanT> #action winston-d to publish scheduler based WIP
16:52:34 <DuncanT> Right, last topic, 3rd party CI
16:52:35 <navneet> DuncanT: sure ll like to see that
16:52:41 <DuncanT> #topic 3rd party CI
16:52:55 <navneet> winston-d_: thx
16:53:56 <DuncanT> Has anybody looked at asselin's brnach? Or got any other comments?
16:54:16 <xyang1_> hemna: are your Jenkins slaves all running on VMs?
16:54:26 <bruff> Can I get a URL for the brnach?
16:54:43 <DuncanT> https://github.com/rasselin/os-ext-testing/tree/nodepool
16:54:53 <hemna> xyang1_, we are working on getting nodepool support in with jaypipes code
16:55:21 <hemna> we haven't tested FC PCI passthrough yet either.
16:55:28 <akerr> hemna: and this branch that was linked is that WIP?
16:55:49 <hemna> akerr, yah
16:55:56 <akerr> is it usable?
16:56:07 <hemna> I think it's close
16:56:11 <hemna> might be worth trying
16:57:21 <xyang1_> hemna: do you have service account all setup?  Can tests be triggered manually without that?
16:57:36 <hemna> xyang1_, we are just triggering manually for now
16:57:50 <kmartin> akerr, asselin_ will be online in the cinder channel later and should be able to answer any question, he is just in another meeting currrently
16:57:51 <hemna> until everything is dialed in, then we'll get the service account setup
16:58:04 <DuncanT> tests can be manually triggered without a service account, or you can attach to the event stream with a personal account (just don't vote on a personal account)
16:58:11 <xyang1_> hemna: Are the tests supposed to run only with drivers already merged?
16:58:23 <xyang1_> hemna: I had that problem last time with cert test
16:58:35 <akerr> kmartin: thanks, I have wall-to-wall meetings today, but I'll look for him later on
16:58:43 <xyang1_> hemna: it was designed to test merged code, and it erased my new driver changes when I start the test
16:58:45 <hemna> xyang1_, ideally, they need to pull the patch that triggered the event.
16:59:06 <xyang1_> hemna: so I had to change the script to work around that
16:59:23 <xyang1_> hemna: we may have to do it for CI test, for new driver test
16:59:27 <tjones> guys - you close to wrapping up?  the next meeting starts in 1 minute
16:59:46 <DuncanT> tjones: Ok, thanks
16:59:50 <hemna> yah, it'll have to get the patchset that triggered the CI test, once it's integrated with the upstream
16:59:56 <hemna> 30 seconds
17:00:03 <DuncanT> Right all, time to move things to the cinder channel.....
17:00:15 <DuncanT> #end meeting
17:00:19 <tjones> DuncanT: thanks
17:00:26 <DuncanT> #endmeeting