09:00:14 #startmeeting smaug 09:00:15 Meeting started Tue Jul 5 09:00:14 2016 UTC and is due to finish in 60 minutes. The chair is saggi. Information about MeetBot at http://wiki.debian.org/MeetBot. 09:00:16 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 09:00:19 The meeting name has been set to 'smaug' 09:00:22 hi everyone 09:00:26 hello 09:00:57 hi 09:01:13 is it just the 3 of us? 09:01:25 not sure 09:01:25 o/ 09:01:27 hi 09:01:57 chenying, chenzeng and xiangxinyong will not arrive 09:02:08 hi 09:02:29 OK, let's start. They all can look at the logs later. 09:02:47 #topic Core Nomination 09:02:58 hi 09:03:47 I would like to nominate yinwei_computer as a core team member for Smaug. She has been one of the leading minds on the project from the start and I think she deserves it. 09:04:10 +1 09:04:13 +1 09:04:24 +1 09:04:29 though i am not core, +1 09:04:36 +1 09:04:47 I think that's unanimous. 09:04:58 thanks saggi 09:05:40 So from here on forth, yinwei by my ptl powers I dub thee a core member. 09:05:49 Congratulations yinwei! 09:05:50 congratulations 09:06:07 thanks guys! 09:06:10 congratulations 09:06:13 congratulations 09:06:19 congratulations 09:06:43 it's my pleasure and honor to work with you guys. Let's do it better! 09:07:39 #topic enhance restore object with status and resource info 09:08:24 I did not put this on the docket, so whoever wanted this discussed please speak. 09:08:35 I put it 09:09:03 so the background is we want to support concurrent restore on the same checkpoint 09:09:26 which introduces to maintain restore status on restore object 09:09:59 But one issue smile-luobin has raised is that, since cinder doesn't support multiple restore on the same backup 09:10:18 is it necessary for smaug to support that feature? 09:10:33 Can we queue it on our end? 09:10:49 Also, it might not matter when restoring since cinder might not be involved. 09:11:25 ok. so the answer is smaug will support concurrent restore on one checkpoint, right? 09:11:33 yes 09:12:20 btw, since you can't backup a volume concurrently, that means that protecting the same resources twice in the same time has this issue as well 09:12:22 If this is the requirement, then we need maintain session of restore, same as checkpoint 09:12:27 We might get less concurrent if we block on cinder but if\when they fix it we will be ready. 09:12:50 keystone session? 09:13:02 yuval, do you mean we backup one volume in two plans at the same time? 09:13:08 yinwei_computer: yes 09:13:35 saggi, sorry, when I said session here I mean lease 09:13:59 yuval, yes, the same issue. 09:14:41 How hard is it to internally reuse to the backup. 09:14:43 ? 09:15:06 Or is it too complex for now 09:15:08 reuse existed lease mechanism, you mean? 09:15:19 saggi: and what happens upon delete? 09:15:29 refcount 09:15:49 yes, ref count 09:15:51 saggi: maintained where? 09:16:10 yuval: has to be on the backup md 09:16:28 another index 09:17:00 ref from restore to checkpoint 09:18:05 I'm less worried about the refcounting and more about internal synchronization 09:18:08 what do you think to assign smile-luobin to update current lease rst to show the solution? so people can check the details on review board? 09:18:38 you mean sync from protection service to api service? 09:18:56 yinwei_computer: Sure, it should be similar to checkpoint. Since it's just there to protect the restore object. 09:19:12 sync protection service with itself 09:19:28 Using leases for cinder backup sounds like a solution, but this is in the hands of the protection plugin 09:19:46 maybe we can provide protection plugins with a general lease api 09:19:55 could you tell more about the sync? sync what, from whom, to whom? 09:19:59 Why do we need leases for cinder? don't they tell us if we double act? 09:20:19 lease is not a cinder specific solution 09:20:33 saggi: I figured that was yinwei_computer meant 09:20:40 I would prefer not having a resource specific lease 09:20:52 lease for checkpoint and restore make sense 09:20:56 lease is a general solution which is maintained by protection service itself 09:21:03 from protection to bank server 09:21:24 I think we are having multiple discussions at once 09:21:25 saggi, I think we're on the same page 09:21:49 **cinder not allowing multiple actions on the same resource 09:22:23 Let's say that we just wait until cinder allows us to perform the action. I assume if you try while another operation is in progress you get a special error. 09:22:27 lease for checkpoint and restore - how does it cope with different providers and the same resources? 09:22:36 yuval: in a minute 09:23:04 yinwei_computer: would that solve cinder issue? 09:23:25 yes 09:23:55 I got your idea, we just queue inside protection service to avoid parallel failure on cinder 09:24:08 yinwei_computer: exactly 09:24:29 the lease is to check idle restoring status restore objects 09:24:38 But it needs to also work if multiple protection services on different hosts act 09:24:50 **restore status 09:25:14 for restore adding a lease seems like the right solution to check for stale restores 09:25:23 hmm, sounds like we need a test before queueing 09:26:02 yinwei_computer: yes, it needs to keep checking. Unless there is a way to get a notification from cinder 09:27:07 agree 09:27:39 as for restore. The lease is just for liveness or do we want to support continuing a failed restore? 09:28:09 I think for a start we just use it for liveness unless there is some Heat magic that can solve this for us easily. 09:28:17 liveness for now 09:29:04 unless we first have task flow to work in a persisted storage, we are not able to support continuing failed restore/protect 09:29:21 OK 09:29:29 we depend on the task flow to maintain status 09:29:36 BTW errors should also be reported on the restore object. Similar to checkpoint. 09:29:43 yes 09:30:14 ok. Let's check the details from later rst commit for review. 09:30:26 if there's no more questions about this issue 09:30:33 I'm good 09:30:49 others? 09:31:31 I think we can switch to next topic 09:31:58 #topic yuval is going to China 09:32:05 $topic 09:32:13 wow, welcome! 09:32:17 is everyone properly excited 09:32:23 ^^ 09:32:56 Beijing and chengdu are on you agenda? 09:33:00 yes, both 09:33:00 yuval: Could you say what you are planning on doing for the record? 09:33:13 Ofcourse 09:33:37 I'll be attending OpenStack Days China in Beijing, with zhonghua-lee 09:33:50 Hopefully be part of his talk 09:34:08 Later I'll arrive to Chengdu 09:34:37 sorry, I left for a while... 09:34:41 yuval: welcome 09:34:50 you will have a speech in beijing, yuval? 09:34:55 wish you have a nice journey 09:35:20 yinwei_computer: zhonghua-lee is speaking, and hopefully I'll deliver a short talk about Smaug 09:35:34 Maybe zhonghua-lee can elaborate 09:36:00 I will try my best 09:37:05 have you two guys got a rehearsal? 09:37:28 not yet 09:37:37 zhonghua-lee, what are you going to talk? an introduction or cooperation? 09:37:45 I am writing the presentation 09:37:47 now 09:38:11 ok 09:38:14 we plan to show some demo about DP 09:38:54 introduce all the related projects 09:38:59 oh, yes 09:39:09 e.g. Smaug, Cinder... 09:39:19 are you going to demo cross regions DP? 09:39:43 that's in our plan 09:40:28 but we met some problem right now, I am not sure if it will be finished before the summit starting 09:41:41 let's check issues together 09:42:21 yinwei_computer: thank you 09:42:29 I hope you are going to show Yuval a good time 09:43:25 #topic open discussion 09:43:38 saggi: :) 09:43:46 Anything else anyone want to talk about? 09:44:20 I used to talk with chenying, about supporting cross region/openstack DP 09:44:30 what's the plan about this feature? 09:45:08 It should be implicitly supported in the right configuration. 09:45:11 do we plan to discuss about the RoadMap? 09:45:48 cross regions, cinder/nova/glance.. don't share db 09:46:11 For nova and glance it shouldn't matter. 09:46:11 so there will issues to protect in region1 but restore to region2 09:47:12 For cinder jgriffith offered to make sure that we know if we can perform a cinder manage on a target. This is required for restore on a different site. 09:47:39 As in make it something that a volume type will report 09:48:14 cross region, keystone is shared 09:48:35 so it's not a problem to manage cinder 09:48:46 cinder manage is an api call 09:48:53 to make a volume managed under cinder 09:49:06 so you copy it to the target and than add all the cinder metadata 09:49:48 will it generate a cinder volume for the data backend? 09:50:13 It's for cases where the data exists on the target but cinder doesn't know about it 09:50:26 so the procedure of restore is not to call cinder restore, but cinder manage, right? 09:50:46 restore only works for backups that where made on the same cinder instance 09:51:26 We need to move the data to the new site and then add the information to cinder 09:51:34 this is what cinder manage is fore 09:51:52 but from which way we notify the backend to restore from backup data to a volume? 09:51:55 Than we should be able to restore 09:52:49 I'm not sure managing a volume also 'imports' its backup 09:53:24 It' 09:53:28 say, we have a backend like ceph, which backup rbd from site1 to site2. The backup data is in a snapshot diff format. We need first ask the backend to restore snapdiff to a rbd image. Then we call cinder manage to map a cinder volume to the backend:ceph rbd image. 09:54:16 another way beyond 'manage', as yuval told is 'import' backup. 09:54:41 Cinder manage is a way to set up the MD for data that is already on the target 09:54:48 import is for data that isn't on the target 09:54:53 then we can restore the imported backup to a volume, where the backend will be notified in this way. 09:55:21 import backup, i mean 09:55:30 For volumes on swift import should work 09:55:39 for replicated volume we will need to use manage 09:55:51 Since they are already on the target 09:56:36 at least there should be a link to know the volume backend id (swift object), and if it is available (swift replication) 09:57:18 We're almost out of time. yinwei_computer could you write a blueprint so we could start discussing this on gerrit? 09:57:36 this link is missing if we use manage procedure: we need a link to check/restore the data of the volume on the target backend 09:57:45 np 09:57:46 sure 09:59:02 OK Thanks everybody 09:59:05 #endmeeting