*** zengyingzhe_ has joined #openstack-smaug | 00:05 | |
*** zengyingzhe_ has quit IRC | 00:19 | |
*** zengyingzhe has joined #openstack-smaug | 00:19 | |
*** saggi has quit IRC | 02:14 | |
openstackgerrit | Yingzhe Zeng proposed openstack/smaug: Proposed Smaug API v1.0 https://review.openstack.org/244756 | 02:17 |
---|---|---|
*** saggi has joined #openstack-smaug | 02:27 | |
*** zengyingzhe has quit IRC | 02:28 | |
openstackgerrit | zengchen proposed openstack/smaug: schedule service design https://review.openstack.org/262649 | 03:42 |
openstackgerrit | zengchen proposed openstack/smaug: operation engine design https://review.openstack.org/262649 | 03:44 |
*** WANG_Feng has quit IRC | 06:19 | |
*** WANG_Feng has joined #openstack-smaug | 06:19 | |
*** CrayZee has quit IRC | 07:43 | |
*** gampel has joined #openstack-smaug | 07:55 | |
*** zengyingzhe has joined #openstack-smaug | 08:14 | |
*** c00281451 has joined #openstack-smaug | 08:57 | |
*** c00281451 is now known as chenzeng | 08:58 | |
chenzeng | saggi:please review the bp:operation-engine-design at your free time. I hope for your feedback. thanks. https://review.openstack.org/#/c/262649 | 09:00 |
saggi | chenzeng: I will | 09:01 |
chenzeng | saggi:thanks. | 09:01 |
gampel | saggi: did you update everyone about the IRC meeting set for next week | 09:27 |
zengyingzhe | I've told the team in China. | 09:39 |
yinwei | the meeting is next week? | 09:40 |
yinwei | I thought it's tonight | 09:40 |
zengyingzhe | And update the meeting notice in https://wiki.openstack.org/wiki/Meetings/smaug | 09:42 |
zengyingzhe | yinwei, every even week. | 09:43 |
yinwei | saggi, I think about the lease scenario today. In fact, what we need solve are two cases:1. delete unfinished zombie checkpoints but no active under protecting checkpoint; 2. delete finished checkpoints but no under restoring checkpoints; | 09:44 |
yinwei | Both cases could be divided by two categories: delete/write or delete/read in same site or across different sites; | 09:45 |
yinwei | if they are from the same site, then lease should be enough to synchronize; | 09:45 |
yinwei | if they are initiated on paralle across different sites, delete based on lease may need wait enough time; | 09:47 |
yinwei | for the question who will do the clean up work, I suggest normally one site will have a single GC instance to check checkpoints created by this site, here we need build another index to list checkpoints of one site. | 09:50 |
saggi | yinwei, actually shortly after you left I found a very simple solution. When you delete something you need to make sure you only mark a section as deleted if the delete was complete if you have any issues you stop the delete. That way if we have two delete running at once, they will stop once they start handling the same resource. Than, in the next GC cycle. The will continue where they left off as if there were a crash. | 09:50 |
saggi | That way we don't need locking at all. We just need to make sure our algorithms are crash safe. Which we have to do anyway. | 09:51 |
yinwei | actually, we could make use of the feature that, in the same site, key is updated synchronously. So inside one site, each service instance could compete the root lease to run the GC, which only check checkpoints of this site. | 09:53 |
yinwei | When the site failed, admin notifies the other site that site A has failed, and GC all unfinished checkpoints left by siteA. | 09:54 |
yinwei | I don't think cross sites compete root lease will work, since this is the similar issue that the other site may see the root lease later. | 09:55 |
saggi | That is why I suggested abandoning root lease. Just allow delete wherever. We just fail if we detect a collision and restart some other time. | 09:57 |
yinwei | But inside one site, root lease compete works. This could ensure one GC one site, where we don't need think about GC workload partition... | 09:58 |
yinwei | across sites, where we don't have any strong consistency cluster to notify site failure, we could choose a manual way: let admin to notify all unfinished checkpoint left by failure site should be cleaned up | 09:59 |
saggi | how do you know which is the "master" site | 09:59 |
yinwei | there's no master site | 10:00 |
yinwei | I suppose smaug should have an alert mechanism or a manual mechanism that admin knows which site fails | 10:00 |
saggi | how would you implement this. | 10:00 |
saggi | ? | 10:00 |
saggi | yinwei, I got to go, be back in 45 minutes. | 10:01 |
yinwei | me too | 10:01 |
yinwei | ping you later at night | 10:01 |
yinwei | I have another question to ask you, shall we grant lease for read? like restore | 10:02 |
yinwei | so let's talk later | 10:02 |
gampel | Hi yinwei | 11:42 |
*** wei___ has joined #openstack-smaug | 13:18 | |
wei___ | hi, saggi | 13:18 |
saggi | wei___: hi\ | 13:18 |
saggi | wei___: hows it going | 13:19 |
wei___ | fine, thanks | 13:19 |
wei___ | are you free to talk? | 13:19 |
saggi | wei___: yes | 13:20 |
wei___ | so what I propose is simple, since we don't have centralized arbitration service, we let admin notify one of alive site that one site has failed. Here we provide an API, like notify_site_failure, then the site being notified will cleanup the garbage of the failure site. | 13:21 |
wei___ | for failure site, all unfinished checkpoints are garbage. | 13:21 |
saggi | That means the system is no longer self correcting | 13:21 |
saggi | you always need someone to take care of it | 13:22 |
wei___ | not really. for smaug, what it manages is separate openstacks which are only connected by geo-replicated bank | 13:23 |
wei___ | admin should be aware that how many sites are managed by smaug | 13:24 |
saggi | wei___: So what you are saying is that we have the admin force one site to do the GC and moving it safely is the admins responsibility. | 13:29 |
wei___ | think in detail, we have no control for behavior of plugin. So each plugin may backup to different sites, say, vol1 backup to site b, vol2 backup to site c. Even site b is able to delete checkpoint located in bank, it may not be able to delete backup of vol2 located in site c. I mean, if the two backup backends are different, you can't have backup driver to delete a resource not belong to its own backend. | 13:31 |
wei___ | It seems that admin or some other service should be aware of all sites(openstacks) managed by smaug, and notify all parties that one site failure and each one should delete the garbage produced by this failure site | 13:33 |
wei___ | It seems we need another component to manage add/delete site. I'm not sure if there're more scenarios require such service, but I feel like yes. | 13:34 |
saggi | A provider should be able to back up fully from the site you want to back up and restore fully from the site you want to restore. | 13:35 |
saggi | There is no other way to go about it | 13:35 |
wei___ | but different tenant could pick up different provider, right? | 13:36 |
saggi | yes | 13:36 |
saggi | of course | 13:36 |
wei___ | we only allow all providers backup to one site? | 13:36 |
wei___ | or smaug only manage two sites? | 13:37 |
saggi | We don't limit it. But it doesn't make any sense because if a site goes down you will only have partial data. I'd assume you would want to backup everything fully to one site or more. | 13:37 |
saggi | It's the providers job. If we backup the volumes to swift than swift does the geo replication. | 13:38 |
wei___ | but one site is shared by multi tenants, you can't limit tenant behavior | 13:38 |
wei___ | say, tenant a chose provider a, tenant b chose provider b, where provider a backup to site2, provider b backup to site3 | 13:39 |
saggi | But this is OK | 13:39 |
wei___ | when site1 fails, garbage are located in site b and site c | 13:39 |
saggi | you could also have a provider that backs up to both site2 and site3 at once | 13:40 |
saggi | depends on configuration | 13:40 |
wei___ | here garbage i don't mean checkpoint only, but also the backup data | 13:40 |
saggi | exactly | 13:40 |
wei___ | in this case, you couldn't have site b to delete garbage located in site c | 13:41 |
wei___ | you have to notify site b cleanup garbage left in its own site and notify site c to cleanup its own. site a may knows how to cleanup both, but it fails so you can't count on. actually when i worked in distributed storage system, we always have a central configure service to synchronize each site status, and even configure sites replication pairs. I'm not sure why smaug doesn't need this, since it works like a distributed system. | 13:46 |
saggi | The bank is supposed to store the configuration | 13:54 |
saggi | If some storage needs any synchronization it should be handled outside. | 13:55 |
saggi | wei___: | 13:55 |
saggi | wei___: This is because actual distributed storage has it's own configuration server. It can't use Smaug's. | 13:59 |
gampel | wei___: hi are you still here ? | 14:12 |
*** wei___ has quit IRC | 14:13 | |
*** wei__ has joined #openstack-smaug | 14:23 | |
wei__ | hi | 14:23 |
gampel | hi | 14:23 |
gampel | hi will be in shenzhen on the 18th | 14:24 |
wei__ | cool | 14:24 |
wei__ | then we can talk face to face | 14:24 |
wei__ | will saggi come with you? | 14:24 |
gampel | I will be there 18th and 19th | 14:24 |
saggi | wei__: no | 14:25 |
gampel | no ayal and me | 14:25 |
wei__ | ok | 14:25 |
wei__ | welcome! | 14:25 |
gampel | I will be in chengdu next week and then in shenzhen | 14:25 |
wei__ | saggi, welcome to china next time! | 14:26 |
wei__ | yes, i heard of the news so lili agreed that if you guys won't come to shenzhen, then we 3 persons would come to chengdu. | 14:26 |
gampel | So it up to you it seem that I will be in shenzhen so we could meet there | 14:27 |
wei__ | hmm, shenzhen is good to me | 14:28 |
wei__ | :) | 14:29 |
gampel | Ok great | 14:30 |
wei__ | so for my last question, another case is restore. If site a fails, tenant need restore one checkpoint created by a in all sites managed by smaug, since we suppose sites are independent. If tenant has to do such kind of work in restore, we could also have tenant to recycle garbage left by failure site in each site. | 14:32 |
gampel | wei_:do you think we can arrange a meet-up for Smaug in shenzhen | 14:32 |
wei__ | anyone familiar how to organize the meet-up in openstack? | 14:33 |
wei__ | I will ask my colleges tommoraw | 14:33 |
gampel | I am not sure how it is being done in shenzhen but I can ask the guys in hangzhou we are doing a meet-up for dragonflow there next week | 14:34 |
wei__ | to see if anyone here in shenzhen has such kind of experience. | 14:34 |
wei__ | ok, pls give me their name | 14:34 |
gampel | Ok i will | 14:34 |
wei__ | hmm, maybe chaoyi knows. Let me check it tomorrow. | 14:35 |
wei__ | time to go to bed | 14:35 |
wei__ | bye, guys | 14:35 |
gampel | wei__: bye talk to you tomorrow | 14:36 |
*** wei__ has quit IRC | 14:36 | |
*** gampel has quit IRC | 15:23 | |
openstackgerrit | Saggi Mizrahi proposed openstack/smaug: Pluggable protection provider doc https://review.openstack.org/262264 | 15:29 |
openstackgerrit | Saggi Mizrahi proposed openstack/smaug: Proposed Smaug API v1.0 https://review.openstack.org/244756 | 16:17 |
*** wei__ has joined #openstack-smaug | 17:37 | |
*** wei__ has quit IRC | 17:41 | |
*** gampel has joined #openstack-smaug | 19:12 | |
*** gampel1 has joined #openstack-smaug | 20:02 | |
*** gampel has quit IRC | 20:03 |
Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!