*** drewn3ss has joined #openstack-swift | 00:01 | |
*** mikecmpbll has quit IRC | 00:02 | |
notmyname | mattoliverau: I'm not completely sure I understand your objections to https://review.openstack.org/#/c/541058/ | 00:12 |
---|---|---|
patchbot | patch 541058 - swift - Add fallocate_reserve to account and container ser... | 00:12 |
mattoliverau | notmyname: the question more is, do we want to try and make it closer to fallocate_reserve on the object side? As account and container puts (obj updates) are small and not too big a deal, but it's replication of larger dbs that would really effect things. | 00:18 |
mattoliverau | on objects, we have the size, so we say if "space left - obj size < fallocate reserve" then fail. Here we're saying. is "Free space < fallocate reserve" ok fail. | 00:20 |
mattoliverau | not totally against it. I just wanted to have the dicussion. because there is slightly different behaviour. and a large db replication could still end up filling our disk | 00:21 |
mattoliverau | where as in object we take the size into account | 00:21 |
notmyname | with PUT and POST, it doesn't matter too much, since those are merely adding rows to existing DBs. ie not much space | 00:22 |
mattoliverau | yeah, its only on REPLICATE I really care. | 00:22 |
mattoliverau | we could argue, now with sharding REPLICATE isn't so much an issue if fallocate_reserve is big enough, which it probably would be in prod | 00:23 |
mattoliverau | ie. even 1% is pretty big | 00:23 |
mattoliverau | So like I mentioned (i think) maybe some documentation so the is a known difference between the object vs db fallocate reserves. | 00:24 |
mattoliverau | we have the rowid difference in the RPC of db replication, so if we know a "standard estimated" row size we could do a rough size check, and return a 507 | 00:26 |
notmyname | yeah, I was just looking through that part of the code to find where the rsync call on DBs is | 00:27 |
mattoliverau | but docs is probably good enough, as the "size" would be a very rough guess | 00:27 |
mattoliverau | at sync (before we decide what to do) from memory is where a check could be made | 00:27 |
notmyname | so you'd prefer, ideally, some measure like (free_space - "size of rows to add") < reserve returns a 507? | 00:28 |
mattoliverau | yeah, as that would align more with what we actually do on the object side | 00:28 |
mattoliverau | and might stop more full disk scenarios | 00:29 |
notmyname | I think you're technically right. but I'm not sure there's a practical difference | 00:29 |
mattoliverau | or just shard everything so all our containers are small enough so the "next" "free space < fallocate reserve" check will work and there will still be space left on disk :P | 00:29 |
notmyname | a row is around 1300 bytes, max | 00:30 |
notmyname | like even if you've got a million rows to sync without doing an rsync, that's still just a MB | 00:30 |
mattoliverau | on a rsync of a whole db, it would be 1300 bytes * number of rows. | 00:31 |
notmyname | sorry, 1GB | 00:31 |
notmyname | and is the fallocate_reserve check going to be *that* sensitive? | 00:31 |
mattoliverau | so maybe ssync isn't as scary, but the check can also go into the pre rsync_then_merge or just rsync as well wirght, so there _could_ be 1 billion rows. | 00:32 |
notmyname | small DB drives are hundreds of GB. even if you have a 200GB drive, 1% is still more than a million rows | 00:32 |
mattoliverau | sorry, usync | 00:32 |
notmyname | in that case, wouldn't we be rsyncing the DB instead of doing a billion rows? (likely, at least) | 00:33 |
mattoliverau | yeah, but we can do a 507 before trying to rsync. though yeah, I guess rsync will fail before then. so maybe your right.. maybe it doesn't matter | 00:33 |
mattoliverau | I assume rsync does a space check before sending data | 00:34 |
notmyname | you're more familiar with this part of the code than I am right now, so correct me where I'm getting it wrong :-) | 00:34 |
notmyname | well, I'm not going to assume rsync does a space check. (well, I'm sure it does one, but not with our buffer.) that's one of the main reasons to get to a tsync | 00:34 |
notmyname | but isn't the rsync triggered via the REPLCIATE verb? | 00:35 |
notmyname | so the check there will give some protection even if a billion-row DB needs to be synced | 00:35 |
mattoliverau | after the space/rowid difference check. | 00:35 |
notmyname | (back of napkin, 1B rows = 1.25TB ish) | 00:35 |
mattoliverau | so it calls the rpc sync, and replicate decides what to do after that | 00:35 |
mattoliverau | ie, to rsync, rsync_then_merge or usync | 00:36 |
notmyname | sure, but that's my point. it's after the check | 00:36 |
notmyname | which means we've got some protection, right/ | 00:36 |
notmyname | I'm with you in that it would be great to calculate it exactly based on the size of the incoming data. but we can't do that with rsync | 00:37 |
notmyname | I'm not trying to get you to change your vote. just trying to understand the situation :-) | 00:39 |
notmyname | (and if you change your vote, that's ok too) ;-) | 00:39 |
mattoliverau | yes we can, we have the broker so we know the number of rows or the rowid difference before we tell it to do the rsync, so the rpc side can return a 507 like it does a 404 (or indicate the "rsync" mode) | 00:39 |
mattoliverau | so the new check will help, in the sense that it'll look to see if we are under fallocate_reserve, but if our free space is bigger by 1 byte then the current fallocate_reserve, we'll let the usync, rsync_then_merge or rsync through. if the db (in rsync and rsync_then_merge) is bigger then the free space then rsync should fail.. if it's less then we'll be letting it though. | 00:42 |
notmyname | yeah. so with the current patch the only ops way to prevent that is to have more fallocate_reserve | 00:43 |
mattoliverau | back of the napkin maths ways maybe usync isn't much of a problem then. and maybe the rsync is just an edge case. But when ever we rebalance we would tend to do rsyncs from the handoff node. though I'd hope the rebalace means there is more space where its moving :) | 00:43 |
mattoliverau | notmyname: yeah, and sharding ;) | 00:44 |
notmyname | sharding. of course | 00:44 |
notmyname | I'm *obviously* use sharding ;-) | 00:44 |
notmyname | *I mean | 00:44 |
notmyname | my opinion is that the current patch will help people prevent filling up DB drives. it's not perfect, though. and youv'e got some good ideas on how to make it better | 00:45 |
notmyname | which gets us to... fix the current patch or have a follow-on | 00:46 |
mattoliverau | +1. just wanted to make sure we had the discussion and maybe we can make a note (in a comment or feature request) to one day revisit to improve it and align it with the object side. | 00:47 |
mattoliverau | the patch makes the situation better so thats great :) | 00:47 |
notmyname | then how about this as a suggestion: you add some comments into the existing patch in the rpc handler(?) about how the row count could be used to do something a little smarter. then push that and land the patch | 00:48 |
* notmyname open to other suggestions too | 00:48 | |
mattoliverau | A follow up would be adding an extra check (based on number of rows or row differnce) in the rpc to make a better decision (507). | 00:49 |
*** gyee has quit IRC | 00:49 | |
mattoliverau | notmyname: yeah good idea :) | 00:50 |
notmyname | mattoliverau: thanks. and thanks for talking it through with me! | 00:50 |
mattoliverau | nps, thanks for finding the max size per row. That was the missing part that I hadn't got around to finding out :) | 00:51 |
notmyname | it came up a lot internally with sales guys when making recommendations for customers and when talking about sharding ;-) | 00:52 |
notmyname | 1300 isn't exact, and it's certainly towards the max size instead of what I'd expect to be "normal", but it's relatively useful for what it could be | 00:53 |
notmyname | I think the exact is closer to 1350. 1355 or 1356. something like that | 00:53 |
notmyname | I only got that by looking at the table definition. not counting any index overhead or anything like that | 00:54 |
mattoliverau | yeah, a ball park is all we can really hope to acheive anyway | 00:54 |
mattoliverau | on onther plus side I seem to have a monasa plugin that does the clayg number of primary/handoffs check. Now just finalising the swift_recon one. | 00:56 |
notmyname | cool | 00:57 |
notmyname | ah. I'm getting summoned for dinner prep. talk to you later | 00:58 |
mattoliverau | notmyname: kk, thanks for the chat :) enjoy o/ | 00:58 |
*** drewn3ss has quit IRC | 02:21 | |
*** spsurya_ has joined #openstack-swift | 03:59 | |
*** links has joined #openstack-swift | 04:02 | |
*** ianychoi_ has joined #openstack-swift | 05:27 | |
*** ianychoi has quit IRC | 05:30 | |
*** cshastri has joined #openstack-swift | 05:49 | |
*** ccamacho has joined #openstack-swift | 06:12 | |
*** ccamacho has quit IRC | 06:12 | |
*** ccamacho has joined #openstack-swift | 06:13 | |
*** mikecmpbll has joined #openstack-swift | 06:21 | |
*** mikecmpbll has quit IRC | 06:23 | |
*** armaan has joined #openstack-swift | 06:23 | |
*** dr_gogeta86 has joined #openstack-swift | 06:24 | |
*** dr_gogeta86 has quit IRC | 06:24 | |
*** dr_gogeta86 has joined #openstack-swift | 06:24 | |
*** armaan has quit IRC | 06:31 | |
*** bharath1234 has joined #openstack-swift | 06:43 | |
*** armaan has joined #openstack-swift | 06:48 | |
*** armaan has quit IRC | 06:56 | |
*** armaan has joined #openstack-swift | 06:57 | |
*** armaan has quit IRC | 06:57 | |
*** gkadam has joined #openstack-swift | 07:01 | |
*** tesseract has joined #openstack-swift | 07:12 | |
*** DHE has quit IRC | 07:39 | |
*** DHE has joined #openstack-swift | 07:40 | |
*** drewn3ss has joined #openstack-swift | 08:46 | |
*** rcernin has quit IRC | 08:46 | |
*** armaan has joined #openstack-swift | 08:55 | |
*** psachin has joined #openstack-swift | 09:01 | |
*** armaan has quit IRC | 09:08 | |
*** armaan has joined #openstack-swift | 09:09 | |
*** armaan has quit IRC | 09:13 | |
*** armaan has joined #openstack-swift | 09:15 | |
*** armaan has quit IRC | 10:04 | |
*** armaan has joined #openstack-swift | 10:14 | |
*** armaan has quit IRC | 11:27 | |
acoles | zaitcev: git rebase -i and edit instruction can be useful | 12:23 |
acoles | zaitcev: BTW, I'm transitioning now and won't have much time for PUT+POST in immediate future but I will try to check back at some point. Please use/discard/improve any of my patches as you see fit. I hope some of it has helped. | 12:27 |
kota_ | acoles: thanks for spending much time for Swift upstream. your contribution has been excellent for all swift progressed. | 12:28 |
kota_ | acoles: I've been in summer vacation since yesterday but i'd say thank you, before you left this channel. | 12:29 |
acoles | kota_: thanks! I expect I will lurk in channel from time to time. Have a good vacation :) | 12:30 |
*** hoonetorg has quit IRC | 12:43 | |
*** links has quit IRC | 12:59 | |
mattoliverau | acoles: just wanted to let you know it's been an honour to work with you upstream! You'll be missed and i hope you visit at least here often | 13:00 |
*** hoonetorg has joined #openstack-swift | 13:01 | |
*** armaan has joined #openstack-swift | 13:04 | |
*** armaan has quit IRC | 13:16 | |
*** armaan has joined #openstack-swift | 13:23 | |
*** kei_yama has quit IRC | 13:24 | |
*** armaan has quit IRC | 13:25 | |
*** armaan has joined #openstack-swift | 13:26 | |
*** armaan has quit IRC | 13:33 | |
*** armaan has joined #openstack-swift | 13:36 | |
*** armaan has quit IRC | 13:39 | |
*** armaan has joined #openstack-swift | 13:40 | |
*** armaan has quit IRC | 13:40 | |
*** psachin has quit IRC | 13:42 | |
*** cshastri has quit IRC | 13:55 | |
openstackgerrit | Nguyen Hai proposed openstack/swift master: add lower-constraints job https://review.openstack.org/556255 | 14:04 |
notmyname | good morning | 14:36 |
*** armaan has joined #openstack-swift | 15:07 | |
*** tesseract has quit IRC | 15:21 | |
*** mikecmpbll has joined #openstack-swift | 15:38 | |
*** ccamacho has quit IRC | 16:05 | |
*** armaan has quit IRC | 16:14 | |
*** mikecmpbll has quit IRC | 16:26 | |
*** mikecmpbll has joined #openstack-swift | 16:44 | |
*** mikecmpbll has quit IRC | 16:45 | |
*** mikecmpbll has joined #openstack-swift | 16:49 | |
*** mikecmpbll has quit IRC | 16:50 | |
*** mikecmpbll has joined #openstack-swift | 16:55 | |
*** gkadam has quit IRC | 16:55 | |
*** kei-ichi has quit IRC | 17:20 | |
*** kei-ichi has joined #openstack-swift | 17:23 | |
*** mikecmpbll has quit IRC | 17:34 | |
*** jistr has quit IRC | 18:11 | |
*** jistr has joined #openstack-swift | 18:11 | |
openstackgerrit | John Dickinson proposed openstack/python-swiftclient master: Add bash_completion to swiftclient https://review.openstack.org/579037 | 18:24 |
*** gkadam has joined #openstack-swift | 18:50 | |
*** gkadam has quit IRC | 19:17 | |
*** mikecmpbll has joined #openstack-swift | 20:02 | |
*** mikecmpbll has quit IRC | 20:04 | |
zaitcev | Oh god, John | 20:36 |
zaitcev | [zaitcev@lembas python-swiftclient-0]$ ls /etc/bash_completion.d/| wc -l | 20:37 |
zaitcev | 9 | 20:37 |
zaitcev | Actually, never mind. It's not as bad as I was afraid. | 20:37 |
*** gyee has joined #openstack-swift | 21:25 | |
*** nguyenhai93 has joined #openstack-swift | 21:26 | |
*** nguyenhai_ has quit IRC | 21:29 | |
*** zaitcev_ has joined #openstack-swift | 21:40 | |
*** ChanServ sets mode: +v zaitcev_ | 21:40 | |
*** zaitcev has quit IRC | 21:43 | |
-openstackstatus- NOTICE: logs.openstack.org is offline, causing POST_FAILURE results from Zuul. Cause and resolution timeframe currently unknown. | 21:52 | |
*** ChanServ changes topic to "logs.openstack.org is offline, causing POST_FAILURE results from Zuul. Cause and resolution timeframe currently unknown." | 21:52 | |
*** itlinux_ has joined #openstack-swift | 22:00 | |
*** itlinux_ has quit IRC | 22:49 | |
*** ChanServ changes topic to "OpenStack Swift object storage | Logs: http://eavesdrop.openstack.org/irclogs/%23openstack-swift/ | Meetings: https://wiki.openstack.org/wiki/Meetings/Swift | Review Dashboard: http://not.mn/reviews.html" | 23:36 | |
-openstackstatus- NOTICE: logs.openstack.org is back on-line. Changes with "POST_FAILURE" job results should be rechecked. | 23:36 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!