*** jamielennox|away is now known as jamielennox | 00:06 | |
*** pberis has joined #openstack-swift | 00:18 | |
*** dmorita has joined #openstack-swift | 00:34 | |
*** kota_ has joined #openstack-swift | 00:39 | |
*** km has joined #openstack-swift | 00:41 | |
*** geaaru has quit IRC | 01:05 | |
*** fanyaohong has quit IRC | 01:09 | |
*** thumpba has quit IRC | 01:11 | |
*** pberis has quit IRC | 01:18 | |
*** kota_ has quit IRC | 02:03 | |
openstackgerrit | Merged openstack/swift: Set connection timeout in container sync https://review.openstack.org/156943 | 03:16 |
---|---|---|
*** thumpba has joined #openstack-swift | 03:25 | |
*** geaaru has joined #openstack-swift | 03:31 | |
*** thumpba has quit IRC | 03:44 | |
*** thumpba has joined #openstack-swift | 03:53 | |
*** km_ has joined #openstack-swift | 04:01 | |
*** km has quit IRC | 04:03 | |
*** thumpba has quit IRC | 04:19 | |
*** kota_ has joined #openstack-swift | 04:27 | |
*** ppai has joined #openstack-swift | 04:50 | |
notmyname | I'd love to see the ec_review patches get finished up in the next 24 hours. | 04:54 |
notmyname | that will give a chance to get the other pending-on-master patches to land and use to make an RC on tuesday | 04:55 |
*** haomaiwang has joined #openstack-swift | 05:09 | |
kota_ | notmyname: ok, I'm going to hurry me to review the patches on the ec_review branch. | 05:11 |
*** geaaru has quit IRC | 05:17 | |
openstackgerrit | Pratik Mallya proposed openstack/python-swiftclient: Accept token and tenant_id auth https://review.openstack.org/172791 | 05:45 |
*** km_ has quit IRC | 05:48 | |
*** kota_ has quit IRC | 05:59 | |
*** km has joined #openstack-swift | 06:01 | |
cschwede | Good Morning! | 06:40 |
mattoliverau | cschwede: Guten Morgen, have a good weened? | 06:52 |
mattoliverau | *weekend | 06:52 |
cschwede | mattoliverau: Good Morning Matthew! Yes, thanks - finally spring is arriving over here thus enjoying the sun :D How about you? | 06:52 |
mattoliverau | Its Autumn, so cooling down.. but like you means its wonderful weather and enjoying actually being out in the sun :) | 06:53 |
mattoliverau | (without burning) | 06:54 |
*** nshaikh has joined #openstack-swift | 07:01 | |
mattoliverau | But yeah, had a good weekend :) | 07:03 |
*** jamielennox is now known as jamielennox|away | 07:11 | |
cschwede | i’m wondering about the test errors in the reconstructor (https://review.openstack.org/#/c/170339/) and asking myself if this is something we need to worry about. the tests pass locally on my VM though | 07:22 |
*** jistr has joined #openstack-swift | 07:24 | |
*** chlong has quit IRC | 07:25 | |
*** mmcardle has joined #openstack-swift | 07:34 | |
*** geaaru has joined #openstack-swift | 07:46 | |
*** krykowski has joined #openstack-swift | 07:49 | |
*** jordanP has joined #openstack-swift | 08:00 | |
*** ujjain has joined #openstack-swift | 08:28 | |
*** acoles_away is now known as acoles | 08:28 | |
*** ujjain has quit IRC | 08:28 | |
acoles | morning | 08:28 |
openstackgerrit | Lorcan Browne proposed openstack/swift: Add lowest option to swift-recon disk usage output https://review.openstack.org/167236 | 08:29 |
*** joeljwright has joined #openstack-swift | 08:32 | |
*** tanee has quit IRC | 08:40 | |
*** tanee has joined #openstack-swift | 08:41 | |
acoles | cschwede: i just took a look at the test_reconstructor errors, they all appear to be due to comparing values derived from lists that may not always have same order e.g. dict keys | 08:46 |
*** haigang has joined #openstack-swift | 08:46 | |
acoles | cschwede: so i think the problem is that the tests should sort before comparing, and not a fundamental problem with the unit under test | 08:46 |
cschwede | acoles: Morning! Well, if it is from a dict we should these errors with a 50/50 chance, or not? because the dicts have only two entries | 08:47 |
acoles | cschwede: yes. it will need to be fixed. | 08:51 |
acoles | cschwede: or do you never see it locally? | 08:52 |
cschwede | acoles: no, not on my three tests, thus i’m wondering | 08:52 |
acoles | cschwede: i am just looping the tests to try to reproduce | 08:54 |
*** km has quit IRC | 08:56 | |
*** jamielennox|away is now known as jamielennox | 09:02 | |
acoles | cschwede: hmmm, i can't reproduce either. but the failures are due to misordering when comparing suffix lists having 'abc' and '123' | 09:07 |
cschwede | acoles: yes, i think sometimes there is a set() missing - while some tests already use it | 09:13 |
*** haigang has quit IRC | 09:13 | |
*** tanee has quit IRC | 09:14 | |
*** tanee has joined #openstack-swift | 09:14 | |
*** haigang has joined #openstack-swift | 09:15 | |
cschwede | oh wait, that are lists, not dicts. | 09:15 |
acoles | cschwede: the failed unit test report for the assertion at line 1500 test_reconstructor.py.test_build_jobs_handoff shows the expected value stub_hashes.keys() to be ['abc', '123'], but... | 09:21 |
cschwede | acoles: do we still rely on 2.6 for testing? i don’t think so, right? we could use https://docs.python.org/2/library/unittest.html#unittest.TestCase.assertItemsEqual then | 09:21 |
cschwede | which is basically assertEqual(sorted(expected), sorted(actual)) | 09:22 |
acoles | cschwede: locally if i construct same stub_hashes dict the key order is reversed, so the test passes | 09:22 |
acoles | cschwede: re py26 i'm not sure we have officially abandoned it - the swiftstack CI runs tox -e py26 | 09:22 |
cschwede | acoles: ah yeah, you’re right. thus simply throwing in a few sorted() should do it then | 09:23 |
*** aix has joined #openstack-swift | 09:23 | |
cschwede | well, a few more... | 09:24 |
cschwede | acoles: i create a diff | 09:24 |
acoles | cschwede: yes, its weird though because the value being tested must also be a dict with same keyy so on same machine similar dicts are sorting differently?? one list is based on dictA.items(), the other on dictB.keys(), both have same keys but we get different ordering. not sure if that should surprise me or not, i know dict ordering is arbitrary. | 09:25 |
acoles | cschwede: thanks fr doing the diff | 09:26 |
*** kei_yama has joined #openstack-swift | 09:26 | |
*** haigang has quit IRC | 09:27 | |
*** haigang has joined #openstack-swift | 09:28 | |
*** ppai_ has joined #openstack-swift | 09:29 | |
*** ppai has quit IRC | 09:33 | |
acoles | clayg: ^^ fyi scrollback | 09:35 |
*** Kirgahn has joined #openstack-swift | 09:38 | |
Kirgahn | Hello everyone! | 09:39 |
*** theanalyst has quit IRC | 09:42 | |
Kirgahn | There's a problem I'm having with a single node swift deployment I made within a v.m. - I can create containers and upload/download objects just fine but, when I try to delete something, it refuses with an http 400 Object DELETE failed: Invalid path: /device0/3674/AUTH_8f63721ec8734d29adb22ce73cc | 09:43 |
Kirgahn | got any advice? | 09:43 |
*** theanalyst has joined #openstack-swift | 09:44 | |
ppai_ | Kirgahn, could you share the command you used to issue a delete request | 09:44 |
cschwede | acoles: clayg: diff that wraps some dicts in sorted(): http://paste.openstack.org/show/203488/ | 09:45 |
Kirgahn | via bash "swift delete SwiftContainer", via horizon i just select the object and delete it - same result | 09:46 |
Kirgahn | thnx | 09:46 |
acoles | cschwede: thanks! did you post that link on the gerrit review? | 09:46 |
cschwede | acoles: yes | 09:47 |
acoles | great | 09:47 |
ppai_ | Kirgahn, I hope the container is empty | 09:48 |
Kirgahn | the container is not empty - as I stated, when I try to delete the single object it contains I get the very same error | 09:49 |
Kirgahn | I'm aware of the fact that you can't delete containers that are not empty | 09:49 |
ppai_ | hmmm.interesting, did u take a look at the logs ? | 09:50 |
*** yuan has quit IRC | 09:52 | |
*** yuan has joined #openstack-swift | 09:53 | |
Kirgahn | yes, this is what I get: "[root@swift ~]# swift delete test 880568-sophie-howard--AhaWallpaper.com.jpg Object DELETE failed: http://192.168.124.133:8080/v1/AUTH_8f63721ec8734d29adb22ce73ccd0ac5/test/880568-sophie-howard--AhaWallpaper.com.jpg 400 Bad Request [first 60 chars of response] Invalid path: /device0/3154/AUTH_8f63721ec8734d29adb22ce73cc" | 09:53 |
Kirgahn | I can easily download that object | 09:53 |
*** haigang has quit IRC | 09:54 | |
*** haigang has joined #openstack-swift | 09:55 | |
Kirgahn | "[root@swift ~]# swift download test 880568-sophie-howard--AhaWallpaper.com.jpg 880568-sophie-howard--AhaWallpaper.com.jpg [auth 0.408s, headers 0.648s, total 0.658s, 1.556 MB/s]" | 09:56 |
ppai_ | If it's a saio vm, you'll be having access to swift logs | 09:57 |
Kirgahn | it's not, i actually manually deployed and connected it to an existing openstack deployment | 09:58 |
Kirgahn | I've already tried to increase verbosity with "log_name = swift log_facility = LOG_LOCAL0 log_level = DEBUG log_headers = false log_address = /dev/log" in each server conf file | 09:59 |
ppai_ | looking from the code, that message "Invalid path: ****" is thrown by split_path() which means there's something wrong with the path | 09:59 |
*** haigang has quit IRC | 10:00 | |
Kirgahn | can i do anything else to increase verbosity? the wierd thing is that I can download the object easily but I can't delete it, as if the path would change | 10:05 |
acoles | Kirgahn: someone asked here with a similar problem recently and IIRC there was a misconfigured account or container or object server port in the configs which was causing maybe a container server to be listening on port that should be an object server | 10:18 |
Kirgahn | mmm, thanks for the pointer I'll triple check the config | 10:18 |
acoles | Kirgahn: double check your config file and ring file port numbers | 10:19 |
acoles | Kirgahn: its just that error (invalid path) is symptomatic of a request being sent to the wrong server type | 10:20 |
acoles | Kirgahn: but tbh i'm not sure how you would have uploaded the object in that case | 10:21 |
*** ppai_ has quit IRC | 10:26 | |
Kirgahn | Thanks acoles! I had a weird double entry in the object.builder | 10:31 |
Kirgahn | swift-ring-builder /etc/swift/object.builder /etc/swift//object.builder, build version 3 4096 partitions, 1.000000 replicas, 1 regions, 1 zones, 2 devices, 0.00 balance The minimum number of hours before a partition can be reassigned is 1 Devices: id region zone ip address port replication ip replication port name weight partitions balance meta 0 0 0 127.0.0.1 6002 127. | 10:32 |
Kirgahn | i removed the wrong entry, rebalanced and voilà! | 10:32 |
Kirgahn | still wierd though | 10:32 |
Kirgahn | one entry was serving upload and download requests, the other one was handling deletes | 10:33 |
acoles | Kirgahn: ok, glad you found it. | 10:36 |
*** ppai_ has joined #openstack-swift | 10:38 | |
*** aix has quit IRC | 10:45 | |
tab___ | Which database is best to use/prefered one to use with Swift? Should I go with SQLite or MySQL/MariaDB? | 10:56 |
*** ppai_ has quit IRC | 11:19 | |
portante | tab___: what do you mean by "best to use/preferred"? Are you asking if there is an option to tell which DB to use? Or are you asking what to use in some larger system working with Swift? | 11:26 |
*** tab___ has quit IRC | 11:27 | |
*** ppai_ has joined #openstack-swift | 11:33 | |
*** ujjain has joined #openstack-swift | 11:34 | |
*** kei_yama has quit IRC | 11:43 | |
*** jamielennox is now known as jamielennox|away | 12:12 | |
*** EmilienM|afk is now known as EmilienM | 12:19 | |
*** ppai_ has quit IRC | 12:33 | |
*** PurpleJack has joined #openstack-swift | 12:38 | |
*** jroll has quit IRC | 12:50 | |
*** jroll has joined #openstack-swift | 12:50 | |
*** Kirgahn has quit IRC | 12:55 | |
*** openstackgerrit has quit IRC | 13:00 | |
*** openstackgerrit has joined #openstack-swift | 13:03 | |
*** krtaylor has quit IRC | 13:03 | |
*** erlon has joined #openstack-swift | 13:15 | |
*** ozialien has joined #openstack-swift | 13:16 | |
*** proteusguy has quit IRC | 13:19 | |
*** petertr7 has joined #openstack-swift | 13:20 | |
*** aix has joined #openstack-swift | 13:24 | |
*** dmorita has quit IRC | 13:28 | |
*** proteusguy has joined #openstack-swift | 13:31 | |
*** annegentle has joined #openstack-swift | 13:44 | |
*** Trixboxer has joined #openstack-swift | 14:04 | |
*** lpabon has joined #openstack-swift | 14:07 | |
*** nshaikh has quit IRC | 14:15 | |
*** tellesnobrega has quit IRC | 14:17 | |
*** vinsh has quit IRC | 14:18 | |
*** tellesnobrega has joined #openstack-swift | 14:19 | |
*** jistr is now known as jistr|mtg | 14:29 | |
*** vinsh has joined #openstack-swift | 14:33 | |
notmyname | good morning | 14:49 |
acoles | notmyname: good morning | 14:51 |
notmyname | big day today (I hope) :-) | 14:51 |
acoles | notmyname: what's your plan B? :P | 14:52 |
notmyname | do it the next day? ;-) | 14:53 |
*** proteusguy has quit IRC | 14:54 | |
notmyname | acoles: cschwede: I'm glad you were looking at the reconstructor error | 14:54 |
notmyname | looks like 2 ec_review patches need a 2nd +2. and 4 need 2 +2s | 14:56 |
notmyname | how up to date with issues is https://etherpad.openstack.org/p/swift_ec_triage | 14:57 |
*** welldannit has joined #openstack-swift | 14:57 | |
acoles | notmyname: oh i thought we had more double +2's | 14:58 |
acoles | notmyname: 9 reviews in total right? | 14:58 |
notmyname | some might have been lost with a recent push | 14:58 |
notmyname | ya, I'm looking at https://review.openstack.org/#/q/status:open+project:openstack/swift+branch:feature/ec_review+topic:bp/swift-ec,n,z | 14:59 |
acoles | notmyname: the -2 on patch 169985 is obscuring all the +2's there | 15:01 |
patchbot | acoles: https://review.openstack.org/#/c/169985/ | 15:01 |
notmyname | oh, yeah. but that one is a special case. yes. that one too is ready to go, it seems | 15:02 |
notmyname | ok, I added a +A there to help with tracking | 15:03 |
notmyname | ok, so 4 have +A and 5 don't | 15:04 |
acoles | notmyname: i'm close to +2 on patch 169989 but need some reassurance on a query there | 15:04 |
patchbot | acoles: https://review.openstack.org/#/c/169989/ | 15:04 |
acoles | or a slap round the head for being stupid | 15:04 |
acoles | and i am reviewing the reconstructor | 15:05 |
acoles | notmyname: it feels like it has been less painful than SP was in terms of stuff changing 'underneath' patches up the chain | 15:06 |
acoles | ...so far | 15:06 |
notmyname | :-) | 15:06 |
acoles | nice video (blog) btw, reminded me how nice the view is from your offices | 15:07 |
notmyname | one of the biggest differences, IMO, between SP and EC is that EC has been more of a whole-community effort from the beginning. whereas SP was more of a few people doing it and then the merge was everyone else coming up to speed | 15:07 |
*** jistr|mtg is now known as jistr | 15:07 | |
acoles | yup that ^^ was certainly the case for me | 15:07 |
notmyname | and, yes, this seems less painful than SP was | 15:07 |
notmyname | so THANK YOU! :-) | 15:07 |
notmyname | well, that's an interesting email post this morning: "eventlet 0.17.3 is now fully Python 3 compatible" | 15:08 |
notmyname | tdasilva: I know you aren't around, but congrats on the new baby! | 15:11 |
acoles | notmyname: oh wow, did he/she arrive early? | 15:12 |
*** annegentle has quit IRC | 15:12 | |
notmyname | yes he did. tdasilva sent me an email yesterday. | 15:12 |
notmyname | "...our baby Lucas arrived last night. It was a little earlier than expected but baby and mom are doing well." | 15:12 |
*** annegentle has joined #openstack-swift | 15:13 | |
acoles | excellent. so tdasilva may make vancouver after all :P :P | 15:14 |
notmyname | heh | 15:15 |
*** GlennS has left #openstack-swift | 15:20 | |
*** annegentle has quit IRC | 15:23 | |
*** zaitcev has joined #openstack-swift | 15:27 | |
*** ChanServ sets mode: +v zaitcev | 15:27 | |
notmyname | ok, the starred patches on the dashboard are: (1) ec_review patches, (2) stuff for master that already has 2 +2s (so I can remember to land them as soon as ec_Review lands), and (3) a couple of small nice-to-haves | 15:35 |
notmyname | eg, it's the 11th hour for https://review.openstack.org/#/c/166576/ | 15:36 |
*** gyee has joined #openstack-swift | 15:41 | |
*** jistr has quit IRC | 15:45 | |
*** baffle has joined #openstack-swift | 15:49 | |
*** vinsh has quit IRC | 15:57 | |
*** annegentle has joined #openstack-swift | 16:00 | |
notmyname | cschwede: "big thinks..." | 16:01 |
cschwede | notmyname: thanks a lot, the review process works well ;) | 16:04 |
* cschwede thinks more about other things now | 16:04 | |
notmyname | ;-) | 16:04 |
*** Fin1te has joined #openstack-swift | 16:19 | |
*** ozialien has quit IRC | 16:22 | |
openstackgerrit | John Dickinson proposed openstack/swift: 2.3.0 authors and changelog updates https://review.openstack.org/172573 | 16:22 |
notmyname | I included some EC notes in this new version ^^ | 16:23 |
notmyname | any comments and improvements are welcome | 16:23 |
*** jordanP has quit IRC | 16:24 | |
*** aerwin has joined #openstack-swift | 16:27 | |
notmyname | FYI, I'll be going down to santa clara in a couple of hours and be in various states of "online" this afternoon. I'll be fully back online once I'm home, and I expect to be up late finishing EC and the rest of it for tomorrow | 16:30 |
notmyname | I'm definitely available for anything that comes up (and nearly all of you have my cell if it's really important) | 16:31 |
acoles | notmyname: remind me, we need all ec reviews +2 by end of today, correct? | 16:32 |
notmyname | yes, that's the goal. obviously, good quality trumps "today", but I'd like to have everything submitted to the gate (to land on feature/ec_review) by the time I go to bed tonight | 16:33 |
notmyname | then tomorrow morning I'll propose and land the ec_review->master merge commit | 16:33 |
notmyname | then the other stuff that's pending on master with 2 +2s | 16:33 |
acoles | right so ideally clay pulls the corking -2 later today | 16:33 |
notmyname | right :-) | 16:33 |
*** sandywalsh has quit IRC | 16:34 | |
notmyname | then once all that's done, we've got a SHA for the RC and I'll send that on to ttx. | 16:34 |
notmyname | so that's my schedule for the next ~30-36 hours | 16:34 |
notmyname | if I'm lucky, I'll get a good night's sleep too! ;-) | 16:35 |
*** sandywalsh has joined #openstack-swift | 16:35 | |
*** krykowski has quit IRC | 16:36 | |
*** ujjain has quit IRC | 16:51 | |
*** vinsh has joined #openstack-swift | 16:54 | |
*** haomaiw__ has joined #openstack-swift | 17:02 | |
*** haomaiwang has quit IRC | 17:03 | |
*** annegentle has quit IRC | 17:10 | |
*** ozialien has joined #openstack-swift | 17:12 | |
*** geaaru has quit IRC | 17:16 | |
*** Fin1te has quit IRC | 17:18 | |
*** mmcardle has quit IRC | 17:19 | |
clayg | morning | 17:27 |
clayg | sounds like there's a few diffs already to apply - I'm inclined to take them now? | 17:29 |
acoles | clayg: morning. i think i have posted all diffs i have for the moment - i *think* all i have left to review is docs | 17:30 |
*** rdaly2 has joined #openstack-swift | 17:30 | |
*** ozialien has quit IRC | 17:31 | |
clayg | acoles: well some of the diffs are like really good right - fixing mis-spelled variable names and the ilk? | 17:31 |
acoles | clayg: there's one on per-policy-diskfile review for cleaning up old non-durable data that you should probably *review* before applying | 17:32 |
*** aix has quit IRC | 17:32 | |
clayg | non-druable data - waits reclaim age right? | 17:32 |
clayg | the only reason I didn't grab it on friday night was laziness on some tests that were coupled with the replicated behavior | 17:32 |
acoles | clayg: yeah. at least thats the intent :P | 17:32 |
clayg | yeah that one would be nice to verify functionally - i'm not entirely sure how much I tested .druable repair via reconstrutor | 17:33 |
clayg | there was some notion that it was slow/wasteful - like it would rebuild the whole damn FA just to get the .druable to the remote - but as long as it works it's probably fine | 17:34 |
*** zhill has joined #openstack-swift | 17:34 | |
acoles | clayg: well this one i didn't write it til sunday :) https://gist.github.com/alistairncoles/a118e563495d9bd0903e | 17:34 |
clayg | so really it's more of a sanity check of the missing check to make sure the reciever really does ask for the hash/timestamp of the non-druable data | 17:34 |
acoles | clayg: so which diff are you referring to? | 17:35 |
clayg | acoles: as far as the race in purge - *I* think pushing the cleanup of the non-fi-indexed-durable out until the next pass should essentially make it a non-race - in that once things have settled down for a whole replication pass it's even less likely that suddenly some writes are going to start showing up again into a much smaller window - but maybe i'm being overly optomistic | 17:37 |
*** aix has joined #openstack-swift | 17:37 | |
clayg | I thought cschwede had a diff for me too - i was just reading email - i need to go look at the reviews on the patch sets | 17:38 |
notmyname | hi clayg | 17:38 |
clayg | ... but I'm inclined to apply whatever fixes we have written | 17:38 |
clayg | notmyname: good morning! | 17:38 |
notmyname | clayg: party day! | 17:38 |
acoles | yeah cschwede fixed up some tests on ec-recon, he's left a diff linked to the review | 17:38 |
notmyname | (and ec_review party) | 17:38 |
notmyname | *an | 17:38 |
clayg | acoles: yeah those! let's get 'em | 17:39 |
notmyname | maybe that's why I don't get asked to throw many parties | 17:39 |
clayg | notmyname: you and peluse and torgomatic and I can figuratively reapply acoles and cschwede'd +2's later this morning right? | 17:39 |
notmyname | what do you mean? | 17:40 |
acoles | he means can you proxy vote for me :) | 17:40 |
clayg | notmyname: looks like mattoliverau might have a few nits as well | 17:40 |
clayg | notmyname: well for all the +2's and "this is good enough" - I think there's still a couple of oppertunities to improve some | 17:40 |
clayg | I'm more inclined to make the fixes and approve them than to not make the fixes just to avoid any changes | 17:41 |
clayg | I *do* want to get stuff merged today tho because I really want to shift my focus to scale testing in the lab | 17:41 |
notmyname | oh, yeah. the way this whole thing is going, it seems like we're all in it together. so if acoles and cschwede go to bed and you and me and torgomatic and mattoliverau end up +2/+A stuff, then it's fine | 17:42 |
notmyname | I mean, I'm fine with acoles doing stuff when I'm asleep. I'm assuming that's transitive | 17:42 |
clayg | notmyname: cool - that's what I was thinking | 17:42 |
*** zhill has quit IRC | 17:44 | |
clayg | acoles: so that gist with the .data cleanup - is .data in the reclaim rules or not? | 17:44 |
clayg | acoles: yeah i'm totally confused why test_hash_cleanup_listdir_keep_single_old_data didn't fail with that diff | 17:46 |
clayg | acoles: and what about the qurom size stuff - torgomatic have you seen weekend comments on https://review.openstack.org/#/c/169989/ | 17:48 |
acoles | clayg: ok gimme a few mins to catch up - yeah, the quorum size is the one issue i'm not sure on hence no +2 yet on that patch | 17:48 |
*** aix has quit IRC | 17:49 | |
clayg | cschwede: some of the sorted(set( changes in http://paste.openstack.org/show/203488/ don't make sense to me - the set equality was ment to address the ordering - maybe onside forgot to get wrapped in a set? | 17:51 |
clayg | cschwede: the dict.keys == [] obviously needed to be sorted - thanks for those | 17:51 |
*** ozialien has joined #openstack-swift | 17:52 | |
*** rdaly2 has quit IRC | 17:53 | |
*** jkugel has joined #openstack-swift | 17:53 | |
torgomatic | clayg: yeah, I was looking at that quorummy stuff on the train | 17:54 |
torgomatic | and it confuses me | 17:54 |
*** krtaylor has joined #openstack-swift | 17:54 | |
clayg | torgomatic: well all the multi-phase PUT stuff confuses me | 17:56 |
clayg | torgomatic: so I'm sure I'm more confused than you are | 17:57 |
clayg | torgomatic: probably only acoles knows what to do - and he's claiming ignorence | 17:57 |
acoles | i could trump you both on confusion :) | 17:57 |
torgomatic | I think it's mostly a question of how much is a quorum | 17:58 |
notmyname | I'm driving to santa clara now. I'll be online as much as possible this afternoon | 17:58 |
clayg | notmyname: STOP GOING PLACES!? | 17:58 |
notmyname | lol | 17:58 |
clayg | notmyname: you can't have two number one priorities | 17:58 |
notmyname | clayg: talk to dana/manzoor ;-) | 17:58 |
clayg | I'll fucking break some heads - me getting fired won't help you | 17:58 |
notmyname | lol | 17:59 |
notmyname | now I know who to call | 17:59 |
clayg | torgomatic: so was DELETE POST only need "most" of the replicas like on purpose? | 17:59 |
*** annegentle has joined #openstack-swift | 17:59 | |
clayg | it makes sense to me - you only need one tombstone for eventualy consistency to win out - the delete'd ness of an object is not erasure-coded it's replicated to high hell | 18:00 |
torgomatic | clayg: I guess so; really as long as you can write out a pair of tombstones, you're okay | 18:00 |
torgomatic | this is more on the PUT part though; like, how many still-working PUTs do you need at each step of the way? | 18:00 |
torgomatic | the code is not particularly clear on the matter | 18:00 |
clayg | bah :'( | 18:01 |
clayg | it's probably all my fault | 18:01 |
torgomatic | you are in the Co-Authored-By line ;) | 18:01 |
cschwede | clayg: most of the sorted() wraps was due to the errors reported here: http://logs.openstack.org/39/170339/6/check/gate-swift-python27/78c17d4/ | 18:02 |
cschwede | and i added a few more because i thought they might break as well | 18:02 |
cschwede | clayg: i can limit the diff to the reported errors if you think that’s more safe and helps | 18:03 |
clayg | cschwede: no I think adding all the sorted(dict.keys()) == sorted([]) is good - anywhere we're doing sorted(set()) == sorted(set()) is not needed as sets are already un-ordered | 18:05 |
clayg | i do the audit when I apply the diff | 18:05 |
clayg | cschwede: thanks! | 18:06 |
cschwede | clayg: ah, ok, got it. yes, makes no sense to sort a set | 18:06 |
cschwede | clayg: you’re welcome, glad i could help a bit | 18:06 |
clayg | ^ huge understatement! | 18:06 |
clayg | thanks a ton! | 18:06 |
clayg | ok i'm finsihed looking things over - i'm going to start applying diffs | 18:07 |
acoles | clayg: let me stew on the hash_cleanup_listdir diff a little longer. you're right that .data should be in reclaim_rules but i think i see a better way | 18:08 |
clayg | hopefully torgomatic will find his way out of his commute and show me how to beat all of the stupid out of the my craptacular paraphrasing of his proxy work | 18:09 |
clayg | acoles: well.... OHHHHHHhhhhkay | 18:09 |
acoles | clayg: just need to stoke up in caffeeine... | 18:09 |
*** Fin1te has joined #openstack-swift | 18:18 | |
*** rdaly2 has joined #openstack-swift | 18:22 | |
*** rdaly2 has quit IRC | 18:24 | |
*** annegentle has quit IRC | 18:33 | |
*** ozialien has quit IRC | 18:33 | |
*** Fin1te has quit IRC | 18:35 | |
*** joeljwright has quit IRC | 19:03 | |
*** joeljwright has joined #openstack-swift | 19:03 | |
clayg | acoles: ok, i'm skipping over the diskfile patch you posted anticipating even more awesomeness coming shortly | 19:08 |
acoles | clayg: k. so test_hash_cleanup_listdir_keep_single_old_data didn't fail with my diff because i'd only find fragments_without_durable if there was more than one (because of the wacky len(files) condition) | 19:10 |
acoles | but if i uncomment your reclaim_rule for .data then a bunch of stuff fails :( and i'm still working through those | 19:11 |
clayg | acoles: interesting | 19:12 |
clayg | acoles: fwiw, i'm pretty sure the only reason the len(files) == 1 check is there is because like with EC, in replication, the tombstone reclaimation was added at the very end and the very special "only one file" condition was an easy way to ensure other code paths relating to suffix hashing wouldn't be effected | 19:13 |
acoles | clayg: some of it is just where tests create files at time 42 and the .data gets instantly reclaimed after its written :) because HCL is called after the .data put before the .durable is written (thats due to not touching legacy code and is on our trello todo to fix) | 19:14 |
clayg | but we already know that's all quite janky because of the object-server not plumbing in reclaim age through get_hashes | 19:14 |
acoles | clayg: yeah i'm sure its so | 19:14 |
clayg | acoles: oh yes, I see that would be a problem | 19:15 |
clayg | simple enough to make those tests use self.ts() instead of 42? | 19:15 |
acoles | clayg: yeah, for a minute i was wtf where did my diskfile go??? | 19:15 |
clayg | acoles: not a great experience even in tests i'm sure :\ | 19:16 |
acoles | clayg: yep done that just got some reconstructor tests failing :/ | 19:16 |
acoles | clayg: oh yeah, so i have self.ts() ! | 19:17 |
*** thumpba has joined #openstack-swift | 19:17 | |
clayg | are the reconstructor tests the dict keys order thing cschwede found? | 19:17 |
acoles | clayg: no i have some left over status from the fake conn | 19:19 |
* acoles goes digging... | 19:19 | |
notmyname | . | 19:20 |
acoles | .. | 19:20 |
clayg | ../.. | 19:20 |
notmyname | permission denied | 19:21 |
clayg | cschwede: the literal path to the sample internal client config I think it sorta problematic | 19:21 |
clayg | I'm thinking I'll have to build it up from test.__file__ | 19:24 |
*** silor has joined #openstack-swift | 19:27 | |
peluse | wow, that's some serious scrollback | 19:27 |
*** silor has quit IRC | 19:36 | |
*** rdaly2 has joined #openstack-swift | 19:38 | |
peluse | acoles, you still there? | 19:42 |
acoles | peluse: i am | 19:43 |
peluse | was just looking at your quorum size comment in get_put_responses() sure seems like a bug to me | 19:43 |
peluse | did you do anything further with it or should I post a gist fix? | 19:43 |
acoles | yeah it felt like a _quorum_size() override method had gone AWOL | 19:43 |
clayg | peluse: acoles: wasn't that call for the .durable? | 19:44 |
acoles | peluse: no i have not take it any further, i wasn't confident enough that i understood | 19:44 |
peluse | the .durable uses the minimum_responses thing | 19:44 |
clayg | peluse: acoles: feels like the standard/crazy "about half your nodes" rules apply | 19:44 |
*** Fin1te has joined #openstack-swift | 19:44 | |
peluse | so we need regular quorum before we issue the .durable and then minimum .durable responses to be done (and that min is 2 right now) | 19:45 |
clayg | peluse: so the PUT was only requring half instead of the ec policies rule - and tests were passing because why? Does our fake test ec policy just happen to have a quorum that is equal to the replication calculation? | 19:45 |
peluse | don't know that (yet) | 19:46 |
clayg | torgomatic: are you also looking at this? | 19:46 |
peluse | I'll post something in the patch shortly | 19:47 |
acoles | clayg: peluse see my comment on patchset 6, need_quorum is opposite to final_phase so self.have_quorum is called for intermediate (.data) phase | 19:48 |
peluse | OK | 19:49 |
acoles | clayg: ok recosntructor tests passing. hardcoded time to create files :/ | 19:54 |
acoles | clayg: so shall we go the whole way and remove the len(files)=1 condition ??? | 19:55 |
clayg | acoles: I thought the only reason we had not is because of some overlap of tests in the base mixin? | 19:55 |
clayg | like we'd have to get rid of the requirement and fix and replicated case in order to have common test pass on both diskfiles? | 19:55 |
acoles | clayg: yes tests and "remaining consistent with legacy" | 19:56 |
acoles | clayg: actually lets not | 19:56 |
acoles | clayg: its make work | 19:56 |
acoles | can fix it later and do other stuff now | 19:57 |
acoles | clayg: the important thing is that old .datas are now cleaned up | 19:57 |
clayg | yeah that's nice to have I suppose | 19:57 |
acoles | clayg: lets put perfectionism on hold | 19:57 |
* acoles is 12 hours into the day's shift | 19:57 | |
* acoles so feeling lazy | 19:58 | |
*** annegentle has joined #openstack-swift | 19:59 | |
*** annegentle has quit IRC | 20:10 | |
*** PurpleJack has quit IRC | 20:11 | |
clayg | so i like X-Backend-Ec-Archive-Index | 20:13 |
peluse | it likes you too | 20:13 |
clayg | acoles: I think there a notion maybe that if we pick something more like "Node-Index" that we could reuse it on other diskfiles - just that we're maybe not gaining much by making it ec specific - but i'm not really sure how long the conept will direclty map to the ring ordering of nodes | 20:14 |
acoles | clayg: ok well maybe its node-index, at least thats 'generic', but backend-frag-index is both ec specific and inconsistent | 20:16 |
clayg | lol | 20:17 |
*** Guest_ has joined #openstack-swift | 20:20 | |
*** rdaly2 has quit IRC | 20:22 | |
clayg | Anyway, i'm not sure how much sense it would make to try and change x-object-sysmeta-ec-archive-index - so we're really just talking about what header the reciever should use to get the value that it hands over to get_diskfile_from_hash | 20:23 |
*** rdaly2_ has joined #openstack-swift | 20:24 | |
clayg | I think x-backend-fragment-index may be close to ideal, i'm not sure exactly what the schema would be for a backend diskfile that could potentially return the wrong data for a given hash unless it is (like ec) chopping the data into fragments somehow - at least fragment-index would match the diskfile kwarg | 20:26 |
clayg | oh except there we used frag_index :'( | 20:27 |
clayg | peluse: hates varible names longer then four characters, if it were up to him and yuan it'd be all fi's and ic's everywhere | 20:27 |
acoles | clayg: heh. is that how we got a variable named 't' in test_reconstructor | 20:28 |
clayg | lol, so as not to confuse it with a ts | 20:28 |
clayg | i'm sure i've named timestamps t1 and t2 | 20:28 |
peluse | what wrong with "t"? | 20:28 |
acoles | clayg: yup. so self.ts() *does* return me a tombstone file right? :P | 20:29 |
clayg | i think the only thing wrong with that t was how the methods that close over that scope were defined before the variable was initialized - i ment to fix that :\ | 20:29 |
acoles | peluse: try searching for where its declared :) | 20:29 |
acoles | peluse: sorry no offence meant i'm sure i use i and x all over the place | 20:31 |
clayg | obviously -> /\<t\> = | 20:32 |
clayg | acoles: ok, so you have a diff for me? | 20:32 |
acoles | almost there running tests | 20:33 |
clayg | i'm still torn on x-backend-node-index | 20:33 |
acoles | i can hardly type 'diff' ! | 20:33 |
*** Guest_ has quit IRC | 20:33 | |
*** HenryG has quit IRC | 20:33 | |
*** aerwin has quit IRC | 20:33 | |
clayg | acoles: what was bad about x-backend-fragment-index? or data-fragment-index or object-fragment-index or diskfile-fragment-index or something like this? | 20:33 |
acoles | clayg: delegate up to notmyname for a PTL decision :P | 20:33 |
clayg | he's AWOL | 20:34 |
*** rdaly2_ has quit IRC | 20:34 | |
*** joeljwright has quit IRC | 20:34 | |
notmyname | I'm here | 20:34 |
acoles | . | 20:35 |
clayg | sweet! | 20:35 |
clayg | we have one of those hard computer science problems | 20:35 |
notmyname | naming things? | 20:35 |
*** welldannit has quit IRC | 20:36 | |
clayg | notmyname: ssync needs to send the reciever a hint as to what value it should send into the diskfile manager's frag_index kwarg | 20:36 |
peluse | and nobody likes the name I used... | 20:36 |
* peluse thinks same story different day | 20:36 | |
clayg | it's a common code path for repliation and ec - but obviously the replicated diskfile ignores the kwarg and ssync reciever defaults to sending in None | 20:36 |
clayg | but in the EC case the current named header is "X-Backed-Ssync-Frag-Index" | 20:37 |
clayg | ... which I sorta think is not so terrible looking at it again | 20:37 |
acoles | so i'm being all OCD but fragment != archive (the sysmeta header) so just thought it should be consistent with the sysmeta thats all | 20:37 |
*** HenryG has joined #openstack-swift | 20:37 | |
acoles | its no big deal lets leave and move on... | 20:37 |
clayg | anyway on the PUT path the proxy sends this value to the object node as X-Object-Sysmeta-EC-Archive-Index | 20:38 |
peluse | clayg acoles, so I posted a change to patch 169989 to address the quorum size but it probably sucks. | 20:38 |
patchbot | peluse: https://review.openstack.org/#/c/169989/ | 20:38 |
clayg | torgomatic: ^ | 20:38 |
acoles | peluse: ok i'll take a look in a mo just getting this diff wrapped for clayg | 20:38 |
clayg | peluse: well but howcome tests weren't failing? | 20:39 |
notmyname | clayg: ya, seems odd that PUT and ssync send different headers for the same thing | 20:40 |
clayg | notmyname: but the X-Object-Sysmeta prefix seems to make sense for the Ec specific diskfile, the object server doesn't really look at it - it just passes sysmeta down to the diskfile and it does it's thing | 20:40 |
notmyname | ya, sysmeta is for storing. x-backend- is for inter-cluster communication (timestamps, etc) | 20:41 |
clayg | notmyname: things are complicated by the fact that there's a coupling of fragment index and the node's offset in the ring (fragment index mostly == primary_node['index'] from the ring) | 20:41 |
clayg | and then there's the fact that diskfile has decided to call the kwarg frag_index - which maybe mostly doesn't matter - it's only the ec specific diskfile that cares about it and like I said the object server doesn't *currently* parse X-Object-Sysmeta-EC-Archive-Index; but it probably will at somepoint, then *all* diskfiles will recieve *some* value for this kwarg, even if it's just a default "None" or something like that - a | 20:44 |
peluse | clayg, I'm looking at that next but suspect with 4+2 Ec the quorumforum for repl being used would be 4 so we'd always get enough. have to go pickup daughter from school in a few, be back later | 20:44 |
clayg | peluse: ok, well I'd be inclined to fix the janky test/fake policy so that it's acctually a good test for ec | 20:45 |
acoles | clayg: ok here it is https://gist.github.com/alistairncoles/0b8bddd06474f023ec21 | 20:47 |
acoles | clayg: thats a diff against patch 170339 the ec recon :/ because i had to fix up those tests | 20:48 |
patchbot | acoles: https://review.openstack.org/#/c/170339/ | 20:48 |
acoles | clayg: so idk maybe you can split across that review and the per-policy-diskfile | 20:49 |
acoles | clayg: sorry, i just realised i was working on top of chain for probe testng etc so that diff won't apply to patch 169987 | 20:49 |
patchbot | acoles: https://review.openstack.org/#/c/169987/ | 20:49 |
acoles | clayg: i undid your reclaim_age thing i'm afraid - the single .data should not be set in results['.data'] | 20:50 |
acoles | when its not being reclaimed | 20:50 |
*** rdaly2 has joined #openstack-swift | 20:56 | |
*** rdaly2 has quit IRC | 21:02 | |
*** ozialien has joined #openstack-swift | 21:04 | |
*** annegentle has joined #openstack-swift | 21:09 | |
clayg | acoles: I don't get it | 21:14 |
acoles | clayg: which bit | 21:14 |
acoles | ? | 21:14 |
clayg | acoles: so if I apply that on the end of the chain - which bits do i have to fix up to get all the diskfile changes in one review? | 21:15 |
clayg | oh, you got rid of "reclaim_rules" | 21:16 |
clayg | that's fine by me I think | 21:16 |
acoles | oh yeah sorry, i didn;t get rid of reclaim_age ! oops | 21:16 |
clayg | acoles: I'm guessing you have noticed I hate variables named timestamp that are acctually strings? | 21:17 |
acoles | clayg: i think its just the test_reconstructor changes that need to go at end of chain and the rest would go on the per-policy diskfile | 21:17 |
acoles | clayg: awww, but i did avoid a dangling elif just for you :P | 21:18 |
clayg | just the other day I was writing some code and the dangling elif seemed like the most obvious correct way to write it - i was so proud of myself - and then I looked at it and realized it was clearer to break it into if; if else | 21:19 |
notmyname | acoles: you know, I was writing some code this weekend and had a dangling elif and removed it because I though "what would clayg do" | 21:19 |
notmyname | clayg: you've trained us all :-) | 21:19 |
* clayg is glad to be rubbing off on folks | 21:19 | |
clayg | acoles: oh i see the elif case statement code | 21:20 |
clayg | see - that's where I reach for a map! | 21:20 |
acoles | clayg: do you want me to split that diff into two? for each review? | 21:21 |
*** proteusguy has joined #openstack-swift | 21:21 | |
*** aix has joined #openstack-swift | 21:21 | |
clayg | no i can split it up - i'm still trying to decide if the increasingly long justification for why that code stinks is really the best that we want to strive for | 21:21 |
*** haomaiw__ has quit IRC | 21:23 | |
acoles | clayg: i know, it sucks. let me see how much grief i get just going the whole way. | 21:23 |
*** gyee has quit IRC | 21:32 | |
*** G________ has joined #openstack-swift | 21:34 | |
*** G________ has quit IRC | 21:43 | |
*** lpabon has quit IRC | 21:44 | |
*** MVenesio has joined #openstack-swift | 21:51 | |
*** Fin1te has quit IRC | 21:51 | |
*** welldannit has joined #openstack-swift | 21:51 | |
*** jkugel has quit IRC | 21:52 | |
*** PurpleJack has joined #openstack-swift | 21:56 | |
clayg | notmyname: so you were supposed to tell me what to name the x-backend-ssync-frag-index | 21:57 |
notmyname | heh | 21:57 |
notmyname | I thought you and acoles figured it out :-) | 21:57 |
notmyname | clayg: I like x-backend-* for ssync because it's a control falg for the process | 21:58 |
notmyname | *flag | 21:58 |
*** jkremer has joined #openstack-swift | 21:58 | |
notmyname | rather than sysmeta because that's what's stored | 21:58 |
notmyname | does that make sense or answer the question? | 21:58 |
notmyname | seemed to be that the question was about x-backend- vs -x-systmeta- | 21:59 |
*** ozialien has quit IRC | 21:59 | |
*** PurpleJack has quit IRC | 22:00 | |
peluse | clayg, build and process_jobs().... wow | 22:02 |
acoles | peluse just left a comment on the review re your gist | 22:07 |
peluse | acoles, it is, I just used the policy property because it read better I thought | 22:10 |
acoles | peluse: k | 22:11 |
acoles | peluse: and i guess the controller doesn't have a 'policy' attribute? cos of the problems with COPY etc | 22:11 |
peluse | nope | 22:13 |
peluse | the quorum stuff got a little twisted after the introduct of the EC obj controller | 22:14 |
peluse | clayg, it doesn't look like the ec recon is propogating a .durable if the .data exists (ie node A has both node B only has .data, reun EC recon and node B is unchanged however if node B starts with no data it ends up with all the right stuff) | 22:15 |
peluse | clayg, but that was a manual test, I can write a probe for it to make sure. This is key to our use of "minimum of 2 durables written" | 22:17 |
*** petertr7 has quit IRC | 22:21 | |
*** itlinux_ has joined #openstack-swift | 22:28 | |
itlinux_ | hi all | 22:29 |
*** annegentle has quit IRC | 22:29 | |
*** cschwede has quit IRC | 22:30 | |
*** MVenesio has quit IRC | 22:30 | |
itlinux_ | quick question I want to ask. The hash on the ring which is cal at the creation of the ring, shows a size, if we are trying to push a new object which maybe bigger than that size does the object fall into two location then.. | 22:30 |
mattoliverau | Morning, wow that was a lot of scroll back to read.. Y'all have been busy | 22:30 |
mattoliverau | itlinux_: what do you mean by size, part power? | 22:32 |
*** cschwede has joined #openstack-swift | 22:33 | |
acoles | mattoliverau: good morning! | 22:34 |
mattoliverau | hey acoles, your still up! Good evening to you sir. | 22:35 |
acoles | mattoliverau: leave it 25 mins and you can say good morning ;) | 22:35 |
*** gyee has joined #openstack-swift | 22:36 | |
mattoliverau | acoles: Right so getting late, thanks for your dedication! clayg and yourself are machines :p | 22:38 |
acoles | clayg: ok done it, fart removed! https://gist.github.com/alistairncoles/f54eff6d179af735c414 same as before it applies to patch 170339, hopefully everything but test_recosntructor changes can be applied to per-policy diskfiles patch | 22:43 |
patchbot | acoles: https://review.openstack.org/#/c/170339/ | 22:44 |
peluse | clayg, here's a propogate durable probe test that fails currently. https://gist.github.com/peluse/a38f8fb516425ef2a905 | 22:44 |
clayg | whoot whot! | 22:44 |
acoles | clayg: its a bit of a hack - some of the tests could probably do with per-policy mixins but lets not go churning those | 22:44 |
acoles | whoa, did i say mixin ? :/ | 22:45 |
clayg | peluse: wait - so why does it fail? :'( | 22:45 |
peluse | clayg, I dunno yet, wanted to make sure I wasn't seeing things first. If you want to take a quick look at the probe test I can dig in but not for another two hours (but then the rest of the night) | 22:46 |
*** jkremer has quit IRC | 22:46 | |
clayg | peluse: yeah idk, i wouldn't expect that test to pass if removing the .durable makes the object node 404? | 22:48 |
peluse | clayg, but running the Ec recon before the next GET should push the .durable over so the next get doesn't 404 | 22:50 |
peluse | right? | 22:50 |
clayg | oh yes of course | 22:50 |
acoles | peluse: does it fail with just a single missing durable? | 22:51 |
clayg | acoles: i think all of the test scenarios devolve to only a single missing durable in all cases | 22:51 |
peluse | yeah, that's odd | 22:51 |
peluse | oh, its a direct get | 22:52 |
peluse | one sec... | 22:52 |
peluse | OK, so the direct get was the reason for the 404, need to change the test to look for a .durable on the node where it was deltedd before the run of the EC recon, not to do a get. | 22:54 |
clayg | peluse: are you sure - you sorta convinced me it was correct | 22:56 |
peluse | well, if we do a direct get and there's no .durable then the obj server should never give us back the data however a proxy get should because we're only killing one .durable so it can still decode | 22:57 |
clayg | but the direct_get is after the reconstructor? | 22:58 |
acoles | yeah test looks good to me | 22:58 |
peluse | well, yeah OK | 22:58 |
peluse | man, I'm just in a fog these daze | 22:58 |
clayg | peluse: acoles: so fwiw, torgomatic updated the fake ec policy/ring on the proxy PUT path tests and after cleaning up a bunch of unrelated test churn that was being to opinionated about the size and number of things - it seemed like the proxy was already doing the right thing :\ | 23:00 |
torgomatic | yeah, this is confusing as hell | 23:00 |
* torgomatic is running at about 5 WTF/min | 23:00 | |
clayg | torgomatic: well even if it turns out to be correct we probably want to make it less confusing :'( | 23:00 |
torgomatic | clayg: yeah, I'm trying that now | 23:00 |
*** zhill has joined #openstack-swift | 23:01 | |
clayg | peluse: you don't invalidate the hashes :\ | 23:01 |
peluse | ahhh | 23:02 |
peluse | yes, the test I stole from was killing the pkl file. Note that I did this manually first and I was doing that. One sec... | 23:02 |
torgomatic | half the confusingness is probably because we're half-done refactoring BaseObjectController, so we've got this ReplicatedObjectController that's got like 2% of the code for replicated objects, then BaseObjectController has the other 98% but ECObjectController overrides the methods to hide it | 23:04 |
peluse | OK, removing the hashes.pkl as well and it still fails. Will update the test on the gist | 23:04 |
peluse | OK, updated. I wonder if the .durable isn't being reflected properly in the hashes | 23:06 |
acoles | torgomatic: so according to my distilled piece of _get_put_responses http://paste.openstack.org/show/203778/ we *always* call _have_adequate_successes(statuses, min_responses) and i'm wondering if that turns out to be just as strict or stricter than the quorum test | 23:07 |
torgomatic | acoles: right, so that call to _have_adequate_successes uses min_responses, which has the correct quorum value (policy.quorum) | 23:09 |
clayg | peluse: ok, so I'm thinking the get_suffix_delta is only comparing the fragment indexes and not the None's | 23:10 |
clayg | i'm guessing if the suffixes made their way into ssync we'd get the right behavior | 23:10 |
peluse | OK, looks like that's OK. I think we're missing it in get_suffix_delta because we're not passing in None | 23:10 |
torgomatic | and then I guess the subsequent call to self.have_quorum will want a majority, but we've already got one because policy.quorum is bigger than a majority | 23:10 |
clayg | yay policy.quorum! | 23:10 |
peluse | yeah, that :) | 23:10 |
peluse | BTW, I just confirmed that (clayg) | 23:10 |
torgomatic | unless you do something asshatted like run a 10+20 scheme, and then we have trouble | 23:10 |
clayg | peluse: you're one step ahead of as usual! | 23:11 |
peluse | sheeeeit | 23:11 |
peluse | usually 3 steps behind is more like it! | 23:11 |
clayg | torgomatic: what's wrong with a 10+20 schema! it's like even better than replication! | 23:11 |
peluse | OK, I have to do kid erands. I can post a gist fix when I get back if note its already done (if I don't see any mroe chatter I'll do it) | 23:11 |
clayg | peluse: so you have a diff you want me to look at? | 23:11 |
peluse | clayg, BTW the EC recon job changes are really fantastic! | 23:12 |
torgomatic | clayg: well, at a minimum, our code will happily write down only 11 fragment archives, then blow up horribly when it can't write 16 .durable files ;) | 23:12 |
clayg | peluse: thanks for saying | 23:12 |
peluse | clayg, not yet, I just hacked in some prints to confirm what was going on. don't have a solution yet | 23:12 |
clayg | peluse: perfect, no problem | 23:12 |
torgomatic | HOWEVER | 23:12 |
clayg | peluse: i think i'll start with a unittest - thanks for writing up that probetest so quick - ttyl | 23:13 |
torgomatic | if your schema is not ridiculously over-parity-bitted, then our stuff works | 23:13 |
acoles | torgomatic: hmm, but _have_adequate_successes = True just breaks out of waiting for responses early, could still be <quorum goood responses | 23:13 |
torgomatic | acoles: yes, but with sane numbers of things, you need M+1 things written to pass _have_adequate_successes(), and that's more than majority(M+K) | 23:14 |
torgomatic | it only falls over when K>M | 23:14 |
*** chlong has joined #openstack-swift | 23:14 | |
torgomatic | possibly K > M + 2 | 23:14 |
torgomatic | so your 10+4 scheme is just fine; your 4+10 is broken | 23:15 |
torgomatic | (occasionally and only if massive-but-not-*too*-massive failure occurs) | 23:15 |
acoles | torgomatic: ok i'm sure your brain is operating better than mine right now | 23:17 |
clayg | poor acoles :'( | 23:17 |
acoles | torgomatic: if not always in fact | 23:17 |
acoles | clayg: so where are we at? i'm wondering if i can go sleep? did you see the fart-removing-diff back up^^? | 23:18 |
clayg | acoles: yes i love it! you tap out man - it's all going to be beautiful in the morning! | 23:19 |
notmyname | thanks acoles! | 23:19 |
acoles | ok well good luck guys, i may catch you later i guess depending on how it goes :p | 23:21 |
*** fanyaohong has joined #openstack-swift | 23:21 | |
*** jamielennox|away is now known as jamielennox | 23:22 | |
torgomatic | if the fart-removing diff is delayed, is it flatu-late? | 23:22 |
acoles | torgomatic: lol | 23:22 |
torgomatic | I can make fart jokes again now that it's after Easter | 23:23 |
* torgomatic had given them up for flatu-Lent | 23:23 | |
clayg | ^ acoles this is what happens if you feed it | 23:24 |
acoles | torgomatic: is that like when you loan someone a bad tire - flat-u-lent | 23:24 |
mattoliverau | lol | 23:24 |
torgomatic | acoles: exactly! | 23:24 |
acoles | did is pell tire right? | 23:25 |
acoles | spell | 23:25 |
torgomatic | acoles: somewhere | 23:25 |
clayg | the ring has tiers - and it's round | 23:25 |
acoles | like awedding cake | 23:25 |
*** acoles is now known as acoles_away | 23:28 | |
clayg | peluse: ok i have a failing unittest - we got this one | 23:29 |
* notmyname is about to drive home. will be on full time after | 23:30 | |
*** itlinux_ has quit IRC | 23:35 | |
*** kota_ has joined #openstack-swift | 23:42 | |
kota_ | morning | 23:42 |
*** PurpleJack has joined #openstack-swift | 23:44 | |
*** zhill has quit IRC | 23:49 | |
*** PurpleJack has quit IRC | 23:49 | |
clayg | kota_: good morning! | 23:50 |
*** ho has joined #openstack-swift | 23:52 | |
mattoliverau | kota_: morning | 23:54 |
ho | good morning! | 23:54 |
mattoliverau | ho: morning | 23:56 |
ho | mattoliverau: morning! | 23:57 |
kota_ | clayg, mattoliverau: good morning :) | 23:59 |
*** km has joined #openstack-swift | 23:59 |
Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!