Monday, 2017-01-30

openstackgerritMatthew Oliver proposed openstack/swift: Correct ringbuilder's set_weight usage string  https://review.openstack.org/42662000:08
notmynametimburke: I'm not sure what this is in reference to, but it sounds related to the auth v1 keystone plugin you made http://lists.openstack.org/pipermail/openstack/2017-January/018472.html00:12
*** NM has joined #openstack-swift01:24
*** foexle has quit IRC01:35
*** sams-gleb has joined #openstack-swift01:35
*** sams-gleb has quit IRC01:40
kota_hello world02:00
kota_back to work shortly from my sick.02:00
mattoliveraukota_: morning, welcome back!02:08
kota_mattoliverau: thanks!02:15
kota_mattoliverau: however, i'm planning to work short today, it doesn't seem still completely fine yet.02:18
mattoliveraukota_: well it's good to see that your starting to feel much better! but don't work too hard, it's better to rest and get fully better rather then be not quite 100% for a long time.02:19
*** tanee_away is now known as tanee02:31
*** NM has quit IRC02:31
*** rcernin has joined #openstack-swift03:11
*** rcernin has quit IRC03:13
*** rcernin has joined #openstack-swift03:14
*** rcernin has quit IRC03:25
*** sams-gleb has joined #openstack-swift03:38
*** sams-gleb has quit IRC03:42
*** SkyRocknRoll has joined #openstack-swift03:49
*** links has joined #openstack-swift03:56
*** links has quit IRC03:57
mahaticgood morning04:12
mahaticnotmyname: +1 on your comment on the US news04:14
mahatickota_: good to know you're feeling better04:15
kota_mahatic: good morning, and thanks04:15
*** maestropandy has joined #openstack-swift04:31
*** vinsh has joined #openstack-swift04:31
mattoliveraumahatic: morning04:32
mahaticmattoliverau: o/04:32
*** vinsh has quit IRC04:51
*** klrmn has quit IRC05:04
*** psachin has joined #openstack-swift05:28
*** maestropandy has left #openstack-swift05:33
*** sams-gleb has joined #openstack-swift05:40
*** sams-gleb has quit IRC05:45
*** takashi has joined #openstack-swift05:57
openstackgerritKota Tsuyuzaki proposed openstack/swift: Optimize ec duplication and its md5 hashing  https://review.openstack.org/42167306:01
openstackgerritKota Tsuyuzaki proposed openstack/swift: EC Fragment Duplication - Foundational Global EC Cluster Support  https://review.openstack.org/21916506:01
kota_Update only my priority patches, I know I have a stack of incoming reviews but I'd like to step by small so pend them for the rest of this week. Getting back home to take a rest/care of my family.06:04
* kota_ is in baby steps06:04
*** ppai has joined #openstack-swift06:17
*** takashi has quit IRC07:02
*** rcernin has joined #openstack-swift07:11
*** rcernin has quit IRC07:12
*** rcernin has joined #openstack-swift07:12
mattoliveraukota_: yeah, don't rush. you'll be back to full speed in no time :)07:13
*** silor has joined #openstack-swift07:21
*** silor1 has joined #openstack-swift07:34
*** silor has quit IRC07:36
*** silor1 is now known as silor07:36
*** sams-gleb has joined #openstack-swift07:42
*** sams-gleb has quit IRC07:47
*** takashi has joined #openstack-swift07:47
*** tesseract has joined #openstack-swift07:50
*** ChubYann has quit IRC08:00
*** sams-gleb has joined #openstack-swift08:08
*** rledisez has joined #openstack-swift08:12
*** pcaruana has joined #openstack-swift08:15
*** oshritf has joined #openstack-swift08:27
*** geaaru has joined #openstack-swift08:34
*** rcernin has quit IRC09:31
*** rcernin has joined #openstack-swift09:33
*** oshritf has quit IRC09:36
*** oshritf has joined #openstack-swift09:40
*** jistr has quit IRC09:49
*** jistr has joined #openstack-swift09:51
*** maestropandy has joined #openstack-swift09:56
*** maestropandy has quit IRC10:00
*** oshritf has quit IRC10:02
*** takashi has quit IRC10:05
*** maestropandy has joined #openstack-swift10:05
*** maestropandy has left #openstack-swift10:05
*** oshritf has joined #openstack-swift10:08
*** sams-gleb has quit IRC10:21
*** sams-gleb has joined #openstack-swift10:21
*** sams-gleb has quit IRC10:25
*** mvk has quit IRC10:28
*** sams-gleb has joined #openstack-swift10:37
*** mvk has joined #openstack-swift10:57
*** kei_yama has quit IRC11:00
openstackgerritMatthew Oliver proposed openstack/swift: Correct ringbuilder's set_weight usage string  https://review.openstack.org/42662011:19
*** jordanP has joined #openstack-swift11:23
*** dims has joined #openstack-swift11:26
*** oshritf_ has joined #openstack-swift11:31
*** psachin has quit IRC11:32
*** oshritf has quit IRC11:33
*** vint_bra has joined #openstack-swift11:54
*** bkopilov has quit IRC12:02
*** foexle has joined #openstack-swift12:19
*** catintheroof has joined #openstack-swift12:25
*** NM has joined #openstack-swift12:33
*** bkopilov has joined #openstack-swift12:41
*** wasmum has quit IRC13:03
*** oshritf__ has joined #openstack-swift13:15
*** oshritf_ has quit IRC13:18
*** caiobrentano has joined #openstack-swift14:16
*** caiobrentano has quit IRC14:23
*** vills_ has joined #openstack-swift14:26
openstackgerritMerged openstack/swift: Correct ringbuilder's set_weight usage string  https://review.openstack.org/42662014:27
openstackgerritBéla Vancsics proposed openstack/swift: Reduced the complexity of the _response_iter method  https://review.openstack.org/42678214:41
*** rcernin has quit IRC14:48
*** _JZ_ has joined #openstack-swift14:57
*** rcernin has joined #openstack-swift15:00
notmynametdasilva: cschwede: this is interesting http://martin.kleppmann.com/2017/01/26/data-loss-in-large-clusters.html15:01
*** mvk has quit IRC15:02
cschwedenotmyname: Hi! Yes, but nothing new. Summarized: the bigger the cluster, the more likely it is (as an operator) that there is data loss. However, as an user my probability is still the same15:02
cschwedenotmyname: i was somewhat worried when i saw that on HN, took me a while to spot the difference15:03
notmynameoh yeah? I just woke up and read it :-)15:03
tdasilvagood morning15:03
cschwedenotmyname: if you have a VERY LARGE cluster, you would expect a data loss probability of close to 1, right? it's just the law of the big numbers IIRC15:04
notmynameoh so kinda like the birthday paradox. more people increases the chance that *some* pair has the same birthday, but there's always still the 1/365 chance that someone has the same birthday as *you*15:04
cschwedebut the probability to loose one of _my_ objects is still like 1*10-9 or whatever my number is15:05
*** bkopilov has quit IRC15:05
notmynamethat's not very comforting, though.15:07
notmynamebecause we aren't supposed to protect for a given partition. we're supposed to protect for any data loss for all partitions15:09
*** bkopilov has joined #openstack-swift15:11
tdasilvamm..interesting, i guess like the author i'd be also interested in hearing what's happening in practice?15:13
*** sams-gleb has quit IRC15:16
*** caiobrentano has joined #openstack-swift15:16
*** sams-gleb has joined #openstack-swift15:17
* notmyname needs to continue to get ready for the day15:17
*** mvk has joined #openstack-swift15:18
*** vinsh has joined #openstack-swift15:20
*** sams-gleb has quit IRC15:21
*** sams-gleb has joined #openstack-swift15:33
*** mvk has quit IRC15:40
jordanPthe comments on that article are informative15:49
*** mvk has joined #openstack-swift15:53
*** acoles_ is now known as acoles15:54
*** SkyRocknRoll has quit IRC16:22
*** klrmn has joined #openstack-swift16:22
*** rcernin has quit IRC16:24
*** ppai has quit IRC16:28
*** tesseract has quit IRC16:58
*** garyj has joined #openstack-swift17:02
notmynamegood morning (for realz)17:05
*** jordanP has quit IRC17:16
*** chsc has joined #openstack-swift17:17
*** chsc has joined #openstack-swift17:17
*** klrmn has quit IRC17:18
*** mvk has quit IRC17:24
*** StraubTW has joined #openstack-swift17:37
*** garyj has quit IRC17:40
*** JimCheung has joined #openstack-swift17:41
*** rledisez has quit IRC17:45
*** garyj has joined #openstack-swift17:48
*** newmember has quit IRC17:54
jrichlinotmyname: thanks for pointing out the OS statement, and new priority reviews list.  bkeller` and I are at San Jose with clu and tqtran for the next few days.18:04
timburkegood morning18:04
jrichlitimburke: good morning18:05
acolesnotmyname: feeling better?18:06
*** mvk has joined #openstack-swift18:07
timburkenotmyname: that sounds more like keystone's discovery api -- i think he might want to look at https://github.com/openstack/keystoneauth/blob/master/keystoneauth1/discover.py ?18:09
timburkeiirc, the generic password plugin should already handle the v2/v3 split fairly seemlessly18:11
*** klrmn has joined #openstack-swift18:12
notmynameacoles: I am feeling much more normal. thank18:12
notmyname*thanks18:13
acolesnotmyname: good to hear18:13
acolesclayg: ping18:14
*** htruta` is now known as htruta18:23
claygacoles: pong18:23
acolesclayg: hi. I was just typing up some more comment on patch https://review.openstack.org/#/c/41978718:24
patchbotpatch 419787 - swift - Better optimistic lock in get_hashes18:24
claygacoles: I played with a follup patch over the weekened - trying to figure out a better way to represent "None"/all-hashes-are-invalid *on-disk*18:24
acolesclayg: yes, that! just left a link to a gist, I wonder how similar our solutions are?18:24
claygbut I'm having a hard time dreaming up a test where {'updated': now()} is acctually wrong - part of the "problem" seems to be the way that get_hashes passes in the do_listdir when it has to recurse because it looses the optimistic lock.18:25
claygacoles: my patch wrote {'updated': now(), 'valid': False} - what about yours?  I didn't really *like* mine.18:26
acolesclayg: hehe, at some point today i wrote exactly that!18:26
claygmy first attempt serialized it into a dict - but revivified as a SuffixHashes object18:26
clayghated that too18:26
acolesclayg: I ended up with 'created': time.time() when writing a hashes.pkl that does not have results of suffix calcs (i.e. a "fresh" hashes.pkl)18:27
claygacoles: nice - sounds like you might have gotten further than I did!  Do you love it!?!?18:27
acolesclayg: we have to (a) mark a hashes.pkl that is fresh and (b) do it differently in every process that writes a fresh pkl18:27
claygoh you have a test - you definately got further18:27
*** zaitcev has joined #openstack-swift18:28
*** ChanServ sets mode: +v zaitcev18:28
claygacoles: essentially it seems yes18:28
acolesclayg: I try to stay dispassionate about my code ;)18:28
acolestoo many htings i have loved have been torn down by greater minds ;)18:28
claygacoles: I hate 99% of my code - so the 1% that is not obvious crap I tend to get pretty excited about.18:28
*** tqtran has joined #openstack-swift18:29
acoleslol18:29
*** vills_ has quit IRC18:30
acolesclayg: what I do like is that once the follow up patch lands, consolidate_hashes either writes a pkl and returns a dict, no more None.18:30
acolesor raises exception18:31
acoles(with the gist that is)18:31
claygacoles: i'm still going over the comments - should I jump ahead?18:33
acolesclayg: so when I had just 'valid': False, IIRC the follow up patch https://review.openstack.org/#/c/426336/3 test change would then fail18:33
patchbotpatch 426336 - swift - Fix race when consolidating new partition18:33
acolesclayg: which drove me to created=time.time(), but with care about when that gets set - refresh page and see my last comment on gerrit18:36
acolesclayg: I'm not sure if I picked up all Pavel's and mahatic comments in my gist, so there may be other comments to pay attention to, but the gist is just about all I have to offer today18:37
claygacoles: ok, the test looks like something.  I didn't immediately fall in love with created key - i see a lot of handling for "if created in hashes" and "pop('created', None)" - but the test is a huge head start18:39
claygOne of things I hated about my patch is where intialization was just "hashes = None" before I changed it to {'valid': False'}18:40
acolesclayg: I tried and failed to overload 'updated' key - see my gerrit comments18:40
claygacoles: I can definately vouche overloading 'updated' is a path to madness18:41
claygacoles: ok, tactics18:41
claygacoles: if I manage to come up with something I like today - should I push over patch 419787?18:42
patchbothttps://review.openstack.org/#/c/419787/ - swift - Better optimistic lock in get_hashes18:42
acolesclayg: yes. I am dome for today18:42
acolesdone*18:42
claygacoles: does the test at https://gist.github.com/alistairncoles/648c0d8e7f35f1bc4d5cf8994d8a9ce0#file-gist-diff-L120 demonstrate a *regression* or just another failure?18:42
openstackgerritTim Burke proposed openstack/swift: Warn about using EC with isa_l_rs_vand and nparity >= 5  https://review.openstack.org/42549618:43
claygacoles: what's the status on patch 426336 - that test works on master - but sadly only because of the double-rehash-everytime bug18:43
patchbothttps://review.openstack.org/#/c/426336/ - swift - Fix race when consolidating new partition18:43
claygwe have to make *some* kind of progress in this method at some point - i'm worried great is the enemy of good-enough?  (obviously I'm tired and have other patches I'm worried about too)18:44
claygif we have a clear path of what's left to fix and we think we manage the diff - i'm all for it - but I'm slowly loosing track of all the different problems we're fixing in this patch18:44
clayggot more races than a day at the tracks18:44
acolesclayg: the test in the gist passes on master (small sample)18:46
*** foexle has quit IRC18:46
acolesclayg: patch 426336 I think can stay as a follow on, to keep things simple.18:47
patchbothttps://review.openstack.org/#/c/426336/ - swift - Fix race when consolidating new partition18:47
claygacoles: ok, wfm18:47
claygI don't have anything about patch 334719 loaded into my head and no bandwidth to load it today I don't think18:49
patchbothttps://review.openstack.org/#/c/334719/ - swift - Preserve X-Static-Large-Object from .data file aft...18:49
claygthere's also a follow up patch apparently that needs to be squashed - if anyone knows what's going on there and wants to push over it'd be greatly appreciated18:49
claygif not - I'll try not to forget about it again18:49
acolesclayg: the change you may in diskfile on patch 426336 was exactly what I had in mind but before latest rev of patch 419787 it caused some errors, but now it looks good because we're always writing a hashes.pkl18:49
patchbothttps://review.openstack.org/#/c/426336/ - swift - Fix race when consolidating new partition18:49
patchbothttps://review.openstack.org/#/c/419787/ - swift - Better optimistic lock in get_hashes18:50
claygyeah, np18:50
acolesclayg: I need to go, sure youll do the right thing18:51
*** acoles is now known as acoles_18:53
*** geaaru has quit IRC19:27
timburkenotmyname: fyi https://review.openstack.org/42689419:35
patchbotpatch 426894 - releases - stable/ocata branch for python-swiftclient19:35
timburkenotmyname: oh yeah, did we ever get updates on the project mascot? is it actually a swift now?19:38
notmynameno word yet on mascot19:40
notmynametimburke: oh, right. we've got to specifically ask for that stable release now, right?19:40
notmynamethanks19:40
timburkeyup19:41
notmynameok, I'll add my + vote19:41
*** oshritf__ has quit IRC19:41
* notmyname goes to lunch19:42
openstackgerritTim Burke proposed openstack/swift: Warn about using EC with isa_l_rs_vand and nparity >= 5  https://review.openstack.org/42549619:58
* timburke grumbles19:58
*** silor has quit IRC20:03
*** ChubYann has joined #openstack-swift20:06
*** garyj has quit IRC20:21
*** vills has joined #openstack-swift20:26
*** vills has quit IRC20:26
*** newmember has joined #openstack-swift20:30
*** NM has quit IRC20:34
notmynametimburke: looks like the branch already landed20:36
timburkeyup. so we can go land https://review.openstack.org/#/c/426902/20:40
patchbotpatch 426902 - python-swiftclient (stable/ocata) - Update .gitreview for stable/ocata20:40
notmynametimburke: cool. thanks for that too20:41
timburkenotmyname: well now it's bots all the way down20:41
mattoliverauMorning21:05
timburkeclayg: i'd feel a whole lot better about https://review.openstack.org/#/c/425441/ with something like http://paste.openstack.org/show/596937/ applied...21:12
patchbotpatch 425441 - swift - Do not revert fragments to handoffs21:12
timburkeas it is, i'm not sure i know enough about the reconstructor to say definitively that it's a good idea21:12
*** vint_bra has quit IRC21:23
claygtimburke: I think that mostly ends up moving it twice in practice - but possibly get it into the handoff chain sooner21:30
claygmight be attractive if you run a really low m21:30
claygdefinately better than letting them flop around forever!21:30
timburkeclayg: again, it just seems like we're hurting availability without something like that. might want some constant (0 < C < 1?) in front of that replica_count, though?21:32
*** NM has joined #openstack-swift21:33
claygi think unless the primary disk is unmounted the absolute best thing to do is wait21:34
*** Jeffrey4l__ has joined #openstack-swift21:34
*** Jeffrey4l_ has quit IRC21:35
claygafter a rebalance you're used to an "availability issue" - the best was to avoid a *durability* issue is to get from "rebalancing" to "rebalanced" asap - that means only moving them once21:35
*** NM has quit IRC21:36
claygIME - some additional experimentation may be in order - but i'm sure the patch works better than master - maybe that's not good enough?21:36
*** catintheroof has quit IRC21:40
*** catintheroof has joined #openstack-swift21:42
*** NM has joined #openstack-swift21:42
*** newmember has quit IRC21:48
timburkeclayg: when we were talking about test_get_more_nodes, you said that we wanted to make sure the handoff iter stayed relatively stable through rebalances, particularly at the start. do we have any sort of feel for the probabilities involved here?21:49
timburkeseems like if the handoff list is probably mostly the same, then (1) you probably *won't* have an availability issue, (2) the reconstructor would most likely hit the i'm-a-handoff path and wait, and (3) having data that's not fairly high up in priority is rather bad21:49
*** NM has quit IRC21:52
timburke(...data *on a node* that's not fairly high up in priority...)21:52
*** catinthe_ has joined #openstack-swift21:52
*** catintheroof has quit IRC21:53
claygtimburke: I think maybe it depends on how many disks you have (or are adding?) - I don't think it's particuarlly common (although maybe not exceptionally rare depending on the # of disk) that an old primary becomes a handoff (all disks are handoffs at some point, it's just a matter of depth)21:54
claygtimburke: anyway - i'm not sure I follow the significance of that line of reasoning - I don't think I'm strongly against moving parts "between once and twice" and rebalance - it's close enough to "once" from "betweence once and infinity" that it's probably good enough for me!22:00
*** sams-gleb has quit IRC22:01
timburkeblerg, right, the majority of the time it'll be an old-primary rather than an old-handoff-still-within-request_node_count (see why i shorten that to handoff? maybe we need a separate term there...)22:02
timburkeok, i think i'm getting more on board with it as-is22:03
*** catintheroof has joined #openstack-swift22:04
*** catinthe_ has quit IRC22:07
*** foexle has joined #openstack-swift22:09
claygtimburke: I think the replicator does this cute trick where it has like "attempts" - then it += 1 when it sees a 507 response22:13
claygidk, i'm still working on SuffixHashes - I'll keep thinking about when/how to revert to handoff22:15
claygtimburke: thanks!22:15
timburkehmmm... seems like "old primary becomes a handoff" would be a rather nice thing to have...22:15
*** vint_bra has joined #openstack-swift22:17
*** catinthe_ has joined #openstack-swift22:17
claygI've had any good thoughts on how to do it - it's more attractive on GET than PUT22:18
*** catinth__ has joined #openstack-swift22:18
*** catintheroof has quit IRC22:18
timburkeidk, seems like on PUT it'd still be nice to have that handoff get rid of any overwritten data quickly22:19
timburkeclayg: i think i'm on board, though. want me to just go change the bug to 1653169 and call it done? is it worth cleaning up those unnecessary breaks?22:19
timburkewe never create revert jobs on primaries, right? that seems like it'd be a silly thing to do22:20
*** catinthe_ has quit IRC22:22
claygtimburke: I think you raised some good points in comments - i'd like to give them some more thought - I can push something up by EOD - do you think you'd have time to give it another round tmrw?22:22
timburkeyeah, most likely. but if i'm just fixing the bug #, we keep cschwede's +2 and we can drop those breaks (turns out neither check is necessary) later22:26
claygtimburke: sorry - my heads not really in game - i trust your judgement22:27
*** foexle has quit IRC22:27
*** catinth__ has quit IRC22:29
timburkeclayg: no worries; i know you've got hashes on the mind22:29
*** sams-gleb has joined #openstack-swift22:42
*** sams-gleb has quit IRC22:42
*** sams-gleb has joined #openstack-swift22:42
*** sams-gleb has quit IRC22:43
timburkei take back my remark on silliness -- it's a perfectly reasonable thing to do if the primary has multiple frags, in which case i don't like https://github.com/openstack/swift/blob/2.12.0/swift/obj/reconstructor.py#L659-L661 ...22:49
timburkei think i need to go read more about the reconstructor22:49
timburkelooking at http://docs.openstack.org/developer/swift/overview_erasure_code.html -- anyone know what's meant by "Additionally, its [sic] not always the case that the processing of a particular suffix directory means one or the other for the entire directory"? one or the other *what*?23:06
*** StraubTW has quit IRC23:06
claygone or the other "job types" - replicator has update and updated_deleted - each part goes into once - reconstructor can have a part generate jobs of both types23:18
claygI think that was all it was getting at23:18
clayg^ from the last paragraph in http://docs.openstack.org/developer/swift/overview_erasure_code.html#the-reconstructor23:19
clayghonestly I totally forgot that document went into that detail - some of those reconstructor changes could be introducing doc regressions - i should audit...23:20
*** kei_yama has joined #openstack-swift23:30
openstackgerritTim Burke proposed openstack/swift: Clean up EC overview docs a bit  https://review.openstack.org/42697123:30
*** sileht has quit IRC23:37
*** chsc has quit IRC23:37
*** sileht has joined #openstack-swift23:41
*** sams-gleb has joined #openstack-swift23:43
*** sams-gleb has quit IRC23:48
*** catintheroof has joined #openstack-swift23:50
*** jamielennox is now known as jamielennox|away23:58
*** jamielennox|away is now known as jamielennox23:59

Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!