21:00:51 <notmyname> #startmeeting swift
21:00:52 <openstack> Meeting started Wed Aug 16 21:00:51 2017 UTC and is due to finish in 60 minutes.  The chair is notmyname. Information about MeetBot at http://wiki.debian.org/MeetBot.
21:00:54 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
21:00:56 <openstack> The meeting name has been set to 'swift'
21:01:00 <notmyname> who's here for the swift team meeting?
21:01:05 <mattoliverau> o/
21:01:12 <joeljwright> hello o/
21:01:15 <rledisez> hi o/
21:01:16 <mathiasb> o/
21:01:55 <torgomatic> hi
21:01:56 <acoles> hi
21:02:06 <notmyname> clayg: ping
21:02:06 <tdasilva> hello
21:02:19 <clayg> werd
21:02:56 <notmyname> all right, let's get started
21:02:58 <timburke> i'm here, i'm here!
21:02:59 <notmyname> #link https://wiki.openstack.org/wiki/Meetings/Swift
21:03:02 <notmyname> timburke: :-)
21:03:13 <notmyname> a couple of topics to go over this week
21:03:40 <notmyname> #topic pre/post amble patch
21:03:49 <notmyname> let me set some context here
21:04:19 <notmyname> we have https://review.openstack.org/#/c/365371/ from joeljwright that's been open a while
21:04:32 <notmyname> hmm.. patchbot isn't joining...
21:05:00 <timburke> "Add Preamble and Postamble to SLO and SegmentedIterable"
21:05:25 <clayg> i've seen more patchsets than that
21:05:32 <clayg> i'm only mildly impressed
21:05:32 <notmyname> ok, so joeljwright is asking for a reasonable thing: is this patch (and therefore the idea behind it) something that we will actually land, or should he investigate other options
21:05:44 <notmyname> a historic IRC conversation was at
21:05:46 <notmyname> #link http://eavesdrop.openstack.org/irclogs/%23openstack-swift/%23openstack-swift.2017-06-19.log.html#t2017-06-19T15:48:47
21:06:31 <notmyname> in my opinion, it seems like a relatively novel extension to the SLO manifest format that is not breaking anything existing and that adds new functionality
21:07:02 <notmyname> but to quote the earlier conversation "there's such a huge difference between "this patch is good enough to land" and "this patch should land" " and "I'm a little worried about inventing something to solve a bunch of use cases that we make up"
21:07:19 <notmyname> so... joeljwright, where does that leave us?
21:07:33 <notmyname> share more about your use of this functionality at sohonet
21:07:37 <kota_> hello
21:07:51 * kota_ is late
21:08:00 <notmyname> kota_: welcome. we're just getting started (you didn't miss much)
21:08:08 <joeljwright> well, this all came about when we wanted to share multiple files using a tempurl and preserve some structure (so initially a tarball)
21:08:25 <joeljwright> but when we started the work we realised it could be so much more
21:08:31 <clayg> joeljwright: notmyname was highlighting that you're past "this is one way I can see to solve this" and all the way into "we're using this and it is working"
21:08:49 <clayg> for like ... business needs - not just like "i tested it and it does what it says on the tin"
21:08:54 <joeljwright> hence the generic extension to SLO
21:09:11 <joeljwright> yes, we're not using exactly this (because it's not released)
21:09:18 <notmyname> yeah, what clayg said is what i'm interested in
21:09:29 <joeljwright> but it makes a big difference to our use case of building tarballs using existing segments
21:09:49 <joeljwright> we don't want to store thousands of tiny objects
21:09:58 <joeljwright> because download performance is bad
21:10:03 <joeljwright> and we're not interested in that data
21:10:21 <notmyname> the tiny objects being the *ambles for tarfile segments?
21:10:26 <joeljwright> yes, sorry
21:10:45 <joeljwright> in order to build tarballs we have to generate 2 very small objects for every stored object
21:10:53 <joeljwright> a tar header and tar padding
21:11:02 <joeljwright> both in the 0-1024 byte size range
21:11:46 <notmyname> ok. and you're using this today in prod?
21:11:49 <joeljwright> in exploring how to make it better we realised that with a fairly simple addition to SLO we could explore not only tar, but also buyilding other container formats
21:12:12 <joeljwright> we're using a related version in testing
21:12:21 <clayg> yeah, I think doing the tar SLO client side is weird - I could see just *not* bothering with the tar and just making the client do GET for the objects it wants but *shrug* in the tempurl case (before) prefix... there's some flexibility here to be really prescriptive
21:12:31 <joeljwright> we haven't deployed it to prod because of maintaining a patch
21:12:33 <notmyname> ok. did you find other non-tar formats that work here?
21:12:35 <clayg> you can download these 10K things, in a tarball, with this single tempurl
21:13:16 <joeljwright> clayg: to extend that "with these names, and fail if anything on disk changes"
21:13:21 <clayg> i do believe that is a thing that will be useful from time to time - if they can figure out how to solve that with this they probably could have figured out a way to solve it w/o this
21:13:29 <clayg> joeljwright: good point!
21:13:40 <clayg> so... where are you at?
21:13:58 <clayg> I think a lot of people have got past "yeah that is probably be a thing we could do"
21:14:07 <joeljwright> currently I have a test system that uses this patch and builds the premables/postambles manually
21:14:31 <joeljwright> I also have an experimental middleware that's just a pre-SLO manifest transform to make the headers/padding
21:14:48 <clayg> like stub data - or using your data model?  Like your business use case?  with the existing segments and stuff?
21:15:21 <joeljwright> clayg: can you elaborate what you're after
21:15:23 <joeljwright> ?
21:15:24 <rledisez> joeljwright: is there any reason not to dynamically generate the tar headers on a requests (eg: POST req containing the list of objects to include in the archive)? i don't know the tar format, but i used to do that for zip. maybe it just does not match your use case
21:15:27 <clayg> the manifests are big right?  a few MB?  Can you share for a production like use-case one a real world example?
21:16:04 <joeljwright> yeah, our main product that uses Swift is a file transfer application
21:16:11 <clayg> rledisez: that cuts out the tempurl use case - or at least you still have to distribute a manifest thing that enumerates the things
21:16:20 <joeljwright> we build packages to be downloaded by recipients who do not have swift accounts
21:16:34 <joeljwright> we use SLO + tempurl to achieve this
21:16:41 <clayg> rledisez: I think it's related but distinct from joeljwright's use-case - but there is probably a use-case to make for more dynamic stuff too.
21:17:23 <joeljwright> rledisez: the reason not to do it dynamically is to get the SLO data validation check
21:17:30 <notmyname> one thing I like about joeljwright's patch is that it has the hooks for future stuff like dynamic *ambles, but it does not add that complexity now
21:17:39 <joeljwright> if it downloads/validates the data is what the sender wanted to share
21:17:46 <clayg> joeljwright: specific elaboration on what I'm after (not sure if this helps anyone else) a real world example of a for-realzy-sovling-a-customer-use-case segment with *ambles
21:18:30 <clayg> Like... in python stdlib folks always say "wouldn't it be great if this was in stdlib" and python core be like "yeah, probably, you should put it on pypi and after a bunch of people use it and you flesh out some of the edges of rubber meats the road we'll bring it in"
21:18:31 <joeljwright> the only real-world I need it now use case is making tar-SLOs that don't cripple performance with loads of read for tiny objects on the object servers
21:18:33 <timburke> i'm still a little up in the air about whether the inline data should be attached to a segment or to have a new "segment type" -- but i guess since we already have the *amble patch, may as well go with it? what i definitely *don't* want to do is change our minds on it after-the-fact
21:18:49 <joeljwright> I have other things I want to explore - zip, ISO even mov
21:19:02 <clayg> are you *waiting* on upstream to enter "maintenance" mode before you *use* it?  Don't wait on us.  Show us how stupid we are for not having it work *exactly like this* already
21:19:34 <joeljwright> I have been waiting for upstream to validate that I'll get it eventually before I use it in production
21:19:59 <joeljwright> otherwise I have to maintain many different code bases
21:20:02 <clayg> i think that's created some of the tension we're experiencing
21:20:07 <notmyname> +1
21:20:27 <clayg> i mean middleware is middleware - i have middleware - lots of people that run swift have middleware - you're supposed to?
21:20:37 <clayg> rledisez: you have middleware?
21:20:41 <notmyname> joeljwright: ok, so to clarify, you've got a real problem, you've got a pretty cool idea on how it could be solved and patch for that, but you have *not* actually used this functionality yet to solve your problem in prod
21:20:48 <joeljwright> the problem with this approach is it requires patching SLO and helpers
21:20:48 <rledisez> clayg: sureā€¦
21:21:19 <joeljwright> notmyname: that's a fair assessment, I've held off pushing to get this in prod
21:21:26 * timburke has definitely *never* released a spike of a swift-provided middleware...
21:21:29 <joeljwright> I would deploy it tomorrow if I could though :)
21:21:46 <clayg> lol
21:22:04 <notmyname> yeah, I can sympathinze with joeljwright's position, though, because he's not only maintaining swift in prod, AFAIK he's the only one doing it and adding the differences from upstream (even just patching slo middleware locally) could be daunting
21:22:45 <clayg> ok, well I'm not saying that should change, and i'm *definitely* not saying that running hacks on upstream code is ideal (it definitely is crappy)
21:23:09 <notmyname> but it does create the tension of committing to support new SLO manifest formats forever in the hopes that we got the new stuff right before seeing it used
21:23:13 <notmyname> it's a tricky situation
21:23:18 <timburke> clayg: thinking of use-cases, imagine researchers wanting to share large data sets together with the scripts used to analyze them
21:23:26 <timburke> the datasets are large, they don't want to waste space by uploading a separate tar copy. the scripts are small, so even if we solve the de-dupe problem with small objects for headers/padding, we still have the ratelimiting problem
21:24:06 <notmyname> timburke: we can imagine a lot of places where new code could be useful... that's never been the problem, right?
21:24:26 <joeljwright> this all feels a bit catch-22
21:24:34 <notmyname> heh yes! totally unfair to joeljwright
21:24:49 <clayg> if I *had* such a use-case I would just see if joeljwright's existing solution works and then +2 (literally *works* for me!)
21:24:50 <timburke> notmyname: clayg was looking for "a real world example of a for-realzy-sovling-a-customer-use-case segment with *ambles" -- i gave him something!
21:25:00 <torgomatic> timburke: +1
21:25:06 <joeljwright> :)
21:25:08 <clayg> timburke: nope - i don't need an *idea*
21:25:54 <notmyname> so that's the thing here. as a big community (with different employers, different use cases, etc), to some extent we can't only approve the stuff that we personally see use cases for
21:26:26 <notmyname> joeljwright actually has a for realzy use case here. the question is more about if the give patch is maintainable by the community, *not* if it's actually useful functionality
21:26:31 <torgomatic> clayg: isn't that what joeljwright's users have? like, he's got SLOs with big segments and tiny segments, and they make a tarball, but performance sucks due to all the tiny segments
21:27:54 <joeljwright> torgomatic: not to mention all the extra time making 3* the number of SLO manifests I really need :)
21:28:32 <torgomatic> :)
21:28:38 <notmyname> so we get back to the basic question of "who's gonna review it?"
21:28:43 <notmyname> right? is that where we're at?
21:29:16 <notmyname> clayg: but I want to know your thoughts too. is joeljwright's existing use case ok? even though he admits he hasn't run this code in his prod yet?
21:29:34 <clayg> you can't say someone's use case is "not ok"
21:29:38 <notmyname> lol
21:29:41 <clayg> it'd be like saying "you're not mad at me"
21:29:50 <joeljwright> :)
21:30:20 <clayg> It's just everytime I think about joel's use-case I write a TLO middleware - not add preamble to SLO
21:30:21 <notmyname> right. sorry. I didn't want to trap your answer there.
21:31:00 <notmyname> ok, yeah. so if one wanted to solve downloading .tar files, one would likely do somethign a lot simpler than *ambles in SLOs
21:31:01 <timburke> and how much of slo does that reinvent? what about when you realize you want cpio archives instead?
21:31:14 <joeljwright> timburke: you beat me to it
21:31:32 <notmyname> and the *amble idea is interesting because we see that it might be useful, but it's not actually been asked for, even by joeljwright's users
21:31:52 <notmyname> which makes it hard to justify the more complex solution
21:31:56 <notmyname> clayg: is that about right?
21:32:24 <timburke> i *am* a bit nervous about the number of variations we're adding to slo segments. that's one of the reasons i've been thinking about having "object" segments and "inline" segments -- i feel like it might make things easier to reason about
21:32:26 <clayg> idk, I think it's fine - if we have folks that "might" review it I don't think that really answer's joeljwright's problem really well
21:32:57 <joeljwright> timburke: that's a fair suggestion
21:33:22 <joeljwright> but we'd probably want to stop inline-only manifests
21:33:31 <joeljwright> which is why I shied away from that
21:33:47 <clayg> timburke: I have no strong intuition that I should prefer one solution over the other - it depends on the % of use-cases that need to envelope individual segments vs totally independent in-between and around datas
21:33:48 <joeljwright> I wanted to make it obvious that this is for making the data your store more useful
21:33:51 <clayg> both are complex and workable
21:33:52 <joeljwright> not storing data in manifests
21:34:03 <timburke> and i've certainly contributed! https://github.com/openstack/swift/commit/25d5e68 ... https://github.com/openstack/swift/commit/7fb102d ...
21:34:09 <notmyname> joeljwright: so what do you want to see? review the current patch with a goal of landing it? it sounds to me that timburke has some interesting alternative suggestions
21:34:17 <clayg> like - SLO's are almost turing complete now
21:34:25 <joeljwright> clayg: :D
21:34:44 <timburke> joeljwright: on inline-only manifests... maybe just write the damn data?
21:35:17 <timburke> "look, you wanted this data written, i wrote it! what else do you want?"
21:35:27 <timburke> (as a non-slo, i mean)
21:35:29 <joeljwright> I'm really after some review that say (a) this is too complex or (b) this looks useful or (c) this would be better if you changed...
21:35:48 <timburke> definitely b, maybe a
21:35:58 <joeljwright> I'd like to know if this has any likelihood of landing
21:36:08 <joeljwright> because I need to solve this problem
21:36:14 <notmyname> acoles: kota_: tdasilva: you all are being quiet. what are your thoughts? what do you see as the best path forward?
21:36:20 <torgomatic> okay, I'll volunteer to take an in-depth look at it in the next few days
21:36:35 <joeljwright> torgomatic: thanks!
21:36:36 <notmyname> torgomatic: that's very helpful. thanks
21:38:19 <joeljwright> I'll be around on the swift channel if anyone wants to talk about this separately
21:38:29 <mattoliverau> Now that I've heard of the problems joeljwright's having and how he solves it, it has me sold.. well that we need a solution. I haven't looked at the patch so can't say it's it, but willing to go look at it in the mindset of we should solve it
21:38:31 <kota_> notmyname: still at catching up the history yet but...
21:39:26 <mattoliverau> I'll find time to have an initial look today
21:39:41 <joeljwright> mattoliverau: kota_: thanks for looking!
21:39:47 <acoles> I'm torn - I'd prefer not to make SLO more complex but I can see the desire to re-use SLO by adding ambles
21:40:31 <kota_> primary question, i didn't find any reason why it is needed for tar archive case (it might be discussed before i joined here)
21:40:47 <notmyname> kota_: yeah, that's in some of the earlier links
21:41:03 <joeljwright> kota_: the pre/postambles avoid the need to store tiny objects for tar header/padding
21:41:19 <kota_> it looks just a feature to add binaries by manifest to each segment
21:41:26 <joeljwright> but preserve the SLO features of validating that the data you wanted in the tar is what you're downloading
21:41:41 <clayg> tick tock
21:41:43 <notmyname> ok, I want to move on the meeting, but thank you for your comments here. let's see if I can summarize where we are
21:41:51 <clayg> I don't feel we're in a drastically different place that we've been before
21:42:07 <joeljwright> yeah, but 40 minutes is enough!
21:42:07 <kota_> joeljwright: oic
21:42:19 <clayg> it is complex, it is worth solving, no one is *sure* what should change because we have no external pressure driving the design
21:42:40 <notmyname> heh, clayg just summarized it very well
21:42:52 <joeljwright> clayg: maybe we can catch up some time tomorrow to talk about alternatives?
21:43:04 <clayg> joeljwright: no, this code is already written
21:43:07 <notmyname> and torgomatic and mattoliverau offered to look at the patch
21:43:19 <joeljwright> kk
21:43:26 <joeljwright> I'll stop stressing now :)
21:43:30 <notmyname> :-)
21:43:33 <joeljwright> (until the reviews arrive!)
21:43:33 <clayg> we just have to have enough of a WAG at how to "qualify" it - and bandwidth to review
21:43:47 <notmyname> ok, let's move on. thank you joeljwright
21:43:56 <notmyname> #topic deadlock patch
21:43:59 <notmyname> https://review.openstack.org/#/c/493636/
21:43:59 <patchbot> patch 493636 - swift - Fix deadlock when logging from a tpool thread.
21:44:12 <notmyname> torgomatic did a great job finding and solving this issue
21:44:41 <mattoliverau> +100
21:44:45 <notmyname> we worried earlier this week that it would add some linux-only things, but that's been removed (so we still only test linux, but it might work somewhere else maybe)
21:45:16 <notmyname> clayg: you said we'd likely land this right after the meeting today :-)
21:45:33 <clayg> oh yeah, can answer any questions?
21:45:41 <clayg> there's two I can NOT answer
21:46:02 <clayg> #1 how do we prevent this kind of lockup bug in future (no idea, it's a huge space, probably ripe)
21:46:38 <clayg> #2 is torgomatic some sort of super genius mutant (probably, we have some evidence of that, but one can not be sure - he may just be a cyborg)
21:47:20 <mattoliverau> That would confirm my theory that swiftstack Devs don't sleep
21:47:46 <tdasilva> or are bots
21:47:49 <notmyname> torgomatic: you just pushed a new patch set during this meeting. is it all good from your perspective? timburke, clayg, acoles: are you planning on or expecting to +A it shortly?
21:48:01 <clayg> what?  new patch set!?
21:48:14 <acoles> notmyname: I was planning to sleep shortly :)
21:48:16 <torgomatic> notmyname: I think it's good, but then, I would ;)
21:48:33 <notmyname> acoles: proving mattoliverau and tdasilva wrong ;-)
21:48:41 <torgomatic> clayg: added a check to make sure you can't unlock someone else's mutex, like threading._RLock has
21:48:48 <mattoliverau> Lol
21:49:22 <clayg> slry, I will +A it so hard
21:49:38 <timburke> so...probably +A soon? might be worth having someone *outside* of swiftstack understand the problem and how this fixes it, though...
21:49:51 <clayg> I honestly need to package it today or in the am.. so.. my life is non-trivially better if we +A now - is that... workable for folks?
21:50:10 <clayg> I hate it when things happen to quickly - it's so foreign to how we normally work it makes me uneasy.
21:50:24 <notmyname> ok. mostly for this topic I just wanted to make sure people understand it and are aware of it
21:50:24 <tdasilva> lol
21:50:25 <clayg> Are we *supposed* to be this agile and responsive?!  Is this ok!?
21:50:46 <tdasilva> clayg: if a patch doesn't take 6 months then something feels wrong?
21:51:00 <joeljwright> tdasilve: :D
21:51:05 <notmyname> is there anyone who has *not* read the bug report and understand why this needs to be fixed ASAP?
21:51:59 <clayg> yeah I think understanding is useful-ish - because... and this came up this morning - this is so awesome we should probably backport it - it could be RC on lp bug #1575277 or lp bug #1659712
21:52:01 <openstack> Launchpad bug 1575277 in OpenStack Object Storage (swift) "object-replicator goes into bad state when lockup timeout < rsync timeout" [Medium,Confirmed] https://launchpad.net/bugs/1575277
21:52:02 <openstack> Launchpad bug 1659712 in OpenStack Object Storage (swift) "Object replicator hang" [Undecided,New] https://launchpad.net/bugs/1659712
21:52:15 <notmyname> if so, then read https://launchpad.net/bugs/1710328 soon
21:52:16 <openstack> Launchpad bug 1710328 in OpenStack Object Storage (swift) "object server deadlocks when a worker thread logs something" [High,In progress] - Assigned to Samuel Merritt (torgomatic)
21:52:35 <rledisez> not very helpful, but i need to say this patch is amazing. what used to never work (logging in diskfile) now works perfectly!
21:52:35 <clayg> ok, I'll probably kick it in ~1.5 hrs
21:52:54 <clayg> wooo!
21:52:56 <rledisez> we stressed it today with success
21:53:01 <notmyname> nice!!
21:53:02 <clayg> wooo!
21:53:11 <notmyname> rledisez: can you leave a short comment in gerrit saying that?
21:53:14 <mattoliverau> Awesome
21:53:17 <rledisez> sure
21:53:19 <notmyname> thanks
21:53:21 <clayg> rledisez: zomg rledisez doing *reviews*!?
21:53:36 <notmyname> ok. last thing (and related to this patch)...
21:53:40 <clayg> tdasilva: i won't even be able to keep up we move so fast
21:53:41 <timburke> rledisez: did that come up much for you before? i thought part of why it took so long to pin down was the fact that we *don't* do much logging in diskfile...
21:53:43 <notmyname> #topic swift 2.15.1 for pike
21:53:59 <clayg> 2.15.1 - best swift release YET
21:54:15 <notmyname> timing for the pike release is coming up quickly. what's on master at the end of the week is what I'll be tagging for pike as 2.15.1
21:54:19 <clayg> i have hard stop at 3pm, i might leave 2 mins early if we're not done
21:54:21 <rledisez> timburke: we figured it was because of this bug. alecuyer ahs been working around for so long, when we saw the patch coming, it was like "omg, we are so excited" :D
21:54:36 <timburke> great!
21:54:42 <notmyname> which includes torgomatic's patch for the deadlock
21:54:49 <notmyname> rledisez: that's great :-)
21:54:56 <acoles> timburke: IDK but I have distant memory that whenever I tried to add debug logging to diskfile something broke ... and maybe this was it
21:55:11 <notmyname> any other questions/comments from anyone?
21:55:17 <clayg> three cheers for torgomatic
21:55:20 <clayg> hip hip!
21:55:23 <acoles> hip hip
21:55:26 <acoles> awww!
21:55:31 <clayg> hoooooRAY!
21:55:46 <joeljwright> \o/
21:55:49 <torgomatic> "hip hip hip hip awww!" is an appropriate set of cheers for a concurrency bug :)
21:55:51 <clayg> notmyname: sorry, you said tag on Friday yeah!?
21:55:57 <clayg> lolasd;lfjkasdf;lkjadsf
21:56:06 <mattoliverau> Lol
21:56:16 <notmyname> land on master by friday. I'll tag this weekend or monday
21:56:29 <notmyname> thank you, everyone, for coming
21:56:35 <tdasilva> do we care for patch 472659 being in the release?
21:56:36 <patchbot> https://review.openstack.org/#/c/472659/ - swift - Allow to rebuild a fragment of an expired object
21:56:38 <rledisez> for swift 2.15.1, i'd love to see this merged, i think it's ready now, last comment was about a header name, waiting for a +A ;) : https://review.openstack.org/#/c/472659/
21:56:38 <patchbot> patch 472659 - swift - Allow to rebuild a fragment of an expired object
21:56:41 <acoles> https://review.openstack.org/#/c/472659/ <-review
21:56:41 <patchbot> patch 472659 - swift - Allow to rebuild a fragment of an expired object
21:56:42 <rledisez> :)
21:56:47 <notmyname> thanks especially for going over the *amble patchw ith joeljwright and joeljwright for sticking with us on it ;-)
21:56:58 <joeljwright> thanks everyone
21:57:00 <acoles> ok that's three request for same thing, it must happen!
21:57:06 <clayg> yes merge that one
21:57:12 <acoles> :)
21:57:13 <clayg> it must happen
21:57:20 <tdasilva> ok, i was reviewing that today, will continue tomorrow
21:57:24 <notmyname> tdasilva: thanks
21:57:26 <acoles> tdasilva: thank you
21:57:40 <notmyname> thanks for your work on swift
21:57:44 <notmyname> #endmeeting