21:00:37 <tdasilva> #startmeeting swift 21:00:38 <openstack> Meeting started Wed Sep 19 21:00:37 2018 UTC and is due to finish in 60 minutes. The chair is tdasilva. Information about MeetBot at http://wiki.debian.org/MeetBot. 21:00:39 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 21:00:41 <openstack> The meeting name has been set to 'swift' 21:00:58 <timburke> i'm so excited! or maybe i just had too much coffee this morning 21:01:00 <tdasilva> hello, notmyname won't be avaiable to run meeting today, so he asked me to run 21:01:10 <tdasilva> who is here for swift meeting? 21:01:22 <mattoliverau> o/ 21:01:28 <kota_> hi 21:01:29 <m_kazuhiro> o/ 21:01:29 <rledisez> hi o/ 21:01:31 <fatema_> o/ 21:02:25 <clayg> did not realize I was not in here 21:02:32 <tdasilva> hehe, no worries 21:02:39 <tdasilva> meeting agenda is here: https://wiki.openstack.org/wiki/Meetings/Swift 21:03:02 <tdasilva> #topic PTG summary and review 21:03:18 <tdasilva> hopefully everyone had a safe flight home 21:03:21 <mattoliverau> So how'd did the ptg go? 21:03:34 <tdasilva> it was really cool to see everyone there, and we missed those that couldn' t make it 21:03:41 <mattoliverau> Give me all the goss :) 21:03:48 <timburke> mattoliverau: well, we missed you :-( 21:04:01 <tdasilva> i was looking at the etherpad earlier: https://etherpad.openstack.org/p/swift-ptg-planning-denver-2018 21:04:03 <mattoliverau> Sure sure :p 21:04:05 <timburke> but i *guess* you oughta spend time with the new baby... 21:04:16 <tdasilva> and I think we almost all topics there 21:04:50 <mattoliverau> Trip or marrage and i chose marrage, Zoe is very cute too which helps 21:05:02 <tdasilva> good choice 21:05:30 <timburke> :-) 21:05:33 <tdasilva> there was some discussion on PUT+POST, general task queue, multinode testing, py3, what else? 21:06:00 <kota_> sort of s3 things? 21:06:01 <tdasilva> we had some informal conversations about 1space 21:06:08 <kota_> and losfs. 21:06:54 <timburke> on s3 things -- including getting ceph tests going again! thanks, kota_! 21:06:57 <tdasilva> right, I saw an a patch earlier today about s3api and keystone enhancements, i think it came out of conversations there, right 21:07:25 <timburke> yep, got it starred 21:07:31 <tdasilva> cool 21:07:46 <timburke> and i think i fixed up the gate check? tried, anyway 21:07:48 <kota_> yeah, it sounds great performance improvement. 21:08:16 <tdasilva> any to share in terms of PTG process? do we need to do anything different next time? 21:08:54 <tdasilva> I thought it was quite the same as usual, a bit more focused on hacking on code directly... 21:09:29 <tdasilva> ok, if nothing else to share about PTG we can move on to next topic? 21:09:49 <tdasilva> #topic PUT+POST patch discovered issues 21:10:08 <clayg> Here's the 100ft view of where we're at: https://etherpad.openstack.org/p/swift-eventlet 21:10:10 <mattoliverau> there was an etherpad topic on sharding in prod and how thats going and no note howd it go? 21:10:44 <clayg> mattoliverau: I remember speaking briefly about container-sharding to the room 21:11:01 <timburke> sorry mattoliverau -- i think the gist of it it, "so far, so good -- we think?" 21:11:21 <clayg> mattoliverau: basically we're really happy with how it's going - but we have a ton of tooling downstream to make it manageable for our professional services team 21:11:51 <clayg> we're just using it when the alternative is "live with horribly horribly big and broken giant 100+ GB containers" and waiting for some issue to show up? 21:11:59 <timburke> like, containers get broken up, and (so far as i know) nobody's complaining about data going missing, so that's good 21:12:08 <clayg> Every day someone is sharding a container somewhere! No show stopping bugs found! 21:12:33 <mattoliverau> \o/ 21:12:33 <clayg> From there it seemed no one else is using it yet (probably because the manual api/script stuff is non-trivial) 21:12:58 <tdasilva> zaitcev: just for context, on Friday morning, there were some more discussions on PUT+POST, so clayg wrote up the etherpad to summarize what we 'found out' 21:13:04 <clayg> we probably should have encouraged other operators to *at least* start running the sharding deamon and monitoring their "largest container" alerts 21:13:27 <zaitcev> tdasilva: is that the eventlet link above? 21:13:32 <tdasilva> correct 21:13:49 <mattoliverau> Well thanks for running it in prod and getting valuable feedback 21:14:08 <mattoliverau> Now we cam go fox things and make it better 21:14:40 <timburke> zaitcev: in particular, https://bugs.launchpad.net/swift/+bug/1792615 was some new information that fell out of trying to get latest eventlet installed for probe tests 21:14:40 <openstack> Launchpad bug 1792615 in OpenStack Object Storage (swift) "Eventlet 0.22.0+ changed how graceful shutdowns work" [Undecided,New] 21:14:52 <mattoliverau> Wow, thanks autocomplete on my tablet 21:15:51 <tdasilva> what are the next steps re: PUT+POST? 21:16:26 <timburke> that was coming up in https://review.openstack.org/#/c/564700/, and i kicked out https://review.openstack.org/#/c/602526/ to fix up the test but need to rebase it 21:16:27 <patchbot> patch 564700 - swift - Add ceph-s3 test non-voting job (MERGED) - 35 patch sets 21:16:28 <patchbot> patch 602526 - swift - Use latest eventlet in probe tests - 6 patch sets 21:16:53 <clayg> zaitcev: executive summary is "double continue garbage is not portable" so the thinking was "PUT+POST[+POST]" should be about the same on the wire and more portable!! But "on the wire" isn't the same as "semantically equivalent" and having the object-server carry state in memory between "PUT+<footers>" is a bad idea, because HTTP says you can close a pipelined request at anytime. 21:17:34 <clayg> in fact eventlet recently decided it would start more aggressively kicking out clients - so it may not even be uncommon in production to have an object server go away between the data and the metadata - in which case the proxy is kinda screwed. 21:18:29 <clayg> Soo... next step could be "don't throw the baby out with the bath, PUT+second-http-request-for-footers may not work, but we could still fix the double continue madness as long as we were prepared to retry" 21:18:45 <clayg> which looks like a hybrid of MIMEPut+POST 21:18:48 <zaitcev> clayg: closing of the pipelined requests changes nothing to PUT+POST, except adding SYN/SYN+ACK/ACK packets, so performance is lower. But we aren't using the connection state for anything. 21:19:08 <clayg> or *maybe just maybe* - the whole idea was bonkers and we need to re-think our plan to get the hell off eventlet 21:19:21 <zaitcev> clayg: it's a bit hard to believe, but I forgot to filter out Connection: close at some point and everything worked 21:19:30 <clayg> zaitcev: yeah but if the process dies you loose the df cache 21:19:46 <clayg> we haven't commited anything to disk - there's no way to find the link_at file descriptor to add the metadata too 21:20:09 <zaitcev> clayg: True. And is this any different from the process dying that received the body but not MIME trailer? 21:20:23 <clayg> that's another option I considered - go ahead and fsync the .data.tmp file w/o the metadata so if we have to reconnect/recontinue we can "find it" 21:21:13 <clayg> zaitcev: not different than the process *dying* (which should be rare) - but unfortunately it *is* different from the process being asked to gracefully shut down - which will not be rare :'( 21:21:55 <clayg> there was some consideration if maybe we could patch eventlet - but then what happens if we which to another web server that also likes to tell clients trying to do pipelined requests that they can't have their POST after PUT 21:23:10 <clayg> the reality is trying to do stateful stuff on a stateless protocol was a bad idea - I probably should have seen it ages ago - but the ramifications of HOW the object-server would track resources from one request to the next wasn't obvious to me until I was reviewing your code - and even then I wasn't sure it was wrong enough to need to abandon 21:23:35 <zaitcev> okay, I see where this is going 21:24:12 <zaitcev> next step, an RPC encoded with ASN.1 over WebSockets 21:24:25 <zaitcev> That should be statefull enough for you. 21:24:35 <clayg> I sure am getting tired of beating my head against HTTP! 21:24:41 <zaitcev> Okay 21:24:46 <clayg> all I want is some data with metadata afterwards!? 21:24:53 <zaitcev> Did you look at HTTP/2? 21:25:03 <clayg> zaitcev: see - now you're thinking! 21:25:07 <zaitcev> That one has channelized requests, maybe they can help. 21:25:16 <clayg> I just want off eventlet 😠21:25:37 <zaitcev> Hey, no cheating. You said you wanted stateful, that's more than just "off eventlet". 21:25:55 <clayg> I think the protocol for http/2 is amazing absolutely! there's no good way to swap it into a functioning wsgi application implementation 21:26:23 <zaitcev> Yeah... That's why I wasn't joking about WebSockets 21:26:27 <clayg> that's the whole problem - we keep wanting to not have to "throw away all that stuff that already works" 🤮 21:27:00 <clayg> zaitcev: I'm going to push up what I have soon (today or tomorrow) 21:27:15 <clayg> zaitcev: I want to reconsider your change from a tactical level 21:27:55 <clayg> but I think they both get -1'd for the same issue "if you hup the object-server on latest eventlet the proxy can't finish the upload" 21:28:06 <clayg> so... we'll have to figure out where to go next 21:28:18 <clayg> I was cussin on Friday 21:29:18 <clayg> but I'm over it now - because I hate eventlet - and humming bird is free for the taking - and all I really want is data followed by metadata and nothing is really worse, my clusters are running the same today with MIME+eventlet that they were the week before 21:29:23 <zaitcev> I still don't understand how PUT+POST is any worse than MIME in this regard. 21:30:12 <clayg> the new eventlet behavior is that if you HUP a server (which we do plenty on config push etc) the object-server will let the PUT finish then kick the connection to the floor 21:30:23 <zaitcev> Ah, okay. 21:30:29 <clayg> with MIME it would let the PUT+footers+commit *all* finish - which is super good and exactly what we want 21:31:22 <clayg> I would be even more mad at eventlet (I still kinda am really) - but in this case they're right - what I had suggested was brain-dead and I now I kinda hate http too 21:31:27 <clayg> but I'm over it! 21:31:54 <clayg> tdasilva: i think that's all I've got on that subject for now 21:32:02 <clayg> did everyone else doze off 😬 21:32:30 <tdasilva> zaitcev: any thing else? 21:32:30 <zaitcev> OK, fine. So why don't you create a PUT2 that takes our good old MIME, only encoded in such a way that there aren't any zero-size bodies? Can eventlet take at least that much? 21:32:54 <clayg> zaitcev: FWIW I don't have an obviously better plan - so if you want to high bandwidth about this later I'd be down for that. Or we can just let it stew and work on py3 or whatever and maybe an door will present it self 21:33:30 <zaitcev> I see. You want to avoid the tree to accept another undercooked solution like ssync. 21:34:42 <clayg> zaitcev: yeah, I think MIME/PUT + POST commit would work - and we end with something that is not intimately coupled with eventlet - but it still doesn't feel super portable to me - it's a hodge podge! and YES - EXACTLY - we still have to un$%^& ssync too! 21:35:04 <clayg> we need a better option than HTTP I'm afraid - our requirements don't match what it can offer us it seems! 21:35:04 <zaitcev> I'm willing to wait until the day we have an external event that forces us to accept something that is good enough, like PUT+POST or an equivalent STAGE+MERGE+COMMIT. Maybe py3, unless we find a way to make MIMEPutter just work there. 21:36:04 <tdasilva> zaitcev, clayg: maybe we could continue the conversation in either #openstack-swift or schedule a video call or something? 21:36:22 <clayg> tdasilva: yup, I think that's what we needed to get out today 21:36:44 <timburke> if we could find a way to get the zero-byte chunk out of the mime stuff, that might work. the trouble (IIRC) is correctly identifying end-of-data. i think we wouldn't even necessarily need to invent a new verb. but if we wanted to use another framework/language, i think we'd still have some degree of trouble with wanting to send/receive headers with the 100 Continue response, and wanting to send more than one 100 Continue 21:37:29 <clayg> YUP 21:37:33 <clayg> timburke: gets it 21:37:46 <zaitcev> One thing that makes me vary of rolling our own RPC (over WebSockets or even over ssh) is my experience with Ceph, where they do have their project-specific RPC. That thing is a disaster. Firstly, it's underdocumented, so writing implementations is next to impossible. Secondly, the strong binding seeps in, so now you MUST have proxies (monitors in Ceph) and storage (OSD in Ceph) to have matching versions. 21:37:48 <clayg> HTTP is not the right protocol for what we're doing 21:38:03 <clayg> YUP 21:38:05 <zaitcev> Please be aware that RPC has these hazards 21:38:06 <clayg> zaitcev: gets it 21:38:23 <timburke> zaitcev: are we sure that MIMEPutter *doesn't* work with py3? i thought i'd gotten some EC stuff working on py3... 21:39:04 <zaitcev> timburke: it might. I dimly remember some problem with it, but maybe it's resolved. I had many problems in the beginning of py3. 21:39:08 <clayg> our requirements are that off the wall - there's a better foundation for what we need to do than HTTP - no one could have known our requirements would over time out grow HTTP - is was is 21:39:22 <clayg> are *not* that off the wall 21:40:11 <tdasilva> cool...let's move on and we can continue on #openstack-swift 21:40:15 <zaitcev> ok 21:40:16 <clayg> the problem we've been scared of is how to move to something better w/o having to re-due everything at once. that problem hasn't gone away. 21:40:46 <tdasilva> #topic Configure DiskFileClass 21:41:02 <tdasilva> I'm not sure if it was notmyname or kota_ that added that topic 21:41:08 <kota_> that's what i added 21:41:23 <tdasilva> kota_: what's up? 21:41:49 <kota_> just a re-minder for mattoliverau and it looks he noticed my irc message in the swift channel. 21:42:10 <mattoliverau> :) 21:42:12 <kota_> we had some discussion at the PTG with tdasilva then, we don't think any issues so for. 21:42:13 <tdasilva> #link https://review.openstack.org/#/c/447129/ 21:42:14 <patchbot> patch 447129 - swift - Configure diskfile per storage policy - 20 patch sets 21:42:24 <kota_> so far. 21:42:36 <clayg> timburke: kota_: thanks for reviewing that 21:42:45 <clayg> and m_kazuhiro too! 21:42:48 <clayg> nice! 21:43:02 <tdasilva> and it already has one +2, so should be close 21:43:11 <kota_> but it's an incremental improvement so that, i may want to wait other eyes before landed. 21:43:15 <timburke> clayg: i wouldn't say i "reviewed it" -- but i at least started thinking about it a bit 21:43:56 <clayg> all the tests seem to be passing - we know it's not horribly broken 21:44:10 <kota_> but also i know rledisez want to get it merged *really* so I'm now thinking if can i make it with single +2 or not 21:44:27 <clayg> I don't think I'm going to be re-spinning kinetic anytime soon 21:45:00 <clayg> so unless I have some other use-case for an out-of-tree diskfile I'm happy to let the entrypoints in and if we find an issue later we'll resolve it 21:45:23 <clayg> probably with a bug like "can't use my out of tree disk file with new entrypoints because wah wah wah" 21:45:36 <timburke> tdasilva: so does it seem like swiftonfile would be able to use this instead of needing to provide its own object-server? 21:45:48 <clayg> that would be pretty cool 21:46:09 <mattoliverau> Ill try and get some kind of review in this week, but definitly a better one in before next meeting. (/me is kinda on leave this week) 21:46:10 <tdasilva> timburke: IIRC swiftonfile overwrites some methods of the object-server, need to reload why 21:46:37 <kota_> alright, thanks mattoliverau 21:46:48 <timburke> not that it *must* be done to get the patch landed -- just thinking it might make a good data point to think about whether it *could* be done 21:46:51 <clayg> mattoliverau: unless you'd be spinning testing with an out-of-tree disk file I'm not sure what we'd cover new beyond what kota_ has seen 21:47:08 <tdasilva> timburke: yep! agreed 21:47:17 <clayg> kota_: is there anything specifically about how it interacts with the *in-tree* diskfiles that gives you pause? Or you *only* reservation was out-of-tree diskfiles interop? 21:47:30 <mattoliverau> Yeah, i won't be doin that 21:48:29 <clayg> maybe another reviewer could leverage https://github.com/thiagodasilva/ansible-saio and give swift-on-file a whirl 21:48:30 <timburke> i still wanna know what happens when you mis-configure the entry points -- do a replicated diskfile with an ec policy, say 21:48:46 <clayg> should we do that before we merge? do we owe that to diskfile consumers (cc zaitcev tdasilva) 21:48:49 <tdasilva> timburke had a good question on whether the option should be in object-server 21:48:54 <rledisez> timburke: it should raise an error and the process won't start 21:49:04 <timburke> cool! thanks rledisez 21:49:06 <tdasilva> clayg: i'm really not concerned about swiftonfile in this case 21:49:08 <kota_> clayg: i think it works with in-tree 21:49:39 <clayg> kota_: then I think you did the right thing bringing to the meeting and I think we can merge it and probably should 21:49:47 <zaitcev> clayg: Sadly I'm not that in touch with Gluster people who run swift-on-file. Not heard from them in years. 21:50:15 <kota_> basically, what i concerns is with out-of-tree but for sure, another couple of eyes always fine to find something I missed. 21:50:17 <clayg> WFM! I'm not trying to break anybody on purpose - this is just an issues of resources and needing to move our ball forward 21:50:31 <clayg> I don't think anyone is going to hate on us too long - we're doing our best 21:51:00 <tdasilva> clayg: I can reach out to gluster folks and see if they want to have a say... 21:51:19 <kota_> yup, so that, i was thinking to do if we may have a long time. 21:51:33 <clayg> tdasilva: ok, I'd be willing to hold on that response until next week 21:51:43 <tdasilva> sounds good, i'll reach out to them tomorrow 21:51:44 <clayg> tdasilva: and that's probably pretty freaking amicable of you to offer that 21:52:24 <clayg> and if anyone did show up and ask if we could do xyz to make their lives easier - we'd be all for it - patches welcome! 21:52:45 <tdasilva> anything else on this topic? 21:52:51 <kota_> summarizing that, mattoliverau will review it in this week, then tdasilva will ask to gluster folks, probably we can land it in the next week 21:53:11 <timburke> rledisez: so with another diskfile implementation -- would you expect to switch over one policy at a time, or all policies for a single node? 21:54:04 <rledisez> timburke: as you prefer. we only manage one policy per node, so it's the same for us. but you can switch one policy at a time 21:55:13 <timburke> ok. just trying to get a model in my head for how things are connected 21:55:39 <kota_> thx timburke 21:55:58 <tdasilva> #topic open discussion 21:56:14 <tdasilva> anything else to discuss ? 21:56:38 <tdasilva> rledisez, m_kazuhiro anything on 'general task'? 21:57:36 <tdasilva> there are still some patches on https://wiki.openstack.org/wiki/Swift/PriorityReviews that need review, so I'd encourage to look at that 21:58:22 <zaitcev> BTW, why are you running the meeting and what has happened to notmyname? 21:58:52 <tdasilva> notmyname had an appointment so he couldn't join 21:58:56 <kota_> oic, s3 and swift cross compatibilities, we should discuss, probably it's in the next meeting. it will take longer time. 21:58:58 <tdasilva> so he asked me to run it today 21:59:04 <zaitcev> got it 21:59:09 <tdasilva> he should be back running it next week 21:59:17 <kota_> just a minute left. 21:59:19 <tdasilva> kota_: can you add to the agenda 21:59:23 <kota_> yup 21:59:49 <tdasilva> the rolling upgrade test job got merged on monday but there is still a ton of work/cleanup to there, checkout this etherpad if you would like to discuss help: https://etherpad.openstack.org/p/swift-zuul-rolling-upgrade 22:00:11 <mattoliverau> Nice Thanks tdasilva 22:00:17 <kota_> great 22:00:21 <clayg> +1 tdasilva is awesome!!! 22:00:24 <tdasilva> ok, going to call it here and we can continue on #openstack-swift if there's anything else to discuss 22:00:34 <clayg> sorry I took up so much time on the PUT+POST MIME :'( 22:00:54 <m_kazuhiro> tdasilva: we discussed updatiing process of general task, so I updating patch for that now. 22:00:57 <tdasilva> clayg: no problem, thanks for sharing all that, and zaitcev thanks for your time on it too 22:00:57 <kota_> clayg: no worries, that's awesome. 22:01:14 <tdasilva> m_kazuhiro: awesome! 22:01:21 <tdasilva> #endmeeting