21:01:46 <notmyname> #startmeeting swift 21:01:47 <openstack> Meeting started Wed Apr 25 21:01:46 2018 UTC and is due to finish in 60 minutes. The chair is notmyname. Information about MeetBot at http://wiki.debian.org/MeetBot. 21:01:48 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 21:01:50 <openstack> The meeting name has been set to 'swift' 21:01:51 <notmyname> who's here for the swift team meeting? 21:01:53 <m_kazuhiro> o/ 21:01:56 <acoles> hi 21:02:15 <kota_> hello 21:02:21 <rledisez> hi 21:02:40 <cschwede> hello! 21:02:42 <notmyname> Link https://wiki.openstack.org/wiki/Meetings/Swift 21:02:47 <notmyname> oh my! it's cschwede! 21:02:56 <notmyname> #link https://wiki.openstack.org/wiki/Meetings/Swift 21:03:01 <cschwede> i heard there's a swift gathering here 21:03:50 <notmyname> tdasilva told me he's not here 21:04:08 <notmyname> and torgomatic and clayg and timburke are in a meeting (that I left) 21:04:12 <notmyname> soo... 21:04:15 <notmyname> let's get started! 21:04:15 <acoles> cschwede: hi! 21:04:51 <notmyname> here's my weekly reminder that https://review.openstack.org/#/c/555245/ is still open and needs to land before the next release 21:04:51 <patchbot> patch 555245 - swift - Fix versioned writes error with url-encoded object... 21:05:12 <notmyname> it fixes https://bugs.launchpad.net/swift/+bug/1755554 21:05:13 <openstack> Launchpad bug 1755554 in OpenStack Object Storage (swift) "Percent signs in object names cause trouble for versioned_writes" [High,In progress] - Assigned to Kota Tsuyuzaki (tsuyuzaki-kota) 21:05:39 <notmyname> let's talk about https://review.openstack.org/#/c/562943/ 21:05:40 <patchbot> patch 562943 - swift - Import swift3 into swift repo as s3api middleware 21:05:48 <notmyname> kota_: how is the patch looking this week? 21:06:06 <mattoliverau> o/ - sorry was a holiday in Oz yesterday so got confused this morning 21:06:52 <notmyname> (did kota_ step away?) 21:07:12 <kota_> tdasilva, m_kazuhiro, and pete reviewed it 21:07:24 <kota_> notmyname: I'M here 21:07:29 <notmyname> that's good news 21:07:33 * cschwede is still reviewing it, but looks pretty good so far 21:07:35 <mattoliverau> And I'm in the middle of a review.. just haven't finished, I got busy 21:07:39 <notmyname> great 21:07:59 <kota_> then, yesterday, I acked all comments and fixed small things. 21:08:07 <notmyname> kota_: is there anything found that is bigger than expected? 21:08:25 <kota_> not so far imo 21:08:35 <notmyname> does it seem like the plan we talked about last week is still working? (having a smaller set of people review it and land when they say it's good) 21:08:48 <cschwede> notmyname: what was your schedule, ie required merge date? 21:09:01 <cschwede> i remember Thiago saying sth like today 21:09:07 <cschwede> or this week 21:09:11 <kota_> it seems tdasilva still wants to make it by default but I still hold my opinion it's not now because of gete issue. 21:09:24 <notmyname> kota_: what's the gate issue? 21:09:35 <notmyname> cschwede: I don't have a required merge date for the s3api, but asap. the main contention is the upcoming feature/deep branch 21:09:36 <kota_> basically, devstack 21:09:51 <kota_> we have debstack gate to run functests 21:10:08 <kota_> but devstack has custom proxy-server.conf that includes swift3 21:10:33 <kota_> then, s3api by default means double s3api compatible middeleware in the pipeline 21:10:51 <kota_> then, almost of functests will fail 21:10:57 <notmyname> ah 21:10:58 <cschwede> kota_: can'T we pin swift in devstack to a specific hash before the s3api merge, and then follow up, unpin it and at the same time disable old middleware? 21:11:19 <cschwede> iirc that is possible 21:11:39 <kota_> so my plan is 1. merge it to the master, 2. fix devstack to use s3api, then, s3api by default. 21:12:19 <notmyname> I agree that we should merge to master before turning it on by default. (but I support getting to an on-by-default asap) 21:12:25 <kota_> cschwede: ah... i think it's possible 21:13:21 <kota_> i mean, 'we can pin the version'. 21:13:33 <notmyname> is swift3 added to the proxy pipeline for every devstack job? 21:13:45 <kota_> but still not sure it works well for... setting up the gate. 21:13:59 <kota_> 1 sec 21:14:34 <kota_> the related change is at https://review.openstack.org/#/c/548938/ 21:14:34 <patchbot> patch 548938 - openstack-dev/devstack - Change devstack-swift to use keystone auth sample 21:15:39 <kota_> it doesn't solve the problem directly, but attempts to resolve devstack making handmade pipeline. 21:15:56 <kota_> it didn't get finished yet tho. 21:16:36 <notmyname> oh ok. so it's a swift3-specific dsvm job 21:16:43 <notmyname> that adds swift3 to the pipeline 21:16:58 <notmyname> which means we cannot add that test to the s3api branch and have it tests s3api middleware 21:17:13 <kota_> yes, the is_service_enabled swift3 is true at L430 21:17:33 <notmyname> and we cannot turn on s3api middleware by default because *other* projects may have added swift3 to the pipeline and then will fail 21:17:53 <kota_> and the swift_pipeline is based on tempauth pipeline in proxy-server.conf 21:17:56 <notmyname> did I understand that correctly? 21:18:40 <kota_> yes, but only one project (i.e. devstack) i know exists as such a project, tho. 21:19:08 <kota_> idk for any other projects. 21:19:21 <notmyname> hmm 21:20:00 <notmyname> it sounds like there's some more investigation needed as to what is breaking when. (or to rephrase: I don't completely understand the problem yet) 21:20:01 <kota_> oh, just we can ask to cschwede how it is on triple-O 21:20:16 <notmyname> but it doesn't seem like this is blocking a merge to master, correct? 21:20:28 <cschwede> kota_: it's not enabled by default on OOO 21:20:34 <cschwede> not yet, that is :) 21:20:38 <kota_> if we can make it as optional by default. 21:20:56 <notmyname> the proposed code can land, as-is. the proposed code is tested. adn the proposed-code doesn't break other projects 21:21:13 <kota_> notmyname: if we can make it as optional by default, it doesn't block to get it merge. 21:21:26 <notmyname> it is currently proposed as optional, right? 21:21:42 <kota_> i think so 21:22:19 <notmyname> ok 21:22:22 <kota_> oh... i missed your mean. 21:22:30 <kota_> absolutely true. 21:22:38 <notmyname> ok 21:22:43 <kota_> currently I proposed the patch to make s3api as optional 21:23:25 <notmyname> which means that it should be able to land on master as soon as it's reviewed. 21:24:37 <mattoliverau> Ok so definitely optional first, then turned up will be a follow up after investigation and once master 21:25:00 <mattoliverau> *turned on by default 21:25:06 <notmyname> so if that's true, and it sounds like only small stuff has been found so far, then I hope that after mattoliverau and cschwede finish their reviews, then we will be able to merge s3api to master 21:25:42 <kota_> +1 :) 21:25:54 <notmyname> good to hear 21:26:16 <notmyname> ok, let's talk about torgomatic's consistency engine patches. 21:26:21 <notmyname> https://review.openstack.org/#/c/555563/ landed this week! 21:26:21 <patchbot> patch 555563 - swift - Multiprocess object replicator (MERGED) 21:26:23 <notmyname> thank you 21:26:26 <kota_> nice 21:26:47 <mattoliverau> \o/ 21:26:58 <notmyname> that unblocks a few others that he'd been working on too that are based on that patch 21:27:25 <notmyname> rledisez: thanks for helping out there. I saw your name on an email about one of those, I think 21:28:01 <notmyname> ok, let's talk about the really big one now 21:28:02 <notmyname> feature/deep 21:28:17 <notmyname> acoles: what's the good work? 21:28:19 <notmyname> *word 21:28:58 <acoles> based on timburke's experimentation last week we made some significant changes to the replication strategy 21:29:30 <acoles> that is all now on feature/deep and Tim is continuing to trial sharding a monster db 21:29:48 <notmyname> and current progress on that big DB seems to be going well 21:29:50 <mattoliverau> Nice work 21:29:50 <notmyname> or at least going 21:29:51 <acoles> ...which is progressing :) ... but a little slowly :( so 21:29:57 <notmyname> one important... yeah. it's slow 21:30:14 <notmyname> but making progress 21:30:36 <acoles> so we're now trying to understand why it is so slow. It's on HDDs for a start. 21:30:58 <mattoliverau> We knew monster dbs would be a problem... But once it's on by default (in future) we only have to deals with monsters initially 21:31:01 <notmyname> given the hardware that DB is deployed on, we are *not* going to wait for it to finish all the sharding work. at it's current rate, it will take about 50 days to finish with 143M rows 21:31:32 <mattoliverau> Wow 21:31:53 <acoles> at the other end of the scale, a 1.2million row db on my laptop with no async pendings shards in about 3 mins! 21:32:05 <notmyname> wooo! 21:32:34 <notmyname> here's the real question: when can we propose feature/deep to master 21:32:49 <notmyname> I think we should definitely create a new feature/deep-review 21:32:50 <mattoliverau> acoles: cool 21:33:10 <mattoliverau> Yup, it touches a bunch of places 21:33:21 <cschwede> acoles: any chance to test the 143m row db on an ssd node? 21:33:38 <notmyname> cschwede: unfortunately no. 21:34:06 <mattoliverau> What part seems to be slow? Scanning, cleaving (locally), replication? 21:34:38 <notmyname> acoles: do you know? timburke would be better able to answer that, I think 21:34:39 <acoles> scanning was less than an hour IIRC 21:34:45 <notmyname> yeah 21:34:46 <mattoliverau> Maybe we can profile it and see what area may need to be improved (for the future) 21:34:52 <cschwede> are we sure this scales linear? ie 1.2 million = 3 minutes on ssd, 12 million at least in the range of a few hours max? 21:34:53 <rledisez> did you run iotop to check if an SSD would help? 21:35:16 <mattoliverau> Oh nice 21:35:23 <acoles> with the recent changes we have tried to stamp on replication to the db 21:35:32 <mattoliverau> (re: scanning time) 21:36:14 <acoles> cschwede: I'm going to make more tests on 2 million, 4 million etc 21:36:27 <notmyname> yeah... hang on. uploading a screenshot 21:36:55 <notmyname> rledisez: http://d.not.mn/node_stats_sharding.png 21:38:14 <notmyname> just really really busy drives. 50M async pendings in the cluster plus the sharding work 21:38:24 <notmyname> ie "just like a user would do" 21:38:52 <cschwede> well, on the other side - 1.2 mill = 3 min, 144mil should be around 360 min then = 6 hours, which means a factor of 200 between hdds and ssd. which sounds reasonable if only looking at iops? 21:39:56 <notmyname> assumign linear scaling etc etc etc 21:40:09 <cschwede> right 21:40:38 <notmyname> but we expect it to get faster over time. as shardes are cleaved off, the workload should be distributed to other nodes. so it may be faster 21:40:45 <notmyname> but for now, it's slow 21:40:59 <notmyname> but it's making progress 21:41:08 <cschwede> i was just worried about the big gap between minutes and 50 days, but it might be totally reasonable on hdds 21:41:19 <notmyname> which again gets back to "when can we merge to master?" 21:41:33 * cschwede is curious about acoles tests with 2 and 4 million rows 21:41:50 <notmyname> I've been talking with timburke about making some more DBs about that size too 21:44:00 <notmyname> rledisez: http://d.not.mn/iotop.png 21:44:30 <acoles> notmyname: re proposing to master, obviously we need to get to bottom of the speed issues, but at same time I am conscious that there will be bugs that will only be found by review 21:44:42 * acoles admits he may write bugs, sometimes 21:45:18 <cschwede> impossible! 21:45:46 <acoles> cschwede: I do it *deliberately* of course :) 21:45:56 <mattoliverau> Lol 21:45:57 <notmyname> here's a proposal. please tell me where it's wrong or how to make it better 21:47:40 <notmyname> (1) mattoliverau and cschwede finish a review of s3api. assuming it's good, it lands on master within the next 24-26 hours. (2) I create the feature/deep-review branch upstream (3) acoles prepares the patch chain to propose there (4) we keep monitoring the progress on the big DB (5) when feature/deep-review has the patch chain proposed, we soft-freeze master and all review 21:48:42 <acoles> +1 for soft freeze 21:50:17 <notmyname> everyone is amazed at that plan's brilliance and has nothing to say, or everyone is amazed at it's terribleness and is at a loss for words 21:50:22 <mattoliverau> And others who reviewed s3api and have there comments addressed can also re-revoew 21:50:33 <kota_> i don't have any objections we go soft-freeze when we have feature/deep-review. it should help the branch maintainer to concentrate to fix/discuss issues on the branch. 21:50:44 <mattoliverau> Seeing as they might also already be close to a +2, 21:50:58 <mattoliverau> But I'll try and get the review done today 21:51:01 <mattoliverau> Ish 21:52:19 <kota_> still want to keep my eyes the progress on (4) the progress on the big DB tho. 21:52:35 <notmyname> yes, definitely 21:54:55 <notmyname> (I really am waiting on any comments) 21:55:50 <acoles> kota_: +1 point 4 is important to us too 21:56:03 <mattoliverau> notmyname: so sounds like a good plane.. the prepare the patch and landing may depend on 4 a bit.. so the timeline might need to expand. But the plan is good 21:56:13 <mattoliverau> Plan 21:56:18 <cschwede> plan sounds good to me too 21:56:38 <notmyname> mattoliverau: well, yes, but I don't think we should wait for 50 days to see what happens before proposing something to master 21:56:46 <mattoliverau> oh no 21:56:52 <acoles> I am tentatively planning on a small chain of patches, hoping it will help reviewers grok the aspects of the change 21:57:11 <mattoliverau> We should totally make the steps happen. A d go make a shard-review branch 21:57:16 <acoles> hard to know how that will work out until I start carving it up 21:57:19 <notmyname> acoles: how long will you need for that? 21:57:38 <acoles> notmyname: ard to know how that will work out until I start carving it up ;) 21:57:42 <notmyname> ok 21:57:43 <acoles> maybe a day? 21:58:22 <notmyname> ok, great. i'll propose the branch my today or tomorrow, so that should work well together 21:58:30 <acoles> I'll most likely be derailed by tests breaking as the changes are separated 21:59:25 <notmyname> we're nearly at full time 21:59:36 <notmyname> let's start working together on this plan 21:59:59 <notmyname> I'll talk to tdasilva about getting another video chat next week to do a feature/deep overview 22:00:08 <notmyname> and that should also help with us reviewing 22:00:40 <kota_> +1 22:00:46 <notmyname> and ask him to record it to make sure different timezones can see it not in the middle of the night 22:01:06 <kota_> nice 22:01:06 <notmyname> (assuming we can't schedule it at a good time) 22:01:53 <mattoliverau> Sounds good 22:01:58 <notmyname> thanks for working on all this. s3api is really great for our users and feature/deep is probably the biggest/best/hardest big feature we've ever done 22:02:10 <notmyname> I'm excited to see them land 22:02:28 <notmyname> I'll put info about the video meeting in the -swift channel 22:02:34 <notmyname> thanks for coming today 22:02:38 <notmyname> #endmeeting