21:01:46 #startmeeting swift 21:01:47 Meeting started Wed Apr 25 21:01:46 2018 UTC and is due to finish in 60 minutes. The chair is notmyname. Information about MeetBot at http://wiki.debian.org/MeetBot. 21:01:48 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 21:01:50 The meeting name has been set to 'swift' 21:01:51 who's here for the swift team meeting? 21:01:53 o/ 21:01:56 hi 21:02:15 hello 21:02:21 hi 21:02:40 hello! 21:02:42 Link https://wiki.openstack.org/wiki/Meetings/Swift 21:02:47 oh my! it's cschwede! 21:02:56 #link https://wiki.openstack.org/wiki/Meetings/Swift 21:03:01 i heard there's a swift gathering here 21:03:50 tdasilva told me he's not here 21:04:08 and torgomatic and clayg and timburke are in a meeting (that I left) 21:04:12 soo... 21:04:15 let's get started! 21:04:15 cschwede: hi! 21:04:51 here's my weekly reminder that https://review.openstack.org/#/c/555245/ is still open and needs to land before the next release 21:04:51 patch 555245 - swift - Fix versioned writes error with url-encoded object... 21:05:12 it fixes https://bugs.launchpad.net/swift/+bug/1755554 21:05:13 Launchpad bug 1755554 in OpenStack Object Storage (swift) "Percent signs in object names cause trouble for versioned_writes" [High,In progress] - Assigned to Kota Tsuyuzaki (tsuyuzaki-kota) 21:05:39 let's talk about https://review.openstack.org/#/c/562943/ 21:05:40 patch 562943 - swift - Import swift3 into swift repo as s3api middleware 21:05:48 kota_: how is the patch looking this week? 21:06:06 o/ - sorry was a holiday in Oz yesterday so got confused this morning 21:06:52 (did kota_ step away?) 21:07:12 tdasilva, m_kazuhiro, and pete reviewed it 21:07:24 notmyname: I'M here 21:07:29 that's good news 21:07:33 * cschwede is still reviewing it, but looks pretty good so far 21:07:35 And I'm in the middle of a review.. just haven't finished, I got busy 21:07:39 great 21:07:59 then, yesterday, I acked all comments and fixed small things. 21:08:07 kota_: is there anything found that is bigger than expected? 21:08:25 not so far imo 21:08:35 does it seem like the plan we talked about last week is still working? (having a smaller set of people review it and land when they say it's good) 21:08:48 notmyname: what was your schedule, ie required merge date? 21:09:01 i remember Thiago saying sth like today 21:09:07 or this week 21:09:11 it seems tdasilva still wants to make it by default but I still hold my opinion it's not now because of gete issue. 21:09:24 kota_: what's the gate issue? 21:09:35 cschwede: I don't have a required merge date for the s3api, but asap. the main contention is the upcoming feature/deep branch 21:09:36 basically, devstack 21:09:51 we have debstack gate to run functests 21:10:08 but devstack has custom proxy-server.conf that includes swift3 21:10:33 then, s3api by default means double s3api compatible middeleware in the pipeline 21:10:51 then, almost of functests will fail 21:10:57 ah 21:10:58 kota_: can'T we pin swift in devstack to a specific hash before the s3api merge, and then follow up, unpin it and at the same time disable old middleware? 21:11:19 iirc that is possible 21:11:39 so my plan is 1. merge it to the master, 2. fix devstack to use s3api, then, s3api by default. 21:12:19 I agree that we should merge to master before turning it on by default. (but I support getting to an on-by-default asap) 21:12:25 cschwede: ah... i think it's possible 21:13:21 i mean, 'we can pin the version'. 21:13:33 is swift3 added to the proxy pipeline for every devstack job? 21:13:45 but still not sure it works well for... setting up the gate. 21:13:59 1 sec 21:14:34 the related change is at https://review.openstack.org/#/c/548938/ 21:14:34 patch 548938 - openstack-dev/devstack - Change devstack-swift to use keystone auth sample 21:15:39 it doesn't solve the problem directly, but attempts to resolve devstack making handmade pipeline. 21:15:56 it didn't get finished yet tho. 21:16:36 oh ok. so it's a swift3-specific dsvm job 21:16:43 that adds swift3 to the pipeline 21:16:58 which means we cannot add that test to the s3api branch and have it tests s3api middleware 21:17:13 yes, the is_service_enabled swift3 is true at L430 21:17:33 and we cannot turn on s3api middleware by default because *other* projects may have added swift3 to the pipeline and then will fail 21:17:53 and the swift_pipeline is based on tempauth pipeline in proxy-server.conf 21:17:56 did I understand that correctly? 21:18:40 yes, but only one project (i.e. devstack) i know exists as such a project, tho. 21:19:08 idk for any other projects. 21:19:21 hmm 21:20:00 it sounds like there's some more investigation needed as to what is breaking when. (or to rephrase: I don't completely understand the problem yet) 21:20:01 oh, just we can ask to cschwede how it is on triple-O 21:20:16 but it doesn't seem like this is blocking a merge to master, correct? 21:20:28 kota_: it's not enabled by default on OOO 21:20:34 not yet, that is :) 21:20:38 if we can make it as optional by default. 21:20:56 the proposed code can land, as-is. the proposed code is tested. adn the proposed-code doesn't break other projects 21:21:13 notmyname: if we can make it as optional by default, it doesn't block to get it merge. 21:21:26 it is currently proposed as optional, right? 21:21:42 i think so 21:22:19 ok 21:22:22 oh... i missed your mean. 21:22:30 absolutely true. 21:22:38 ok 21:22:43 currently I proposed the patch to make s3api as optional 21:23:25 which means that it should be able to land on master as soon as it's reviewed. 21:24:37 Ok so definitely optional first, then turned up will be a follow up after investigation and once master 21:25:00 *turned on by default 21:25:06 so if that's true, and it sounds like only small stuff has been found so far, then I hope that after mattoliverau and cschwede finish their reviews, then we will be able to merge s3api to master 21:25:42 +1 :) 21:25:54 good to hear 21:26:16 ok, let's talk about torgomatic's consistency engine patches. 21:26:21 https://review.openstack.org/#/c/555563/ landed this week! 21:26:21 patch 555563 - swift - Multiprocess object replicator (MERGED) 21:26:23 thank you 21:26:26 nice 21:26:47 \o/ 21:26:58 that unblocks a few others that he'd been working on too that are based on that patch 21:27:25 rledisez: thanks for helping out there. I saw your name on an email about one of those, I think 21:28:01 ok, let's talk about the really big one now 21:28:02 feature/deep 21:28:17 acoles: what's the good work? 21:28:19 *word 21:28:58 based on timburke's experimentation last week we made some significant changes to the replication strategy 21:29:30 that is all now on feature/deep and Tim is continuing to trial sharding a monster db 21:29:48 and current progress on that big DB seems to be going well 21:29:50 Nice work 21:29:50 or at least going 21:29:51 ...which is progressing :) ... but a little slowly :( so 21:29:57 one important... yeah. it's slow 21:30:14 but making progress 21:30:36 so we're now trying to understand why it is so slow. It's on HDDs for a start. 21:30:58 We knew monster dbs would be a problem... But once it's on by default (in future) we only have to deals with monsters initially 21:31:01 given the hardware that DB is deployed on, we are *not* going to wait for it to finish all the sharding work. at it's current rate, it will take about 50 days to finish with 143M rows 21:31:32 Wow 21:31:53 at the other end of the scale, a 1.2million row db on my laptop with no async pendings shards in about 3 mins! 21:32:05 wooo! 21:32:34 here's the real question: when can we propose feature/deep to master 21:32:49 I think we should definitely create a new feature/deep-review 21:32:50 acoles: cool 21:33:10 Yup, it touches a bunch of places 21:33:21 acoles: any chance to test the 143m row db on an ssd node? 21:33:38 cschwede: unfortunately no. 21:34:06 What part seems to be slow? Scanning, cleaving (locally), replication? 21:34:38 acoles: do you know? timburke would be better able to answer that, I think 21:34:39 scanning was less than an hour IIRC 21:34:45 yeah 21:34:46 Maybe we can profile it and see what area may need to be improved (for the future) 21:34:52 are we sure this scales linear? ie 1.2 million = 3 minutes on ssd, 12 million at least in the range of a few hours max? 21:34:53 did you run iotop to check if an SSD would help? 21:35:16 Oh nice 21:35:23 with the recent changes we have tried to stamp on replication to the db 21:35:32 (re: scanning time) 21:36:14 cschwede: I'm going to make more tests on 2 million, 4 million etc 21:36:27 yeah... hang on. uploading a screenshot 21:36:55 rledisez: http://d.not.mn/node_stats_sharding.png 21:38:14 just really really busy drives. 50M async pendings in the cluster plus the sharding work 21:38:24 ie "just like a user would do" 21:38:52 well, on the other side - 1.2 mill = 3 min, 144mil should be around 360 min then = 6 hours, which means a factor of 200 between hdds and ssd. which sounds reasonable if only looking at iops? 21:39:56 assumign linear scaling etc etc etc 21:40:09 right 21:40:38 but we expect it to get faster over time. as shardes are cleaved off, the workload should be distributed to other nodes. so it may be faster 21:40:45 but for now, it's slow 21:40:59 but it's making progress 21:41:08 i was just worried about the big gap between minutes and 50 days, but it might be totally reasonable on hdds 21:41:19 which again gets back to "when can we merge to master?" 21:41:33 * cschwede is curious about acoles tests with 2 and 4 million rows 21:41:50 I've been talking with timburke about making some more DBs about that size too 21:44:00 rledisez: http://d.not.mn/iotop.png 21:44:30 notmyname: re proposing to master, obviously we need to get to bottom of the speed issues, but at same time I am conscious that there will be bugs that will only be found by review 21:44:42 * acoles admits he may write bugs, sometimes 21:45:18 impossible! 21:45:46 cschwede: I do it *deliberately* of course :) 21:45:56 Lol 21:45:57 here's a proposal. please tell me where it's wrong or how to make it better 21:47:40 (1) mattoliverau and cschwede finish a review of s3api. assuming it's good, it lands on master within the next 24-26 hours. (2) I create the feature/deep-review branch upstream (3) acoles prepares the patch chain to propose there (4) we keep monitoring the progress on the big DB (5) when feature/deep-review has the patch chain proposed, we soft-freeze master and all review 21:48:42 +1 for soft freeze 21:50:17 everyone is amazed at that plan's brilliance and has nothing to say, or everyone is amazed at it's terribleness and is at a loss for words 21:50:22 And others who reviewed s3api and have there comments addressed can also re-revoew 21:50:33 i don't have any objections we go soft-freeze when we have feature/deep-review. it should help the branch maintainer to concentrate to fix/discuss issues on the branch. 21:50:44 Seeing as they might also already be close to a +2, 21:50:58 But I'll try and get the review done today 21:51:01 Ish 21:52:19 still want to keep my eyes the progress on (4) the progress on the big DB tho. 21:52:35 yes, definitely 21:54:55 (I really am waiting on any comments) 21:55:50 kota_: +1 point 4 is important to us too 21:56:03 notmyname: so sounds like a good plane.. the prepare the patch and landing may depend on 4 a bit.. so the timeline might need to expand. But the plan is good 21:56:13 Plan 21:56:18 plan sounds good to me too 21:56:38 mattoliverau: well, yes, but I don't think we should wait for 50 days to see what happens before proposing something to master 21:56:46 oh no 21:56:52 I am tentatively planning on a small chain of patches, hoping it will help reviewers grok the aspects of the change 21:57:11 We should totally make the steps happen. A d go make a shard-review branch 21:57:16 hard to know how that will work out until I start carving it up 21:57:19 acoles: how long will you need for that? 21:57:38 notmyname: ard to know how that will work out until I start carving it up ;) 21:57:42 ok 21:57:43 maybe a day? 21:58:22 ok, great. i'll propose the branch my today or tomorrow, so that should work well together 21:58:30 I'll most likely be derailed by tests breaking as the changes are separated 21:59:25 we're nearly at full time 21:59:36 let's start working together on this plan 21:59:59 I'll talk to tdasilva about getting another video chat next week to do a feature/deep overview 22:00:08 and that should also help with us reviewing 22:00:40 +1 22:00:46 and ask him to record it to make sure different timezones can see it not in the middle of the night 22:01:06 nice 22:01:06 (assuming we can't schedule it at a good time) 22:01:53 Sounds good 22:01:58 thanks for working on all this. s3api is really great for our users and feature/deep is probably the biggest/best/hardest big feature we've ever done 22:02:10 I'm excited to see them land 22:02:28 I'll put info about the video meeting in the -swift channel 22:02:34 thanks for coming today 22:02:38 #endmeeting