21:01:46 <notmyname> #startmeeting swift
21:01:47 <openstack> Meeting started Wed Apr 25 21:01:46 2018 UTC and is due to finish in 60 minutes.  The chair is notmyname. Information about MeetBot at http://wiki.debian.org/MeetBot.
21:01:48 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
21:01:50 <openstack> The meeting name has been set to 'swift'
21:01:51 <notmyname> who's here for the swift team meeting?
21:01:53 <m_kazuhiro> o/
21:01:56 <acoles> hi
21:02:15 <kota_> hello
21:02:21 <rledisez> hi
21:02:40 <cschwede> hello!
21:02:42 <notmyname> Link https://wiki.openstack.org/wiki/Meetings/Swift
21:02:47 <notmyname> oh my! it's cschwede!
21:02:56 <notmyname> #link https://wiki.openstack.org/wiki/Meetings/Swift
21:03:01 <cschwede> i heard there's a swift gathering here
21:03:50 <notmyname> tdasilva told me he's not here
21:04:08 <notmyname> and torgomatic and clayg and timburke are in a meeting (that I left)
21:04:12 <notmyname> soo...
21:04:15 <notmyname> let's get started!
21:04:15 <acoles> cschwede: hi!
21:04:51 <notmyname> here's my weekly reminder that https://review.openstack.org/#/c/555245/ is still open and needs to land before the next release
21:04:51 <patchbot> patch 555245 - swift - Fix versioned writes error with url-encoded object...
21:05:12 <notmyname> it fixes https://bugs.launchpad.net/swift/+bug/1755554
21:05:13 <openstack> Launchpad bug 1755554 in OpenStack Object Storage (swift) "Percent signs in object names cause trouble for versioned_writes" [High,In progress] - Assigned to Kota Tsuyuzaki (tsuyuzaki-kota)
21:05:39 <notmyname> let's talk about https://review.openstack.org/#/c/562943/
21:05:40 <patchbot> patch 562943 - swift - Import swift3 into swift repo as s3api middleware
21:05:48 <notmyname> kota_: how is the patch looking this week?
21:06:06 <mattoliverau> o/ - sorry was a holiday in Oz yesterday so got confused this morning
21:06:52 <notmyname> (did kota_ step away?)
21:07:12 <kota_> tdasilva, m_kazuhiro, and pete reviewed it
21:07:24 <kota_> notmyname: I'M here
21:07:29 <notmyname> that's good news
21:07:33 * cschwede is still reviewing it, but looks pretty good so far
21:07:35 <mattoliverau> And I'm in the middle of a review.. just haven't finished, I got busy
21:07:39 <notmyname> great
21:07:59 <kota_> then, yesterday, I acked all comments and fixed small things.
21:08:07 <notmyname> kota_: is there anything found that is bigger than expected?
21:08:25 <kota_> not so far imo
21:08:35 <notmyname> does it seem like the plan we talked about last week is still working? (having a smaller set of people review it and land when they say it's good)
21:08:48 <cschwede> notmyname: what was your schedule, ie required merge date?
21:09:01 <cschwede> i remember Thiago saying sth like today
21:09:07 <cschwede> or this week
21:09:11 <kota_> it seems tdasilva still wants to make it by default but I still hold my opinion it's not now because of gete issue.
21:09:24 <notmyname> kota_: what's the gate issue?
21:09:35 <notmyname> cschwede: I don't have a required merge date for the s3api, but asap. the main contention is the upcoming feature/deep branch
21:09:36 <kota_> basically, devstack
21:09:51 <kota_> we have debstack gate to run functests
21:10:08 <kota_> but devstack has custom proxy-server.conf that includes swift3
21:10:33 <kota_> then, s3api by default means double s3api compatible middeleware in the pipeline
21:10:51 <kota_> then, almost of functests will fail
21:10:57 <notmyname> ah
21:10:58 <cschwede> kota_: can'T we pin swift in devstack to a specific hash before the s3api merge, and then follow up, unpin it and at the same time disable old middleware?
21:11:19 <cschwede> iirc that is possible
21:11:39 <kota_> so my plan is 1. merge it to the master, 2. fix devstack to use s3api, then, s3api by default.
21:12:19 <notmyname> I agree that we should merge to master before turning it on by default. (but I support getting to an on-by-default asap)
21:12:25 <kota_> cschwede: ah... i think it's possible
21:13:21 <kota_> i mean, 'we can pin the version'.
21:13:33 <notmyname> is swift3 added to the proxy pipeline for every devstack job?
21:13:45 <kota_> but still not sure it works well for... setting up the gate.
21:13:59 <kota_> 1 sec
21:14:34 <kota_> the related change is at https://review.openstack.org/#/c/548938/
21:14:34 <patchbot> patch 548938 - openstack-dev/devstack - Change devstack-swift to use keystone auth sample
21:15:39 <kota_> it doesn't solve the problem directly, but attempts to resolve devstack making handmade pipeline.
21:15:56 <kota_> it didn't get finished yet tho.
21:16:36 <notmyname> oh ok. so it's a swift3-specific dsvm job
21:16:43 <notmyname> that adds swift3 to the pipeline
21:16:58 <notmyname> which means we cannot add that test to the s3api branch and have it tests s3api middleware
21:17:13 <kota_> yes, the is_service_enabled swift3 is true at L430
21:17:33 <notmyname> and we cannot turn on s3api middleware by default because *other* projects may have added swift3 to the pipeline and then will fail
21:17:53 <kota_> and the swift_pipeline is based on tempauth pipeline in proxy-server.conf
21:17:56 <notmyname> did I understand that correctly?
21:18:40 <kota_> yes, but only one project (i.e. devstack) i know exists as such a project, tho.
21:19:08 <kota_> idk for any other projects.
21:19:21 <notmyname> hmm
21:20:00 <notmyname> it sounds like there's some more investigation needed as to what is breaking when. (or to rephrase: I don't completely understand the problem yet)
21:20:01 <kota_> oh, just we can ask to cschwede how it is on triple-O
21:20:16 <notmyname> but it doesn't seem like this is blocking a merge to master, correct?
21:20:28 <cschwede> kota_: it's not enabled by default on OOO
21:20:34 <cschwede> not yet, that is :)
21:20:38 <kota_> if we can make it as optional by default.
21:20:56 <notmyname> the proposed code can land, as-is. the proposed code is tested. adn the proposed-code doesn't break other projects
21:21:13 <kota_> notmyname: if we can make it as optional by default, it doesn't block to get it merge.
21:21:26 <notmyname> it is currently proposed as optional, right?
21:21:42 <kota_> i think so
21:22:19 <notmyname> ok
21:22:22 <kota_> oh... i missed your mean.
21:22:30 <kota_> absolutely true.
21:22:38 <notmyname> ok
21:22:43 <kota_> currently I proposed the patch to make s3api as optional
21:23:25 <notmyname> which means that it should be able to land on master as soon as it's reviewed.
21:24:37 <mattoliverau> Ok so definitely optional first, then turned up will be a follow up after investigation and once master
21:25:00 <mattoliverau> *turned on by default
21:25:06 <notmyname> so if that's true, and it sounds like only small stuff has been found so far, then I hope that after mattoliverau and cschwede finish their reviews, then we will be able to merge s3api to master
21:25:42 <kota_> +1 :)
21:25:54 <notmyname> good to hear
21:26:16 <notmyname> ok, let's talk about torgomatic's consistency engine patches.
21:26:21 <notmyname> https://review.openstack.org/#/c/555563/ landed this week!
21:26:21 <patchbot> patch 555563 - swift - Multiprocess object replicator (MERGED)
21:26:23 <notmyname> thank you
21:26:26 <kota_> nice
21:26:47 <mattoliverau> \o/
21:26:58 <notmyname> that unblocks a few others that he'd been working on too that are based on that patch
21:27:25 <notmyname> rledisez: thanks for helping out there. I saw your name on an email about one of those, I think
21:28:01 <notmyname> ok, let's talk about the really big one now
21:28:02 <notmyname> feature/deep
21:28:17 <notmyname> acoles: what's the good work?
21:28:19 <notmyname> *word
21:28:58 <acoles> based on timburke's experimentation last week we made some significant changes to the replication strategy
21:29:30 <acoles> that is all now on feature/deep and Tim is continuing to trial sharding a monster db
21:29:48 <notmyname> and current progress on that big DB seems to be going well
21:29:50 <mattoliverau> Nice work
21:29:50 <notmyname> or at least going
21:29:51 <acoles> ...which is progressing :) ... but a little slowly :( so
21:29:57 <notmyname> one important... yeah. it's slow
21:30:14 <notmyname> but making progress
21:30:36 <acoles> so we're now trying to understand why it is so slow. It's on HDDs for a start.
21:30:58 <mattoliverau> We knew monster dbs would be a problem... But once it's on by default (in future) we only have to deals with monsters initially
21:31:01 <notmyname> given the hardware that DB is deployed on, we are *not* going to wait for it to finish all the sharding work. at it's current rate, it will take about 50 days to finish with 143M rows
21:31:32 <mattoliverau> Wow
21:31:53 <acoles> at the other end of the scale, a 1.2million row db on my laptop with no async pendings shards in about 3 mins!
21:32:05 <notmyname> wooo!
21:32:34 <notmyname> here's the real question: when can we propose feature/deep to master
21:32:49 <notmyname> I think we should definitely create a new feature/deep-review
21:32:50 <mattoliverau> acoles: cool
21:33:10 <mattoliverau> Yup, it touches a bunch of places
21:33:21 <cschwede> acoles: any chance to test the 143m row db on an ssd node?
21:33:38 <notmyname> cschwede: unfortunately no.
21:34:06 <mattoliverau> What part seems to be slow? Scanning, cleaving (locally), replication?
21:34:38 <notmyname> acoles: do you know? timburke would be better able to answer that, I think
21:34:39 <acoles> scanning was less than an hour IIRC
21:34:45 <notmyname> yeah
21:34:46 <mattoliverau> Maybe we can profile it and see what area may need to be improved (for the future)
21:34:52 <cschwede> are we sure this scales linear? ie 1.2 million = 3 minutes on ssd, 12 million at least in the range of a few hours max?
21:34:53 <rledisez> did you run iotop to check if an SSD would help?
21:35:16 <mattoliverau> Oh nice
21:35:23 <acoles> with the recent changes we have tried to stamp on replication to the db
21:35:32 <mattoliverau> (re: scanning time)
21:36:14 <acoles> cschwede: I'm going to make more tests on 2 million, 4 million etc
21:36:27 <notmyname> yeah... hang on. uploading a screenshot
21:36:55 <notmyname> rledisez: http://d.not.mn/node_stats_sharding.png
21:38:14 <notmyname> just really really busy drives. 50M async pendings in the cluster plus the sharding work
21:38:24 <notmyname> ie "just like a user would do"
21:38:52 <cschwede> well, on the other side - 1.2 mill = 3 min, 144mil should be around 360 min then = 6 hours, which means a factor of 200 between hdds and ssd. which sounds reasonable if only looking at iops?
21:39:56 <notmyname> assumign linear scaling etc etc etc
21:40:09 <cschwede> right
21:40:38 <notmyname> but we expect it to get faster over time. as shardes are cleaved off, the workload should be distributed to other nodes. so it may be faster
21:40:45 <notmyname> but for now, it's slow
21:40:59 <notmyname> but it's making progress
21:41:08 <cschwede> i was just worried about the big gap between minutes and 50 days, but it might be totally reasonable on hdds
21:41:19 <notmyname> which again gets back to "when can we merge to master?"
21:41:33 * cschwede is curious about acoles tests with 2 and 4 million rows
21:41:50 <notmyname> I've been talking with timburke about making some more DBs about that size too
21:44:00 <notmyname> rledisez: http://d.not.mn/iotop.png
21:44:30 <acoles> notmyname: re proposing to master, obviously we need to get to bottom of the speed issues, but at same time I am conscious that there will be bugs that will only be found by review
21:44:42 * acoles admits he may write bugs, sometimes
21:45:18 <cschwede> impossible!
21:45:46 <acoles> cschwede: I do it *deliberately* of course :)
21:45:56 <mattoliverau> Lol
21:45:57 <notmyname> here's a proposal. please tell me where it's wrong or how to make it better
21:47:40 <notmyname> (1) mattoliverau and cschwede finish a review of s3api. assuming it's good, it lands on master within the next 24-26 hours. (2) I create the feature/deep-review branch upstream (3) acoles prepares the patch chain to propose there (4) we keep monitoring the progress on the big DB (5) when feature/deep-review has the patch chain proposed, we soft-freeze master and all review
21:48:42 <acoles> +1 for soft freeze
21:50:17 <notmyname> everyone is amazed at that plan's brilliance and has nothing to say, or everyone is amazed at it's terribleness and is at a loss for words
21:50:22 <mattoliverau> And others who reviewed s3api and have there comments addressed can also re-revoew
21:50:33 <kota_> i don't have any objections we go soft-freeze when we have feature/deep-review. it should help the branch maintainer to concentrate to fix/discuss issues on the branch.
21:50:44 <mattoliverau> Seeing as they might also already be close to a +2,
21:50:58 <mattoliverau> But I'll try and get the review done today
21:51:01 <mattoliverau> Ish
21:52:19 <kota_> still want to keep my eyes the progress on (4) the progress on the big DB tho.
21:52:35 <notmyname> yes, definitely
21:54:55 <notmyname> (I really am waiting on any comments)
21:55:50 <acoles> kota_: +1 point 4 is important to us too
21:56:03 <mattoliverau> notmyname: so sounds like a good plane.. the prepare the patch and landing may depend on 4 a bit.. so the timeline might need to expand. But the plan is good
21:56:13 <mattoliverau> Plan
21:56:18 <cschwede> plan sounds good to me too
21:56:38 <notmyname> mattoliverau: well, yes, but I don't think we should wait for 50 days to see what happens before proposing something to master
21:56:46 <mattoliverau> oh no
21:56:52 <acoles> I am tentatively planning on a small chain of patches, hoping it will help reviewers grok the aspects of the change
21:57:11 <mattoliverau> We should totally make the steps happen. A d go make a shard-review branch
21:57:16 <acoles> hard to know how that will work out until I start carving it up
21:57:19 <notmyname> acoles: how long will you need for that?
21:57:38 <acoles> notmyname: ard to know how that will work out until I start carving it up ;)
21:57:42 <notmyname> ok
21:57:43 <acoles> maybe a day?
21:58:22 <notmyname> ok, great. i'll propose the branch my today or tomorrow, so that should work well together
21:58:30 <acoles> I'll most likely be derailed by tests breaking as the changes are separated
21:59:25 <notmyname> we're nearly at full time
21:59:36 <notmyname> let's start working together on this plan
21:59:59 <notmyname> I'll talk to tdasilva about getting another video chat next week to do a feature/deep overview
22:00:08 <notmyname> and that should also help with us reviewing
22:00:40 <kota_> +1
22:00:46 <notmyname> and ask him to record it to make sure different timezones can see it not in the middle of the night
22:01:06 <kota_> nice
22:01:06 <notmyname> (assuming we can't schedule it at a good time)
22:01:53 <mattoliverau> Sounds good
22:01:58 <notmyname> thanks for working on all this. s3api is really great for our users and feature/deep is probably the biggest/best/hardest big feature we've ever done
22:02:10 <notmyname> I'm excited to see them land
22:02:28 <notmyname> I'll put info about the video meeting in the -swift channel
22:02:34 <notmyname> thanks for coming today
22:02:38 <notmyname> #endmeeting