#openstack-meeting log

21:00:04 <notmyname> #startmeeting swift
21:00:04 <openstack> Meeting started Wed Jan 31 21:00:04 2018 UTC and is due to finish in 60 minutes.  The chair is notmyname. Information about MeetBot at http://wiki.debian.org/MeetBot.
21:00:05 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
21:00:07 <openstack> The meeting name has been set to 'swift'
21:00:19 <notmyname> who's here for the swift team meeting?
21:00:22 <m_kazuhiro> o/
21:00:25 <mattoliverau> o/
21:00:46 <acoles> hello
21:01:05 <rledisez> hi
21:01:23 <mattoliverau> could be a nice small one (in terms of people).
21:01:30 <notmyname> seems like it
21:01:34 <mattoliverau> but the important people are here ;)
21:02:20 <rledisez> there is only important people around swift ;)
21:02:29 <notmyname> ok, not a whole lot on the agenda this week
21:02:36 <notmyname> #link https://wiki.openstack.org/wiki/Meetings/Swift
21:02:44 <notmyname> #topic releases
21:02:48 <notmyname> first up, releases
21:03:02 <notmyname> we've done a swiftclient release (it took a long time in the gate, but it's done now)
21:03:16 <notmyname> and swift itself is getting ready for a 2.17.0 release
21:03:21 <mattoliverau> \o/
21:03:45 <notmyname> I know I may have given the impression of doing the 2.17.0 release sooner, but I was waiting on a patch and I was traveling :-)
21:04:02 <mattoliverau> next swift client release will need to support different tempurl hashes, cause that just landed in swift
21:04:05 <notmyname> and we arent under time pressure (yet) for it. we were on a schedule for the client release, so i did that one first
21:04:29 <notmyname> mattoliverau: good point
21:04:39 <notmyname> yeah, so a few last-minute things landing ins wift for 2.17.0
21:04:47 <notmyname> the multiple hashes for tempurls is cool
21:04:52 <mattoliverau> yeah, in no hurry.. we might be able to land some more things so notmyname has to update the changelog again :P
21:04:54 <notmyname> and the data segements in SLOs is really nice :-)
21:05:17 <notmyname> yeah, I'll get the changelog updated today or tomorrow and do the tag when it lands
21:05:30 <timburke> thanks again mattoliverau and cschwede, for looking at the tempurl stuff! i suppose i ought to start on a similar patch for formpost...
21:05:40 <notmyname> looking ahead to the official queens release...
21:06:00 <notmyname> I do not know yet if we'll do another 2.18. or 2.17.1 release before the queens deadline
21:06:02 <mattoliverau> timburke: multpe hashes in the client first, so people can easily use it
21:06:08 <notmyname> to some extent, it depends on what lands
21:06:21 <notmyname> if we do anything, I suspect it would only be a 2.17.1
21:06:34 <notmyname> IIRC the deadline for that release is only a couple of weeks away anyway
21:06:53 <mattoliverau> I added a basic part diff tool.. though maybe it isn't useful to people. It was fun to write tho :P
21:06:56 <notmyname> so maybe we'll just call this one an early queens release :-)
21:07:23 <notmyname> mattoliverau: what's that?
21:07:37 <mattoliverau> i pushed it last night, let me find it
21:07:51 <timburke> i saw something about that... need to find time to take a look...
21:08:15 <mattoliverau> #link https://review.openstack.org/#/c/539466/
21:08:16 <patchbot> patch 539466 - swift - Add a basic partition diffing tool
21:08:31 <notmyname> oh, cool
21:08:53 <mattoliverau> Just from a discussion on channel the other week. so you can compare builders and rings to see whats different in the replica2parts2dev tables
21:09:06 <notmyname> yeah, that could be quite useful
21:09:21 <mattoliverau> tho I don't know if the verbose option is useful.. but is fun :P
21:09:36 <notmyname> when `swift-recon --md5` gives an error, this tool could tell you how bad it is
21:10:01 <mattoliverau> might need to add a device diff too. but you can just use ring builder for at least that list
21:10:12 <mattoliverau> *to the tool
21:10:17 <notmyname> any question on releases, current or upcoming?
21:10:38 <timburke> then you realize that you had a part power of 24. then you cry...
21:10:45 <timburke> ;-)
21:11:03 <mattoliverau> yup.. but it look pretty
21:11:21 <mattoliverau> could strip -v out
21:11:35 <notmyname> between now and the queens cutoff date, if you see something that's a big bug, that should be priority above new features
21:11:51 <notmyname> let's move on to the PTG
21:11:51 <mattoliverau> +100
21:11:57 <notmyname> #topic PTG planning
21:12:08 <notmyname> I made a thing
21:12:10 <notmyname> #link https://etherpad.openstack.org/p/Dublin_PTG_Swift
21:12:18 <acoles> notmyname: is the checkpoint question still alive?
21:12:21 <notmyname> an etherpad for gathering topics
21:12:51 <notmyname> acoles: IMO yes it is, but I'd like to have torgomatic around. well, more people actually. and I think it's something we should talk about at the PTG
21:13:10 <notmyname> rephrased, I do NOT think it's something we should agree to or decide quickly
21:13:11 <acoles> notmyname: ok, makes sense
21:13:30 <notmyname> add it to the etherpad! :-)
21:13:50 <acoles> done
21:14:12 <mattoliverau> acoles: I'll looking forward to being able to talk sharding at PTG. Sorry I've been a bit in and out regarding reviewing and working on it upstream over the last few weeks.
21:14:48 <acoles> mattoliverau: NP, I'm also looking forward to a face to face session on it!
21:16:13 <notmyname> the PTG is in Dublin Ireland starting on monday february 26
21:16:33 <notmyname> we'll have a room for wed-fri, but I'm sure we'll find a way to talk on monday and tuesday, too
21:16:52 <notmyname> and it's likely we may also try to find some time for some off-site social activities too
21:17:19 <mattoliverau> And I found pretty direct flights, so it'll only 26 hours of transit (instead of 30+) so I'm happy :)
21:17:32 <notmyname> if you find someone who's going to the PTG, please encourage them to update the etherpad
21:18:11 <notmyname> here's all the links to etherpads for the whole event
21:18:12 <notmyname> #link https://wiki.openstack.org/wiki/PTG/Queens/Etherpads
21:18:44 <mattoliverau> we're using a different naming pattern.
21:19:07 <notmyname> that's because i created the etherpad before I added a link to it on the wiki page
21:19:26 <mattoliverau> oh it's fine, just my OCD kicking in
21:19:44 <timburke> https://wiki.openstack.org/wiki/PTG/Rocky/Etherpads you mean, right?
21:19:47 <notmyname> hang on. no we arent
21:20:03 <notmyname> oh yeah
21:20:09 <notmyname> r comes after q
21:20:16 <notmyname> #undo
21:20:17 <openstack> Removing item from minutes: #link https://wiki.openstack.org/wiki/PTG/Rocky/Etherpads
21:20:20 <notmyname> #link https://wiki.openstack.org/wiki/PTG/Rocky/Etherpads
21:20:23 <mattoliverau> yeah, https://wiki.openstack.org/wiki/PTG/Rocky/Etherpads (i've had that open so didn't follow the irc link)
21:21:07 <notmyname> timburke: thanks
21:21:18 <notmyname> any other questions about the PTG? everyone ok for now?
21:21:26 * tdasilva snicks in...
21:22:02 <notmyname> WELCOME tdasilva!!!!!!!
21:22:09 <mattoliverau> any questions tdasilva? now that your here :P
21:22:11 <notmyname> (you can't "snick" in ;-) )
21:23:03 <notmyname> #topic task queue: upgrade impact
21:23:12 <notmyname> m_kazuhiro: this is a topic you added to the agenda
21:23:14 <notmyname> #link https://etherpad.openstack.org/p/swift_general_task_queue
21:23:22 <notmyname> m_kazuhiro: the floor is yours
21:23:30 <m_kazuhiro> notmyname: thank you
21:23:49 <m_kazuhiro> At first, I explain background.
21:25:00 <m_kazuhiro> I'm implementing patch to update object-expirer now. https://review.openstack.org/#/c/517389/
21:25:01 <patchbot> patch 517389 - swift - Update object-expirer to use general task queue sy...
21:25:29 <m_kazuhiro> This patch makes expier to use general task queue.
21:25:46 <m_kazuhiro> With general task queue feature, many object-expirers run on a swift cluster. Object-expirer runs on every object-server.
21:26:12 <m_kazuhiro> The expiratoin tasks are assigned to object-expirers according to partition numbers which their object-server have.
21:26:30 <m_kazuhiro> To get partition numbers of local object-server, object-expirer requires ip and port information of object-server. Therefore object-expirer's config is moved to object-server.conf's section from sepcial conf file 'object-expirer.conf' in the patch.
21:26:57 <m_kazuhiro> Hidden task account / container names are changed from original task account / container (hereinafter, we call it "legacy" style). In legacy style, task account / container is only one for a swift cluster. With general task queue feature, there will be many task accounts / containers.
21:27:19 <m_kazuhiro> To make expirers compatible with legacy style tasks, we can set "execute_legacy_task" flag in object-expirer section in object-server.conf. If the value is True, the object-expirer will execute legacy style tasks. So we can choice which object-expirer runs for legacy style task on object-server.
21:27:49 <m_kazuhiro> After the patch, no legacy style expiration tasks are created. Expiration tasks are created only into general tasks queue.
21:28:38 <m_kazuhiro> About this patch. The discussion points for now are...
21:29:33 <m_kazuhiro> 1: If swift operators run object-expirer (for legacy style tasks) NOT on object-servers before the patch. They need to redesign where object-expirers for legacy style tasks should run on. This is impact for operator.
21:29:49 <m_kazuhiro> 2: If swift operators forget to update object-server's [object-expirer] section, there will be error message 'Unable to find object-expirer config section in object-server.conf'. Then, no object-expirers will run. Can we accept the behavior?
21:31:07 <m_kazuhiro> I want to discuss above points.
21:31:16 <mattoliverau> thanks m_kazuhiro
21:31:31 <notmyname> yes, thank you
21:31:38 <mattoliverau> So in the ehterpad you talk about 2 choices.
21:31:54 <notmyname> so it's about what happens in a cluster that is using expiring objects after they upgrade to a release that has a task queue
21:32:22 <mattoliverau> make people migrate to the new expirer configuration or run 2 sets (legacy) and task queue expirers. The former looking at the old config.
21:32:39 <kota_> morning
21:32:47 <kota_> sorry slept too much
21:32:53 <notmyname> kota_: no worries
21:33:20 <mattoliverau> Having 2 different expirers seems more confusing then just fixing it up on upgrade to me.
21:33:54 <notmyname> but on the other hand "run this legacy one until the queue is gone" does sound simple
21:34:00 <mattoliverau> I mean what do we call them, and if people are using automated swift-init or there own systemd init scripts then they may all need to be renamed
21:34:06 <mattoliverau> yeah
21:34:11 <notmyname> I suppose the queue may never be gone, if someone has a 7 year expiry or something, though
21:34:28 <mattoliverau> So it really depends on where your running expirers
21:35:08 <mattoliverau> it's pretty smooth if it's already on object servers. But obviously more effort on people running them elsewhere (line rledisez I believe)
21:35:37 <mattoliverau> I put this in the etherpad, while brain storming the steps involved (I might have missed some?)
21:35:57 <mattoliverau> - Move or create '[object-expirer]' section to object-server configuration (on all object server nodes)
21:35:57 <mattoliverau> - drop a '/etc/swift/internal-client.conf' if it doesn't exist, or define an internal client location with 'internal_client_conf_path'.
21:35:57 <mattoliverau> - If you need legacy expirers (not green field and want to clean up old legacy locations):
21:35:57 <mattoliverau> - if old expirers where on some of the object servers:
21:35:57 <mattoliverau> - add 'execute_legacy_task = true' to only them so they will still work and take advantage of old existing 'processes' and 'process' settings.
21:35:58 <mattoliverau> - else:
21:36:00 <mattoliverau> - pick the same number of expirers to do legacy work. Then you can use the same 'processes' and 'process' settings. Or just pick 1 to do legacy work and don't define 'processes' and 'process'.
21:36:02 <mattoliverau> - Optionally, if you wish to choose a different task queue account prefix of expirer queue name prefix do so now:
21:36:04 <mattoliverau> task_account_prefix = ?  (default 'task')
21:36:06 <mattoliverau> expirer_task_container_prefix = ? (default 'expirer').
21:36:29 <acoles> is it ok for more than one new style expirer to have execute_legacy_task=true
21:36:37 <mattoliverau> ok that didn't paste the best.. go look in the etherpad :P
21:37:03 <mattoliverau> yup, but if you do, you need to use the old legacy processes and process
21:37:04 <acoles> I mean, per server
21:37:13 <rledisez> acoles: i should be, I guess it's then needed to set the processes and process
21:37:20 <rledisez> *it
21:37:24 <mattoliverau> otherwise we could dos outselves (all hitting the legacy namespace at once)
21:38:27 <mattoliverau> tho the good news is, even if you decided to only have 1 in legacy mode, we aren't adding to the legacy queue so it'll eventually get through it all
21:38:54 <timburke> "eventually"
21:39:04 <mattoliverau> lol
21:39:06 <mattoliverau> yup
21:39:38 <mattoliverau> which is why processes and process hasn't disapeared.. just only legacy options now.
21:40:02 <rledisez> could we decide to have a fixed numbers of legacy process (eg: 32). if there is not 32 object servers, then the values for process/processes would adapt automaticaly
21:40:35 <rledisez> it would avoid to overload the legacy account/containers, while having some parallelization
21:41:06 <timburke> maybe legacy processes could process *all* queue entries... if we haven't hit the expiration yet, add an entry to the *new* queue and pop from the old...
21:41:21 <acoles> anyone wha ran an expirer before is going to want a legacy mode, correct? and potentially forever?
21:41:34 <rledisez> yes
21:41:41 <timburke> i hate that "potentially forever" part...
21:42:00 <acoles> timburke: right, I wondered if legacy tasks could be migrated somehow
21:42:37 <acoles> but it seems like having legacy mode automatically enabled would be nice
21:43:31 <mattoliverau> but controlling some ellection of nodes to handle legacy starts getting complicated. Because we don't want too many to hit the same legacy namespace.
21:44:01 <mattoliverau> we could say replica 0 of each.. but at 24 part power that's still alot of servers hitting
21:44:14 <notmyname> so how should we move forward with this? keep discussing in here? discuss in irc in -swift? keep it in the etherpad?
21:44:37 <mattoliverau> well firstly..
21:45:13 <mattoliverau> if legacy might be around "potentually" forever. then in the case of this discussion, I think we want option 2.
21:45:24 <mattoliverau> only 1 set of expirers (the new ones).
21:45:30 <mattoliverau> that need to handle legacy
21:45:58 <mattoliverau> otherwise we have 2 different expirer daemons and 2 sets of configs
21:46:17 <kota_> too bad :'(
21:46:30 <mattoliverau> I'd rather 1 and either configure them in the why people used to for legecy or some automatic way.
21:46:37 <mattoliverau> *way
21:47:48 <mattoliverau> It sounds like that where the discussion was going.. so we might be able to progress the question at hand
21:48:15 <mattoliverau> but happy to take it offline, into etherpad, and definitely discussions at PTG :)
21:48:32 <rledisez> option 1 is my choice, and we will always have code related to legacy (either legacy expirer or conversion code, because we will never know if everything i converted in all clusters over the world)
21:48:55 <notmyname> enter the checkpoint release conversation... :-)
21:49:00 <rledisez> :)
21:49:11 <mattoliverau> lol
21:49:34 <kota_> lol
21:49:36 <notmyname> ok, so mattoliverau and rledisez both say option 1 for now, based in part on the concerns from timburke and acoles. so that sounds like a good plan for now
21:49:37 <mattoliverau> well, offline/etherpad it is ;)
21:50:01 <notmyname> and we can readdress it at the PTL? (and of course in the etherpad or IRC before then)
21:50:06 <notmyname> *PTG
21:51:01 <m_kazuhiro> mattoliverau: Thank you for your leading discussion.
21:51:03 <acoles> I need to digest this some more, and we should definitely give it time at PTG if not before
21:51:12 <notmyname> I agree
21:51:28 <kota_> acoles: +1
21:51:37 <notmyname> #topic open discussion
21:51:49 <notmyname> is there more to bring up this week in the meeting? anyone else have something?
21:51:56 * acoles gets nervous about upgrades that break something that worked before
21:52:38 <mattoliverau> acoles: but it might work much better
21:53:09 <acoles> mattoliverau: yes, its the getting to 'better' that worries me :)
21:53:18 <mattoliverau> :)
21:53:28 <notmyname> ok, notbody's jumping in with anything, so I think the meeting is done :-)
21:53:34 <notmyname> thanks for coming, everyone
21:53:36 <mattoliverau> only my random lunchtime hack part diff tool. but we briefly talked about it before
21:53:43 <notmyname> thank you for your contributions to swift
21:53:49 <notmyname> #endmeeting