Wednesday, 2020-09-30

*** tetsuro has joined #openstack-meeting00:03
*** martial has quit IRC00:06
*** tetsuro has quit IRC00:07
*** tetsuro has joined #openstack-meeting00:10
*** rbudden has quit IRC00:23
*** yasufum has joined #openstack-meeting00:33
*** tetsuro has quit IRC00:40
*** gyee has quit IRC01:08
*** rfolco|ruck has quit IRC01:14
*** yamamoto has quit IRC01:18
*** cgoncalves has quit IRC01:23
*** cgoncalves has joined #openstack-meeting01:23
*** Roamer` has quit IRC01:24
*** yamamoto has joined #openstack-meeting01:28
*** rbudden has joined #openstack-meeting01:34
*** manpreet has quit IRC01:56
*** Liang__ has joined #openstack-meeting02:11
*** whoami-rajat__ has joined #openstack-meeting02:13
*** arne_wiebalck has quit IRC02:14
*** clayg has quit IRC02:14
*** hillpd has quit IRC02:14
*** vkmc has quit IRC02:14
*** patrickeast has quit IRC02:14
*** phrobb has quit IRC02:14
*** Liang has quit IRC02:14
*** ttx has quit IRC02:14
*** mattoliverau has quit IRC02:14
*** freefood has quit IRC02:14
*** hillpd has joined #openstack-meeting02:15
*** vkmc has joined #openstack-meeting02:15
*** phrobb has joined #openstack-meeting02:15
*** clayg has joined #openstack-meeting02:15
*** arne_wiebalck has joined #openstack-meeting02:15
*** patrickeast has joined #openstack-meeting02:20
*** mattoliverau has joined #openstack-meeting02:20
*** freefood has joined #openstack-meeting02:20
*** b1airo has quit IRC02:21
*** manpreet has joined #openstack-meeting02:27
*** viks____ has joined #openstack-meeting02:28
*** armax has quit IRC02:33
*** armax has joined #openstack-meeting02:34
*** armax has quit IRC02:34
*** rcernin has quit IRC02:55
*** rcernin_ has joined #openstack-meeting02:57
*** yonglihe has quit IRC03:10
*** psachin has joined #openstack-meeting03:31
*** psachin has quit IRC03:32
*** psachin has joined #openstack-meeting03:33
*** Liang__ has quit IRC04:01
*** rh-jelabarre has quit IRC04:10
*** Lucas_Gray has joined #openstack-meeting04:31
*** evrardjp has quit IRC04:33
*** evrardjp has joined #openstack-meeting04:33
*** rbudden has quit IRC04:57
*** hongbin has quit IRC05:07
*** moguimar has joined #openstack-meeting05:15
*** yamamoto has quit IRC05:22
*** moguimar has quit IRC05:22
*** moguimar has joined #openstack-meeting05:22
*** yamamoto has joined #openstack-meeting05:26
*** Lucas_Gray has quit IRC05:32
*** Lucas_Gray has joined #openstack-meeting05:38
*** lpetrut has joined #openstack-meeting06:24
*** dklyle has quit IRC06:38
*** ralonsoh has joined #openstack-meeting06:38
*** bbowen has quit IRC07:04
*** bbowen has joined #openstack-meeting07:05
*** rcernin_ has quit IRC07:06
*** slaweq has joined #openstack-meeting07:09
*** jiaopengju1 has joined #openstack-meeting07:13
*** jiaopengju2 has quit IRC07:15
*** jiaopengju1 has quit IRC07:15
*** jiaopengju1 has joined #openstack-meeting07:15
*** ttx has joined #openstack-meeting07:17
*** rcernin_ has joined #openstack-meeting07:18
*** rcernin_ has quit IRC07:20
*** rcernin has joined #openstack-meeting07:20
*** tosky has joined #openstack-meeting07:30
*** Lucas_Gray has quit IRC07:32
*** ociuhandu has joined #openstack-meeting07:50
*** ociuhandu has quit IRC07:54
*** jiaopengju1 has quit IRC07:55
*** jiaopengju1 has joined #openstack-meeting07:56
*** ociuhandu has joined #openstack-meeting08:09
*** whoami-rajat__ has quit IRC08:22
*** rcernin has quit IRC08:48
*** yamamoto has quit IRC08:52
*** yamamoto has joined #openstack-meeting08:52
*** yamamoto has quit IRC08:53
*** moguimar has quit IRC09:13
*** moguimar has joined #openstack-meeting09:24
*** yasufum has quit IRC09:32
*** yamamoto has joined #openstack-meeting09:33
*** moguimar has quit IRC09:33
*** moguimar has joined #openstack-meeting09:33
*** yasufum has joined #openstack-meeting09:41
*** yasufum has quit IRC09:52
*** Lucas_Gray has joined #openstack-meeting10:31
*** haleyb has quit IRC10:47
*** yamamoto has quit IRC10:47
*** yamamoto has joined #openstack-meeting10:52
*** slaweq_ has joined #openstack-meeting10:52
*** dansmith has quit IRC10:53
*** slaweq has quit IRC10:55
*** dansmith has joined #openstack-meeting10:55
*** yamamoto has quit IRC10:59
*** yamamoto has joined #openstack-meeting11:02
*** Lucas_Gray has quit IRC11:03
*** e0ne has joined #openstack-meeting11:09
*** yamamoto has quit IRC11:11
*** whoami-rajat__ has joined #openstack-meeting11:19
*** haleyb has joined #openstack-meeting11:34
*** yamamoto has joined #openstack-meeting11:44
*** yasufum has joined #openstack-meeting11:50
*** rh-jelabarre has joined #openstack-meeting11:51
*** rcernin has joined #openstack-meeting11:51
*** rfolco has joined #openstack-meeting11:53
*** rfolco is now known as rfolco|ruck|bbl11:53
*** yamamoto has quit IRC11:55
*** raildo has joined #openstack-meeting11:55
*** rcernin has quit IRC12:16
*** moguimar has quit IRC12:20
*** yamamoto has joined #openstack-meeting12:27
*** yamamoto has quit IRC12:32
*** lbragstad has quit IRC12:42
*** yasufum has quit IRC12:45
*** ricolin_ has joined #openstack-meeting12:48
*** yasufum has joined #openstack-meeting12:50
*** rbudden has joined #openstack-meeting12:59
*** moguimar has joined #openstack-meeting13:10
*** ricolin_ has quit IRC13:11
*** yasufum has quit IRC13:14
*** lbragstad has joined #openstack-meeting13:16
*** Lucas_Gray has joined #openstack-meeting13:20
*** soniya29 is now known as soniya29|ruck13:24
*** priteau has joined #openstack-meeting13:24
*** yasufum has joined #openstack-meeting13:26
*** Lucas_Gray has quit IRC13:30
*** ricolin_ has joined #openstack-meeting13:31
*** slaweq_ has quit IRC13:33
*** yasufum has quit IRC13:41
*** slaweq has joined #openstack-meeting13:46
*** yasufum has joined #openstack-meeting13:54
*** thgcorrea has joined #openstack-meeting14:00
*** haleyb has quit IRC14:01
*** e0ne_ has joined #openstack-meeting14:02
*** e0ne has quit IRC14:02
*** haleyb has joined #openstack-meeting14:05
*** ricolin_ has quit IRC14:14
*** rfolco|ruck|bbl is now known as rfolco14:15
*** whoami-rajat has joined #openstack-meeting14:18
*** TrevorV has joined #openstack-meeting14:23
*** armax has joined #openstack-meeting14:27
*** yamamoto has joined #openstack-meeting14:29
*** yamamoto has quit IRC14:34
*** mlavalle has joined #openstack-meeting14:48
*** yasufum has quit IRC14:52
*** gyee has joined #openstack-meeting14:59
*** lpetrut has quit IRC15:04
*** eharney has quit IRC15:06
*** dklyle has joined #openstack-meeting15:11
*** lpetrut has joined #openstack-meeting15:14
*** david-lyle has joined #openstack-meeting15:16
*** david-lyle has quit IRC15:17
*** lpetrut has quit IRC15:27
*** belmoreira has joined #openstack-meeting15:33
*** moguimar has quit IRC15:34
*** tosky has quit IRC16:19
*** yamamoto has joined #openstack-meeting16:30
*** psachin has quit IRC16:34
*** yamamoto has quit IRC16:34
*** ociuhandu has quit IRC16:57
*** yasufum has joined #openstack-meeting17:00
*** yasufum has quit IRC17:08
*** dmacpher has joined #openstack-meeting17:20
*** bbowen has quit IRC17:33
*** e0ne_ has quit IRC17:50
*** eharney has joined #openstack-meeting17:58
*** ralonsoh has quit IRC18:11
*** andrebeltrami has joined #openstack-meeting18:16
*** bbowen has joined #openstack-meeting18:47
*** e0ne has joined #openstack-meeting18:48
*** e0ne_ has joined #openstack-meeting18:52
*** e0ne has quit IRC18:53
*** mattia has quit IRC18:56
*** priteau has quit IRC18:56
*** e0ne_ has quit IRC18:58
*** ociuhandu has joined #openstack-meeting19:02
*** e0ne has joined #openstack-meeting19:04
*** belmoreira has quit IRC19:05
*** ociuhandu has quit IRC19:06
*** e0ne has quit IRC19:16
*** tosky has joined #openstack-meeting19:32
*** jokke has quit IRC19:33
*** whoami-rajat has quit IRC19:35
*** patchbot has joined #openstack-meeting20:54
*** raildo has quit IRC20:56
timburke#startmeeting swift21:00
openstackMeeting started Wed Sep 30 21:00:18 2020 UTC and is due to finish in 60 minutes.  The chair is timburke. Information about MeetBot at http://wiki.debian.org/MeetBot.21:00
openstackUseful Commands: #action #agreed #help #info #idea #link #topic #startvote.21:00
*** openstack changes topic to " (Meeting topic: swift)"21:00
openstackThe meeting name has been set to 'swift'21:00
timburkewho's here for the swift meeting?21:00
seongsoochoo/21:00
mattoliverauo/21:00
alecuyero/21:01
rledisezo/21:01
kota_hi21:01
timburkenot a whole lot on the agenda21:02
timburke#link https://wiki.openstack.org/wiki/Meetings/Swift21:02
timburke#topic ptg21:02
*** openstack changes topic to "ptg (Meeting topic: swift)"21:02
timburkewe're less than a month away!21:02
timburkeplanning etherpad is at21:03
timburke#link https://etherpad.opendev.org/p/swift-ptg-wallaby21:03
timburkei'll plan on an ops-feedback-style session for one of our slots, and make another etherpad just for that21:03
timburkemoving on21:04
claygo/21:04
timburke#topic replication lock21:05
*** openstack changes topic to "replication lock (Meeting topic: swift)"21:05
timburke#link https://review.opendev.org/#/c/754242/21:05
patchbotpatch 754242 - swift - Fix a race condition in case of cross-replication - 4 patch sets21:05
timburkeclayg, i think you added this, right?21:05
claygi forget why I wanted to talk about this - rledisez has been pushing up new patchsets daily and it just keeps getting better!21:05
claygrledisez: do you want some help with a ssync_reciever test?  I'm not sure I can a probe test ๐Ÿค”21:05
rledisezI still have one (major) point to discuss. this patchset only fix the issue for SSYNC. But rsync is(must be) impacted too21:06
rledisezand yeah, I don't know how to write a probetest for this21:07
rledisezI don't know how to fix rsync actually21:08
claygmeh rsync21:08
timburkepart of me wants the excuse to change our "saio" to a "saif" or so...21:09
claygswift-all-in-fhawhatnow?21:09
timburkefour! so like, `vagrant up` and you get four vms, each independently configurable21:10
claygrledisez: maybe this is just another reason to get of rsync and start working more on tsync - do you think we could "catch" rsync doing this?  how did you find it with ssync?21:10
timburkeor i could learn the dockers ๐Ÿค”21:10
rledisezclayg: we wrote a tool do scan and discover in near real time disappearing fragments, and we started a rebalance. And we saw it happening21:11
claygI think we can say "closes-bug" with a caveat it still exists in rsync and has forever I think?  not being in the datapath takes so many good options off the table.21:11
rledisezwe have a cluster were it is (too) easily reproducible :(21:11
clayg๐Ÿ˜ข21:12
timburkethat's a damn cool tool21:12
mattoliveraucould we add the current ring version in the replicate verb call, and if I receive one newer I look for a new ring?21:12
*** _erlon_ has joined #openstack-meeting21:12
claygso it's kinda like a dispersion report sort of thing?  like it checks on individual frags?21:13
mattoliverauthat plus a lock, cause thre would still be overlap when things can happen21:13
claygmattoliverau: so the tricky thing with update_deleted is it doesn't check with the remote node via REPLICATE (hash recalc) before pushing it's local data over rsync21:13
rledisezalecuyer: is the one writing it. ir scan disk and push in a single DB all files. then a tool scan the DB and list objects that don't have the required number of frags or replicas.21:13
claygit just says "this isn't MY part; rsync away - 1, 2, 3 - everyone good?  BELETED"21:14
rledisezmattoliverau: the object-server does not have the ring loaded so it can't know the ring version.21:14
rledisez+ what clayg just said21:15
mattoliverauoh yes, the whole no ring in the storage daemon thing.21:15
timburke(not having the ring loaded isn't the end of the world; espescially if we only care about some of the json metadata at the front and not the full set of assignments -- it's something solvable)21:16
claygI mean, there's some version of "load the ring" that could just pull the metadata like the ring version, I think it'd be useful to pursue but the lock is better21:16
claygit's kinda weird that the replicator is so quick to pick up and notice a new part but so slow to realize it has new rings...21:17
rledisezto be clear, it happens because our ring deployment takes about 30 minutes. if the ring deployment was synchronised, we may have never hit that bug21:18
claygeverything is probably worse with EC than replicated - just in that deleting a copy doesn't trigger a rebuild21:18
timburkei think the trouble is that it may *not* have the new ring yet -- yeah, what rledisez said21:18
claygso someone pushed it parts, and it doesn't believe they belong there... ๐Ÿค” lots more options with ssync than rsync21:19
clayganyway, I think rsync can suck a lemon and fixing ssync is going to be great!21:20
timburkeand much mor costly to be wrong with ssync than rsync. i agree; we ought to fix ssync and footnote rsync21:20
claygI'm super open to saying we want to deprecate rsync replication too, like you can't complain about bugs in deprecated deployment options right?21:20
claygtimburke gets it21:20
rlediseztbh, i'm not confortable to keep rsync in the code if the bug is not fixed. it's pretty bad, it bite us pretty hardโ€ฆ so i can't imagine that kind of bug being unfixed if we know it's there21:22
timburkeit's not that we *shouldn't* fix it, it's that we should land what fixes we have in hand, and shore up the rest when we can21:23
claygwell we can't pull it out with deprecating - but it could be a strong warning - also I might be biased cause I'm not sure how to fix it... so maybe we can just keep thinking on that harder after we fix ssync (no reason to wait on fixing that right?)21:24
claygrledisez: it's not a new bug - there may be something about rsync with replicated data that makes it less of a problem for some reason we don't entirely understand yet ๐Ÿค”21:25
rledisezclayg: agree with that. we can fix ssync quickly and think of rsync at the same time21:25
timburkeso, wanting to work toward that rsync fix: what about a pre-rsync REPLICATE that grabs the partition lock with some (configurable) lease time and a post-rsync REPLICATE to release it again?21:26
claygtimburke: ๐Ÿคฏ21:26
rledisezit's gonna be tricky to keep the lock without an active connection, but it's something we can try21:27
claygi'd hate to add additional suffix hashing that not's necessary - so it'd probably look more like "fool, i'm reverthing this to you - don't delete it for X hours" kind of new pre-flight message21:27
rledisezi was thinking an SSYNC connection that just "wrap" the rsync run21:27
rledisezand keeping it active by sending "noop" action21:28
claygrledisez: that sounds not terrible!21:28
timburkeeither of those could work, too :-)21:29
clayglook us acting like we know how to build software - thanks for finding the bug @rledisez !!!21:29
mattoliverau+121:29
kota_+121:30
rledisezi'll finish the ssync patch and i'll do some test on rsync. any idea how to test that in an automated way?21:31
timburkei'll start trying to get a multi-vm dev setup going with an explicit goal of repro-ing this and let you know how it goes21:32
mattoliverauI guess we need to have a different rings on different "nodes" so maybe a ring override folder for siao21:32
claygrledisez: i wouldn't block anything on probe test - there's a lot of new requirements (timing, ring changes)21:32
rledisezok21:33
claygrledisez: there's a unittest that already has an ssync_reciever object server and reconstructor daemon - I'd start there21:33
timburkeanything else to bring up on this?21:34
rlediseznot on my side21:34
timburke#topic async SLO segment deletion21:34
*** openstack changes topic to "async SLO segment deletion (Meeting topic: swift)"21:35
timburke#link https://review.opendev.org/#/c/733026/21:35
patchbotpatch 733026 - swift - Add a new URL parameter to allow for async cleanup... - 11 patch sets21:35
timburkeso clay pointed out that we should bring this up with a broader audience before merging as-s21:36
claygwell, but also you already added the operator config value21:36
claygwhat do folks think about having users dump a lot of "to be expired *rfn*" objects into your .expiring_objects quuee21:37
claygdoes anyone monitor that shit?21:37
timburkestill, seems worth asking whether i got the default for the config option right ;-)21:37
rledisezi'd like to monitore it, but I never took time to wrote that patch in the expirer that would send a metric of the queue size21:38
mattoliverauthat's what the .expiring_objects queue is for. async delete. it's a good idea.21:38
rledisezwe sometime get late on deletion because of a burst, so it's useful to know something is going wrong21:38
*** slaweq has quit IRC21:39
mattoliveraumaybe we need to add some tooling around viewing and maintaining the queue would be useful though21:39
claygrledisez: something like the "lag" metric for the container-updater?  I never quite understood how that metric works in a time series visualization...21:39
clayg... although I *do* have some better tools for development with metrics - maybe I just need to look at it again!21:39
claygmattoliverau: all I have is https://gist.github.com/clayg/7f66eab2a61c77869e1e84ac4ed6f1df (run it on any node every N minutes and plot the numbers over time)21:40
mattoliverauI always though expired queue deletion was a soft shedule. it shouldn't be deleted until at least that time.21:41
claygstale entires should be zero-ish, but any time someone does an SLO delete it would spike and depending on how aggressive your object-epxirers are configured...21:41
timburkespeaking of lag metrics, there's https://review.opendev.org/#/c/735271/ ...21:41
patchbotpatch 735271 - swift - metrics: Add lag metric to expirer - 1 patch set21:41
claygtimburke: yeah!  i'll take a look at that one again, maybe that's just what we need (or a version of that which is slo-async-delete aware)21:42
mattoliverauclayg: cool that's something, tracking the size is great :)21:42
*** slaweq has joined #openstack-meeting21:42
claygi'm not even sure I could easily get all those lag metrics to even pop up in a dev env21:44
*** mlavalle has quit IRC21:44
timburkeclayg, good point -- it should probably emit different metrics for expired vs. async deleted....21:44
claygbut just poking at the queue listings doesn't scale horizontally, one of the container walkers could dump stats as it goes and then you just ... divide by replica count?  ๐Ÿคฎ21:45
claygok - so "monitor expiring queue" sounds like something everyone is keen on, but no objections to extending SLO delete in this fashion?21:47
timburkedoes anyone feel uncomfortable having the async-delete behavior as the default?21:48
clayghaving the objects in the segment container listing until their cleared seems reasonable - and it doesn't hurt anything if the client ends up deleting them on their own21:48
mattoliverauno I think it's a great idea because it means we use an existing mechanism that is ment to do it. I'll review the patch today21:48
claygonce the async delete successfully queues everything - the manifest is gone21:49
mattoliverautimburke: I'm warming to the idea of it being the default21:49
mattoliverauWill think about it while I review the patch.21:50
timburkecool, sounds like we're good to go then, thanks. between mattoliverau, clayg, and zaitcev, i'm sure this is gonna turn out great :-)21:50
timburke#topic open discussion21:50
*** openstack changes topic to "open discussion (Meeting topic: swift)"21:51
timburkelast few minutes: anything else to bring up?21:51
timburkei updated https://review.opendev.org/#/c/738959/ to have LIBERASURECODE_WRITE_LEGACY_CRC unset, set to "", and set to "0" all mean the same thing. i realized the commit message needs some clean up, but otherwise i think it's good to go21:53
patchbotpatch 738959 - liberasurecode - Be willing to write fragments with legacy crc - 3 patch sets21:53
clayg๐Ÿฅณ21:53
kota_ok, thx. I'll circle back to the patch around Friday.21:53
timburkethanks kota_!21:53
timburkethe swift side at https://review.opendev.org/#/c/739164/ should maybe be updated to set LIBERASURECODE_WRITE_LEGACY_CRC=1 instead of true, but that's pretty minor21:54
patchbotpatch 739164 - swift - ec: Add an option to write fragments with legacy crc - 2 patch sets21:54
claygI was playing with p 749400 trying to preserve existing behavior - but it got wonky when I put p 749401 on top of it ๐Ÿ˜ž21:55
patchbothttps://review.opendev.org/#/c/749400/ - swift - proxy: Put storage policy index in object responses - 3 patch sets21:55
patchbothttps://review.opendev.org/#/c/749401/ - swift - s3api: Ensure backend headers make it through s3api - 3 patch sets21:55
claygexisting behavior for symlinks relies on the client request headers getting a x-backend-storage-policy index - and the s3api bug is that proxy-logging doesn't see those swift_req headers21:57
timburkeyeah, i'm getting less and less certain that we want p 749400 :-/21:57
patchbothttps://review.opendev.org/#/c/749400/ - swift - proxy: Put storage policy index in object responses - 3 patch sets21:57
claygtimburke: I tried changing it to x-backend-proxy-policy-index; but that was mostly because all the resp.headers.pop were making me nervous (i thought we'd leave 'em just in case)21:57
timburkeit *seems like* the wort of thing that's reasonable... until you start actually seeing what it does to logs :-(21:58
*** rbudden has quit IRC21:58
claygbut it doesn't really matter - because if we're using the response headers it's whatever request was last - so swift api would prefer the client.req header and s3api wouldn't have that - so it'd use the resp header it just sucked21:58
claygah yeah - so for *me* the bug is just s3api metrics not having policy index since we started actually logging the s3api requests (instead of just their underlying swift request)21:59
*** whoami-rajat__ has quit IRC21:59
claygputting the policy index in the response headers may not be the way to go - i'll try to refocus on just the s3api bug22:00
timburkei think we end up needing to make sure that s3api copies some (request!) headers back into the provided env from a subrequest env -- yuk22:00
timburkeoh! one last thing: the election nomination period is over -- we should all read over TC candidacy messages so we're ready for the election22:00
timburke#link http://lists.openstack.org/pipermail/openstack-discuss/2020-September/017668.html22:00
timburkeand #link https://governance.openstack.org/election/#wallaby-tc-candidates22:01
claygand i'm also struggling with making SLO responses that fail mid-transfer show up as 503 in logs p 752770 but that is ALSO a request/resp/environ/context mess ๐Ÿ˜ก22:01
patchbothttps://review.opendev.org/#/c/752770/ - swift - Log error processing manifest as ServerError - 3 patch sets22:01
timburkeall right, we're about out of time22:01
timburkebut we can argue about the wisdom of WSGI back in -swift ;-)22:02
timburkethank you all for coming, and thank you for working on swift!22:02
timburke#endmeeting22:02
*** openstack changes topic to "OpenStack Meetings || https://wiki.openstack.org/wiki/Meetings/"22:02
openstackMeeting ended Wed Sep 30 22:02:42 2020 UTC.  Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4)22:02
openstackMinutes:        http://eavesdrop.openstack.org/meetings/swift/2020/swift.2020-09-30-21.00.html22:02
openstackMinutes (text): http://eavesdrop.openstack.org/meetings/swift/2020/swift.2020-09-30-21.00.txt22:02
openstackLog:            http://eavesdrop.openstack.org/meetings/swift/2020/swift.2020-09-30-21.00.log.html22:02
*** andrebeltrami has quit IRC22:06
*** yamamoto has joined #openstack-meeting22:08
*** mlavalle has joined #openstack-meeting22:08
*** rcernin has joined #openstack-meeting22:12
*** slaweq has quit IRC22:17
*** slaweq has joined #openstack-meeting22:18
*** rfolco has quit IRC22:18
*** slaweq has quit IRC22:24
*** tosky has quit IRC22:54
*** mlavalle has quit IRC23:14
*** TrevorV has quit IRC23:20

Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!