Thursday, 2020-10-15

*** gyee has quit IRC		01:04
*** rcernin has quit IRC		02:26
*** rcernin has joined #openstack-swift		02:49
*** evrardjp has quit IRC		04:33
*** evrardjp has joined #openstack-swift		04:33
*** mikecmpbll has quit IRC		05:15
*** mikecmpbll has joined #openstack-swift		05:18
*** dsariel has joined #openstack-swift		05:56
*** rpittau\|afk is now known as rpittau		07:22
*** rcernin has quit IRC		07:35
*** rcernin_ has joined #openstack-swift		07:35
*** rcernin_ has quit IRC		07:42
*** MooingLemur has quit IRC		07:45
*** MooingLemur has joined #openstack-swift		07:46
*** mikecmpbll has quit IRC		08:10
*** tkajinam is now known as tkajinam\|away		09:17
*** tkajinam\|away is now known as tkajinam		09:17
*** rcernin_ has joined #openstack-swift		09:56
*** rcernin_ has quit IRC		10:25
*** rcernin_ has joined #openstack-swift		10:42
*** rcernin_ has quit IRC		11:06
*** fingo has quit IRC		11:25
*** rcernin_ has joined #openstack-swift		11:59
*** tkajinam has quit IRC		14:20
*** rcernin_ has quit IRC		14:27
*** dsariel has quit IRC		15:11
*** gyee has joined #openstack-swift		15:39
*** rpittau is now known as rpittau\|afk		15:52
ormandj	is there a backport of https://opendev.org/openstack/swift/commit/754defc39c0ffd7d68c9913d4da1e38c503bf914 to ussuri?	16:28
ormandj	with victoria being 20.04 only, and that being a critical issue for us, we're hoping it's possible ;)	16:29
timburke	ormandj, not yet. i haven't checked how cleanly it would apply, but i can look into it. fwiw, though, i wholely expect victoria swift to work on older versions of ubuntu, and to play well with an otherwise-ussuri openstack install	16:32
ormandj	timburke: i think ubuntu cloud archive is only building for >=20.04	16:40
ormandj	we're working on getting that all together because some of the fixes in victoria are pretty huge for the big ticket issues we have	16:40
ormandj	but it's not an overnight process /	16:40
timburke	huh. http://ubuntu-cloud.archive.canonical.com/ubuntu/dists/focal-updates/victoria/main/binary-amd64/Packages lists swift 2.25.1... though 2.26.0 is in focal-proposed, so i guess everything's on-track	16:52
timburke	i don't see any differences looking at the package dependencies (which makes sense; we make a point of not bumping deps unnecessarily) so i think you might be good to just pull down the victoria swift packages and install them on bionic	16:54
timburke	(fair warning: i've never tried it. indeed, i usually don't use distro packages at all -- i'm usually working from source, being a dev and all, and when we need to upgrade swift on our clusters, we build our own packages)	16:56
ormandj	timburke: yeah, we'll try to figure it out	17:11
ormandj	second one, testing a mass rebalance on dev nodes, some data has gone 404. using swift-get-nodes to get the location then checking each of the primary/handoffs, the data isn't there	17:12
ormandj	by mass rebalance i mean adding a new node into the ring and putting weight at 100% effectively at once	17:12
ormandj	is that expected due to partition location changes and self-rectifying as rebalancing completes?	17:12
ormandj	i didn't expect that, and i haven't used swift-get-nodes on the old ring files to see if the 'old' data still exists	17:13
ormandj	but we definitely serve 404s for data that was there now	17:13
timburke	ormandj, triple replica, right? how quickly did you rebalance the ring? ever since https://github.com/openstack/swift/commit/ce26e789 only one assignment should change per rebalance, so i would've expected the other two primary locations to still have it...	17:20
timburke	how quickly and how many times	17:20
ormandj	yes, triple replica	17:22
timburke	you had three beefy nodes before, right? is the new one roughly the same size as the others, or considerably larger?	17:23
ormandj	larger	17:23
ormandj	one ring change to add it, at full capacity, then a month later, one more ring change, then a week alter one more	17:24
timburke	that seems perfectly reasonable -- do we know whether the object was still accessible at the intermediary stages?	17:25
ormandj	no, 404ing almost at the very beginning	17:26
ormandj	unfortunately don't have logs going back far enough to determine if a DELETE went through	17:26
timburke	i was just about to ask about a sanity check there :-)	17:27
ormandj	yeah, it's still in the container db	17:27
ormandj	but it _is_ possible a DELETE went through for it, and the container db just didn't get the update	17:27
ormandj	but i'd expect that to eventually have caught up, too	17:27
ormandj	if all the data itself is actually purged	17:27
timburke	you have this habit of answering my next question before i ask it :P	17:27
ormandj	i just looked at the old rings (backups) and the locations the data 'should' be, checked those locations, it's definitely not there	17:28
ormandj	based on the old rings	17:28
ormandj	it's actually the same location as the 'new' rings show it should exist	17:28
timburke	when you deliver new rings, do you have a feel for how long the gap is between the first node getting the new ring and the last one getting it? rledisez had a ~30min window that led him to observe https://bugs.launchpad.net/swift/+bug/1897177	17:30
openstack	Launchpad bug 1897177 in OpenStack Object Storage (swift) "Race condition in replication/reconstruction can lead to loss of datafile" [High,In progress]	17:30
ormandj	about 5 seconds	17:30
timburke	yeah, negligible. good	17:30
timburke	check quarantine dirs?	17:31
ormandj	hm, protip on doing that?	17:31
ormandj	i did check the handoff nodes fwiw	17:32
ormandj	not seeing a quarentined dir in the /srv/node/driveID	17:34
timburke	hrm. yeah, i would've checked with something like `find /srv/node//quarantined`	17:35
ormandj	yeah, no such directory	17:36
ormandj	big async_pending	17:36
timburke	might have a delete record to send to the container	17:40
timburke	how big is the container?	17:40
ormandj	i'm sure huge	17:41
ormandj	is there a way to look for a delete record pending?	17:41
ormandj	we want to make sure this is a result of a client operation	17:41
ormandj	not the server(s)	17:41
ormandj	but we don't have client logs going back 2 months	17:41
timburke	each file is just a pickled dict iirc -- https://review.opendev.org/#/c/725429/1/swift/cli/async_inspector.py almost seems too simple for me to bother pushing on ;-)	17:43
patchbot	patch 725429 - swift - Add a tool to peek inside async_pendings - 1 patch set	17:43
timburke	how's your reclaim age? might want to bump it up if we're worried about how big the async pile is getting...	17:44
timburke	https://gist.github.com/clayg/249c5d3ff580032a0d40751fc3f9f24b may be useful (both to get a feel for the state of the system, and as a starting point to go looking for a specific async)	17:47
timburke	though given the suffix/hash from swift-get-nodes, it probably wouldn't be so hard to find for it anyway...	17:49
timburke	something like `find /srv/node//async/${SUFFIX}/${HASH}`	17:50
cwright	timburke: reclaim_age is set to 2592000	17:51
timburke	then, assuming you find something, crack it open and make sure it really was for a delete	17:51
timburke	cwright, so 30 days -- that might not be long enough, if we're legit worried that this was deleted a couple months ago and never made it back to the container server...	17:52
*** whmcr has joined #openstack-swift		17:55
timburke	:-/ i should help clayg make that async stats script work on py3	17:56
whmcr	@timburke we're assuming that suffix & hash are the parts of the filepath that are post the partition. IE /srv/node/DRIVEID/objects/PARTITIONID/SUFFIX/HASH, if so no dice on that	18:03
timburke	yup, that was the idea	18:04
timburke	:-(	18:04
*** djhankb has quit IRC		18:07
timburke	ok, so https://gist.github.com/tipabu/abf38940d49d67d33fe98b957f9306a6 should work on py3 and i taught it to report the age of the oldest async it can find	18:17
whmcr	running it now	18:20
whmcr	count is already >70k, oldest (so far at least) is from July	18:21
timburke	😳	18:23
*** dsariel has joined #openstack-swift		18:23
timburke	70k may or may not be something to worry about, but we definitely need to bring that reclaim age up	18:23
timburke	at whatever point the updater gets around to that async from July, it's not making any container requests; it's just going to delete it	18:24
timburke	how's the updater tuned? in particular, what've you got for workers, concurrency, objects_per_second?	18:25
timburke	probably also want to get a feel for your success rate for processing updates	18:29
cwright	timburke: those three settings are still using defaults: concurrency = 8, updater_workers = 1, objects_per_second = 50	18:30
timburke	my gut says you probably want to turn up workers, considering how dense your chassis are	18:33
timburke	ormandj, just had a thought, since you mentioned still having backups of the old rings. have you compared the results you get from swift-get-nodes between them? i would expect there to be at least one disk assignment that didn't change, though i suppose i could be wrong...	18:36
whmcr	yup, we've done that against pre-adding the node	18:36
whmcr	all match up	18:36
timburke	wait, so none of the assignments changed with the rebalance? this seems increasingly like it must've been deleted, but who knows when...	18:37
timburke	fwiw, one of the tricks we've got for server logs is to push them back into the cluster as part of log rotation, under a .logs account that only reseller admins can access	18:40
*** djhankb has joined #openstack-swift		18:40
timburke	might want to check recon cache, looking for object_updater_sweep time	18:43
whmcr	sorry, the drive asignments do change, but the files are not there on any of the versions we've checked	18:44
openstackgerrit	Tim Burke proposed openstack/python-swiftclient master: Remove some py38 job cruft https://review.opendev.org/758479	18:47
timburke	whmcr, did all of the assignments change? just one? two?	18:48
whmcr	looks like one of the non [Handoff] ones changes, and then all of the [handoff]'s change	18:49
timburke	so the primaries (non-handoffs) that didn't change should be pretty authoritative -- if they don't have it (either in objects/PARTITIONID/SUFFIX/HASH or quarantined/objects/HASH) it was most likely deleted	19:02
timburke	when you found it in listings, was that from just one replica of the container DB, or looking across all of them?	19:03
whmcr	listing was from an s3 client doing a GET on the container for an object listing	19:04
timburke	might be worth doing direct listings to each container server with limit=1 and prefix=<object>, see if they seem to agree that it should exist	19:08
timburke	or even drop into sqlite3 and query for it directly. if you go that route, note that you'll probably want to include a 'AND deleted in (0, 1)' clause to take advantage of the deleted,name index	19:09
timburke	but then you can also see the tombstone row (if it exists)	19:10
openstackgerrit	Tim Burke proposed openstack/swift master: Clarify some object-updater settings https://review.opendev.org/758488	19:23
*** gregwork has quit IRC		19:26
ormandj	timburke: for clarity, reclaim_age being too short when asyncs haven't updated container.db means data that _should_ be purged, won't be, if the async container update doesn't go through prior?	20:00
ormandj	ie: dark data will be left on system that shouldn't be there	20:01
timburke	so reclaim_age can bite you two ways at the object-layer if it's too short: you might reap some object-server tombstones (.ts files) before all of the .data have had a chance to get cleaned up, leading to dark data -- OR you might give up on ever getting an async pending through to the container layer, leading to either dark data (if the async was for a PUT) or ghost listings (if the async was for a DELETE)	20:06
ormandj	copy. we'll set it really large then until all this is caught up ;) the key is makings ure it wouldn't result in objects going missing that shouldn't be	20:06
timburke	at the container layer, a too-short reclaim age pretty much always leads to ghost listings, where one replica goes offline for a while then comes back and syncs with other copies that had & reclaimed a deleted row for some of the objects	20:07
ormandj	we'll get that out of the way first, then crank up the updater workers, some of these container.dbs are showing an update time from july	20:07
ormandj	with lots of asyncs pending for them	20:08
timburke	nope -- having it too high just means you're using up some inodes "unnecessarily" -- i'd definitely err on the side of too high rather than too low	20:08
ormandj	perfect	20:08
ormandj	timburke: updating the worker count, anything else we can do to push these asyncs through?	20:17
ormandj	containerdbs are on ssds	20:17
timburke	ormandj, might check to see if you've got https://review.opendev.org/#/c/741753/ in your swift -- if not, you can kick up your container replicator interval to like 48hrs or something until asyncs settle down	20:21
patchbot	patch 741753 - swift (stable/ussuri) - Breakup reclaim into batches (MERGED) - 1 patch set	20:21
ormandj	looking	20:23
ormandj	timburke: unfortunately, i don't think that's in the ussuri cloud packages we have	20:25
ormandj	don't see the other_stuff function in the db.py	20:25
timburke	the fix should be in 2.26.0, 2.25.1, and 2.23.2	20:29
timburke	again, you can work around it by temporarily prolonging your container-replicator cycle time -- it's just a thing we've seen where the replicator may hold a long lock while reclaiming deleted rows	20:31
ormandj	yeah, those releases didn't get built in ubuntu cloud archive	20:36
ormandj	2.25.1 that is	20:36
ormandj	just checked, latest is still 2.25.0	20:36
ormandj	we'll set the replication interval to 48 hours for the container replicator, update the updater_workers to 4, and set reclaim_age to 120 days	20:37
ormandj	we're hoping that's enough to much on these asyncs, we are way behind in this cluster	20:39
ormandj	last-modified on the container db is something like july 06 heh on this one container	20:39
ormandj	we stopped that script and it was over 3 million asyncs	20:40
timburke	certainly a bunch, but with ssds and a tuned-down container-replicator it should be quite manageable. you've got this!	20:44
openstackgerrit	Tim Burke proposed openstack/python-swiftclient master: Allow tempurl times to have units https://review.opendev.org/758500	21:10
klamath_atx	@timburke I upgraded our lab, the only weirdness im seeing right now is container-reconciler is having issues connecting to remote memcache servers, is that a know upgrade issue?	21:26
timburke	klamath_atx, i've not seen that before :-/	21:29
klamath_atx	gotcha, just wanted to check in before i start spinning wheels	21:33
mattoliverau	morning	21:52
*** rcernin_ has joined #openstack-swift		22:03
openstackgerrit	Tim Burke proposed openstack/swift master: Optimize swift-recon-cron a bit https://review.opendev.org/758505	22:06
*** rcernin_ has quit IRC		22:19
openstackgerrit	Merged openstack/python-swiftclient master: Close connections created when calling module-level functions https://review.opendev.org/721051	22:41
*** tkajinam has joined #openstack-swift		22:59
zaitcev	"Firefox can’t establish a connection to the server at review.opendev.org."	23:58

Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!