Wednesday, 2021-06-30

*** dviroel\|out is now known as dviroel		11:29
timss	Hi, my object nodes have a 2xSSD (RAID1 w/LVM for OS) and 64xHDD setup and I'm looking into possibly seperating account/container onto leftover space of the SSD RAID for extra performance. With our previous growth the space requirement for containers seems to be fine (~85G per server), and I've been able to configure the rings accordingly.	13:10
timss	Question is; with only 1 device (logical volume) per object node (atleast 7), will it be enough devices for a healthy distribution, and how would one go about deciding part power etc.? Wasn't able to find any references online, although I think I've heard people run similar setups (albeit maybe with a more significant number of devices for the account/container rings)	13:10
opendevreview	Alistair Coles proposed openstack/swift master: relinker: tolerate existing tombstone with same X-Timestamp https://review.opendev.org/c/openstack/swift/+/798849	13:10
DHE	as long as you have more devices than you have distributed copies you're fine. the concern comes when you add additional redundancy into the system, like multiple failure zones (racks)	13:42
DHE	at 7 servers I'm guessing they're all in the same rack connected to the same switch?	13:42
timss	As of this time unfortunately yes, all in the same rack	13:42
timss	At least there's redundant networking and power, but it's not optimal for sure	13:43
DHE	so redundancy concerns where you're giving some topology information to swift become more serious with only 7 copies and, say, 2 failure zones	13:45
DHE	*7 servers	13:45
timss	In this scenario there's no real difference between the servers inside the same rack so even defining a clear failure domain is a bit tricky. From my understanding even with 1 region and 1 zone, Swift would at least ensure all 3 replicas will be spread on different servers (and their partitions). Not sure if splitting it up would help much, or?	13:53
zaitcev	Swift spreads partitions in tiers. First to each region, then to each zone, then to each node, and finally to each device.	14:23
zaitcev	This allows to assign zones to natural failure boundaries, such as racks.	14:24
zaitcev	But each tier can be degenerate: 1 region total, 1 zone total, etc.	14:24
zaitcev	So, 7 nodes for replication factor 3 sounds fine to me.	14:25
zaitcev	Gives space to handoff nodes beyond the strictly necessary 3 in the node tier.	14:27
timss	Cheers to both, I feel like I can live with this setup, and if growth is to continue I would perhaps introduce another zone or region at some point, but for it is what it is, and the application of this cluster should be fine with the level of redundancy set, was more worried about the very low amount of devices than anything	14:29
zaitcev	Is your replication factor 3 for the container and account rings?	14:30
timss	That's the plan	14:30
zaitcev	Sounds adequate to me.	14:31
timss	Next up would be to decide the partition power for the account/container rings. I've usually seen at least 100-1000 partitions per device be recommended, but clayg's pessimistic part power even recommends as much as ~7k at a part power of 14. Perhaps the general recommendations doesn't play well with the very low amount of devices, but dunno	14:33
clayg	the recommendation was more "on the order of 1K" - so 2-3, maybe 5-6 is fine but >10 starts to look sketchy even if you are "planning" for some growth	14:35
clayg	now that I have more experience with part power increase I wonder if my recommendations about picking a part power may have changed (for objects at least; AFAIK no one has attempted a a/c PPI)	14:36
zaitcev	I never understood economizing at partitions. The more, the better for the replication. The biggest clusters can have issues like having too many i-nodes, which auditors and replicators constantly refresh in the kernel. If you have adequate RAM to contain the inodes and the rings, what's the downside?	14:37
zaitcev	Is there a problem with replicator passes taking too long?	14:38
timss	oh, seems I summoned the man himself involuntarily :D	14:38
timss	back to object rings primarly now, but I'm curious what made some folks over at Rackspace recommend more in the scale of ~200 partitions per drive in their ring calculation tool https://rackerlabs.github.io/swift-ppc/	14:46
timss	I've been running ~6k partitions per device (pp 19) on a previous installation for years which has been going ok, but replication performance isn't the best (probably more factors to it, it hasn't gotten that much love)	14:48
opendevreview	Alistair Coles proposed openstack/swift master: relinker: don't bother checking for previous tombstone links https://review.opendev.org/c/openstack/swift/+/798914	15:10
opendevreview	Hitesh Kumar proposed openstack/swift-bench master: Migrate from testr to stestr https://review.opendev.org/c/openstack/swift-bench/+/798941	18:13
timburke	anybody care much about swift-bench? looks like ~a year ago i proposed we drop py2 for it: https://review.opendev.org/c/openstack/swift-bench/+/741553	19:24
*** dviroel is now known as dviroel\|out		20:41
zaitcev	I would not mind. It's a client, isn't it? Surely new test runs for it run on new installs. No data gravity.	20:45
opendevreview	Tim Burke proposed openstack/swift master: reconciler: Tolerate 503s on HEAD https://review.opendev.org/c/openstack/swift/+/796538	20:45
zaitcev	Well I can imagine benching from an ancient kernel in case there's an anomaly in a new one.	20:46
zaitcev	But frankly I suspect the time for that is in the past.	20:46
kota	good morning	20:56
timburke	o/	20:57
kota	timburke: o/	20:58
timburke	#startmeeting swift	21:00
opendevmeet	Meeting started Wed Jun 30 21:00:37 2021 UTC and is due to finish in 60 minutes. The chair is timburke. Information about MeetBot at http://wiki.debian.org/MeetBot.	21:00
opendevmeet	Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.	21:00
opendevmeet	The meeting name has been set to 'swift'	21:00
timburke	who's here for the swift meeting?	21:00
kota	o/	21:01
acoles	o/	21:01
timburke	pretty sure mattoliver is out sick -- we'll see if clayg and zaitcev end up chiming in later ;-)	21:03
zaitcev	o/	21:04
timburke	as usual, the agenda's at https://wiki.openstack.org/wiki/Meetings/Swift	21:04
timburke	#topic swift-bench and py2	21:04
timburke	so a while back i proposed that we drop py2 support from swift-bench: https://review.opendev.org/c/openstack/swift-bench/+/741553	21:05
timburke	...and then i promptly forgot to push on getting it merged at all :P	21:05
timburke	i saw that there's a new patch up for swift-bench (https://review.opendev.org/c/openstack/swift-bench/+/798941) -- and the py2 job seems broken	21:06
kota	i see. it's updated in Jul 2020	21:06
timburke	so i thought i'd check in to see whether anyone objects to dropping support there	21:07
timburke	sounds like i'm good to merge it :-)	21:09
kota	+1	21:10
timburke	on to updates!	21:10
timburke	#topic sharding	21:10
timburke	it seems like acoles and i are getting close to agreement on https://review.opendev.org/c/openstack/swift/+/794582 to prevent small tail shards	21:11
timburke	were there any other follow-ups to that work we should be paying attention to? or other streams of work related to sharding?	21:11
acoles	IIRC mattoliver had some follow up patch(es) for tiny tails but I don't recall exactly what	21:13
acoles	maybe to add an 'auto' option, IDK	21:14
timburke	sounds about right. and there's the increased validation on sharder config options -- https://review.opendev.org/c/openstack/swift/+/797961	21:16
timburke	i think that's about it for sharding -- looking forward to avoiding those tail shards :-)	21:17
timburke	#topic relinker	21:17
timburke	we (nvidia) are currently mid part-power increase	21:18
timburke	and acoles wrote up https://bugs.launchpad.net/swift/+bug/1934142 while investigating some issues we saw	21:18
timburke	basically, the reconciler has been busy writing out tombstones everywhere, which can cause fomr relinking errors as multiple reconcilers can try to write the same tombstone at the same time	21:20
acoles	we're fortunate that the issue has only manifested with tombstones, as a result of the circumstances of the reconciler workload we had and the policy for which we were doing part power increase	21:21
zaitcev	Oh I see. I was just thinking about it.	21:21
acoles	its relatively easy to reason about tolerating a tombstone with different inode, data files would probably require more validation that 'same filename'	21:22
timburke	a fix is currently up at https://review.opendev.org/c/openstack/swift/+/798849 that seems reasonable, with a follow-up to remove some now-redundant checks at https://review.opendev.org/c/openstack/swift/+/798914	21:22
acoles	timburke: if we feel happy about the follow up I reckon I should squash the two	21:23
acoles	we're basically relaxing the previous checks rather than adding another	21:24
timburke	i think i am, at any rate. i also think i'd be content to skip getting the timestamp out of metadata	21:24
acoles	yeah, that was my usual belt n braces :)	21:25
timburke	surely the auditor includes a timestamp-from-metadata vs timestamp-from-file-name check, right?	21:26
acoles	idk	21:26
acoles	ok i'll rip out the metadata check and squash the two	21:27
timburke	👍	21:28
timburke	#topic dark data watcher	21:28
timburke	i saw acoles did some reviews!	21:28
zaitcev	Yes	21:28
timburke	thanks :-)	21:28
acoles	yes!	21:28
zaitcev	Indeed.	21:28
acoles	well just one	21:28
acoles	iirc i was happy apart from some minor fixes	21:29
zaitcev	I squashed that already but now I'm looking at remaining comments, like the one about when X-Timestamp is present and if an object can exist without one.	21:30
acoles	zaitcev: i think its ok, the x-timestamp should be there if the auditor passes the diskfile to watcher	21:31
timburke	and if the auditor doesn't check for it, it should and idk that the watcher necessarily needs to be defensive against it being missing	21:32
zaitcev	ok	21:33
timburke	all right, that's all i had to bring up	21:34
timburke	#topic open discussion	21:34
timburke	what else should we be talking about?	21:34
zaitcev	Hackathon :-)	21:35
timburke	i love that idea -- unfortunately, i don't think it's something we can do yet	21:38
timburke	short of a virtual one, at any rate	21:38
kota	exactly	21:39
opendevreview	Pete Zaitcev proposed openstack/swift master: Make dark data watcher ignore the newly updated objects https://review.opendev.org/c/openstack/swift/+/788398	21:39
timburke	speaking of -- looks like we've got dates for the next PTG: http://lists.openstack.org/pipermail/openstack-discuss/2021-June/023370.html	21:39
timburke	Oct 18-22, still all-virtual	21:39
acoles	ack	21:40
* kota will register it		21:41
zaitcev	I'm just back from a mini vacation at South Padre. Seen a few people in masks. Maybe one in 20.	21:41
timburke	yeah, but you're in TX ;-)	21:42
zaitcev	The island is overflowing. I guess the international vacationing still not working. People even try to surf, although obviously the wave is pitiful in the Gulf absent a storm.	21:42
timburke	i just check; my company's guidelines for travel are currently matching their guidelines for office re-opening, which is "not yet"	21:42
zaitcev	ok	21:43
timburke	all right, let's let kota get on with his morning :-)	21:43
acoles	is the us even allowing aliens in ? without quarantine?	21:43
timburke	thank you all for coming, and thank you for working on swift!	21:44
timburke	#endmeeting	21:44
opendevmeet	Meeting ended Wed Jun 30 21:44:23 2021 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4)	21:44
opendevmeet	Minutes: https://meetings.opendev.org/meetings/swift/2021/swift.2021-06-30-21.00.html	21:44
opendevmeet	Minutes (text): https://meetings.opendev.org/meetings/swift/2021/swift.2021-06-30-21.00.txt	21:44
opendevmeet	Log: https://meetings.opendev.org/meetings/swift/2021/swift.2021-06-30-21.00.log.html	21:44
timburke	acoles, it looks like they probably wouldn't let you in: https://www.cdc.gov/coronavirus/2019-ncov/travelers/from-other-countries.html :-(	21:48
clayg	sorry i missed the meeting; scrollback all looks good 👍	21:58
opendevreview	Merged openstack/swift-bench master: Drop testing for py27 https://review.opendev.org/c/openstack/swift-bench/+/741553	23:54
opendevreview	Tim Burke proposed openstack/swift-bench master: Switch to xena jobs https://review.opendev.org/c/openstack/swift-bench/+/741554	23:56

Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!