Friday, 2019-10-11

*** diablo_rojo has quit IRC		00:21
openstackgerrit	Tim Burke proposed openstack/python-swiftclient master: Actually run tempauth tests in swiftclient dsvm jobs https://review.opendev.org/687773	00:23
openstackgerrit	Tim Burke proposed openstack/python-swiftclient master: v1auth: support endpoint_data_for() api https://review.opendev.org/687774	00:23
*** gyee has quit IRC		00:40
openstackgerrit	Tim Burke proposed openstack/swift master: Fix misleading error msg if swift.conf unreadable https://review.opendev.org/581280	00:55
openstackgerrit	Tim Burke proposed openstack/python-swiftclient master: v1auth: support endpoint_data_for() api https://review.opendev.org/687774	01:13
*** diablo_rojo has joined #openstack-swift		02:59
*** lbragstad_ has joined #openstack-swift		03:31
*** lbragstad has quit IRC		03:31
*** lbragstad has joined #openstack-swift		04:25
*** diablo_rojo has quit IRC		04:26
*** lbragstad_ has quit IRC		04:28
*** lbragstad_ has joined #openstack-swift		04:33
*** lbragstad has quit IRC		04:34
*** lbragstad has joined #openstack-swift		04:40
*** lbragstad_ has quit IRC		04:41
*** pcaruana has joined #openstack-swift		04:55
*** tkajinam has quit IRC		05:01
*** tkajinam has joined #openstack-swift		05:02
*** tkajinam has quit IRC		05:23
*** tkajinam has joined #openstack-swift		05:23
viks___	Hi, in my swift cluster storage node, a disk had errors and so drive audit tried to unmount it. But the unmounting hanged, due to which load average increased too much and it became unresponsive. Later after restarting that storage node and replacing the drive, everything normal. But how do i handle such case without restarting? why unmounting got hung? Any one has any idea/solution?	05:42
*** evrardjp_ has joined #openstack-swift		05:49
*** ktsuyuzaki has joined #openstack-swift		05:53
*** ChanServ sets mode: +v ktsuyuzaki		05:53
*** timss- has joined #openstack-swift		05:54
*** evrardjp has quit IRC		05:55
*** kota_ has quit IRC		05:55
*** timss has quit IRC		05:55
*** irclogbot_2 has quit IRC		05:56
*** irclogbot_0 has joined #openstack-swift		05:58
*** lbragstad_ has joined #openstack-swift		06:00
*** lbragstad has quit IRC		06:01
*** early has quit IRC		06:10
*** rdejoux has quit IRC		06:11
*** early has joined #openstack-swift		06:11
*** baojg has quit IRC		06:31
*** rcernin has quit IRC		07:03
*** tesseract has joined #openstack-swift		07:03
*** ccamacho has joined #openstack-swift		07:04
*** rdejoux has joined #openstack-swift		07:09
*** rpittau\|afk is now known as rpittau		07:53
*** tkajinam has quit IRC		07:58
openstackgerrit	Merged openstack/swift master: Fix misleading error msg if swift.conf unreadable https://review.opendev.org/581280	08:07
*** mikecmpbll has joined #openstack-swift		08:07
alecuyer	viks___: If the disk was failing and unresponsive, it would explain both the high load average and the inability to unmount it (you may have seen processes in "D" state?) . When that happens on linux, I don't know of a solution other than a reboot	08:33
viks___	alecuyer: Thanks... I did not check for "D" state which i was unaware... Next time i'll check for it..	08:39
*** takamatsu has joined #openstack-swift		08:49
*** e0ne has joined #openstack-swift		09:09
*** mvkr has quit IRC		09:56
*** rcernin has joined #openstack-swift		10:08
*** mvkr has joined #openstack-swift		10:09
*** rpittau is now known as rpittau\|bbl		10:15
*** rcernin has quit IRC		10:21
*** rdejoux has quit IRC		11:12
*** mikecmpbll has quit IRC		11:17
*** mikecmpbll has joined #openstack-swift		11:26
*** rdejoux has joined #openstack-swift		11:40
zigo	http://paste.openstack.org/show/783003/ <--- Big Badaboum ...	11:48
openstackgerrit	Thomas Goirand proposed openstack/swift master: Fix on-disk encryption under Python 3 https://review.opendev.org/688113	12:13
*** rpittau\|bbl is now known as rpittau		12:33
openstackgerrit	Thomas Goirand proposed openstack/swift master: Fix on-disk encryption under Python 3 https://review.opendev.org/688113	12:46
*** BjoernT has joined #openstack-swift		13:16
*** lbragstad_ is now known as lbragstad		13:18
*** BjoernT_ has joined #openstack-swift		13:21
*** BjoernT has quit IRC		13:24
*** lbragstad has quit IRC		13:26
*** BjoernT_ is now known as BjoernT		14:25
*** diablo_rojo has joined #openstack-swift		14:40
*** rdejoux has quit IRC		14:43
*** e0ne has quit IRC		14:53
*** FlorianFa has quit IRC		15:03
*** rpittau is now known as rpittau\|afk		15:44
*** mikecmpbll has quit IRC		16:04
*** e0ne has joined #openstack-swift		16:23
*** e0ne has quit IRC		16:26
openstackgerrit	Thomas Goirand proposed openstack/swift master: Fix on-disk encryption under Python 3 https://review.opendev.org/688113	16:33
*** BjoernT_ has joined #openstack-swift		16:39
*** BjoernT has quit IRC		16:41
*** paladox has quit IRC		16:49
*** gyee has joined #openstack-swift		16:53
*** paladox has joined #openstack-swift		16:57
*** BjoernT_ has quit IRC		17:00
timburke	zigo, on the py3 encryption bug -- which keymaster are you using? the simple one, kmip, barbican?	17:28
*** e0ne has joined #openstack-swift		17:59
*** e0ne has quit IRC		18:25
*** e0ne has joined #openstack-swift		18:35
*** e0ne has quit IRC		18:53
*** pcaruana has quit IRC		18:55
*** tomha has joined #openstack-swift		20:04
*** tomha has quit IRC		20:05
zigo	timburke: Hi there!	20:23
zigo	timburke: Barbican.	20:23
zigo	timburke: What do you think of the patch?	20:23
timburke	o/	20:24
timburke	oh good, that explains why i hadn't repro'd with the simple or pykmip ones ;-)	20:24
zigo	timburke: I didn't know there was another way to run encryption ! :)	20:24
zigo	timburke: How do you do the other way with kmip ? Is it documented somewhere ?	20:25
timburke	seems pretty straight-forward -- i might make it a little more localized to the barbican keymaster, and try to sort out if i should be proposing a patch to castellan and/or barbicanclient	20:25
timburke	...sort of? there's another middleware: https://github.com/openstack/swift/blob/2.23.0/etc/proxy-server.conf-sample#L1138-L1155	20:26
timburke	works similar to the barbican one; there's a preference for putting the config in a separate file (so you can have separate permissions), and that'll look like https://github.com/openstack/swift/blob/2.23.0/etc/keymaster.conf-sample#L78-L96	20:27
timburke	there's some actual docs at https://docs.openstack.org/swift/latest/overview_encryption.html#encryption-root-secret-in-a-kmip-service	20:29
timburke	anyway, i'll get back to getting barbican running locally so i can repro and test :-)	20:31
*** BjoernT has joined #openstack-swift		20:31
timburke	i should really also try to figure out how to use barbican in our dsvm jobs...	20:33
zigo	That'd be helpful indeed.	20:37
zigo	I'm about to push to production a cluster which may grow quickly, and I hope to be able to upgrade swift to Train before it gets in really in use.	20:38
zigo	It's currently running Stein, so Py2, the Train release of Swift in Debian is Py3 only.	20:38
zigo	Which is why I discovered this.	20:38
timburke	right -- i'll be real interested in knowing how it goes :-)	20:39
zigo	timburke: Well, so far, everything works well in my virtualized PoC ...	20:39
zigo	(ie: 16 virtual machines consisting of 3 controllers, 3 proxies, and 10 swiftstores)	20:39
timburke	was it an all-new deployment, or was there data from py2-swift?	20:40
zigo	All new.	20:40
zigo	timburke: Do you fear there would be issues when I upgrade from py2 to py3?	20:41
zigo	I haven't tested that just yet ...	20:41
zigo	I'll probably do when I come back from the Debian cloud sprint in Boston next week.	20:41
zigo	Basically, for me, it should just be apt-get dist-upgrade and re-run puppet ...	20:41
timburke	i've done everything i can to ensure that the py2-py3 upgrade will be smooth -- but i also have to admit that i haven't tested it as thoroughly as i would have liked	20:42
zigo	We do have a few thousands objects in that cluster already (basically, our internal tests...).	20:43
timburke	thanksfully zaitcev_ has been testing it too -- he spotted https://bugs.launchpad.net/swift/+bug/1837805 between 2.22.0 and 2.23.0, for example	20:43
openstack	Launchpad bug 1837805 in OpenStack Object Storage (swift) "py3: account metadata handling is busted" [High,Fix released]	20:43
*** BjoernT_ has joined #openstack-swift		20:43
zigo	We're building a dropbox-like drive using swift as back-end. :)	20:43
timburke	love it!	20:44
zigo	I'm also building a 3rd cluster from scratch, because the 1st one is already too big ...	20:44
zigo	We got nearly 50 storage nodes, each with 12x 12TB spinning disks.	20:44
zigo	Rebalance are becoming painful, even if we have 2x 10Gbits/s network on each nodes.	20:44
zigo	So, building a 2nd cluster... :)	20:45
zaitcev_	My only objection is the lack of consistency. The code and comments must match. If you change the invariants, change the comments too.	20:45
timburke	7PB raw... not too shabby	20:45
*** BjoernT has quit IRC		20:45
zigo	The funny bit is that the controllers (running keystone) are doing almost nothing ...	20:46
zigo	So for the next cluster, we'll be using old re-purposed hardware. :)	20:46
timburke	zaitcev_, zigo: i think if we do the type-coercion in kms_keymaster as we receive the secret, we wouldn't need to touch the comments	20:47
zigo	I hope I'm not bothering you too much with my use case and that it's entertaining.	20:47
timburke	absolutely! i love hearing about how people are using swift :-)	20:48
zigo	:)	20:48
timburke	how are rebalances getting painful? what are the symptoms, and what's triggering the rebalances? how much is changing at once?	20:48
*** zaitcev_ is now known as zaitcev		20:51
zigo	timburke: The 1st cluster is always getting full, when it reaches 15%, we add new nodes (one per swift zone), and doing that just creates a storm of network traffic.	20:51
zigo	As our deployment is having 1 region in one dc, and 2 regions in another, the traffic between DCs is costy.	20:52
zigo	We'll soon upgrade that line, when it gets 100 Gbit/s, it will be less of a problem.	20:52
zaitcev	But you're familiar with the method where you start with a small weight and ramp it gradually as replicators digest it?	20:52
zigo	Current, at 20 Gbits/s shared with other services from my company, I got to carefuly tweak the object-replicator and rsync.	20:53
zigo	zaitcev: Well, I do, but I also don't want to babisit the rebalance for too long.	20:54
zigo	I usually push the weight up to 100% in like 6 or 7 times of weight increase.	20:54
timburke	i wonder if handoffs_first might be useful... try to prioritize the intra-DC movement	20:54
timburke	is it mostly triple replica?	20:55
zigo	Well, what I'd like is to have the LEAST possible traffic between regions.	20:55
zigo	Yeah, 3 replicas.	20:55
zigo	Which is why we have 3 regions.	20:56
zigo	We want one replica in each...	20:56
zigo	Each region is devided in 2 swift zones.	20:56
zigo	Zones are physically in different racks.	20:56
zigo	Oh, one more advice, I may love to have ...	20:57
zigo	How many swift-proxies should I setup per core?	20:57
zigo	Is it one per core ? Or more ?	20:57
zigo	One per thread, I mean...	20:57
zigo	That's currently what we more or less do...	20:58
timburke	https://github.com/openstack/swift/blob/2.23.0/etc/proxy-server.conf-sample#L26-L30 makes it seem like 1 per core would be about right -- but i must admit, i haven't really played with that kind of tuning much. rledisez or clayg might have some insights	21:00
zaitcev	I'd amazed if a geo-replicated cluster was stuck on proxies.	21:00
zaitcev	Especially since you don't have quorum in any 1 DC	21:02
zigo	zaitcev: My thinking is just about having best performances on proxies, as we do have a HUGE traffic ...	21:04
zigo	It's a backup solution, so most clients are doing thoudands of HEAD requests to check of objects are saved.	21:04
zigo	It doesn't look like we're having any preformance issue though! :)	21:05
rledisez	we actually let auto as it match the number of core, it seems a good fit. i'm currently more interested in the way trafic is distributed across the workers	21:05
rledisez	at some point we were also pinning the workers to the cores. we saw a small gain, but nothing notable, I think it disappeared during one of our upgrades and nobody cared to get it back	21:07
zigo	One more thing: is it ok to upgrade a cluster directly from Rocky to Train?	21:13
rledisez	zigo: yes, we jsut did it few weeks ago (from 2.18 to 2.22). nothing special to note. recommendation is to upgrade object-server first, then account/container and finally proxy	21:15
rledisez	always proxy in last so if a new features is presented through the API, all account/container/object servers are up-to-date to implement it	21:16
zigo	Thanks for the tip.	21:17
*** tesseract has quit IRC		21:18
*** BjoernT_ has quit IRC		21:42
*** BjoernT has joined #openstack-swift		21:43
*** diablo_rojo has quit IRC		22:03
*** MooingLe1ur is now known as MooingLemur		22:13
*** diablo_rojo has joined #openstack-swift		22:29
*** rcernin has joined #openstack-swift		22:42
*** BjoernT has quit IRC		23:01
*** BjoernT has joined #openstack-swift		23:01
*** BjoernT has quit IRC		23:15
*** BjoernT has joined #openstack-swift		23:16
*** BjoernT has quit IRC		23:17
*** BjoernT has joined #openstack-swift		23:18
*** BjoernT has quit IRC		23:19
*** BjoernT has joined #openstack-swift		23:21
*** BjoernT has quit IRC		23:36

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!