Friday, 2022-08-05

*** mhen_ is now known as mhen		01:17
*** ykulkarn is now known as yadnesh		06:45
*** yadnesh is now known as ykulkarn		08:51
*** rlandy\|out is now known as rlandy		10:27
mhen	Hello, I've got a question about usage of token authentication in the openstackclient via Keystone.	13:44
mhen	If I do "openstack --os-auth-type=token --os-token=... --os-auth-url https://keystone:5000/v3 --os-identity-api-version=3 image list" while passing a token previously acquired via "openstack token issue", I get "The service catalog is empty.".	13:44
mhen	I did unset all OS_* shell variables beforehand.	13:45
mhen	Any idea what I might be missing? Generic password authentication using an openrc file works fine.	13:45
jeeva	Anyone have any idea?	14:02
jeeva	'Insufficient free space for share creation on host' /var/log/kolla/manila"	14:02
jeeva	when I do "manila extend 'name' newsize"	14:03
jeeva	and I have 2.6 PB free	14:03
jeeva	Insufficient free space for share creation on host B-03-37-openstack-ctl@cephfsnative1#cephfs (requested / avail): 2560000/2492466	14:03
jeeva	trying to extend it to 2600000	14:04
lowercase	jeeva: is this a case where where someone sells a 1TB drive but you get home and its only 960gb cause of the 2^ math?	14:09
lowercase	what i am saying is it might be 2.6P you are typing in, but that might be getting translated to bytes and you might need to go over some.	14:10
jeeva	i dont know where it gets the idea that it is low on diskspace	14:11
jeeva	currently it is "2560000"	14:11
lowercase	what metric is this number in? 2492466	14:12
lowercase	is that 2MB	14:12
jeeva	2.49 PB	14:12
lowercase	2gigabytes, 2 terrabytes	14:12
lowercase	lol	14:12
lowercase	jeeva: i know you think it is, but i'm asking for you to prove it	14:12
jeeva	i dont know where it get that value from though	14:13
lowercase	one sec	14:13
jeeva	if you take the set value " 2560000" that is 2.56 PB	14:13
jeeva	which the share is currently	14:14
lowercase	and if you do a ceph df, does that number match up?	14:14
lowercase	switching vpns, im gonna drop	14:14
jeeva	2.5P 2.4P 94T 97% /share	14:15
lowercase	back	14:15
jeeva	2.5P 2.4P 94T 97% /share	14:16
jeeva	wb	14:16
jeeva	so 2560000 = 2.5P	14:16
lowercase	do that on the ceph cluster, do you have more capacity to give? i.e. does the ceph pool have a limit placed on it	14:17
jeeva	haven't actually checked that ... mmmm	14:18
jeeva	but i have 1.3 PB free, but let me check pool values	14:18
jeeva	max bytes : N/A	14:19
lowercase	i like to use the command, ceph osd pool autoscale-status	14:19
jeeva	so i guess no quota set on cephfs	14:19
lowercase	and look at TARGET-SIZE	14:19
jeeva	no TARGET-SIZE value next to any pools	14:19
lowercase	that's good.	14:20
lowercase	one sec, im looking at my own manilla config	14:21
jeeva	maybe i should have a check at setfattr as well	14:21
jeeva	maybe a limit on setfattr -n ceph.quota.max_bytes	14:21
jeeva	ceph.quota.max_bytes="2748779069440000"	14:22
lowercase	check `manila absolute-limits`	14:24
jeeva	2748779069440000 is 2.74 PB, but 2.44PiB	14:24
jeeva	lowercase, https://prnt.sc/gOGMceflXI9l looks pretty vanilla	14:25
jeeva	maxTotalShareGigabytes = 1000 Gb	14:25
lowercase	prnt.sc is blocked by my work	14:25
jeeva	https://zerobin.net/?0f9ab0e20a6c06c6#EV1Ers1Q4Xfq7JMZRMoboAKAIc7UDPBgXPSc8OFV3ew=	14:26
jeeva	1000 gb is 0.1 PB though, and im far pass that	14:27
lowercase	jeeva: ceph.quota.max_bytes="2748779069440000"	14:28
lowercase	manila extend is in gigabytes	14:28
lowercase	260000 gigabytes is ....	14:28
jeeva	yeah, but ceph output is bytes	14:28
lowercase	2.791729 petabytes	14:28
jeeva	yeah but my manila share is in gigabytes "2560000"	14:29
jeeva	and trying to exten to 2644000-ish	14:30
jeeva	which is below 2.79 Pb	14:30
lowercase	currently, but you are trying to extend it to 260000 gigabytes, plug that in a calculator is 2791728742400000 bytes which exceeds ceph.quota.max_bytes="2748779069440000"	14:30
jeeva	thanks, so my suspicioun in this calculator is correct	14:32
jeeva	suspicion*	14:32
jeeva	lowercase, thanks for your time & input	14:32
lowercase	anytime.	14:33
jeeva	this is my scratch storage, that is suppose to be ephemeral, but no, "don't delete files on scratch"	14:33
jeeva	lowercase, one last thing	14:34
jeeva	do you think i can increase the ceph quota, and it will not break manila ?	14:34
jeeva	since doesn't the manila command update the ceph quota ?	14:34
lowercase	Honestly, my largest cluster is 2.4PB so i haven't needed to do this.	14:34
jeeva	what is your largest single file ?	14:35
lowercase	I am strictly prohibited at looking at the data on my clusters, so I can honestly say that i don't know.	14:35
jeeva	seriously ?	14:36
jeeva	so you cant even check a single item ?	14:37
jeeva	top secret kek ?	14:37
jeeva	hehe	14:37
lowercase	I do have a MDS cache problem related to cephfs that i haven't dove into. Do you have a way that i can look at sizes of data but not the data itself?	14:37
jeeva	what is your mds_cache_memory_limit	14:39
lowercase	let me get into that cluster, one sec	14:40
jeeva	do you have slow / trim MDS issue ?	14:40
lowercase	1 clients failing to respond to cache pressure	14:42
jeeva	that is a common thing for us in HPC	14:43
lowercase	MDS_CLIENT_RECALL: 1 clients failing to respond to cache pressure	14:43
lowercase	mds.alplpmultceph03(mds.0): **** failing to respond to cache pressure client_id:	14:43
jeeva	we normally just let the job continue to run, and then set the client (compute node) to reboot when the job is done	14:43
lowercase	ah shit, i censored the other stuff	14:43
lowercase	nah, this is constint	14:44
jeeva	yeah so you need to increase your mds_cache_memory_limit	14:44
jeeva	i.e. have more memory on the controller	14:44
jeeva	which is one of the issues	14:44
jeeva	alternatively check this doc that i sometimes reference	14:44
jeeva	https://indico.cern.ch/event/588794/contributions/2374222/attachments/1383112/2103509/Configuring_Ceph.pdf	14:45
jeeva	i actually had that issue last week again, but it was a job that was running that wasn't properly coded	14:45
jeeva	so it messed around with the storage	14:45
jeeva	25 Gb/s	14:45
jeeva	i was playing around with my MDS cache preassure issues last week again, had another doc , just have to find it now	14:46
jeeva	btw my mds cache memory limit is currently on 100 GB	14:48
lowercase	oh wow, i just bumped it from the default of 4Gi to 12Gi.	14:48
lowercase	the warning went away but time will tell now	14:48
jeeva	oh wow, 4 GB that is way under specced	14:49
jeeva	for like 3 basic osd nodes	14:49
lowercase	this is a 18 node cluster, with 2.4PB of space, all the drives are behind a raid controller at raid 5.	14:50
lowercase	so, 65 osds	14:50
lowercase	756gigs of memory each node;.	14:50
jeeva	running 36 node cluster, 24 x 16 TB , 48 core x 256 gb ram each	14:52
jeeva	no RAID	14:52
jeeva	~12 Pib Raw	14:52
jeeva	each node has NVMe for rockswal/db partition, and 2x 500 GB SSD for cephfs_metadata	14:52
lowercase	you got nvme	14:53
lowercase	how is the performance on those	14:53
lowercase	we are just about to buy a few racks of them.	14:53
jeeva	read somewhere people say the difference is minimum but for me it was a massive change	14:53
jeeva	POC cluster was exactly the same, but the NVMe's wasnt configured	14:53
jeeva	got like 45% throughput of the existing one with RockswalDB	14:54
jeeva	on the bluestore (collocated)	14:54
lowercase	I'm not familiar with RockswalDB	14:55
jeeva	ag that is what i call it	14:56
jeeva	let me get the correct name	14:56
jeeva	`bluestore_wal_devices`	14:56
jeeva	https://github.com/facebook/rocksdb	14:56
jeeva	you configure it in your ceph inventory file	14:57
jeeva	B-02-40-cephosd.maas osd_objectstore=bluestore devices="[ '/dev/sda', '/dev/sdb', '/dev/sdc', '/dev/sdd', '/dev/sde', '/dev/sdf', '/dev/sdg', '/dev/sdh', '/dev/sdi', '/dev/sdj', '/dev/sdk', '/dev/sdl', '/dev/sdm', '/dev/sdn', '/dev/sdo', '/dev/sdp', '/dev/sdq', '/dev/sdr', '/dev/sds', '/dev/sdt', '/dev/sdu', '/dev/sdv', '/dev/sdw', '/dev/sdx' ]" dedicated_devices="[ '/dev/nvme0n1', '/dev/nvme0n1',	14:58
jeeva	'/dev/nvme0n1', '/dev/nvme0n1', '/dev/nvme0n1', '/dev/nvme0n1', '/dev/nvme0n1', '/dev/nvme0n1', '/dev/nvme0n1', '/dev/nvme0n1', '/dev/nvme0n1', '/dev/nvme0n1', '/dev/nvme0n1', '/dev/nvme0n1', '/dev/nvme0n1', '/dev/nvme0n1', '/dev/nvme0n1', '/dev/nvme0n1', '/dev/nvme0n1', '/dev/nvme0n1', '/dev/nvme0n1', '/dev/nvme0n1', '/dev/nvme0n1', '/dev/nvme0n1' ]"	14:58
lowercase	oh! you're using a device to store the wal and block.db?	14:59
lowercase	I'm here: https://docs.ceph.com/en/octopus/rados/configuration/bluestore-config-ref/	14:59
lowercase	ill switch over to your doc now	15:00
jeeva	yeah, so it creates increments of 3 GB , 30 GB, 300 GB partitions to offload it	15:01
jeeva	so just make sure you buy the correct size NVMe	15:01
jeeva	since you can't set that value	15:01
lowercase	Do you have any nvme's as an osd?	15:01
jeeva	nope, only SSDs	15:01
lowercase	have you tested it?	15:02
lowercase	cause this might change how i approach the whole new cluster.	15:02
jeeva	nope, but i dont think i see that as something i would consider at the time being	15:02
jeeva	if it was VMware vSAN i would consider it	15:02
jeeva	but not with ceph	15:02
lowercase	I'm heavily concerned about the durability of nvme's as an osd.	15:03
jeeva	i have a class SSD2 for our "fast" pool & seperate SSD class pool for cephfs_metadata	15:03
lowercase	yeah same, we got spinners for our slow pool and ssds for our fast pool	15:04
jeeva	we have big files, so nvme wouldn't be feasible	15:04
jeeva	we got users with like 10 TB single files	15:04
lowercase	that's very large.	15:04
lowercase	oh you said you work in hpc.	15:04
jeeva	ya well, they take pictures of blackholes with 64 dishes	15:04
jeeva	data intensive astronomy	15:05
lowercase	you one of those cern guys	15:05
jeeva	no no, im in south africa	15:05
jeeva	Inter-university Institute for Data Intensive Astronomy	15:05
lowercase	https://www.bbc.com/news/science-environment-47891902	15:06
lowercase	okay, so that one isn't you guys	15:06
jeeva	no as far as i know our system did play a part in it	15:07
jeeva	well not the dishes	15:08
jeeva	but the HPC part	15:08
jeeva	lowercase,: thanks for the mental jousting, ceph fs subvolume resize cephfs 5334a96f-3cbc-4447-8187-7e61219a243f 2858730232217600	15:11
jeeva	was the fix	15:11
lowercase	glad i was able to help and meet a cool new friend in the process.	15:12
jeeva	ditto	15:12
jeeva	:)	15:12
jeeva	now these researches can go apeshit over the weekend, since its long weekend and i dont have to worry about it running full before then	15:13
jeeva	lowercase, how much memory does your controllers have? and do you run mgr/mds/mon on it ?	15:15
jeeva	should probably have moved this to #ceph	15:15
lowercase	i get banned from there frequently because i join/part too often	15:15
lowercase	let's see if the bot allows me in	15:16
jeeva	run a BNC :P	15:16
lowercase	i could... but i already work enough hours doing this stuff.	15:16
jeeva	anyway, after 5PM, home time!	15:19
jeeva	or rather, i WFH, time to move away from the desk	15:20
lowercase	welcome to your weekend.	15:20
lowercase	have a good one	15:20

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!