Tuesday, 2018-04-10

*** rbudden has quit IRC		02:52
*** paken has joined #scientific-wg		08:04
*** priteau has joined #scientific-wg		08:27
*** paken has quit IRC		09:11
*** paken has joined #scientific-wg		09:25
*** rbudden has joined #scientific-wg		12:35
*** KurtB has joined #scientific-wg		13:42
KurtB	Good morning. Is anyone here running openstack controllers and/or compute on AMD EPYC?	13:44
KurtB	Between that, and trying to figure out how to size ceph is making my head hurt. :-)	13:44
*** ildikov has quit IRC		14:08
*** ildikov has joined #scientific-wg		14:09
*** paken has quit IRC		14:30
jmlowe_	I'm not running on amd	14:39
jmlowe_	I've got a few years of ceph experience, maybe I can help?	14:40
jmlowe_	KurtB: ^^^^	14:40
KurtB	jmlowe_: Hey. Yeah. I'm trying not to bother you.	16:33
KurtB	I have some (six) disk trays that I inherited from a failed experiment, 256G RAM, 52 4TB drives, 4 800G SSDs. I want to turn that into a ceph cluster. If I set those up as 3x replication, I'm wondering how to size the metadata servers.	17:01
KurtB	Probably run MONs on the storage servers.	17:09
jmlowe_	I'm not sure about metadata server sizing, they need very little storage but I'd put the metadata pool on the ssd's	17:23
jmlowe_	you should be ok in terms of memory per osd	17:24
jmlowe_	thinking about this a little more, I'd use lvm and put all of the ssd's in to one vg, slice that into 52 50GB partitions to use as --block.db then that leaves you with 600GB left over to back a ssd based osd	18:07
jmlowe_	your mention of metadata makes me think you are wanting to play with cephfs?	18:08
*** paken has joined #scientific-wg		18:18
KurtB	cephfs... That's an option.	19:03
KurtB	Do you use ceph as strictly block storage?	19:03
jmlowe_	I've started dabbling with cephfs via manila with ganesha nfs exports	19:04
jmlowe_	but for the most part block storage, we have a user or two that do s3/swift via radosgw	19:05
KurtB	A buddy of mine has been struggling with getting a ceph cluster running reliably for a coupe of months. He's doing cephfs on top of a erasure coding cluster.	19:05
jmlowe_	https://zonca.github.io/2018/03/zarr-on-jetstream.html	19:05
jmlowe_	that's an interesting thing a user is doing	19:06
jmlowe_	so far so good with me testing cephfs on an erasure coded pool with metadata on a nvme pool	19:06
jmlowe_	single mds though	19:06
jmlowe_	The hdf guys have gotten their thingy working on our object store so you can consume hdf directly from the object store	19:07
KurtB	He's been trying to run the MDSs on the storage servers, and doesn't have enuf RAM. 36 8TB drives each, with only 64G RAM	19:08
jmlowe_	99.9% block usage I'd say	19:08
KurtB	Oh, that's cool!	19:08
jmlowe_	I've pressed a compute node or two into service for mds, they can be kind of memory pigs	19:08
KurtB	Are you exporting block to anything otside of OpenStack?	19:09
jmlowe_	I am not	19:09
jmlowe_	I don't think cephfs w/ manila will be ready for prime time until queens and ganesha 2.7, I'm of the opinion all HA all the time	19:10
KurtB	There are a couple of 32G servers I inherited with those storage trays.	19:10
KurtB	Not sure if that's enuf for a MDS	19:11
jmlowe_	my plan is ganesha/manila agent/radosgw/mds, on 3 x 24 core 128G nodes	19:11
KurtB	Nice!	19:12
jmlowe_	so far I only have 2 x radosgw and everything else is single instance	19:12
KurtB	Seems there are a lot of knobs to turn to get ceph happy... but I think my buddy is under-provisioned for RAM.	19:13
jmlowe_	bollig: that dask thingy might be interesting to you guys	19:13
jmlowe_	or I should say Zarr for Dask	19:14
* KurtB is looking at dask now		19:19
KurtB	zarr looks interesting	19:20
KurtB	lemme get ceph working right, and then I'll hack on that! :-)	19:21
jmlowe_	Do I remember correctly that you are at NIST in Boulder?	19:22
KurtB	I'm at NREL (National Renewable Energy Laboratory) in Golden, CO.	19:28
KurtB	I don't have a big enuf trust fund to live in Boulder! :-)	19:28
jmlowe_	ah, ok, I was just out there for the SEA conference at UCAR	19:28
KurtB	I'm like 20 miles south of UCAR. really close.	19:29
*** paken has quit IRC		19:46
bollig	jmlowe_: cool. thanks for the tip	19:47
bollig	KurtB: I’m also interested in how well epyc performs as a hypervisor	19:57
KurtB	bollig: One of my cohorts just gave me an EPYC to test. I'm going to build a test cluster and shove it in as a compute node. I'll let you know.	20:05
bollig	jmlowe_: has zonca or anyone else used dask with kubernetes on jetstream?	20:28
jmlowe_	We don't think so	20:30
jmlowe_	There is a guy working on it	20:30
jmlowe_	Kevin Paul from Pangeo	20:30
bollig	ok great to know	20:31
jmlowe_	Did you say you had barbican working?	20:32
bollig	no, not yet.	20:34
*** priteau has quit IRC		20:59
*** priteau has joined #scientific-wg		21:00
*** priteau has quit IRC		21:04
*** priteau has joined #scientific-wg		21:58
*** priteau has quit IRC		22:08
*** rbudden has quit IRC		23:50

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!