*** VW has joined #openstack-operators | 00:05 | |
*** emagana has quit IRC | 00:14 | |
*** emagana has joined #openstack-operators | 00:28 | |
*** Apoorva_ has joined #openstack-operators | 00:30 | |
*** rebase has quit IRC | 00:31 | |
*** emagana has quit IRC | 00:32 | |
*** Apoorva has quit IRC | 00:34 | |
*** Apoorva_ has quit IRC | 00:35 | |
*** VW has quit IRC | 00:36 | |
*** catintheroof has quit IRC | 00:52 | |
*** gyee has quit IRC | 01:03 | |
*** jamesden_ has joined #openstack-operators | 01:16 | |
*** jamesden_ has quit IRC | 01:20 | |
*** mriedem has quit IRC | 01:22 | |
*** jamesden_ has joined #openstack-operators | 01:23 | |
*** jamesden_ has quit IRC | 01:31 | |
*** VW has joined #openstack-operators | 01:39 | |
*** cemason1 has joined #openstack-operators | 03:07 | |
*** cemason has quit IRC | 03:07 | |
*** emagana has joined #openstack-operators | 03:15 | |
*** emagana has quit IRC | 03:20 | |
*** emagana has joined #openstack-operators | 03:28 | |
*** emagana has quit IRC | 03:32 | |
*** fandi has joined #openstack-operators | 03:55 | |
*** fandi has quit IRC | 03:56 | |
*** fandi has joined #openstack-operators | 03:57 | |
*** fandi has quit IRC | 03:59 | |
*** fandi has joined #openstack-operators | 04:00 | |
*** Rockyg has quit IRC | 04:01 | |
*** fandi has quit IRC | 04:01 | |
*** fandi has joined #openstack-operators | 04:03 | |
*** fragatin_ has joined #openstack-operators | 04:15 | |
*** fragatina has quit IRC | 04:19 | |
*** fragatin_ has quit IRC | 04:20 | |
*** fragatina has joined #openstack-operators | 04:35 | |
*** fragatina has quit IRC | 04:39 | |
*** fragatina has joined #openstack-operators | 04:55 | |
*** fragatina has quit IRC | 04:57 | |
*** fragatina has joined #openstack-operators | 04:58 | |
*** simon-AS5591 has joined #openstack-operators | 05:07 | |
*** fragatina has quit IRC | 05:17 | |
*** aojea has joined #openstack-operators | 05:21 | |
*** jbadiapa has quit IRC | 05:34 | |
*** yprokule has joined #openstack-operators | 05:39 | |
*** aojea has quit IRC | 05:39 | |
*** emagana has joined #openstack-operators | 05:40 | |
*** emagana has quit IRC | 05:45 | |
*** fandi has quit IRC | 05:45 | |
*** emagana has joined #openstack-operators | 05:56 | |
*** simon-AS5591 has quit IRC | 05:58 | |
*** rarcea has joined #openstack-operators | 05:59 | |
*** Oku_OS-away is now known as Oku_OS | 06:04 | |
*** cemason1 is now known as cemason | 06:06 | |
*** rcernin has joined #openstack-operators | 06:28 | |
*** slaweq has joined #openstack-operators | 06:37 | |
*** pcaruana has joined #openstack-operators | 06:40 | |
*** jbadiapa has joined #openstack-operators | 06:52 | |
*** tesseract has joined #openstack-operators | 06:56 | |
*** d0ugal has quit IRC | 06:56 | |
*** aojea has joined #openstack-operators | 07:16 | |
*** aojea has quit IRC | 07:18 | |
*** aojea has joined #openstack-operators | 07:18 | |
*** manheim has joined #openstack-operators | 07:20 | |
*** arnewiebalck_ has joined #openstack-operators | 07:35 | |
*** manheim has quit IRC | 07:37 | |
*** manheim has joined #openstack-operators | 07:38 | |
*** arnewiebalck_ has quit IRC | 07:44 | |
*** slaweq has quit IRC | 07:48 | |
*** slaweq has joined #openstack-operators | 07:49 | |
*** bjolo has joined #openstack-operators | 08:00 | |
*** racedo has joined #openstack-operators | 08:01 | |
*** arnewiebalck_ has joined #openstack-operators | 08:02 | |
*** slaweq has quit IRC | 08:08 | |
*** slaweq has joined #openstack-operators | 08:09 | |
*** paramite has joined #openstack-operators | 08:15 | |
*** vinsh_ has quit IRC | 08:22 | |
*** vinsh has joined #openstack-operators | 08:27 | |
*** arnewiebalck_ has quit IRC | 08:36 | |
*** cartik has joined #openstack-operators | 08:59 | |
*** cartik has quit IRC | 09:07 | |
*** dbecker has quit IRC | 09:07 | |
*** dbecker has joined #openstack-operators | 09:08 | |
*** electrofelix has joined #openstack-operators | 09:20 | |
*** derekh has joined #openstack-operators | 09:33 | |
*** cartik has joined #openstack-operators | 09:39 | |
*** rmart04 has joined #openstack-operators | 09:51 | |
*** electrofelix has quit IRC | 10:05 | |
*** electrofelix has joined #openstack-operators | 10:08 | |
*** aojea has quit IRC | 10:08 | |
*** slaweq has quit IRC | 10:19 | |
*** aojea has joined #openstack-operators | 10:22 | |
*** aojea has quit IRC | 10:39 | |
*** rmart04 has quit IRC | 10:45 | |
*** fragatina has joined #openstack-operators | 10:47 | |
*** aojea has joined #openstack-operators | 10:47 | |
*** rmart04 has joined #openstack-operators | 10:50 | |
*** slaweq has joined #openstack-operators | 11:00 | |
*** fragatina has quit IRC | 11:03 | |
*** fragatina has joined #openstack-operators | 11:03 | |
*** markvoelker has quit IRC | 11:06 | |
*** markvoelker has joined #openstack-operators | 11:06 | |
*** markvoelker has quit IRC | 11:11 | |
*** rarcea has quit IRC | 11:11 | |
*** alexpilotti has quit IRC | 11:26 | |
*** alexpilotti has joined #openstack-operators | 11:27 | |
*** dalees has quit IRC | 11:30 | |
*** alexpilotti has quit IRC | 11:31 | |
*** dalees has joined #openstack-operators | 11:33 | |
*** Miouge has joined #openstack-operators | 11:37 | |
*** alexpilotti has joined #openstack-operators | 11:43 | |
*** alexpilotti has quit IRC | 11:47 | |
*** emagana has quit IRC | 11:54 | |
*** dalees has quit IRC | 11:57 | |
*** benj_ has quit IRC | 12:00 | |
*** alexpilotti has joined #openstack-operators | 12:00 | |
*** cartik has quit IRC | 12:01 | |
*** alexpilotti has quit IRC | 12:04 | |
*** alexpilotti has joined #openstack-operators | 12:05 | |
*** dalees has joined #openstack-operators | 12:08 | |
*** alexpilotti has quit IRC | 12:09 | |
*** zenirc369 has joined #openstack-operators | 12:19 | |
*** fragatina has quit IRC | 12:22 | |
*** fragatina has joined #openstack-operators | 12:23 | |
*** catintheroof has joined #openstack-operators | 12:27 | |
*** catintheroof has quit IRC | 12:30 | |
*** catintheroof has joined #openstack-operators | 12:30 | |
*** benj_ has joined #openstack-operators | 12:32 | |
*** markvoelker has joined #openstack-operators | 12:39 | |
*** catintheroof has quit IRC | 12:40 | |
*** catintheroof has joined #openstack-operators | 12:41 | |
*** catintheroof has quit IRC | 12:41 | |
*** catintheroof has joined #openstack-operators | 12:41 | |
*** catintheroof has quit IRC | 12:43 | |
*** catintheroof has joined #openstack-operators | 12:43 | |
*** pontusf3 has quit IRC | 12:45 | |
*** pontusf3 has joined #openstack-operators | 12:45 | |
*** liverpooler has joined #openstack-operators | 12:46 | |
*** zenirc369 has quit IRC | 12:55 | |
*** alexpilotti has joined #openstack-operators | 13:01 | |
*** slaweq has quit IRC | 13:03 | |
*** bjolo has quit IRC | 13:15 | |
*** alexpilo_ has joined #openstack-operators | 13:19 | |
*** mriedem has joined #openstack-operators | 13:20 | |
*** dalees has quit IRC | 13:21 | |
*** slaweq has joined #openstack-operators | 13:22 | |
*** slaweq has quit IRC | 13:22 | |
*** manheim has quit IRC | 13:22 | |
*** alexpilotti has quit IRC | 13:22 | |
*** erhudy has joined #openstack-operators | 13:23 | |
*** alexpilotti has joined #openstack-operators | 13:24 | |
*** VW_ has joined #openstack-operators | 13:24 | |
*** alexpilo_ has quit IRC | 13:26 | |
*** VW has quit IRC | 13:28 | |
*** dalees has joined #openstack-operators | 13:29 | |
*** VW_ has quit IRC | 13:29 | |
*** ig0r_ has joined #openstack-operators | 13:34 | |
*** cartik has joined #openstack-operators | 13:42 | |
erhudy | anyone have any feedback (good/bad) on block live migration these days? | 13:43 |
---|---|---|
*** jamesden_ has joined #openstack-operators | 13:44 | |
erhudy | as of right now i can't use it because in liberty doesn't seem to work with LVM, but i'm interested to hear if people have positive experiences with it now in m/n/o | 13:45 |
*** cartik has quit IRC | 13:48 | |
*** manheim has joined #openstack-operators | 13:48 | |
*** chlong has joined #openstack-operators | 13:51 | |
*** maishsk has joined #openstack-operators | 14:03 | |
*** manheim_ has joined #openstack-operators | 14:14 | |
*** manheim has quit IRC | 14:17 | |
mnaser | how much memory does everyone reserve for compute nodes these days | 14:17 |
mnaser | we have 8 reserved and we're seeing some VMs OOM :\ | 14:18 |
*** slaweq has joined #openstack-operators | 14:22 | |
*** fragatina has quit IRC | 14:23 | |
*** alexpilotti has quit IRC | 14:25 | |
*** alexpilotti has joined #openstack-operators | 14:26 | |
*** slaweq has quit IRC | 14:27 | |
mrhillsman | mnaser depends on what you are running on the compute node | 14:28 |
mdorman | mnaser: i think we do 3 or 4 GB. i’m surprised 8 isn’t enough, what else are you running? | 14:28 |
mrhillsman | i would think 2 is probably good | 14:28 |
mnaser | mrhillsman / mdorman 2 ceph osd processes, but they are only consuming 1gb of ram each (hence the 8) | 14:29 |
mrhillsman | but depends on a few factors indeed | 14:29 |
mnaser | do you have swap on your compute nodes | 14:29 |
mnaser | wonder if that might help | 14:29 |
cnf | maybe something is leaking | 14:29 |
mnaser | the other thing is we have rbd cache.. i wonder if the rbd cache on each instance is increasing the space used by each vm | 14:30 |
cnf | cache should be marked as releasable | 14:30 |
*** jbadiapa has quit IRC | 14:30 | |
mnaser | i haven't yet done *heavy* investigating | 14:30 |
mnaser | but ive seen two cases of oom's | 14:30 |
cnf | out of mana sucks :/ | 14:31 |
mnaser | the only pattern is that they are high mem instance (one was 96, one was 64g) | 14:31 |
mnaser | do you run swap compute nodes? i wonder if that would help | 14:32 |
mnaser | [12842635.802674] Killed process 25433 (qemu-kvm) total-vm:68897952kB, anon-rss:67541888kB, file-rss:0kB | 14:33 |
mnaser | so that one had around 4gb more | 14:35 |
mdorman | mnaser: one thing we’ve thought about doing is adjusting the oom_score_adj setting for qemu-kvm processes, to hopefully prevent vms from getting oom killed. haven’t actually set that up yet, but the theory is that vms would normally not be the cause of an oom situation (assume you’re not doing crazy oversubscription, etc.) and it’s usually something else on teh HVs that’s ballooning memory. so by forcing the oom killer to never kill | 14:45 |
mdorman | qemu-kvm processes, theoretically it would choose to kill the thing that’s actually the source of the problem. http://www.oracle.com/technetwork/articles/servers-storage-dev/oom-killer-1911807.html is one article that kind of explains how. | 14:45 |
mdorman | we have thought about doing that for rmq as well, b/c we’ve had situations when rmq gets oom killed, too. we’ve seen a handful of vms get oom killed as well, but its not a systematic problem for us. | 14:45 |
*** marst has quit IRC | 14:48 | |
*** marst has joined #openstack-operators | 14:55 | |
*** maishsk has quit IRC | 14:58 | |
*** rcernin has quit IRC | 15:01 | |
klindgren | mnaser, how much space are you giving the HV? IE how much are you reserving in your nova configs? | 15:02 |
*** jamemxx has joined #openstack-operators | 15:03 | |
*** JillS has joined #openstack-operators | 15:03 | |
jamemxx | Hi | 15:03 |
klindgren | hi | 15:04 |
mrhillsman | hi | 15:04 |
*** cemason has quit IRC | 15:04 | |
jamemxx | I'm here for the LDT meeting. But I can see from ML Matt could not attend | 15:05 |
jamemxx | He suggested next Thursday you 4/27 same time, so I advocate for that as well. | 15:07 |
*** Oku_OS is now known as Oku_OS-away | 15:07 | |
*** cemason has joined #openstack-operators | 15:07 | |
*** zenirc369 has joined #openstack-operators | 15:07 | |
jamemxx | I'll respond in the ML. | 15:08 |
mnaser | mdorman that worries me that instead... ceph osd process will be killed, or openvswitch... so im not sure what would be better to kill | 15:10 |
mnaser | klindgren reserving 8gb in nova.conf | 15:10 |
mdorman | mnaser: yeah damned if you do, damned if you don’t, hah. | 15:11 |
mnaser | mdorman do you have swap setup on your compute nodes? | 15:11 |
mnaser | i wonder if that can alleviate some of the mess | 15:11 |
mdorman | jamemxx: next week is better for me too | 15:11 |
mdorman | mnaser: yes, but we do our best to make sure it’s not used | 15:11 |
mdorman | mnaser: does seem like that could help you situation a little. at least you’d swap instead of oomkill | 15:12 |
*** alexpilo_ has joined #openstack-operators | 15:14 | |
*** alexpilotti has quit IRC | 15:14 | |
*** pcaruana has quit IRC | 15:14 | |
mnaser | mdorman this looks really weird in the oom kill (unless im misunderstanding things) | 15:16 |
mnaser | 359055 total pagecache pages, 0 pages in swap cache | 15:16 |
mnaser | an oom kill with that much page cache? | 15:16 |
mnaser | getconf PAGESIZE => 4096 bytes .. 359055 * 4096 = 1470689280 bytes => 1.47gb of page cache | 15:18 |
mnaser | i guess thats reasonable during an oom kill | 15:18 |
*** dminer has joined #openstack-operators | 15:21 | |
*** alexpilotti has joined #openstack-operators | 15:23 | |
mnaser | so based on my simple math: i'm seeing an average of 1~2gb to 4gb extra memory per vm | 15:24 |
mnaser | with 4gb reserved i can imagine how that can easily oom | 15:24 |
*** rmart04 has quit IRC | 15:24 | |
*** alexpilo_ has quit IRC | 15:26 | |
klindgren | thought swapping would easily kill the box as well | 15:27 |
zioproto | hey folks, for those of you testing the upgrade to Newton... have a look at this https://bugs.launchpad.net/nova/+bug/1684861 | 15:27 |
openstack | Launchpad bug 1684861 in OpenStack Compute (nova) "Database online_data_migrations in newton fail due to missing keypairs" [Undecided,New] | 15:27 |
*** simon-AS559 has joined #openstack-operators | 15:28 | |
mnaser | klindgren the way im thinking is that unused memory would be swapped out | 15:28 |
mnaser | giving us more memory to work with | 15:29 |
mnaser | im pretty sure in linux swap doesnt mean use when you're out of memory but rather "work with this extra scratch space" | 15:29 |
*** slaweq has joined #openstack-operators | 15:29 | |
*** simon-AS5591 has joined #openstack-operators | 15:32 | |
*** simon-AS559 has quit IRC | 15:32 | |
klindgren | the issue is with a situation in which oomkiller gets triggers is that you can easily run into a problem where the server is paging processes into/out of swap | 15:33 |
klindgren | which causes the box to slowdown due to iowait | 15:33 |
mnaser | klindgren agreed, that's valid as well | 15:34 |
mnaser | maybe start by figuring out why there is such a huge overhead | 15:34 |
mnaser | i really wonder if its the rbd cache | 15:34 |
*** alexpilo_ has joined #openstack-operators | 15:36 | |
*** VW has joined #openstack-operators | 15:36 | |
*** alexpilo_ has quit IRC | 15:39 | |
*** alexpilotti has quit IRC | 15:39 | |
*** alexpilotti has joined #openstack-operators | 15:39 | |
*** VW has quit IRC | 15:40 | |
*** VW has joined #openstack-operators | 15:40 | |
*** VW has quit IRC | 15:45 | |
*** Miouge has quit IRC | 15:45 | |
*** aojea has quit IRC | 15:52 | |
*** chyka has joined #openstack-operators | 15:54 | |
erhudy | mnaser: we do 16 but with converged ceph, 5 was not enough and 16 has given us a little more headroom | 15:57 |
erhudy | we also don't oversub memory | 15:57 |
*** ig0r_ has quit IRC | 16:01 | |
*** liverpooler has quit IRC | 16:01 | |
*** newmember has joined #openstack-operators | 16:03 | |
*** liverpooler has joined #openstack-operators | 16:03 | |
*** VW has joined #openstack-operators | 16:06 | |
*** manheim_ has quit IRC | 16:06 | |
*** zenirc369 has quit IRC | 16:07 | |
*** VW has quit IRC | 16:09 | |
*** VW has joined #openstack-operators | 16:09 | |
*** rebase has joined #openstack-operators | 16:09 | |
*** VW has quit IRC | 16:13 | |
erhudy | that's with no swap, we turn swap off on HVs | 16:15 |
erhudy | never had an oomkill | 16:15 |
*** newmember has quit IRC | 16:22 | |
*** newmember has joined #openstack-operators | 16:23 | |
*** yprokule has quit IRC | 16:24 | |
*** paramite has quit IRC | 16:24 | |
logan- | mnaser: no swap here but like yourself we are heavy ceph users and rbd cache is turned on. also I have seen some ram overconsumption memory heavy instances. that's an interesting suspicion though with rbd cache maybe causing the over consumption. | 16:24 |
logan- | we run converged, 1-2 OSD per blade, and reserve 8 i believe | 16:26 |
logan- | erhudy: regarding the block migrates, I don't have any bad stories on xenial/newton w/ libvirt 1.3.1 yet. but we've only been on it a month or two. also I think LVM block migration is not a thing yet even in Pike | 16:30 |
*** chyka has quit IRC | 16:30 | |
*** chyka has joined #openstack-operators | 16:31 | |
logan- | that's been the big show stopper for me to even look seriously at replacing file-based disks with LVM. from the performance side it seems like a no brainer | 16:32 |
*** tesseract has quit IRC | 16:34 | |
*** chyka_ has joined #openstack-operators | 16:35 | |
*** chyka has quit IRC | 16:35 | |
erhudy | yeah, i'm trying to strategize up how to offer people local disks while still having some instance mobility if we need to evac | 16:37 |
logan- | yep- sparsity and live migrate are the big reqs for me. file backed is the only way I know of to get that currently. but the performance is atrocious | 16:39 |
erhudy | no preallocate? | 16:39 |
logan- | nope | 16:40 |
*** manheim has joined #openstack-operators | 16:41 | |
*** zenirc369 has joined #openstack-operators | 16:43 | |
logan- | this is some testing I did a while back https://docs.google.com/spreadsheets/d/1_A3SYBvUObZS4ZYaU0gGkF2j-zeynLhGXP4yGKw1vC4/edit?usp=sharing | 16:43 |
logan- | I have not tested preallocate=full | 16:43 |
logan- | the LVM/qcow2 tests were performed on instances backed by the local storage used for the "metal" tests | 16:44 |
*** fragatina has joined #openstack-operators | 16:45 | |
erhudy | seems very sensitive to block sizes | 16:45 |
erhudy | were you setting the block size inside the instance or on the HV? | 16:45 |
*** manheim has quit IRC | 16:45 | |
erhudy | er, sorry, i'm conflating that with readahead | 16:46 |
erhudy | mixing it up with some testing of a similar nature i did a while back | 16:46 |
*** dminer has quit IRC | 16:46 | |
*** derekh has quit IRC | 16:48 | |
*** zenirc369 has quit IRC | 16:49 | |
*** zenirc369 has joined #openstack-operators | 16:49 | |
*** manheim has joined #openstack-operators | 16:49 | |
*** newmember has quit IRC | 16:56 | |
*** alexpilo_ has joined #openstack-operators | 16:57 | |
dmsimard | mnaser: I've dealt with ram overhead issues before. i.e, 32GB VMs really using 35ish | 16:58 |
*** simon-AS5591 has quit IRC | 16:58 | |
erhudy | doing it with preallocate=full would be interesting | 16:59 |
dmsimard | It's been a while though... IIRC there was no easy solution and we ended up reducing the amount of RAM in the flavors to be less "pretty" but account for a small amount of overhead | 16:59 |
dmsimard | KSM is not super reliable either | 16:59 |
*** vinsh has quit IRC | 17:00 | |
*** alexpilotti has quit IRC | 17:01 | |
erhudy | we have RBD caching on and i can't say we've ever had memory issues, but that might be because of page merging balancing the overhead out | 17:01 |
*** alexpilo_ has quit IRC | 17:01 | |
*** vinsh has joined #openstack-operators | 17:02 | |
*** liverpooler has quit IRC | 17:03 | |
*** liverpooler has joined #openstack-operators | 17:06 | |
erhudy | on my hardest-working HV with 420GB of instances scheduled, about 300GB of memory are actually wired, and if my napkin math is right KSM has merged about 30GB of pages together on that system | 17:06 |
erhudy | so that seems like a pretty good rate of recovery to me | 17:06 |
*** dminer has joined #openstack-operators | 17:08 | |
mnaser | looks like a large # of operators that do converged :) | 17:10 |
logan- | i didnt notice the cell notes didn't copy correctly but I just added them all back. the test cmd lines and raw results are in there now | 17:10 |
erhudy | i am trying to deconverge because it's a pain operationally | 17:10 |
mnaser | but, thanks for the comments everyone.. i still haven't done a thorough investigation but 4gb of ram on top of a 64gb instance is a lot, 2gb on top of 16gb is even more | 17:11 |
mnaser | erhudy same here to be honest, it's not on the #1 priority on the list | 17:11 |
logan- | mnaser: would be really interested to hear of any developments if you research it further. | 17:11 |
mnaser | we used to have 32gb memory reserved but that was a lot of wasted space | 17:11 |
mnaser | given that we run 2x 1TB OSDs (SSDs) | 17:12 |
mnaser | so we dropped it to 8 as a start and then these issues started happening | 17:12 |
erhudy | what kernel are you running? | 17:12 |
mnaser | stock centos | 17:13 |
erhudy | anecdotally at the same time we changed from 5 to 16 GB reserved, we also moved from 3.13 to 4.4 on trusty | 17:13 |
erhudy | and 4.4 made _such_ a difference | 17:13 |
*** racedo has quit IRC | 17:13 | |
erhudy | if you're on centos i have no idea | 17:13 |
mnaser | i hear new kernels are the nice luxury but | 17:13 |
mnaser | we like to stick with rdo's packaging | 17:13 |
mnaser | and we're not about to start rolling out our own kernels, seems like a crazy path | 17:13 |
logan- | the nodes where i'm seeing similar overconsumption are 4.4 | 17:13 |
logan- | ubuntu xenial | 17:13 |
erhudy | we used to have a constant stream of blocked ops warnings from ceph that totally disappeared with 4.4 | 17:13 |
logan- | ceph jewel | 17:14 |
erhudy | better network performance, etc | 17:14 |
erhudy | jewel? lucky you | 17:14 |
erhudy | we're still on hammer | 17:14 |
logan- | :( | 17:14 |
mnaser | we're on jewel but one thing that we did that uncovered a lot of issues | 17:14 |
mnaser | changing the blocked ops timeout | 17:14 |
mnaser | 32 seconds is a really really REALLY long time for an i/o to complete | 17:14 |
mnaser | so we dropped it down to 2 seconds and hello ssd hell | 17:15 |
mnaser | helped us identify a lot of bad drives | 17:15 |
logan- | do you monitor ceph osd perf at all? I wonder if high perf readings would correlate to the bad drives you found | 17:16 |
mnaser | logan- we tried. but there's just SO much data and so many numbers | 17:16 |
mnaser | we couldn't make anything useful out of it :\ | 17:16 |
logan- | yeah that's a struggle i've had with ceph metrics too | 17:16 |
mnaser | ceph -s / ceph health detail was the most productive useful thing for us at the end of teh day | 17:16 |
logan- | 'ceph osd perf' is pretty concise though | 17:16 |
mnaser | oh thats an interesting one | 17:17 |
*** aojea has joined #openstack-operators | 17:17 | |
*** manheim has quit IRC | 17:18 | |
mnaser | i completetly forgot about this command, thanks for bringing it back to mind logan- | 17:18 |
*** alexpilotti has joined #openstack-operators | 17:18 | |
*** manheim has joined #openstack-operators | 17:19 | |
*** jamemxx has quit IRC | 17:21 | |
mnaser | anyone has recommendations on ssds for ceph that they've been using? the ones we usually get (S3520 960GB) are sold out everywhere :\ | 17:21 |
*** aojea has quit IRC | 17:22 | |
logan- | newer deploys use 3520's, older ones have some micron 510dc's | 17:22 |
mnaser | yeah our usual vendor and the few other ones seem to be sold out on the s3520s :< | 17:23 |
*** manheim has quit IRC | 17:25 | |
*** ig0r_ has joined #openstack-operators | 17:27 | |
*** manheim has joined #openstack-operators | 17:28 | |
*** Apoorva has joined #openstack-operators | 17:28 | |
*** rebase has quit IRC | 17:31 | |
*** rebase has joined #openstack-operators | 17:32 | |
*** electrofelix has quit IRC | 17:37 | |
*** aojea has joined #openstack-operators | 17:37 | |
*** rebase has quit IRC | 17:40 | |
*** aojea has quit IRC | 17:42 | |
*** dtrainor has quit IRC | 17:48 | |
*** dtrainor has joined #openstack-operators | 17:49 | |
*** VW has joined #openstack-operators | 17:51 | |
*** alexpilotti has quit IRC | 17:53 | |
*** VW has quit IRC | 17:56 | |
*** VW has joined #openstack-operators | 17:57 | |
*** fragatina has quit IRC | 17:58 | |
*** Caterpillar has joined #openstack-operators | 18:05 | |
erhudy | i think we're doing 3610s or 3710s now | 18:08 |
erhudy | next gen stuff we want to do on NVMe | 18:08 |
*** VW has quit IRC | 18:16 | |
yankcrime | mnaser: we managed to secure some stock of SM863's, but yeah - availability is generally poor and the price just keeps on going up | 18:21 |
yankcrime | 3.13 was a _terrible_ kernel for us, fwiw | 18:22 |
erhudy | 3.13 got the job done but now that we're on 4.4 i can see the places where 3.13 was holding us back | 18:23 |
*** eqvist has joined #openstack-operators | 18:47 | |
*** alexpilotti has joined #openstack-operators | 18:52 | |
*** simon-AS559 has joined #openstack-operators | 18:56 | |
*** alexpilotti has quit IRC | 18:57 | |
*** manheim has quit IRC | 19:04 | |
*** fragatina has joined #openstack-operators | 19:16 | |
*** eqvist1 has joined #openstack-operators | 19:20 | |
*** eqvist has quit IRC | 19:22 | |
*** eqvist1 has left #openstack-operators | 19:22 | |
*** rarcea has joined #openstack-operators | 19:22 | |
*** newmember has joined #openstack-operators | 19:23 | |
*** newmember has quit IRC | 19:27 | |
*** newmember has joined #openstack-operators | 19:28 | |
erhudy | question for the other ceph operators: do any of you have cinder deployed with multiple independent RBD AZs? | 19:30 |
erhudy | i'm trying to work out how COW images would work in this setup, because right now the COW between glance/cinder happens across pools in the same ceph cluster | 19:31 |
erhudy | but if glance and cinder were in different ceph clusters entirely, i don't know how that would work, e.g. if glance can cache images from a master ceph cluster in other clusters and then have cinder shallow clone from the cached copies | 19:32 |
*** newmember has quit IRC | 19:34 | |
*** newmember has joined #openstack-operators | 19:35 | |
*** shasha___ has joined #openstack-operators | 19:45 | |
*** bollig has quit IRC | 19:47 | |
*** fragatina has quit IRC | 19:55 | |
*** chyka_ has quit IRC | 19:55 | |
*** chyka has joined #openstack-operators | 19:57 | |
*** bollig has joined #openstack-operators | 19:59 | |
*** aojea has joined #openstack-operators | 20:06 | |
*** aojea has quit IRC | 20:10 | |
logan- | erhudy: i haven't tried it but I think there is a check based on the cluster fsid that will fall back from cow clones to a straight image copy when the glance fsid doesn't match the target | 20:11 |
erhudy | ideally it would copy the image into a cache pool on the target ceph cluster and then COW from that, but i suspect that is hoping for Too Much | 20:11 |
logan- | the same way it'll always do a image copy when you don't have the rbd:// path exposed in glance | 20:11 |
*** chyka has quit IRC | 20:16 | |
*** liverpooler has quit IRC | 20:16 | |
*** chyka has joined #openstack-operators | 20:17 | |
erhudy | right now on all our existing clusters glance's pool is in RBD and images are either COW cloned to another ceph pool or copied to disk for local storage, i have no experience with glance and cinder being in different ceph clusters entirely | 20:17 |
*** chyka has quit IRC | 20:22 | |
*** simon-AS559 has quit IRC | 20:24 | |
*** chyka has joined #openstack-operators | 20:27 | |
*** aojea_ has joined #openstack-operators | 20:27 | |
*** simon-AS559 has joined #openstack-operators | 20:28 | |
*** aojea_ has quit IRC | 20:31 | |
*** simon-AS5591 has joined #openstack-operators | 20:32 | |
*** simon-AS559 has quit IRC | 20:32 | |
*** rarcea has quit IRC | 20:32 | |
*** simon-AS559 has joined #openstack-operators | 20:34 | |
*** shasha___ has quit IRC | 20:34 | |
*** simon-AS5591 has quit IRC | 20:35 | |
*** simon-AS5591 has joined #openstack-operators | 20:40 | |
*** simon-AS559 has quit IRC | 20:40 | |
dmsimard | In a past life where ceph cache layer was a thing, I meant to have a 16x 3.5" chassis with spindle drives in pairs of raid 0's (8 OSDs) with hardware raid cache/writeback and then have those nice intel NVME drives (P3600's?) for the cache pool - it felts like a nice balance of price/cost/performance -- you don't need as much ram/cpu to handle 8 OSDs and you'd need for 16, etc. | 20:42 |
dmsimard | er, not price/cost/performance -- I meant price/storage/performance | 20:42 |
dmsimard | but I haven't been directly involved in operating ceph for a while, I hear the cache pool thing got axed | 20:43 |
*** simon-AS559 has joined #openstack-operators | 20:43 | |
*** simon-AS5591 has quit IRC | 20:43 | |
*** simon-AS5591 has joined #openstack-operators | 20:46 | |
*** simon-AS559 has quit IRC | 20:46 | |
*** aojea has joined #openstack-operators | 20:46 | |
*** aojea has quit IRC | 20:51 | |
*** fragatina has joined #openstack-operators | 21:01 | |
*** dminer has quit IRC | 21:14 | |
*** dminer has joined #openstack-operators | 21:15 | |
*** catintheroof has quit IRC | 21:34 | |
*** manheim has joined #openstack-operators | 21:35 | |
*** manheim has quit IRC | 21:36 | |
*** manheim has joined #openstack-operators | 21:41 | |
*** zenirc369 has quit IRC | 21:45 | |
*** jamesden_ has quit IRC | 21:46 | |
*** Apoorva_ has joined #openstack-operators | 22:08 | |
*** Caterpillar has quit IRC | 22:09 | |
*** Apoorva has quit IRC | 22:11 | |
*** marst_ has joined #openstack-operators | 22:17 | |
*** marst_ has quit IRC | 22:18 | |
*** marst_ has joined #openstack-operators | 22:18 | |
*** dminer has quit IRC | 22:20 | |
*** marst has quit IRC | 22:21 | |
*** VW has joined #openstack-operators | 22:22 | |
*** marst_ has quit IRC | 22:25 | |
*** VW has quit IRC | 22:31 | |
*** VW has joined #openstack-operators | 22:32 | |
*** Apoorva_ has quit IRC | 22:41 | |
*** Apoorva has joined #openstack-operators | 22:41 | |
*** manheim has quit IRC | 22:48 | |
*** vinsh has quit IRC | 23:07 | |
*** slaweq has quit IRC | 23:20 | |
*** slaweq has joined #openstack-operators | 23:20 | |
*** simon-AS5591 has quit IRC | 23:21 | |
*** simon-AS559 has joined #openstack-operators | 23:22 | |
*** slaweq has quit IRC | 23:24 | |
*** chyka has quit IRC | 23:33 | |
*** markvoelker has quit IRC | 23:39 | |
*** simon-AS5591 has joined #openstack-operators | 23:49 | |
*** simon-AS5591 has quit IRC | 23:49 | |
*** simon-AS559 has quit IRC | 23:53 |
Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!