*** thorst has quit IRC | 00:06 | |
*** Jack_Iv has joined #openstack-telemetry | 00:35 | |
*** Jack_Iv has quit IRC | 00:40 | |
*** liusheng has quit IRC | 00:51 | |
*** thorst has joined #openstack-telemetry | 01:03 | |
*** thorst has quit IRC | 01:11 | |
*** liusheng has joined #openstack-telemetry | 01:15 | |
*** Jack_Iv has joined #openstack-telemetry | 01:29 | |
*** Jack_Iv has quit IRC | 01:33 | |
*** thorst has joined #openstack-telemetry | 01:57 | |
*** thorst has quit IRC | 01:58 | |
*** lhx_ has joined #openstack-telemetry | 02:04 | |
*** gongysh has joined #openstack-telemetry | 02:07 | |
*** zhangguoqing has joined #openstack-telemetry | 02:10 | |
*** Jack_Iv has joined #openstack-telemetry | 02:20 | |
*** Jack_Iv has quit IRC | 02:24 | |
*** catintheroof has quit IRC | 02:28 | |
*** zhangguoqing has quit IRC | 02:29 | |
*** catintheroof has joined #openstack-telemetry | 02:30 | |
*** catintheroof has quit IRC | 02:34 | |
*** nokes has joined #openstack-telemetry | 03:01 | |
*** noshankus has quit IRC | 03:02 | |
*** nokes is now known as noshankus | 03:02 | |
llu | seems we're hitting the vine.five import error again with new kombu 4.0.2 version. pls see https://bugs.launchpad.net/oslo.messaging/+bug/1638263/comments/15 | 03:03 |
---|---|---|
openstack | Launchpad bug 1638263 in OpenStack Global Requirements "Unit tests failing on vine.five import" [Medium,Confirmed] | 03:03 |
*** thorst has joined #openstack-telemetry | 03:03 | |
*** thorst has quit IRC | 03:12 | |
*** frickler_ has joined #openstack-telemetry | 03:20 | |
*** frickler has quit IRC | 03:21 | |
*** Jack_Iv has joined #openstack-telemetry | 03:28 | |
*** Jack_Iv has quit IRC | 03:32 | |
*** r-mibu has quit IRC | 04:02 | |
*** thorst has joined #openstack-telemetry | 04:09 | |
*** r-mibu has joined #openstack-telemetry | 04:12 | |
*** thorst has quit IRC | 04:18 | |
*** donghao has quit IRC | 04:40 | |
*** donghao has joined #openstack-telemetry | 04:41 | |
*** adriant has quit IRC | 04:43 | |
*** lhx_ has quit IRC | 05:08 | |
*** thorst has joined #openstack-telemetry | 05:15 | |
*** thorst has quit IRC | 05:22 | |
*** hfu has joined #openstack-telemetry | 05:24 | |
*** Jack_Iv has joined #openstack-telemetry | 05:46 | |
*** Jack_Iv has quit IRC | 05:50 | |
*** lhx_ has joined #openstack-telemetry | 05:52 | |
*** Jack_Iv has joined #openstack-telemetry | 05:53 | |
*** nadya has joined #openstack-telemetry | 06:01 | |
*** thorst has joined #openstack-telemetry | 06:19 | |
*** nadya has quit IRC | 06:20 | |
*** thorst has quit IRC | 06:28 | |
*** Jack_Iv has quit IRC | 06:31 | |
*** Jack_Iv has joined #openstack-telemetry | 06:32 | |
*** Jack_Iv has quit IRC | 06:32 | |
*** Jack_Iv has joined #openstack-telemetry | 06:33 | |
openstackgerrit | Hanxi Liu proposed openstack/ceilometer: Change YAML file directory https://review.openstack.org/412309 | 06:47 |
*** Jack_Iv has quit IRC | 07:03 | |
*** Jack_Iv has joined #openstack-telemetry | 07:04 | |
*** tesseract has joined #openstack-telemetry | 07:04 | |
*** tesseract is now known as Guest33254 | 07:05 | |
*** Jack_Iv has quit IRC | 07:07 | |
*** nadya has joined #openstack-telemetry | 07:22 | |
*** thorst has joined #openstack-telemetry | 07:25 | |
lhx_ | sileht, if you aren't going to restore and change this, I would propose another one to copy the method and apply to the direct publisher | 07:26 |
lhx_ | https://review.openstack.org/#/c/397930/2 | 07:26 |
*** thorst has quit IRC | 07:33 | |
*** pcaruana has joined #openstack-telemetry | 07:33 | |
openstackgerrit | Hanxi Liu proposed openstack/ceilometer: Fix publisher comment https://review.openstack.org/412336 | 07:48 |
*** shardy has joined #openstack-telemetry | 08:02 | |
*** gongysh has quit IRC | 08:10 | |
*** yprokule has joined #openstack-telemetry | 08:14 | |
*** Jack_Iv_ has joined #openstack-telemetry | 08:27 | |
*** thorst has joined #openstack-telemetry | 08:29 | |
*** thorst has quit IRC | 08:38 | |
*** masber has joined #openstack-telemetry | 08:39 | |
lhx_ | jd__, sileht, why do we just handle on mongodb and es? | 09:16 |
lhx_ | https://github.com/openstack/ceilometer/blob/master/devstack/plugin.sh#L204 | 09:16 |
jd__ | lhx_: because SQL is handled by devstack itself, IIRC | 09:18 |
lhx_ | jd__, oh, devstack default to SQL | 09:20 |
jd__ | yeah and it knows how to handle that | 09:21 |
*** yassine has joined #openstack-telemetry | 09:23 | |
*** yassine is now known as Guest24205 | 09:23 | |
*** gongysh has joined #openstack-telemetry | 09:31 | |
*** thorst has joined #openstack-telemetry | 09:35 | |
*** thorst has quit IRC | 09:42 | |
*** frickler_ is now known as frickler | 09:44 | |
*** Adri2000 has quit IRC | 09:54 | |
*** gongysh has quit IRC | 10:07 | |
*** hfu has quit IRC | 10:15 | |
*** lhx_ has quit IRC | 10:16 | |
openstackgerrit | Julien Danjou proposed openstack/gnocchi: Merge project and user id in a creator field https://review.openstack.org/408740 | 10:20 |
openstackgerrit | Julien Danjou proposed openstack/gnocchi: rest: add auth_mode to pick authentication mode https://review.openstack.org/402068 | 10:20 |
openstackgerrit | Julien Danjou proposed openstack/gnocchi: rest: introduce auth_helper to filter resources https://review.openstack.org/402069 | 10:20 |
openstackgerrit | Julien Danjou proposed openstack/gnocchi: Introduce "basic" authentication mechanism https://review.openstack.org/412387 | 10:20 |
*** lhx_ has joined #openstack-telemetry | 10:25 | |
*** Jack_Iv_ has quit IRC | 10:39 | |
*** thorst has joined #openstack-telemetry | 10:40 | |
*** thorst has quit IRC | 10:48 | |
*** Jack_Iv_ has joined #openstack-telemetry | 10:49 | |
*** Jack_Iv_ has quit IRC | 10:53 | |
*** amoralej is now known as amoralej|brb | 11:05 | |
*** cdent has joined #openstack-telemetry | 11:11 | |
*** dave-mccowan has joined #openstack-telemetry | 11:26 | |
*** jefrite has joined #openstack-telemetry | 11:32 | |
*** Jack_Iv_ has joined #openstack-telemetry | 11:33 | |
*** lhx_ has quit IRC | 11:41 | |
*** thorst has joined #openstack-telemetry | 11:45 | |
*** dave-mcc_ has joined #openstack-telemetry | 11:45 | |
*** dave-mccowan has quit IRC | 11:48 | |
*** Guest24205 has quit IRC | 11:52 | |
*** thorst has quit IRC | 11:52 | |
*** hfu has joined #openstack-telemetry | 12:00 | |
*** lhx_ has joined #openstack-telemetry | 12:06 | |
*** shardy is now known as shardy_lunch | 12:15 | |
*** thorst has joined #openstack-telemetry | 12:15 | |
*** amoralej|brb is now known as amoralej | 12:15 | |
*** catintheroof has joined #openstack-telemetry | 12:18 | |
*** dave-mcc_ has quit IRC | 12:18 | |
EmilienM | sileht: do you know if ceilo gate is broken? https://review.openstack.org/#/c/411393/ | 12:29 |
*** hfu has quit IRC | 12:31 | |
*** lhx_ has quit IRC | 12:32 | |
*** hfu has joined #openstack-telemetry | 12:33 | |
*** vint_bra has joined #openstack-telemetry | 12:39 | |
*** hfu has quit IRC | 12:40 | |
*** lhx_ has joined #openstack-telemetry | 12:49 | |
*** Jack_Iv_ has quit IRC | 12:50 | |
*** gordc has joined #openstack-telemetry | 12:59 | |
*** lhx_ has quit IRC | 12:59 | |
*** llu has quit IRC | 13:01 | |
*** yassine has joined #openstack-telemetry | 13:05 | |
*** yassine is now known as Guest95167 | 13:06 | |
*** hfu has joined #openstack-telemetry | 13:08 | |
gordc | sileht: how come we run tempest tests in integration gate? | 13:12 |
*** nadya has quit IRC | 13:13 | |
*** Jack_Iv_ has joined #openstack-telemetry | 13:16 | |
*** Jack_Iv_ has quit IRC | 13:16 | |
*** lionel has quit IRC | 13:22 | |
*** jwcroppe_ has quit IRC | 13:26 | |
*** pradk has joined #openstack-telemetry | 13:31 | |
*** leitan has joined #openstack-telemetry | 13:31 | |
*** zaneb has quit IRC | 13:36 | |
*** lhx_ has joined #openstack-telemetry | 13:46 | |
*** jwcroppe has joined #openstack-telemetry | 13:51 | |
*** hfu has quit IRC | 14:06 | |
*** amoralej is now known as amoralej|lunch | 14:10 | |
*** Jack_Iv_ has joined #openstack-telemetry | 14:13 | |
*** chlong has joined #openstack-telemetry | 14:14 | |
*** cdent has quit IRC | 14:16 | |
*** shardy_lunch is now known as shardy | 14:16 | |
*** yprokule has quit IRC | 14:21 | |
*** pradk has quit IRC | 14:25 | |
*** ryanpetrello has joined #openstack-telemetry | 14:26 | |
*** fguillot has joined #openstack-telemetry | 14:29 | |
ryanpetrello | jd__ ceilometer gabbi tests seem to be failing for me again; `$ tox -e py27 -- gabbi` | 14:30 |
ryanpetrello | https://travis-ci.org/pecan/pecan/jobs/185146311 | 14:30 |
ryanpetrello | ... | 14:32 |
ryanpetrello | File "/home/travis/build/pecan/pecan/.tox/ceilometer-tip/src/ceilometer/.tox/gabbi/lib/python2.7/site-packages/kombu/five.py", line 6, in <module> | 14:32 |
ryanpetrello | import vine.five | 14:32 |
ryanpetrello | ImportError: No module named vine.five | 14:32 |
gordc | ryanpetrello: we need a new oslo.messaging release | 14:33 |
gordc | same reason as last week... new kombu release breaks oslo.messaging and requirements doesn't let us upper cap things | 14:34 |
ryanpetrello | gotcha | 14:38 |
ryanpetrello | just saw that this had been broken for some time | 14:38 |
ryanpetrello | and wasn't sure if I was missing something | 14:38 |
ryanpetrello | but sounds like you all are already aware and working on it | 14:39 |
gordc | yeah. well we fixed it. but the amount of time it took us to update requirements, create release, pick up release, etc... there was another kombu release that broke us again. | 14:41 |
gordc | fun stuff | 14:41 |
gordc | if it's an internal CI, i imagine you could just cap kombu <4.0.0 as that's what it should be. | 14:42 |
ryanpetrello | yea, not a big deal | 14:44 |
ryanpetrello | I just gate pecan against upstream OpenStack projects that use it | 14:44 |
ryanpetrello | and I noticed that ceilometer has been failing for some time | 14:44 |
ryanpetrello | wanted to make sure I wasn't doing something wrong | 14:44 |
gordc | cool cool. | 14:45 |
jd__ | gordc: this time I hope they will | 14:50 |
jd__ | fix it once and for all | 14:50 |
jd__ | ffs | 14:50 |
*** cdent has joined #openstack-telemetry | 14:55 | |
*** amoralej|lunch is now known as amoralej | 14:56 | |
*** sudipto has joined #openstack-telemetry | 15:01 | |
*** sudipto_ has joined #openstack-telemetry | 15:01 | |
sudipto_ | Hi, is it possible to do h/w metric collection via Ceilometer today? | 15:02 |
*** pradk has joined #openstack-telemetry | 15:04 | |
*** Jack_Iv_ has quit IRC | 15:05 | |
*** Jack_Iv_ has joined #openstack-telemetry | 15:06 | |
yarkot | have you looked at http://docs.openstack.org/admin-guide/telemetry-measurements.html | 15:15 |
openstackgerrit | Merged openstack/gnocchi: rest: catch create_metric duplicate https://review.openstack.org/411792 | 15:16 |
*** Jack_Iv_ has quit IRC | 15:18 | |
*** Jack_Iv_ has joined #openstack-telemetry | 15:19 | |
openstackgerrit | Hanxi Liu proposed openstack/ceilometer: Change YAML file directory https://review.openstack.org/412309 | 15:32 |
*** Jack_Iv_ has quit IRC | 15:39 | |
*** Guest33254 has quit IRC | 16:02 | |
openstackgerrit | Hanxi Liu proposed openstack/ceilometer: devstack configure when no collector https://review.openstack.org/412508 | 16:05 |
*** pcaruana has quit IRC | 16:15 | |
*** cdent has quit IRC | 16:26 | |
*** Guest95167 has quit IRC | 16:50 | |
*** yassine has joined #openstack-telemetry | 16:50 | |
*** yassine is now known as Guest45867 | 16:51 | |
*** nadya has joined #openstack-telemetry | 16:54 | |
*** cdent has joined #openstack-telemetry | 16:56 | |
*** nadya has quit IRC | 17:02 | |
*** Guest45867 has quit IRC | 17:08 | |
*** rwsu has joined #openstack-telemetry | 17:22 | |
*** rwsu has quit IRC | 17:23 | |
*** sudipto has quit IRC | 17:25 | |
*** sudipto_ has quit IRC | 17:25 | |
*** sudipto_ has joined #openstack-telemetry | 17:25 | |
*** sudipto has joined #openstack-telemetry | 17:25 | |
*** lhx_ has quit IRC | 17:27 | |
*** rwsu has joined #openstack-telemetry | 17:36 | |
*** Jack_Iv__ has joined #openstack-telemetry | 17:44 | |
*** nadya has joined #openstack-telemetry | 17:49 | |
*** sudipto_ has quit IRC | 17:51 | |
*** sudipto has quit IRC | 17:51 | |
*** nadya has quit IRC | 18:04 | |
*** shardy has quit IRC | 18:08 | |
*** Jack_Iv__ has quit IRC | 18:34 | |
*** Jack_Iv_ has joined #openstack-telemetry | 18:43 | |
*** Jack_Iv_ has quit IRC | 18:48 | |
*** nadya has joined #openstack-telemetry | 18:53 | |
*** Jack_Iv_ has joined #openstack-telemetry | 19:06 | |
*** Jack_Iv_ has quit IRC | 19:07 | |
*** david-lyle_ has joined #openstack-telemetry | 19:13 | |
*** openstackstatus has quit IRC | 19:13 | |
*** david-lyle has quit IRC | 19:13 | |
*** openstack has joined #openstack-telemetry | 19:15 | |
*** Jack_Iv_ has joined #openstack-telemetry | 19:17 | |
*** Jack_Iv_ has quit IRC | 19:21 | |
*** nadya has quit IRC | 19:27 | |
*** chlong has quit IRC | 19:34 | |
*** chlong has joined #openstack-telemetry | 19:36 | |
*** lionel has joined #openstack-telemetry | 19:42 | |
*** rcernin has joined #openstack-telemetry | 19:56 | |
openstackgerrit | Merged openstack/gnocchi: rest: remove user_id and project_id from metric schema https://review.openstack.org/407103 | 20:04 |
akrzos | ok | 20:11 |
akrzos | I'm seeing some of these messages in my metricd.log | 20:11 |
akrzos | "Metric processing lagging scheduling rate. ..." | 20:11 |
akrzos | "increase the number of workers or to lengthen processing interval." | 20:11 |
akrzos | by length processing interval does that mean decrease the "metric_processing_delay" | 20:12 |
gordc | akrzos: increase it | 20:13 |
akrzos | but then metricd fires up less often to actually process measures | 20:15 |
akrzos | before the next polling interval | 20:15 |
gordc | right | 20:15 |
gordc | we haven't started work on improving that part of schedulign just yet | 20:15 |
akrzos | ok | 20:15 |
gordc | akrzos: you can also just use 'refresh' optino which basically forces aggregation at query time | 20:16 |
akrzos | understood | 20:16 |
akrzos | but if my measures backlog just continues to grow | 20:16 |
akrzos | everything will need --refresh | 20:16 |
*** amoralej is now known as amoralej|off | 20:17 | |
gordc | right. in theory only --refresh guarantees you're working against all known data | 20:17 |
gordc | if you don't use it, the assumption is there may be some datapoints that are unaggregated and ignored (at time of query) | 20:18 |
akrzos | so let me explain what i'm doing and maybe it will be easier to understand why i'm thinking of reducing the delay | 20:18 |
akrzos | i'm trying to get gnocchi to saturate my ceph backend | 20:18 |
akrzos | basic scale test with an openstack cloud | 20:18 |
akrzos | with gnocchi configured with ceph driver | 20:19 |
akrzos | i have 4 ceph nodes | 20:19 |
akrzos | 3 controller nodes | 20:19 |
akrzos | 10 computes to host instances to drive a workload for telemetry | 20:19 |
akrzos | i booted 1k instances | 20:19 |
akrzos | measure backlog was growing | 20:20 |
gordc | i see, reducing that delay won't necessarily help you test load. | 20:20 |
gordc | what that delay does is basically the rate it schedules unaggregated data to be processed. | 20:20 |
akrzos | this is with 48 metricd processing workers per controller (144 workers total) | 20:20 |
akrzos | 60s delay | 20:21 |
gordc | you can make it as small as you want but if you don't have enough workers processing the scheduled items, it won't improve anything | 20:21 |
akrzos | the measures still backing up | 20:21 |
akrzos | low archival policy | 20:21 |
akrzos | as soon as i reduce dthe delay to 30s | 20:21 |
akrzos | the entire backlog was consumed in like 40minutes (100-88K measures) | 20:22 |
akrzos | and no more metrics lagging :) | 20:22 |
gordc | hmm. 144 workers is a lot.lol | 20:22 |
akrzos | ceph osds are not saturated either | 20:22 |
akrzos | let me pull the gnocchi version | 20:22 |
akrzos | i know you added a bunch of improvements | 20:23 |
akrzos | 3.0.2 | 20:23 |
akrzos | cpus on the metricd machines (OpenStack Controllers) are not saturated either they still have idle % available | 20:24 |
gordc | yeah, that's latest. | 20:24 |
gordc | so is the issue you want to make the scheduling rate quicker? | 20:24 |
akrzos | so i wanted to get more instances cause well everyone loves scale and 1k across only 10 computes seems low | 20:25 |
gordc | basically, that warning is that your scheduling much faster than metricd workers are processing | 20:25 |
akrzos | so i booted more until i saw the backlog grow again | 20:25 |
gordc | it might happen sometimes... if it happens alot then it's because scheduling is too frequent | 20:26 |
gordc | regarding growing backlog. it's not necessarily a bad thing. basically, if you want to save IO/CPU letting your backlog grow enables you to process in batch and thus more efficiency but you have to accept the delay | 20:28 |
akrzos | ok | 20:28 |
akrzos | agreed | 20:29 |
gordc | but yeah, in your saturation test, i imagine you just want to see 'max' scenario | 20:29 |
akrzos | the concern here is consumers of telemetry are going to want to understand the scale limits of gnocchi | 20:29 |
akrzos | so i'm tyring to best determine that | 20:29 |
openstackgerrit | Merged openstack/aodh: Remove notes about MongoDB https://review.openstack.org/411333 | 20:29 |
gordc | fair enough. | 20:30 |
akrzos | the selling point as i understood it was that hey we are gnocchi and we compute your metrics before you ask for them | 20:30 |
gordc | akrzos: just curious, how many osds do you have? | 20:30 |
akrzos | 36 osds | 20:30 |
akrzos | so the current scale i have right now is | 20:31 |
akrzos | 1.8k instances | 20:31 |
akrzos | 48 metricd workers on each controller | 20:31 |
akrzos | 15s processing delay | 20:31 |
akrzos | 4 ceph nodes with 9 osds each | 20:31 |
akrzos | osds are @ ~60% disk io util | 20:32 |
akrzos | and archival policy is "low" for all resources | 20:32 |
akrzos | i guess my concern is adjusting that to high | 20:32 |
akrzos | and the number of instances we could handle | 20:32 |
akrzos | would be less | 20:32 |
akrzos | and this is applying a pretty heavy utilization on the controllers were i am running the metricd processing workers | 20:33 |
akrzos | also os_workers defaults configures you with a pretty low worker count to cpus available | 20:33 |
akrzos | granted it's easy for me to override that default | 20:33 |
gordc | yeah, that's larger environmetn then what i tested with | 20:33 |
akrzos | so measures doesn't appear to be growing much right now | 20:34 |
akrzos | seems after adjusting to 15s delay it sunk a lot | 20:34 |
gordc | i had ~1/3 your workers. | 20:34 |
akrzos | btw i ended up just using collectd to query for measures backlog | 20:34 |
akrzos | your osds were ssds though | 20:34 |
gordc | but i had basically 100% CPU. | 20:34 |
akrzos | mine are 10k sas drives | 20:34 |
akrzos | also i'm seeing this in my metricd.log https://bugs.launchpad.net/gnocchi/+bug/1557593 | 20:35 |
openstack | Launchpad bug 1557593 in tooz "Unable to extend Redis lock in certain cases" [Medium,Fix released] - Assigned to Julien Danjou (jdanjou) | 20:35 |
gordc | i had a ssd journal. the storage was not. | 20:35 |
akrzos | not sure that fix made it in what i'm running | 20:35 |
akrzos | oh ok | 20:35 |
gordc | i think it's still a bug | 20:35 |
akrzos | so my journal is co-located | 20:35 |
akrzos | i don't have ssds right now for the journal | 20:36 |
gordc | i know i still see it when i put heavy load | 20:36 |
*** adriant has joined #openstack-telemetry | 20:36 | |
gordc | akrzos: i see... i didn't actually test how much it improves with ssd journal (or i don't recall) | 20:36 |
*** cdent has quit IRC | 20:36 | |
gordc | regarding archive policy, it actually might be better at high | 20:37 |
akrzos | interesting | 20:37 |
akrzos | fyi here is what i'm using to monitor the measures backlog - https://review.openstack.org/#/c/411030/4/ansible/install/roles/collectd-openstack/files/collectd_gnocchi_status.py | 20:37 |
gordc | weird but the reasoning is that low policy has a 1 day granularity so it constantly has to aggregate (up to) a days worth of data | 20:37 |
gordc | the high policy maxes out at 1 hr granularity. | 20:37 |
akrzos | obivously that will introduce some additional load but i query this every 30s not my standard 10s with the rest of my collectd metrics | 20:38 |
akrzos | OH | 20:39 |
akrzos | i see | 20:39 |
akrzos | so less re-calculating what the metric is across each aggregation | 20:39 |
gordc | that's more elegant than my soltuion. i just put random log messages and do calculates based on that. | 20:39 |
gordc | akrzos: right | 20:39 |
gordc | it just has a higher storage footprint | 20:39 |
akrzos | also i noticed you suggested aggregation worker threads | 20:39 |
akrzos | the default is 1 i beleive | 20:40 |
akrzos | wondering if we should make that equal to the aggregations that are default for the default archival policy (8 aggregations) | 20:40 |
gordc | right. we defaulted to 1 because the code wasn't very cpu efficient before so threading didn't help... but yeah, it might be a good idea to change default | 20:41 |
akrzos | *nod* python gil | 20:41 |
gordc | did enabling aggregation_workers make things better for you? | 20:41 |
gordc | not sure if it was just my test scenario | 20:42 |
akrzos | i need to try it again | 20:42 |
akrzos | so first my worker count was just entirely too low for metricd | 20:42 |
akrzos | from 6 workers per controller to 48 | 20:43 |
akrzos | which is suprising since these are 24 logical cpu cores (12 cores, 24 hyperthreads) | 20:43 |
gordc | i was going to ask how many cpu cores you had. i have 24physical core and only ran 32 becuase metricd+ceph osd max'd everything out | 20:45 |
akrzos | so the ceph osd processes are on the ceph machines | 20:45 |
akrzos | so they aren't competing for cpu | 20:45 |
gordc | ah i see. good call. :) | 20:46 |
gordc | i had to cram everything onto 3 machines. | 20:46 |
akrzos | if i only had actual ssds though for the journal i'd have "more proper" ceph nodes | 20:47 |
akrzos | the other thing missing for a "proper" ceph setup | 20:47 |
akrzos | is a separate 10g storage network for storage traffic | 20:48 |
gordc | good luck getting approval for hardware.lol | 20:48 |
akrzos | I'll need it | 20:49 |
akrzos | (the luck) | 20:49 |
akrzos | (and hardware too) | 20:49 |
gordc | :) | 20:49 |
akrzos | so maybe instead of dropping the delay interval | 20:50 |
akrzos | i should just continue to increase metricd workers | 20:50 |
akrzos | until i saturate the ceph disks | 20:50 |
akrzos | then i can retry again with a higher archival policy | 20:50 |
akrzos | won't 1s samples cause a lot of writes too or i suppose that will be batched at the scheduling interval? | 20:50 |
akrzos | also curious to the 16 tasks per worker i saw in the code | 20:51 |
gordc | it'll only generate based on incoming points. so if you're using ceilometer and polling every 10s, it'll only have the point every 10s... it won't backfill 9s | 20:52 |
gordc | (iiuc) | 20:53 |
gordc | the 16 tasks per worker is basically because we don't have a good solution to scheduling unprocessed measures yet | 20:53 |
gordc | it relates to: http://lists.openstack.org/pipermail/openstack-dev/2016-November/107284.html | 20:54 |
akrzos | okay so i'd have to decrease the polling interval to get anything with a granulatiry lower than 10minutes since our default is 600s | 20:54 |
gordc | akrzos: right | 20:54 |
akrzos | that includes samples from notification agents? | 20:54 |
gordc | samples from notification are irregular... based on when events happen | 20:55 |
gordc | nova and some other projects have a periodic notification but that's every hour i think (at most) | 20:55 |
akrzos | ok so anything i would absolutely expect to be measured regularlly occurs by the polling then? (cpu_util, memory usage, disk write/reads) | 20:59 |
gordc | disk read/writes, network in/out, cpu, cpu_util, i don't knwo which memory ones are periodic | 21:00 |
gordc | memory.usage apparently according to code | 21:01 |
*** tlian has joined #openstack-telemetry | 21:01 | |
akrzos | ok well i think i have a few actions i can take based on this | 21:01 |
akrzos | i intend on posting results from this into performance-docs here (http://docs.openstack.org/developer/performance-docs/) | 21:02 |
akrzos | thanks for your help gordc! | 21:02 |
gordc | that'd be great. we might bug you for more details once you're done so we can improve stuff | 21:02 |
gordc | thanks to you as well. | 21:02 |
akrzos | no problem!! | 21:06 |
*** Jack_Iv_ has joined #openstack-telemetry | 21:44 | |
*** thorst has quit IRC | 21:46 | |
*** thorst has joined #openstack-telemetry | 21:46 | |
*** Jack_Iv_ has quit IRC | 21:49 | |
*** thorst has quit IRC | 21:55 | |
*** fguillot has quit IRC | 21:56 | |
*** Jack_Iv_ has joined #openstack-telemetry | 22:07 | |
*** Jack_Iv_ has quit IRC | 22:08 | |
*** Jack_Iv_ has joined #openstack-telemetry | 22:09 | |
*** Jack_Iv_ has quit IRC | 22:09 | |
*** Jack_Iv_ has joined #openstack-telemetry | 22:09 | |
*** dave-mccowan has joined #openstack-telemetry | 22:10 | |
*** Jack_Iv_ has quit IRC | 22:14 | |
*** Jack_Iv_ has joined #openstack-telemetry | 22:15 | |
*** Jack_Iv_ has quit IRC | 22:16 | |
*** Jack_Iv_ has joined #openstack-telemetry | 22:18 | |
*** Jack_Iv_ has quit IRC | 22:20 | |
*** Jack_Iv has joined #openstack-telemetry | 22:35 | |
*** Jack_Iv has quit IRC | 22:38 | |
*** Jack_Iv has joined #openstack-telemetry | 22:38 | |
*** thorst has joined #openstack-telemetry | 22:52 | |
*** vint_bra has quit IRC | 22:56 | |
*** thorst has quit IRC | 23:00 | |
*** pradk has quit IRC | 23:01 | |
*** Jack_Iv has quit IRC | 23:11 | |
openstackgerrit | Merged openstack/ceilometer: Fix publisher comment https://review.openstack.org/412336 | 23:27 |
*** tlian has quit IRC | 23:35 | |
*** gordc has quit IRC | 23:37 | |
*** dave-mccowan has quit IRC | 23:50 | |
*** thorst has joined #openstack-telemetry | 23:57 | |
*** jwcroppe has quit IRC | 23:58 | |
*** jwcroppe has joined #openstack-telemetry | 23:59 | |
*** tlian has joined #openstack-telemetry | 23:59 |
Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!