*** vishalmanchanda has joined #openstack-monasca | 06:59 | |
*** mensis has joined #openstack-monasca | 07:13 | |
*** witek has joined #openstack-monasca | 07:31 | |
AJaeger | witek, can you help with 717469, please ? ^ | 08:40 |
---|---|---|
AJaeger | Also, please review the monasca changes in https://review.opendev.org/#/q/status:open++topic:cleanup-py27-support - those should go in before the release | 08:40 |
AJaeger | witek, could you help, please? ^ | 08:40 |
witek | AJaeger: on it, I've put my -1 on the release change until then | 08:54 |
AJaeger | thanks, witek ! | 09:01 |
*** gokhani has joined #openstack-monasca | 09:08 | |
*** k_mouza has joined #openstack-monasca | 09:59 | |
k_mouza | hello all. I'm installing Monasca stable/stein with kolla-ansible and I'm having a problem with monasca-persister on all the controller nodes. I'm getting InfluxDBClient errors ( http://paste.openstack.org/show/791856 ) and the persister container constantly restarts, without sending metrics to InfluxDB. Anyone seen this before? Thanks a lot! | 10:14 |
witek | the metric comes apparently from the log-metrics component and cannot be parsed correctly for some reason | 10:26 |
witek | please set repositories.ignore_parse_point_error to True in monasca-persister to drop these | 10:26 |
witek | https://opendev.org/openstack/monasca-persister/src/branch/master/monasca_persister/conf/repositories.py#L37 | 10:27 |
witek | brtknr, Wasaac: any hints on this? ^^ | 10:29 |
*** k_mouza has quit IRC | 10:35 | |
*** k_mouza has joined #openstack-monasca | 10:40 | |
*** k_mouza has quit IRC | 11:28 | |
*** k_mouza has joined #openstack-monasca | 11:28 | |
*** k_mouza has quit IRC | 11:33 | |
*** k_mouza has joined #openstack-monasca | 11:51 | |
*** k_mouza has joined #openstack-monasca | 11:52 | |
*** ierdem has joined #openstack-monasca | 12:22 | |
*** dougsz has joined #openstack-monasca | 13:49 | |
k_mouza | witek: thanks a lot, didn't know that configuration option. I did enable it but I'm still getting these errors about partial writes because it's not able to parse some measurements... | 13:51 |
k_mouza | is there an option to completely drop "log." metrics? | 13:52 |
witek | stop log-metrics component | 13:53 |
witek | dougsz: any idea why this is happening: http://paste.openstack.org/show/791856/ | 13:54 |
witek | ? | 13:54 |
witek | that's on stable/stein | 13:54 |
dougsz | This is Monasca deployed with Kolla Ansible? | 13:59 |
dougsz | k_mouza ^ ? | 14:00 |
k_mouza | dougsz: yup, correct | 14:00 |
k_mouza | kolla-ansible stable/stein | 14:00 |
dougsz | I think it is a bug with the supplied log metrics configuration. | 14:00 |
dougsz | Too much information is being included with a metric generated from a log file. | 14:01 |
dougsz | Unfortunately the log-metrics service is dangerous like that, because currently it doesn't post anything back via the Monasca API for validation. So the logstash config which drives it can generate invalid metrics. | 14:02 |
dougsz | To fix it, you need to override the log-metrics config file (and file a bug in KA please) | 14:04 |
dougsz | Basically you don't want any highly variable dimensions like log message snippets going into InfluxDB, because it will kill the performance. | 14:06 |
k_mouza | hmm ok, I can have a go at this. Thanks for the info. One more thing. Let's say I want to completely drop the log metrics in monasca. I tried stopping all the monasca-log containers (monasca_log_metrics, monasca_log_persister, monasca_log_transformer and monasca_log_api) and then tried to manually drop the "log.nova-compute.warning" measurement in Influxdb. That measurement came back after a minu | 14:07 |
k_mouza | te even with all the monasca-log services being down | 14:07 |
dougsz | Cool, I think it is buffered in Kafka | 14:08 |
dougsz | In your monasca persister config, do you have ignore_parse_point_error enabled? | 14:09 |
dougsz | Example: https://github.com/SKA-ScienceDataProcessor/alaska-kayobe-config/blob/alaska-prod/etc/kayobe/kolla/config/monasca/persister.conf#L10 | 14:09 |
dougsz | That should allow the failed write to be ignored and unblock your pipeline. | 14:10 |
k_mouza | yup, I've set that to "true" after witek suggested it above, but still having same errors | 14:11 |
k_mouza | I haven't tried restarting/redeploying kafka though after I've added this in the persister config file | 14:11 |
dougsz | Thanks witek, sorry I missed it. Need to set up a bouncer. | 14:11 |
dougsz | You've restarted all of the persisters though? | 14:12 |
dougsz | monasca_persister specifically | 14:12 |
dougsz | Kafka should not require restarting | 14:12 |
k_mouza | yeah , all monasca components were restarted | 14:13 |
dougsz | Big hammer in Kolla Ansible is to delete the Kafka + Zookeeper containers, delete the Kafka + Zookeeper Docker volumes and then redeploy Kafka + Zookeeper. You will loose data buffered in Kafka, and break Monasca, but it can be useful sometimes in dev envs. | 14:13 |
dougsz | Monasca should recover after the redeploy | 14:14 |
k_mouza | yeah, that's what I'm thinking of doing as my next step | 14:14 |
k_mouza | hammer time then ! | 14:14 |
dougsz | Not something for production :) | 14:15 |
dougsz | Just checking, you definitely included the right section header for ignore_parse_point_error? I'm surprised it didn't help | 14:16 |
*** gokhani has quit IRC | 14:16 | |
k_mouza | yeah, the [repositories] one right? I had high hopes for that as well! | 14:16 |
dougsz | yeah :( | 14:17 |
dougsz | https://pastebin.com/Zk60nemR | 14:20 |
dougsz | k_mouza ^ I've been meaning to tweak the supplied log-metrics config to something more like that, havent' had a chance to formalise it yet | 14:21 |
dougsz | Note that includes some parsing of HAProxy logs for monitoring response time, you will likely want to strip that out, unless you want to do that, and then i can supply the required fluentd config. | 14:21 |
k_mouza | thanks a lot dougsz! I'll give that a spin after the deploy that's running now | 14:24 |
dougsz | you're welcome, reach out anytime | 14:24 |
k_mouza | dougsz: by the way. After deleting kafka, zookeper and influxdb volumes and redeploying, metrics are pushed in the db fine and I can see their measurements with monasca cli :) | 14:36 |
dougsz | Good to hear, k_mouza, hopefully you won't need to do that again! | 14:37 |
dougsz | We run the stein release (deployed via Kayobe + Kolla Ansible) in quite a few production environments and it's generally pretty stable. | 14:38 |
k_mouza | good to know! I'm guessing you're using the ignore config in the persister.conf right? I'll try your log metric config later as well | 14:39 |
dougsz | Yeah, generally I don't see many metrics dropped, but without that setting there is a risk of the pipeline freezing as you found out. | 14:53 |
k_mouza | yup, makes sense! | 14:54 |
*** vishalmanchanda has quit IRC | 15:09 | |
*** ierdem has quit IRC | 15:24 | |
*** witek has quit IRC | 16:10 | |
*** k_mouza has quit IRC | 16:49 | |
*** dougsz has quit IRC | 17:07 | |
*** dougsz has joined #openstack-monasca | 17:07 | |
*** dougsz has quit IRC | 17:12 | |
*** k_mouza has joined #openstack-monasca | 18:05 | |
*** dougsz has joined #openstack-monasca | 18:53 | |
*** spsurya_ has quit IRC | 19:08 | |
*** dougsz has quit IRC | 20:10 | |
*** dougsz has joined #openstack-monasca | 20:22 | |
*** dougsz has quit IRC | 20:27 | |
*** mensis has quit IRC | 20:47 | |
*** k_mouza has joined #openstack-monasca | 22:33 | |
*** k_mouza has quit IRC | 23:12 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!