Wednesday, 2019-04-10

*** openstackgerrit has quit IRC01:30
*** openstackgerrit has joined #openstack-monasca01:33
openstackgerritjianweizhang proposed openstack/monasca-api master: Docker support keystone insecure option  https://review.openstack.org/65141801:33
*** openstackstatus has quit IRC04:35
*** openstackstatus has joined #openstack-monasca04:36
*** ChanServ sets mode: +v openstackstatus04:36
*** pcaruana has joined #openstack-monasca05:06
*** witek has joined #openstack-monasca07:23
*** pcaruana has quit IRC07:34
*** pcaruana has joined #openstack-monasca07:35
*** mohankumar has joined #openstack-monasca07:44
*** bandorf has joined #openstack-monasca08:07
*** bandorf has quit IRC08:08
*** dougsz has joined #openstack-monasca08:09
*** bandorf has joined #openstack-monasca08:14
openstackgerritMichał Piotrowski proposed openstack/monasca-thresh master: Create Docker image and build in Zuul  https://review.openstack.org/64929808:28
openstackgerritAdrian Czarnecki proposed openstack/monasca-api master: [WIP] Merge log-api and api  https://review.openstack.org/65124908:31
openstackgerritAdrian Czarnecki proposed openstack/monasca-api master: [WIP] Merge log-api and api  https://review.openstack.org/65124908:34
*** chaconpiza has quit IRC08:35
*** oneswig has joined #openstack-monasca08:40
*** chaconpiza has joined #openstack-monasca08:44
oneswigA quick question arising from the discussion here http://eavesdrop.openstack.org/irclogs/%23openstack-telemetry/%23openstack-telemetry.2019-04-09.log.html#t2019-04-09T06:36:15 - wondered if the Monasca team had considered Gnocchi as an HA-capable free alternative to InfluxDB?  Apparently Gnocchi can speak InfluxDB protocol, so it may not be too hard to achieve.08:54
oneswigI guess given the discussion was driven by concerns over Gnocchi's ongoing development, it might not be so wise...08:55
witekoneswig: we did discuss it during the PTG in fall 201708:58
witekbut nobody was willing to invest time and implement it08:59
oneswigThanks witek - so many features, so little time...08:59
witekI think the project is not actively developed anymore09:00
witekoneswig: have you considered using Kafka consumer groups to replicate measurements between InfluxDB instances?09:09
openstackgerritMerged openstack/monasca-common master: Use proper naming for docker services image zuul jobs  https://review.openstack.org/65001109:33
oneswigwitek: no, I don't think so09:40
*** oneswig has quit IRC09:40
openstackgerritMerged openstack/monasca-api master: Use proper naming for docker service image zuul job  https://review.openstack.org/65088509:46
openstackgerritMerged openstack/monasca-log-api master: Use proper naming for docker service image zuul job  https://review.openstack.org/65088709:46
openstackgerritMerged openstack/monasca-persister master: Add coverage report display  https://review.openstack.org/65026409:57
*** mohankumar has quit IRC11:16
openstackgerritMerged openstack/monasca-tempest-plugin master: Use proper naming for docker service image zuul job  https://review.openstack.org/65088611:21
*** mohankumar has joined #openstack-monasca11:36
*** bobh has joined #openstack-monasca11:55
openstackgerritMichał Piotrowski proposed openstack/monasca-ui master: Unit tests fail  https://review.openstack.org/65151112:07
*** haru5ny has joined #openstack-monasca12:09
*** haru5ny has quit IRC12:11
openstackgerritMichał Piotrowski proposed openstack/monasca-ui master: Unit tests fail  https://review.openstack.org/65151212:12
*** haru5ny has joined #openstack-monasca12:12
*** haru5ny has quit IRC12:13
*** haru5ny has joined #openstack-monasca12:14
*** haru5ny has quit IRC12:14
*** bobh has quit IRC12:18
openstackgerritAdrian Czarnecki proposed openstack/monasca-api master: [WIP] Merge log-api and api  https://review.openstack.org/65124912:46
*** mohankumar has quit IRC12:58
*** irclogbot_1 has joined #openstack-monasca13:03
*** altlogbot_2 has joined #openstack-monasca13:07
openstackgerritMerged openstack/monasca-agent master: Use proper naming for docker service image zuul job  https://review.openstack.org/65088213:14
openstackgerritMerged openstack/monasca-notification master: Use proper naming for docker service image zuul job  https://review.openstack.org/65088313:23
*** openstackgerrit has quit IRC14:14
witekCourtesy Monasca meeting reminder in #openstack-monasca: witek, jayahn,iurygregory,ezpz,igorn,haad,sc,joadavis, akiraY,tobiajo,dougsz_,fouadben, amofakhar, aagate, haruki,kaiokmo,pandiyan,charana,guilhermesp,chaconpiza,toabctl15:00
witek#startmeeting monasca15:00
dougszhello all15:00
openstackMeeting started Wed Apr 10 15:00:52 2019 UTC and is due to finish in 60 minutes.  The chair is witek. Information about MeetBot at http://wiki.debian.org/MeetBot.15:00
openstackUseful Commands: #action #agreed #help #info #idea #link #topic #startvote.15:00
*** openstack changes topic to " (Meeting topic: monasca)"15:00
openstackThe meeting name has been set to 'monasca'15:00
witekhi dougsz15:00
chaconpizaHi15:01
witekhi chaconpiza15:01
bandorfhi, everybody15:01
witekhi15:01
witekagenda for today:15:02
witekhttps://etherpad.openstack.org/p/monasca-team-meeting-agenda15:02
witekI don15:02
witeksorry, let's start15:02
witek#topic monasca-thresh replacement15:02
*** openstack changes topic to "monasca-thresh replacement (Meeting topic: monasca)"15:02
witekI started making myself thought how can we replace monasca-thresh15:03
witekas we urgently do need to replace it15:03
witekand so I looked how Prometheus or Aodh are doing this15:04
witekand they both don't work on streams but query from the DB15:04
witekwhich is much easier to implement15:04
witekand then I thought we could actually try to use what Prometheus offers15:05
witekand came up with this document15:05
witekhttps://docs.google.com/presentation/d/1tvllnWaridOG-t-qj9D2brddeQXsYNyZwoYUfby_3Ns/edit?usp=sharing15:05
witekI've seen your first comments, thanks a lot for that15:05
witekI'd like to start discussion, what do you think of that approach? is that plausible?15:06
bandorfmaybe we can discuss smaller topics first? and then conclude wether it's plausible?15:07
*** haru5ny has joined #openstack-monasca15:07
witekright, do we have to discuss if monasca-thresh should be replaced?15:08
chaconpizaWhat about the upgrade from current solution to the new one using Prometheus for current clients?15:09
Dobroslawhi15:09
joadavishi Dobroslaw15:09
witekchaconpiza: you mean, what operator would have to do to upgrade from one Monasca version to another?15:09
chaconpizayes15:10
bandorfI propose to discuss this (migration) later, when a decision has been taken15:10
witekthe measurement schema would change, so although saved in InfluxDB, some data migration would have to happen if new functionality would be required15:11
joadaviswell, if we keep the monasca api and just use prometheus for the thresholding and alarming, it might not be much change for a current client15:11
bandorfRegarding your problem statement, Witek: I agree with topic 1,2 and 5.15:11
bandorf$ (complex cluster): I can't really judge15:11
bandorf$=415:11
bandorftopic 3: High resource consumption: This is certainly true. However, I 'm not sure if this is caused by monasca itself or bei storm15:12
bandorfbei = by15:12
DobroslawI'm not sure if Prometheus actually will be lighter than storm...15:12
joadavisyes, would definitely want to qualify performance15:13
joadavisand footprint15:13
Dobroslawwould be nice if we find someone using prometheus at production and tell us how much resources it's using, on average and with data spikes15:13
DobroslawI found quite few people complaining about memory usage15:14
dougsz Dobroslaw: We're using it, I haven't benchmarked it yet, but I've frequently seen it at the top of `top`15:14
bandorfUsing "remote read" from influx causes some further overhead - don't know, to what extent15:15
Dobroslawand it don't have build in max memory tuning options15:15
dougszIn addition to extending alarm expression language (#2) we also have a requirement to include metadata with alarms15:15
DobroslawI think I linked to discussion, like using 10x more memory per measurement...15:15
witekdougsz: where does the metadata come from, and can that requirement be addressed with Prometheus?15:17
joadavisI've talked to a few people who have the impression that Prometheus has a smaller footprint than Monasca, but I suspect that is relative to their install (or just marketing speak)15:17
dougszwitek: For example, we want to create a Jira ticket for every log error message. The metadata would include a snippet of the error message. Not sure if it can be done with Prometheus either. I think the approach would be to use something like mtail to make logs scrapable.15:18
Dobroslawit's invasive change, HA will need to be handled differently, not sure how to fast test it with monasca15:18
witekDobroslaw: what would be an alternative?15:20
Dobroslawunfortunately I don't have alternative, just bringing important point, monasca most likely would be installed on same machine with prometheus15:21
Dobroslawand sharing resources with it15:21
joadaviswe may need a POC to show it can be done...15:22
witekremote read is for sure an important aspect, Prometheus normally makes use of built-in aggregations and in proposed setup, the calculation would have to be done on the complete dataset15:22
witekcomplete dataset for a given alerting rule only of course, normally the last 10 minutes of data or so15:24
witekdougsz: how do you use Prometheus, do you have many alerting rules? how much data?15:25
dougszWe aren't using it at scale yet and we don't have a large number of alerting rules.15:27
dougszWe've combined it with mtail to generate metrics from log messages15:27
dougszCurrently we use Prometheus as the TSDB, no Influx yet15:28
dougszWe use kolla-ansible for the deployment - there are quite a few exporters included in that out of the box15:30
witekyes, for the collector part we should advertise the monasca-agent Prometheus plugin better15:31
witekthanks dougsz15:32
dougsz+1 - I think that's a big win - Prometheus exporters are generally pretty up-to-date and it's great we can take advantage from the Monasca Agent.15:32
witekbandorf has commented on the delay until the alarm get's triggered15:32
witekis that an issue?15:32
dougszI think it's a good point.15:33
witekis it a requirement for anyone?15:33
bandorfI had a brief discussion with Cristiano (Product Management) about this. His opinion was: In a typical OpenStack environment, it should be OK. In other scenarios (IoT-demo-fire alarm) it is not.15:34
dougszGenerally we haven't used the buffering capabilities of Kafka too much, but it's slightly concerning that alarms could stop working if there was a large burst of metrics.15:34
joadavismay depend on use case.  Some of the auto-scaling/self healing may want faster alarming15:34
joadavisto reduce downtimes and interruptions15:36
witekI think the streaming based implementation would be much more complicated, requiring knowledge of Kafka Streams or Apache Storm15:37
witekor not scalable, like monasca-aggregator15:38
witekthe only way to scale aggregator is to shard the data and consume from different Kafka topics15:39
witekwhich is also a valid approach after all15:39
witekI have one another concern about Prometheus based set up15:40
witekPrometheus defines all its alerting rules and notification via config files15:41
witekthere is no API for setting them15:41
witekonly query API to get the current configuration15:41
joadavisyeah, that is a concern especially if we do an HA setup (keeping the config files in sync)15:42
joadavisdoes changing a rule then require restarting the Prometheus service?15:42
witekreloading15:43
witekok, let's sum up what we have on advantages:15:44
witek* great community eco-system with many integrations15:45
witek* very flexible alerting rules15:45
witek* and query language for visualisations15:45
witek* easy deployment15:46
witekanything else?15:47
witekdisadvantages:15:48
dougsz* could also monitor the monasca components directly? eg. alert if influxdb goes down15:48
witekyes, I'm not sure if that's Prometheus specific15:49
joadavisdisadvantage: * potentially large footprint and resource usage15:49
joadavisdisadvantage: * no guaranteed delivery of metrics (requirement for billing systems, not as much a concern for alerting)15:50
witek* remote read requires getting complete data chunks from InfluxDB for every evaluation15:50
joadavisdisadvantage: * no native HA support, requires work to design15:51
dougszdisadvantage: * HA model for Prometheus server isn't totally clear (to me at least)15:51
witekjoadavis: well, with Kafka and InfluxDB we do get guaranteed delivery15:51
dougszdisadvantage: * Alerting chain is even more complex. Eg. Monasca API -> Kafka -> Persister -> Influx -> Prometheus -> Alert manager15:52
bandorfdisadvantage: * longer latency time until alarm gets fired15:52
bandorfunknown: * impact of 'remote read to influxdb'15:53
witekI would also argue with HA model, it's the same model as for InfluxDB, and we can use API and Kafka to help make it better15:53
witekdisadvantage: no API for alerting rules and notifications, config based operation15:54
joadavisI have a question about whether this puts Cassandra out of our design, but we are short on time so we can save that for another day15:54
witekfor this set up, we could not use Cassandra, it does not have remote read15:55
witekOK, let's cut it here for now15:56
witeklet's quickly go through the other topics:15:56
witek#topic Retirement of Openstack Ansible Monasca roles15:56
*** openstack changes topic to "Retirement of Openstack Ansible Monasca roles (Meeting topic: monasca)"15:56
witekhttp://lists.openstack.org/pipermail/openstack-discuss/2019-April/004610.html15:56
witekguimaluf: are you around?15:57
witekunfortunately I don't know anyone using OSA15:57
witek#topic Telemetry discussion15:58
*** openstack changes topic to "Telemetry discussion (Meeting topic: monasca)"15:58
witekhttp://lists.openstack.org/pipermail/openstack-discuss/2019-April/004851.html15:58
witekthere was a quick of meeting for Telemetry project yesterday15:58
witekwith the new PTL15:58
witekafter there was nobody starting for the PTL in Train15:59
witekanyone, they have considered if they should continue to rely on Gnocchi or search for alternatives16:00
joadavisI want us to have a good response for taht16:00
joadavisI need to write a thoughtful email back and recommend monasca-ceilometer :)16:00
witekas Mark has written in his email, it would be good to maintain just one monitoring project in OpenStack16:01
dougszwas just thinking about ceilosca16:01
joadavisbut we could also have larger discussions about where the monasca agent and ceilometer agent overlap and how to make mon-agent cover all16:01
witekjoadavis: do we want to sync about the answer to the mailing list?16:03
joadavissure. I can write a draft and send it to you, or you can16:03
witekOK, ping you offline16:03
joadaviswith these kind of questions I start thinking in pictures, but that is hard to do in text emails16:04
witek#topic PTG16:04
*** openstack changes topic to "PTG (Meeting topic: monasca)"16:04
witekwe have a conflict with self-healing session on the first day, Thursday16:04
witekshould we start our sessions on Friday?16:04
witekand free the slot?16:05
dougszsounds sensible16:05
chaconpiza+116:05
witekjoadavis: chaconpiza ?16:05
DobroslawI'm not sure if chaconpiza will be returning on Friday16:05
chaconpizaI will come back on Saturday, I found a good connection flight :)16:06
Dobroslawoh, great16:06
joadavisI'm ok with that. I think one of our goals for this PTG should be working with other projects and SIGs16:06
witekOK, thanks for joining today16:07
witekand for good discussion16:07
witeknext week I'm in vacation16:07
witekso could some else please start the meeting16:07
witekall from me, bye16:08
dougszThanks all, and have a good vacation16:08
Dobroslawbye16:08
joadavisbye16:08
haru5nythank you, bye.16:08
chaconpizaOk, enjoy the vacations. Bye.16:08
witek#endmeeting16:08
*** openstack changes topic to "OpenStack Monitoring as a Service | https://wiki.openstack.org/wiki/Monasca"16:08
openstackMeeting ended Wed Apr 10 16:08:38 2019 UTC.  Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4)16:08
openstackMinutes:        http://eavesdrop.openstack.org/meetings/monasca/2019/monasca.2019-04-10-15.00.html16:08
openstackMinutes (text): http://eavesdrop.openstack.org/meetings/monasca/2019/monasca.2019-04-10-15.00.txt16:08
openstackLog:            http://eavesdrop.openstack.org/meetings/monasca/2019/monasca.2019-04-10-15.00.log.html16:08
*** haru5ny has quit IRC16:08
joadavisOne topic we didn't get to in the meeting - is anyone working on python3 support?  I suspect we have some check tests running in Zuul which are labeled py3 but aren't actually executing the unit tests (at least for monasca-agent).16:21
*** altlogbot_2 has quit IRC16:45
*** dougsz has quit IRC17:00
*** witek has quit IRC17:44
-openstackstatus- NOTICE: Restarting Gerrit on review.openstack.org to pick up new configuration for the replication plugin19:05
*** bobh has joined #openstack-monasca21:39
*** bobh has quit IRC21:47
*** bobh has joined #openstack-monasca21:48
*** bobh has quit IRC22:14

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!