13:00:49 <witek> #startmeeting monasca 13:00:49 <openstack> Meeting started Tue Dec 10 13:00:49 2019 UTC and is due to finish in 60 minutes. The chair is witek. Information about MeetBot at http://wiki.debian.org/MeetBot. 13:00:50 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 13:00:52 <openstack> The meeting name has been set to 'monasca' 13:01:00 <witek> hello everyone 13:05:11 <witek> anyone around? 13:08:58 <chaconpiza> Hi, sorry for coming late. 13:09:12 <witek> hi Martin 13:09:19 <chaconpiza> Dobek took a day-off 13:09:46 <witek> let's start then 13:09:49 <chaconpiza> sure 13:09:52 <witek> agenda: 13:09:56 <witek> https://etherpad.openstack.org/p/monasca-team-meeting-agenda 13:10:20 <witek> #topic Promethues plugin update 13:10:28 <witek> https://github.com/stackhpc/stackhpc-monasca-agent-plugins 13:11:21 <witek> dougsz has published his extension for Prometheus plugin 13:11:37 <witek> I went through the readme today 13:11:45 <dougsz> hey, sorry i'm late 13:11:51 <witek> hi Doug 13:11:57 <witek> nice work 13:12:13 <dougsz> ah, thanks, just a test really at the moment, but it seems feasible 13:12:59 <dougsz> the main motivation was to use the Ceph Prometheus endpoint to replace the Monasca Agent Ceph collector 13:13:14 <dougsz> but should hopefully work for everything 13:13:41 <dougsz> I was wondering about the best approach to merge it into Monasca Agent 13:13:42 <witek> right, Ceph has added Prometheus instrumentation 13:14:02 <dougsz> The current Prometheus Monasca Agent plugin is heavily geared to k8s 13:14:11 <dougsz> which is probably not what a lot of people want 13:14:44 <witek> but it also allows static configuration, right? 13:14:55 <dougsz> yeah, in a very basic sense 13:15:09 <dougsz> I almost wonder if it should be the prometheus-k8s plugin 13:15:33 <dougsz> and then have a vanilla prometheus plugin like the prototype 13:15:42 <dougsz> Alternative is to have all functionality in one plugin 13:16:00 <witek> is there that many conflicting points? 13:16:52 <dougsz> There is some whitelisting capability but only for the k8s thing 13:16:59 <dougsz> which would need unifying 13:17:45 <witek> I see 13:17:46 <dougsz> `monasca.io/whitelist` 13:17:51 <dougsz> slightly weird naming 13:19:06 <dougsz> and the metrics types thing, is for k8s 13:19:16 <dougsz> but could be useful for general endpoints 13:19:19 <witek> I thought, `auto_detect_endpoints` could control if we use static endpoints configuration or K8s auto detection 13:19:52 <dougsz> yeah - I mean it works 13:20:38 <dougsz> but it seems like a chunk of the k8s specific config needs to be pulled down and made available to the non-k8s bit 13:20:59 <dougsz> and you have to worry about backwards compatible option naming etc. 13:21:24 <dougsz> hence my thoughts about making a new plugin 13:21:47 <witek> so you'd prefer to rename the existing one to prometheus-k8s, and publish the prototype based as prometheus? 13:22:13 <dougsz> something like that 13:22:50 <dougsz> probably best not to rename the existing one though for backwards compatibility 13:23:05 <witek> fine for me, if that's much easier to implement 13:23:37 <dougsz> thanks - it's much easier for me too, given limited time 13:25:34 <witek> the plugin name is not that important after all, we could document both options in the same section and point to two different configuration files 13:26:07 <dougsz> that's true - I can push something to that effect 13:26:18 <witek> chaconpiza: any thoughts? 13:27:12 <chaconpiza> to be in the context... the goal of this new plugin is to handle the new version of Ceph? 13:28:29 <witek> new Ceph versions offer Prometheus instrumentation, so yes, we can monitor Ceph with this new Prometheus plugin 13:28:53 <witek> but also any other Prometheus endpoints 13:29:15 <dougsz> chaconpiza: Ceph was the motivation (because the existing plugin parses the Ceph CLI which changes from release to release) 13:29:36 <chaconpiza> yes, I remember. 13:29:46 <dougsz> but yeah, as witek says, to make it easier to use Prometheus endpoints in general without running another Monasca service to compute rates etc. 13:30:04 <chaconpiza> then it sound good like a long term solution 13:31:21 <witek> I'm a little worried about performance with larger setups, where we'd like to scrape several endpoints and define multiple aggregations 13:31:59 <witek> in the long term aggregation on Monasca server would be better 13:32:36 <witek> but that's more work, so I'm happy with this plugin 13:33:08 <dougsz> One advantage of this is that the whitelist can greatly reduce the amount of data going into Monasca 13:33:31 <witek> right, that's very useful 13:33:37 <dougsz> Many Prometheus endpoints produce vast amounts of data at scale (eg. Ceph, 1M scrapes for 10 node cluster) 13:34:33 <dougsz> But I agree, people may want to the agent to be lightweight and shift the compute burden to the central Monasca deployment 13:34:46 <dougsz> *want the 13:34:52 <witek> do you remember this component? 13:34:56 <witek> https://github.com/monasca/monasca-aggregator 13:35:07 <dougsz> yeah 13:35:31 <witek> we could implement it scalable with Faust 13:36:03 <dougsz> That's a good idea 13:36:45 <witek> so many nice things we can do :) 13:37:12 <witek> I was reading through the doc of your Prom plugin 13:37:29 <witek> the `counter` section is somewhat missleading 13:38:06 <dougsz> any feedback much appreciated :) 13:38:36 <witek> will you be proposing it upstream in the next time? 13:39:00 <dougsz> yeah, I will do that, probably the best place to dicuss 13:39:17 <witek> very nice, thanks a lot! 13:39:44 <witek> can we move on? 13:39:56 <dougsz> please do, thanks 13:40:02 <witek> #topic review 13:40:18 <witek> we've made some progress on reviews this week 13:40:41 <witek> the merging of DevStack plugin landed 13:41:10 <witek> also updating ELK change has been updated 13:41:48 <witek> here our board: 13:41:51 <witek> https://storyboard.openstack.org/#!/board/190 13:42:14 <witek> I started looking at periodic notifications 13:42:25 <witek> https://storyboard.openstack.org/#!/story/2006837 13:42:38 <witek> the changes have been up for much too long already 13:43:11 <witek> other one needing attention is IPv6 support: 13:43:17 <witek> https://review.opendev.org/673274 13:43:46 <witek> Adrian has submitted new change deleting the old plugin from monasca-log-api: 13:43:52 <witek> https://review.opendev.org/690527 13:44:23 <witek> do you have some more reviews you'd like to mention? 13:45:07 <dougsz> None from me 13:45:10 <witek> #topic new bugs 13:45:20 <witek> we have one new bug report this week 13:45:25 <witek> https://storyboard.openstack.org/#!/story/2006984 13:45:37 <witek> it's about upgrading the DB schema 13:46:04 <dougsz> yes, that's mine, I will push a fix soon 13:46:23 <witek> nice, thank dougsz 13:46:42 <dougsz> I should have spotted that really - I think the alembic step just needs to query existing plugins which are configured and skip deleting the associated types 13:47:23 <dougsz> Worth knowing about if anyone is upgrading anytime soon 13:48:01 <witek> that's from Queens to any other version? 13:48:19 <dougsz> yeah, we did Queens -> Rocky -> Stein and hit it then 13:48:56 <witek> do you know at which step? 13:49:13 <witek> I mean -> R or -> S 13:49:37 <dougsz> I *think* it is in Rocky where we got rid of built in notification types 13:50:34 <dougsz> No stein actually 13:51:25 <witek> ok, thanks, we're upgrading to Rocky but I think we didn't hit it 13:52:06 <witek> #topic AOB 13:52:16 <chaconpiza> We had in a pre-production-env suddenly a big increase of memory consuption from Influxdb 1.3.4 13:52:31 <chaconpiza> We noticed that the monasca-metric-agent was wrongly configured in the url of Nova for the metric "http_status". 13:53:05 <chaconpiza> So it was producing the metrics with a long string in the "value_meta" besides of having "1" as the metric "value". 13:53:21 <chaconpiza> we are wondering whether Influxdb has troubles to process a big amount of points with this "long value_meta" 13:53:34 <dougsz> chaconpiza: Issue with the detection plugin not configuring the URL correctly? 13:54:09 <chaconpiza> In our devs machines the detection plugin works well and it end up with correct URL for keystone, nova, cinder, etc 13:54:30 <dougsz> value_meta limit is ~2kb right? the main issue I have seen is with two many unique dimensions 13:54:31 <chaconpiza> I am not sure how the pre-production team got it 13:54:41 <dougsz> *s/two/too 13:54:58 <chaconpiza> because of the cardinality? 13:55:01 <dougsz> yeah 13:55:31 <chaconpiza> We are simulating the metric-agent with: https://github.com/monasca/monasca-perf/blob/master/scale_perf 13:55:38 <chaconpiza> agent_simulator.py but setting a long string in the "value_meta". 13:55:52 <chaconpiza> in order to reproduce the issue 13:58:12 <chaconpiza> we will keep you informed in case we can break influxdb because of big "value_meta" 13:58:44 <dougsz> thanks, it's probably worth investigating InfluxDB 1.7 as 1.3 has a security issue where dimension values are leaked across tenants 13:59:11 <witek> I think it would be better to improve the auto-detection script to configure the agent as needed 14:00:04 <chaconpiza> so far all metric-agent's configuration were fixed manually and restarted 14:00:56 <chaconpiza> in pre-prod a single tenant is being used 14:01:13 <witek> OK, please keep us updated 14:01:22 <witek> the time is over 14:01:30 <witek> thanks for the discussions 14:01:43 <dougsz> thanks all, bye 14:01:48 <chaconpiza> thanks 14:01:52 <witek> thanks, bye 14:01:55 <witek> #endmeeting