15:00:43 <witek_> #startmeeting monasca 15:00:43 <openstack> Meeting started Wed Sep 11 15:00:43 2019 UTC and is due to finish in 60 minutes. The chair is witek_. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:00:44 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:00:46 <openstack> The meeting name has been set to 'monasca' 15:01:04 <witek_> hello everyone 15:01:06 <joadavis> hello 15:01:52 <witek_> anyone else around? 15:01:59 <hosanai> hi 15:02:14 <witek_> hi hosanai, good to see you 15:03:28 <witek_> small group but let's start 15:03:36 <witek_> #topic monasca.io 15:04:07 <witek_> thanks to Roland for extending the domain registration 15:04:16 <witek_> and Tim for fixing the website 15:04:34 <witek_> so we have monasca.io up and running again 15:05:09 <witek_> we have also agreed we can use it as the community landing page 15:05:21 <witek_> we'll have to update the content 15:05:54 <witek_> after completing this we 15:06:29 <witek_> the content is maintained in GitHub 15:06:49 <witek_> https://github.com/monasca/monasca.github.io 15:07:17 <witek_> the same rules as for other repos are valid 15:07:42 <witek_> contributors should create PRs and sign their commits 15:07:58 <witek_> these will be reviewed by core reviewers 15:08:15 <witek_> thanks again to Roland and Tim 15:08:38 <witek_> hi dougsz 15:08:48 <Wasaac> o/ Sorry I'm late 15:09:02 <witek_> hi Wasaac 15:09:02 <dougsz> hi, sorry same as ^ 15:09:12 <witek_> just shortly updated on monasca.io page 15:09:43 <witek_> it's up again and we have approval to use it as the community landing page 15:10:08 <joadavis> Now that it is back, do we need to let any users know it is operational again? I think it was a user that first pointed out it was down 15:10:44 <Wasaac> Where would one announce to users if not on that page? 15:10:59 <joadavis> I'm sure they could figure it out themselves if they try deploying a helm chart with dependency 15:11:28 <witek_> joadavis: do you think about openstack-discuss? 15:12:22 <dougsz> Might be nice to update it a little before promoting it? I recall one of our customers go annoyed at the Documentation, coming soon page! (http://monasca.io/docs/index.html) 15:12:29 <joadavis> I was trying to remember which user it was that first reported the problem. Was it t-mobile? 15:12:57 <joadavis> I was thinking a more direct contact than a broadcast message 15:13:14 <dougsz> 👍 15:13:40 <witek_> I'll send an email to the reporter, and after updating the content we can promote at openstack-discuss 15:14:27 <witek_> #topic Review Priority flag 15:14:43 <witek_> http://lists.openstack.org/pipermail/openstack-discuss/2019-September/009162.html 15:15:14 <witek_> I've suggested a simple process for setting review priority flag 15:15:59 <witek_> any change could be proposed to get prioritized by anyone in the list or in the meeting 15:16:25 <witek_> Any core reviewer, preferably from a different company, could confirm such proposed change by setting RV +1. 15:16:42 <witek_> what do you think? 15:16:47 <dougsz> I think it's a good idea - we can focus review efforts 15:17:23 <witek_> we also give more visibility important changes 15:19:02 <witek_> additionally, Doug suggested publishing Gerrit Dashboard with the list of prioritized changes as one of the items 15:19:06 <witek_> http://www.tinyurl.com/monasca 15:20:19 <joadavis> cool 15:21:23 <witek_> I guess we can push a change to our docs and see in the review if folks like it 15:22:56 <dougsz> Sounds like a good plan 15:22:58 <Wasaac> +1 15:23:32 <witek_> cool, thanks 15:23:49 <witek_> #topic Kafka client status update 15:24:25 <witek_> when testing Kafka publisher in API with tempest test for Python 2 I 15:25:11 <witek_> I've found a bug in monasca-common 15:25:11 <witek_> https://review.opendev.org/680653 15:26:27 <witek_> with this bugfix all tempest tests for API Kafka client upgrade pass 15:27:59 <witek_> with the changes in API and notification (persister already merged) we have all python components updated 15:28:31 <witek_> so the goal is getting close to completion 15:29:10 <witek_> I'll appreciate your reviews 15:30:12 <witek_> that's all from me, do you have other topics? 15:31:01 <dougsz> brtknr asked me to give an update about his work on improving InfluxDB performance 15:31:20 <joadavis> I probably should, but don't currently. If anyone is interested in discussing Monasca Events I did think about it a week ago. :P 15:31:25 <witek_> #topic improving InfluxDB performance 15:32:08 <dougsz> The first change is to support scoping dimension queries by time to speed up Grafana dashboard loads: https://review.opendev.org/#/c/670318 15:32:51 <dougsz> The problem we have hit, is that InfluxDB, cannot accurate return dimension values within a specified time window. It can return them within the shard duration, which can vary between ~days and ~weeks. 15:33:16 <dougsz> The current situation is that you load a DB with a dynamic query to fetch all hostnames for the time series. 15:33:44 <dougsz> With this patch you would load a subset of hostnames within the timewindow on the dashboard. This is a lot faster. 15:34:01 <dougsz> However, what we actually get are hostnames from outside the time window on the dashboard. 15:34:21 <dougsz> The query is a lot faster, but the API doesn't really do what it says. 15:34:34 <witek_> oops, any idea why? 15:34:44 <joadavis> sorry, I'm trying not to laugh out loud at that 15:34:57 <dougsz> :) It's a fundamental limitation with Influx 15:35:34 <Wasaac> So to confirm, the list of hosts returned will contain all hosts that did post data in the time window 15:35:36 <joadavis> If the return is a superset of what you are looking for, it would be tempting to do some post-processing filtering to remove hostnames you don't want 15:36:00 <dougsz> Wasaac: yes, that's correct 15:36:00 <Wasaac> But will also include those that posted outside the time window, but within the shard window 15:36:29 <dougsz> joadavis: yes, the problem is the dimension values are not returned with timestamps, so no post-processing is possible 15:36:38 <joadavis> yuck 15:37:17 <dougsz> brtknr tried alternative queries, which work, but the performance is worse than just getting all values 15:37:35 <witek_> and right now (without scoping) we get all dimension values which where ever written? 15:38:06 <dougsz> yeah, and in large DBs, that query can take 30mins + and lock up the InfluxDB instance while it runs 15:38:43 <Wasaac> Sounds to me that getting a faster return is worth more than having an accurate list 15:38:50 <Wasaac> Provided that's understood by users 15:38:53 <witek_> Wasaac: +1 15:38:57 <dougsz> That is the question 15:39:36 <witek_> the list will get much shorter and eventually updated after the shard "expires" completely 15:39:46 <witek_> seems to be good enough to me 15:40:05 <dougsz> ok, thanks, we will press on with that change then 15:40:21 <dougsz> If anyone thinks of any comments please add them to: https://review.opendev.org/#/c/670318 15:40:37 <dougsz> Second change to improve performance, is to use an InfluxDB per tenant 15:40:56 <dougsz> So that queries run against smaller datasets 15:41:33 <dougsz> This one seems to work nicely, but brtknr needs to investigate tempest failures 15:41:57 <witek_> so it requires changes both in API and persister? 15:42:19 <dougsz> That's correct, and a migration script to move data to the new layout 15:42:22 <dougsz> https://review.opendev.org/#/q/topic:story/2006331+(status:open+OR+status:merged) 15:42:31 <dougsz> ^ If anyone is interested 15:42:56 <dougsz> That's it from me on these two. 15:43:24 <witek_> that's the first step in implementing scalable InfluxDB setup 15:43:40 <witek_> very nice, thanks 15:44:43 <witek_> do you think automatic partitioning (not only based on tenant) would be possible in future? 15:45:54 <dougsz> Hmm, I suppose it should be with the right adapter 15:46:28 <dougsz> It would be nice to investigate TimescaleDB at that stage since it provides clustering out of the box 15:47:13 <witek_> I'm not sure about their licensing 15:48:06 <witek_> https://www.timescale.com/products 15:48:38 <witek_> some features are limited to enterprise version 15:48:46 <witek_> but I haven't looked in detail 15:49:37 <dougsz> Hmm, yes, automated data retention policies is missing from open-source 15:49:42 <dougsz> there is also https://eng.uber.com/m3/ 15:49:56 <witek_> with only Go client 15:50:20 <witek_> which is not a blocker though 15:50:37 <dougsz> Yeah, but good point - it's clearly not a task for this cycle 15:53:09 <witek_> hosanai: do you have any update on monasca-analytics? 15:53:42 <dougsz> 500 million metrics per second, Uber claim, I wonder if anyone actually deployed it outside of Uber 15:54:01 <witek_> :) 15:54:12 <hosanai> witek_: now i'm working on python3.6/3.7 support. 15:54:22 <dougsz> * oops, aggregates 15:55:07 <witek_> hosanai: any idea about the time frame? the community goal has the deadline in 3 days 15:55:55 <hosanai> witek_: do my best :-) 15:56:12 <joadavis> Glad hosanai mentioned py3. A documentation question came up - we should be sure that documentation reflects any python 3 changes to match up with any changes we made in code or cli 15:56:56 <joadavis> likely not much difference, but I hadnt thought of that aspect when I looked at py3 gate tests in the past 15:57:41 <hosanai> thanks for the info. 15:58:51 <witek_> any last comment before I wrap up? 15:59:20 <witek_> thanks for joining today 15:59:42 <witek_> let's make some progress on reviews 15:59:43 <joadavis> thanks all 15:59:47 <hosanai> thx! 15:59:50 <dougsz> thanks all, bye 15:59:51 <witek_> and see you next week 15:59:53 <witek_> bye 15:59:57 <witek_> #endmeeting