#openstack-meeting-3 log

15:01:48 <rhochmuth> #startmeeting monasca
15:01:49 <openstack> Meeting started Wed Nov  9 15:01:48 2016 UTC and is due to finish in 60 minutes.  The chair is rhochmuth. Information about MeetBot at http://wiki.debian.org/MeetBot.
15:01:50 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
15:01:53 <openstack> The meeting name has been set to 'monasca'
15:01:57 <bklei> o/
15:02:03 <rhochmuth> am i in the right time-zone
15:02:17 <witek> you must know :)
15:02:20 <rbak> o/
15:02:20 <rhochmuth> bklei: you are here, so maybe
15:02:24 <shinya_kwbt> o/
15:02:26 <bklei> i believe so
15:02:34 <rhochmuth> looks like i made it on time
15:02:39 <shinya_kwbt> rhochmuth: You are right
15:02:49 <rhochmuth> we just switched the clocks in the usa
15:02:56 <rhochmuth> daylight savings time
15:03:14 <rhochmuth> maybe trump can get rid of daylight savings time
15:03:36 <rhochmuth> sorry meant "the trump"
15:03:41 <bklei> :)
15:03:58 <rhochmuth> So our agenda is at, https://etherpad.openstack.org/p/monasca-team-meeting-agenda
15:04:11 <rhochmuth> and there is nothing in it
15:04:16 <rhochmuth> for today
15:04:34 <rhochmuth> anyone have any topics to discuss
15:04:45 <rhochmuth> reviews
15:04:47 <rhochmuth> ...
15:05:04 <witek> haad1 asked for grafana-datasource working with stable/newton
15:05:09 <bklei> how's cassandra?
15:05:27 <rhochmuth> so, let's take grafana-datasource first
15:05:41 <rbak> what's not working with newton?
15:05:49 <rhochmuth> at onetime grafana-datasource did work on stable newton
15:06:03 <rhochmuth> see https://review.openstack.org/#/c/395041/
15:06:09 <witek> stable/newton does not have new endpoints
15:06:24 <rhochmuth> so, it looks like grafana datasource was update to use the new dimensions names/values endpoints
15:06:28 <witek> but older commit should work, I think
15:06:33 <rhochmuth> but that wasn't on newton
15:06:51 <rhochmuth> i don't think i can merge adam's change to newton
15:07:03 <witek> no, I don't think either
15:07:06 <rhochmuth> that is new functionality that wasnt't in newton
15:07:22 <rbak> So changing the change to the new dimensions api broke newton?
15:07:24 <rhochmuth> we tried to get it in, but missed the window
15:07:30 <witek> we should tag the grafana-datasource though
15:07:34 <rhochmuth> rbak: correct
15:07:52 <rbak> Well we can't exactly role that back now.  We've got it in production here.
15:08:10 <rhochmuth> no, i don't think we shoudl roll it back
15:08:14 <rbak> Any suggestions on how to handle that?
15:08:37 <rhochmuth> so we can't take adam's review and merge it
15:08:38 <witek> in the perfect world we should have stable/newton for grafana-datasource
15:08:44 <rhochmuth> correct
15:08:56 <witek> but I think it's enough to make a tag
15:09:06 <witek> https://review.openstack.org/#/c/395595/
15:09:34 <rhochmuth> yes, that seems like the best alternative
15:09:47 <rbak> that works for me.
15:09:55 <rbak> this won't be a perfect solution though
15:10:16 <witek> rbak: would you prefer the branch?
15:10:28 <rbak> Grafana is also changing versions and so the datasource version will need to match roughly a version of grafana or some features may be broken
15:10:42 <rhochmuth> that commit hash says "switch templating to use dimension values endpoint"
15:10:45 <rhochmuth> and it is for ocata
15:10:51 <rhochmuth> that doesn't help newton
15:11:14 <rhochmuth> so for newon we would have to do something similar, with the commit tag one commit/review earlier
15:11:28 <rbak> witek: I think the tag idea is fine, but any solution won't be perfect since Grafana isn't using Openstack versions
15:11:44 <rhochmuth> yeah, that too
15:12:33 <rhochmuth> seems like the best we can do is to release the grafana datasource for the newton release
15:12:58 <rhochmuth> would you want to use a 1.0.0 version number for ocata then?
15:13:14 <rhochmuth> should we have 1.0.0 for newton, and 2.0.0 for ocata
15:13:20 <rhochmuth> since they are incompatible
15:13:55 <rbak> I think 1.0.0 for Newton is fine.  That was the first release to include the grafana datasource.
15:14:06 <rhochmuth> right
15:14:23 <rhochmuth> then 2.0.0 for Ocata, since it is a incompatible change
15:14:29 <rbak> works for me
15:14:59 <witek> we still have to tag on master first, right?
15:15:10 <rhochmuth> i don't think so
15:15:45 <rhochmuth> ocata is master right now
15:15:50 <witek> right
15:16:07 <rhochmuth> so, your review, https://review.openstack.org/#/c/395595/, would be modified to be 2.0.0
15:16:12 <witek> and we don't have any other branch
15:16:27 <rhochmuth> then we would create a release for newton, if they let us, that is 1.0.0
15:16:46 <rhochmuth> with a commit tag prior to the one in your current review
15:17:12 <rhochmuth> basically a commit tag that doesn't have the new dimensions and names changes
15:17:12 <witek> my review puts the tag with the code for newton, so it should stay 1.0 in my oppinion
15:17:39 <rhochmuth> Release monasca-grafana-datasource 1.0.0 for Ocata
15:17:50 <rhochmuth> it says ocata and is in the ocata directory
15:18:13 <witek> yes, I don't have other place for it at the moment
15:18:39 <rhochmuth> i think it needs to the newton directory
15:18:43 <rhochmuth> and say newton
15:18:51 <witek> ok
15:19:14 <witek> I'll move it then
15:19:32 <rhochmuth> well, i think what you want is two reviews
15:19:56 <witek> another one with HEAD, you mean?
15:20:16 <rhochmuth> the current one is valid for ocata, it just needs to be for a 2.0.0 version number
15:20:29 <rhochmuth> then, on the newton release
15:20:35 <rhochmuth> a similar review
15:20:42 <rhochmuth> but it needs to be 1.0.0
15:20:56 <rhochmuth> and it needs to use a commit tag earlier in time
15:20:58 <witek> for Ocata I would take newer state then
15:21:10 <rhochmuth> at some point that didn't have the names/dimensions changes
15:21:14 <rhochmuth> correct
15:21:23 <rhochmuth> ocata will pull top of master branch
15:21:29 <witek> ok
15:21:45 <witek> also, jenkins complains "no release job specified for openstack/monasca-grafana-datasource, should be one of ['nodejs4-publish-to-npm', 'openstack-server-release-jobs', 'publish-to-pypi', 'puppet-tarball-jobs'] or no release will be published"
15:22:05 <witek> is it OK to add 'nodejs4-publish-to-npm'?
15:22:27 <rhochmuth> i guess
15:22:35 <witek> rbak: ?
15:22:40 <rbak> Does that really make sense though?
15:22:56 <rbak> The only place it needs to be published is the grafana site
15:23:08 <rbak> And installing it is just git cloning it to the right place
15:23:45 <rbak> It won't hurt to specify a release job, I'm just not sure it does any good either.
15:24:29 <witek> Jenkins returns error otherwise
15:25:00 <rbak> Alright, if it's required then add it.
15:25:49 <witek> ok, I think that's all for this topic
15:26:25 <rhochmuth> #topic cassandra
15:26:53 <bklei> just wondering how that effort is going
15:28:18 <rhochmuth> so, we haven't completely given up on cassandra
15:28:33 <rhochmuth> but, it is not looking good
15:28:47 <witek> rhochmuth: do you have some news?
15:28:52 <rhochmuth> you can look at, https://etherpad.openstack.org/p/monasca_cassandra
15:29:27 <bklei> that's a big list of problems
15:29:47 <rhochmuth> yeah, it might be difficult piecing together from all the notes
15:29:56 <rhochmuth> that were starting to get a little haphazard
15:30:34 <rhochmuth> i would start at Issue 6.2
15:30:52 <rhochmuth> but i'll try and summarize
15:31:06 <witek> Languages, Python vs Java and insert rates ?
15:31:14 <rhochmuth> cassandra has a 2B row limit, but the
15:31:18 <rhochmuth> sorry
15:31:21 <rhochmuth> yes witek
15:31:36 <rhochmuth> cassandra has a 2B row limit, but the effective limit is around 100K
15:31:49 <rhochmuth> this means that series will have to be split across multiple partitions
15:32:07 <rhochmuth> which implies that you can't do in-database user-defined functions/aggregations
15:32:31 <rhochmuth> which implies you have to query everything into the API, then do statistics functions, ...
15:32:51 <rhochmuth> the other main problem is in the areas of secondary indices
15:33:22 <rhochmuth> basically inverted index tables to get a meric ID from a metric name and dimensions
15:33:35 <rhochmuth> you end up creating a lot of paritions
15:33:44 <rhochmuth> which you can't easily search across
15:34:01 <rhochmuth> insert performance was not very good either
15:34:14 <witek> other solutions seem to overcome that problem by indexing in different DB
15:34:31 <witek> relational or ES
15:35:33 <rhochmuth> that is an option we considered
15:35:44 <rhochmuth> i'm not keen on it, as it adds a lot of extra complexity
15:35:55 <rhochmuth> in addition, the other problems aren't addressed
15:36:22 <rhochmuth> all the stitching together of time-series in the API would be a major problem
15:36:56 <rhochmuth> additionally, for every query, you need to first query the DB with the indices, get the response, then query cassandra for the time series
15:37:10 <rhochmuth> so, there is a bit of latency in the data path
15:38:07 <rhochmuth> we could also store indices in other places too
15:38:43 <rhochmuth> but the show-stopper for me is the limits on row sizes, which leads to all the "client" side processing
15:39:24 <witek> I think KairosDB should have this implemented, I'm not sure about performance though
15:40:07 <rhochmuth> not sure what else to day
15:40:14 <rhochmuth> hpe spent a huge amount of time on this
15:40:29 <rhochmuth> both deklan and i spent about 5 weeks lf late nights and weeks looking into this
15:40:40 <rhochmuth> and we had others of the hpe monasca team involved too
15:40:44 <rhochmuth> please, prove us wrong
15:40:53 <bklei> :)
15:41:00 <rhochmuth> i don't see anyone else stepping up to do the grunt work
15:41:23 <rhochmuth> this is after there were a lot of statements that folks would be pitching in
15:41:39 <rhochmuth> as far as i'm concerend kairosdb is on cassandra
15:41:51 <rhochmuth> if you understand cassandra then there is no way kairosdb could solve the problems
15:41:58 <rhochmuth> but please, do the work
15:42:03 <rhochmuth> set-up a three node cluster
15:42:08 <rhochmuth> create the schemas
15:42:12 <rhochmuth> write the benchmarks
15:42:14 <rhochmuth> do the analysis
15:42:55 <bklei> we at charter are grateful for the work hpe has done on this, for sure
15:43:20 <rhochmuth> in the near future i will be removing all the cassandra stuff from monasca
15:43:41 <bklei> we don't have the resource to pick this up though -- just wanted an update
15:43:43 <rhochmuth> i get commits from folks that have never looked at performance
15:44:31 <bklei> so hpe's strategy is continue to ship w/vertica then?
15:44:39 <rhochmuth> no
15:44:49 <rhochmuth> for the time being we will use vertica
15:44:49 <witek> we did look at performance, not with kairosDB though
15:45:04 <rhochmuth> what did you do
15:45:50 <witek> we tested the complete monasca installation and compared influx and cassandra rates
15:45:53 <rhochmuth> we are looking at alternatives to vertica which include incfluxdb, ES, and other
15:47:06 <rhochmuth> bklei: the problem for vertica is that it will be going to microfocus
15:47:25 <rhochmuth> so, at some point in time, we expect that we will require a license too
15:47:26 <bklei> aah.  losing your baby.
15:47:59 <rhochmuth> the details aren't completely known, but that is a potential outcome, and probably the more likely
15:50:16 <bklei> well, thx for the update
15:50:25 <rhochmuth> wecome
15:50:34 <rhochmuth> not sure it has been a great update
15:50:41 <rhochmuth> i was really hopeful when starting
15:50:56 <rhochmuth> we actually did hit 150K inserts per second on a single node
15:51:12 <rhochmuth> but then when we went to a 3 node cluster, the performance remained at 150K
15:51:24 <rhochmuth> and inserts per second don't translate into metrics per second
15:51:40 <rhochmuth> some of the schemas requires multiple tables, and therefore multiple inserts
15:51:51 <bklei> as an operator, we are more concerned about getting data out, than in.  there's so much buffering anyway on the write path...
15:51:58 <rhochmuth> you can try and be smart about inserts then, but that adds a lot of complexity
15:52:03 <rhochmuth> which i didn't mind
15:52:16 <rhochmuth> because it is only software and presented some challenges
15:52:24 <rhochmuth> but, in the end the performance was not acceptable
15:52:34 <rhochmuth> InfluxDB can hit 300K metrics/sec
15:52:57 <rhochmuth> i don't really want to spend time developing something that is not competitive
15:53:18 <rhochmuth> also, the influxdb team have done their analysis
15:53:23 <rhochmuth> and come to a similar conclusion
15:53:36 <rhochmuth> so, we've validate some of their competitive analysis
15:53:58 <rhochmuth> i pretty much spent 5 weeks trying to prove them wrong!
15:54:06 <rhochmuth> or prove we could do it
15:54:33 <rhochmuth> that doesn't align with a lot of the rest of the cassandra community using it for time-series
15:54:41 <rhochmuth> but, if you dive into the example that are shown
15:54:51 <rhochmuth> sensors, weather stations, iot, ...
15:55:05 <witek> so you want to go with InfluxDB, what about clustering?
15:55:14 <rhochmuth> they are a lot more specific and they don't have the same problems we do
15:55:26 <rhochmuth> influxdb is an option
15:55:35 <rhochmuth> there are several options
15:56:44 <shinya_kwbt> So MonfluxDB forked from past InfluxDB.
15:56:45 <rhochmuth> how about we have another meeting on the influxdb question and alternatives
15:57:05 <witek> yes, I think we need it
15:57:57 <rhochmuth> ok, i'll setup something ASAP
15:58:26 <rhochmuth> shinya_kwbt: possibly
15:59:12 <rhochmuth> ok, i need to end the meeting
15:59:19 <rhochmuth> bye everyone
15:59:26 <kamil___> bye
15:59:27 <witek> bye
15:59:49 <rhochmuth> #endmeeting