14:00:22 <rhochmuth> #startmeeting monasca
14:00:23 <openstack> Meeting started Wed Aug 16 14:00:22 2017 UTC and is due to finish in 60 minutes.  The chair is rhochmuth. Information about MeetBot at http://wiki.debian.org/MeetBot.
14:00:24 <rhochmuth> 0/
14:00:24 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
14:00:26 <openstack> The meeting name has been set to 'monasca'
14:00:27 <rhochmuth> o/
14:01:31 <rhochmuth> is anyone here for the weekly monasca meeting?
14:01:47 <sgrasley1> Yep
14:02:22 <rhochmuth> hi sgrasley1
14:02:51 <rhochmuth> well, it looks like a pretty small group this week
14:02:57 <rhochmuth> just you and i
14:03:15 <Christian_> Hi Roland!
14:03:16 <sgrasley1> Well a short meeting then.
14:03:30 <rhochmuth> hi Shristian_
14:03:34 <rhochmuth> Christian_
14:03:41 <yavor_OP5> Hi Roland, Yavor here from OP5, thanks for the invite
14:03:47 <rhochmuth> so, i t hink some folks are out this week on vacation
14:03:51 <rhochmuth> i know witek is out
14:04:03 <rhochmuth> hi yavor_OP5
14:04:16 <rhochmuth> glad you guys made it
14:05:14 <yavor_OP5> trip down memory lane, this... must be 20 yrs since I last used irc ;)
14:05:39 <rhochmuth> lol
14:05:52 <rhochmuth> you'll love it
14:07:04 <rhochmuth> ok, the agenda doesn't have anything in it
14:07:10 <rhochmuth> https://etherpad.openstack.org/p/monasca-team-meeting-agenda
14:08:02 <rhochmuth> i think we can go with an open floor
14:08:13 <rhochmuth> so do folks have any topics to discss this week
14:08:55 <rhochmuth> we have a nice backlog on reviews
14:08:58 <rhochmuth> https://review.openstack.org/#/q/status:open+monasca,n,z
14:10:20 <rhochmuth> OP5 folks, have you had a chance to review some of the areas of development and come to any direction?
14:10:32 <rhochmuth> or conclusion
14:10:50 <yavor_OP5> we've started investigating "poor man's clustering" on influxdb on our side...
14:11:24 <rhochmuth> we had a number of strategies that we had considered that would involve some changes to monasca
14:11:52 <rhochmuth> one idea is that if an influxdb node/container goes down
14:12:30 <rhochmuth> when it is brought back on-line it would have to catch-up
14:12:42 <yavor_OP5> yup, we thought about that
14:13:12 <rhochmuth> therefore, queries from the monasca-api to that influxdb node would not occur until the consumer offset of the persister for that node was close to the producer offset
14:13:43 <rhochmuth> this assumes that persister storage of some type is being used, such as EBS in an AWS environment
14:13:57 <rhochmuth> or that the influxdb node is the same node as before with the same disks available
14:14:32 <rhochmuth> if the node is down for a while, then it could take a while to catach-up via kafka and the persister
14:14:53 <rhochmuth> so another option is to copy the data from a working node to the new node
14:14:57 <rhochmuth> that is more invovled
14:15:19 <rhochmuth> and at some point i think data to both nodes would need to stop
14:16:00 <yavor_OP5> would it be OK to assume that the kafka cluster would provide storage needed for catch-up?
14:16:15 <rhochmuth> yes
14:16:25 <yavor_OP5> since it is already a part of the reference architecture
14:16:29 <rhochmuth> you could store data in kafka for several days or a week
14:16:42 <yavor_OP5> otherwise we put constraints on the influx implementation
14:16:51 <yavor_OP5> which may vary from instance to instance
14:17:08 <rhochmuth> the data is uncompressed in kafka, so the amount of storage is possibly the biggest issue
14:17:18 <sgrasley1> What are options for persisting to 2 influxdb instances
14:17:22 <rhochmuth> you would need to size capacity appropriately to handle your injest rate
14:18:10 <rhochmuth> sgrasley1: you can replicate influxdb using the open-source code
14:18:16 <yavor_OP5> what do you believe is a reasonable expectation of time to catch up with no loss? 2hrs? 2days? We have to have an idea about some sort of duration
14:18:23 <rhochmuth> or you can use infludb's proprietary clustering solution
14:19:21 <rhochmuth> i think the catch-up time is related to your amount of downtime, injest rate of metrics, and the throughput to the infludb node
14:19:54 <rhochmuth> if your injest rate is 50K metrics/sec and you can only write 50K metrics/sec to influxdb, you would never catch-up
14:20:07 <yavor_OP5> sure..
14:20:27 <rhochmuth> so, you would have to measure that as it is dependent on systems, number of persisters
14:20:57 <yavor_OP5> but I still think that one needs to have a reasonable expectation of how long an influx outage can last without losing data...
14:21:44 <rhochmuth> in a kubernetes environment recovery times can be very quick
14:22:09 <yavor_OP5> to exemplify: from our company view if we can sustain an influx outage for a particular ingestion rate of 6 hrs x ingestion rate and then catchup then this is a perfectly acceptable clustering solution
14:22:33 <yavor_OP5> which obviously leads to a question of how you monitor your Monasca instance ;)'
14:23:36 <yavor_OP5> rhochmuth: how would you like us to structure our findings?
14:24:13 <rhochmuth> good question
14:24:38 <rhochmuth> i think an overview of the various alternatives, pros/cons of each one
14:24:49 <rhochmuth> the usual place where this ends-up in openstack is an etherpad
14:25:16 <rhochmuth> For example, the monasca team meeting is at, https://etherpad.openstack.org/p/monasca-team-meeting-agenda
14:25:32 <rhochmuth> but you could easily add an etherpad for influxdb
14:25:46 <rhochmuth> monasca-influxdb
14:25:46 <yavor_OP5> ok, will look into it
14:26:10 <rhochmuth> and, unless you are done earlier than the midcycle, we could review at one of the weekly meetings
14:26:23 <rhochmuth> but covering this at the mid-cycle woukld be best as that is just a few weeks away
14:26:32 <rhochmuth> witek will be organizing that
14:26:52 <rhochmuth> but, due to his vacation, it will take him another week to cover that
14:28:22 <rhochmuth> so, should we wrap-up a little early this week?
14:28:37 <rhochmuth> unless there are other topics/questions to cover
14:28:51 <Christian_> I'm good
14:29:02 <rhochmuth> thx Christian_
14:29:09 <Christian_> Yavor was the active one this time :-)
14:29:41 <rhochmuth> thanks for getting invovled in the project, we are looking forward to the involvement
14:29:49 <rhochmuth> and sorry about the small group today
14:30:06 <yavor_OP5> thanks for the update!
14:30:07 <rhochmuth> but it is that time of year when folks are taking vacations
14:30:18 <yavor_OP5> yeah, anything for you?
14:30:31 <yavor_OP5> in terms of vacation that is
14:30:49 <rhochmuth> i have a backlog, but probably in september
14:31:19 <rhochmuth> OK everyone, thanks for the meeting today
14:31:23 <rhochmuth> see you all next week
14:31:29 <rhochmuth> #endmeeting