14:00:22 <rhochmuth> #startmeeting monasca 14:00:23 <openstack> Meeting started Wed Aug 16 14:00:22 2017 UTC and is due to finish in 60 minutes. The chair is rhochmuth. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:00:24 <rhochmuth> 0/ 14:00:24 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:00:26 <openstack> The meeting name has been set to 'monasca' 14:00:27 <rhochmuth> o/ 14:01:31 <rhochmuth> is anyone here for the weekly monasca meeting? 14:01:47 <sgrasley1> Yep 14:02:22 <rhochmuth> hi sgrasley1 14:02:51 <rhochmuth> well, it looks like a pretty small group this week 14:02:57 <rhochmuth> just you and i 14:03:15 <Christian_> Hi Roland! 14:03:16 <sgrasley1> Well a short meeting then. 14:03:30 <rhochmuth> hi Shristian_ 14:03:34 <rhochmuth> Christian_ 14:03:41 <yavor_OP5> Hi Roland, Yavor here from OP5, thanks for the invite 14:03:47 <rhochmuth> so, i t hink some folks are out this week on vacation 14:03:51 <rhochmuth> i know witek is out 14:04:03 <rhochmuth> hi yavor_OP5 14:04:16 <rhochmuth> glad you guys made it 14:05:14 <yavor_OP5> trip down memory lane, this... must be 20 yrs since I last used irc ;) 14:05:39 <rhochmuth> lol 14:05:52 <rhochmuth> you'll love it 14:07:04 <rhochmuth> ok, the agenda doesn't have anything in it 14:07:10 <rhochmuth> https://etherpad.openstack.org/p/monasca-team-meeting-agenda 14:08:02 <rhochmuth> i think we can go with an open floor 14:08:13 <rhochmuth> so do folks have any topics to discss this week 14:08:55 <rhochmuth> we have a nice backlog on reviews 14:08:58 <rhochmuth> https://review.openstack.org/#/q/status:open+monasca,n,z 14:10:20 <rhochmuth> OP5 folks, have you had a chance to review some of the areas of development and come to any direction? 14:10:32 <rhochmuth> or conclusion 14:10:50 <yavor_OP5> we've started investigating "poor man's clustering" on influxdb on our side... 14:11:24 <rhochmuth> we had a number of strategies that we had considered that would involve some changes to monasca 14:11:52 <rhochmuth> one idea is that if an influxdb node/container goes down 14:12:30 <rhochmuth> when it is brought back on-line it would have to catch-up 14:12:42 <yavor_OP5> yup, we thought about that 14:13:12 <rhochmuth> therefore, queries from the monasca-api to that influxdb node would not occur until the consumer offset of the persister for that node was close to the producer offset 14:13:43 <rhochmuth> this assumes that persister storage of some type is being used, such as EBS in an AWS environment 14:13:57 <rhochmuth> or that the influxdb node is the same node as before with the same disks available 14:14:32 <rhochmuth> if the node is down for a while, then it could take a while to catach-up via kafka and the persister 14:14:53 <rhochmuth> so another option is to copy the data from a working node to the new node 14:14:57 <rhochmuth> that is more invovled 14:15:19 <rhochmuth> and at some point i think data to both nodes would need to stop 14:16:00 <yavor_OP5> would it be OK to assume that the kafka cluster would provide storage needed for catch-up? 14:16:15 <rhochmuth> yes 14:16:25 <yavor_OP5> since it is already a part of the reference architecture 14:16:29 <rhochmuth> you could store data in kafka for several days or a week 14:16:42 <yavor_OP5> otherwise we put constraints on the influx implementation 14:16:51 <yavor_OP5> which may vary from instance to instance 14:17:08 <rhochmuth> the data is uncompressed in kafka, so the amount of storage is possibly the biggest issue 14:17:18 <sgrasley1> What are options for persisting to 2 influxdb instances 14:17:22 <rhochmuth> you would need to size capacity appropriately to handle your injest rate 14:18:10 <rhochmuth> sgrasley1: you can replicate influxdb using the open-source code 14:18:16 <yavor_OP5> what do you believe is a reasonable expectation of time to catch up with no loss? 2hrs? 2days? We have to have an idea about some sort of duration 14:18:23 <rhochmuth> or you can use infludb's proprietary clustering solution 14:19:21 <rhochmuth> i think the catch-up time is related to your amount of downtime, injest rate of metrics, and the throughput to the infludb node 14:19:54 <rhochmuth> if your injest rate is 50K metrics/sec and you can only write 50K metrics/sec to influxdb, you would never catch-up 14:20:07 <yavor_OP5> sure.. 14:20:27 <rhochmuth> so, you would have to measure that as it is dependent on systems, number of persisters 14:20:57 <yavor_OP5> but I still think that one needs to have a reasonable expectation of how long an influx outage can last without losing data... 14:21:44 <rhochmuth> in a kubernetes environment recovery times can be very quick 14:22:09 <yavor_OP5> to exemplify: from our company view if we can sustain an influx outage for a particular ingestion rate of 6 hrs x ingestion rate and then catchup then this is a perfectly acceptable clustering solution 14:22:33 <yavor_OP5> which obviously leads to a question of how you monitor your Monasca instance ;)' 14:23:36 <yavor_OP5> rhochmuth: how would you like us to structure our findings? 14:24:13 <rhochmuth> good question 14:24:38 <rhochmuth> i think an overview of the various alternatives, pros/cons of each one 14:24:49 <rhochmuth> the usual place where this ends-up in openstack is an etherpad 14:25:16 <rhochmuth> For example, the monasca team meeting is at, https://etherpad.openstack.org/p/monasca-team-meeting-agenda 14:25:32 <rhochmuth> but you could easily add an etherpad for influxdb 14:25:46 <rhochmuth> monasca-influxdb 14:25:46 <yavor_OP5> ok, will look into it 14:26:10 <rhochmuth> and, unless you are done earlier than the midcycle, we could review at one of the weekly meetings 14:26:23 <rhochmuth> but covering this at the mid-cycle woukld be best as that is just a few weeks away 14:26:32 <rhochmuth> witek will be organizing that 14:26:52 <rhochmuth> but, due to his vacation, it will take him another week to cover that 14:28:22 <rhochmuth> so, should we wrap-up a little early this week? 14:28:37 <rhochmuth> unless there are other topics/questions to cover 14:28:51 <Christian_> I'm good 14:29:02 <rhochmuth> thx Christian_ 14:29:09 <Christian_> Yavor was the active one this time :-) 14:29:41 <rhochmuth> thanks for getting invovled in the project, we are looking forward to the involvement 14:29:49 <rhochmuth> and sorry about the small group today 14:30:06 <yavor_OP5> thanks for the update! 14:30:07 <rhochmuth> but it is that time of year when folks are taking vacations 14:30:18 <yavor_OP5> yeah, anything for you? 14:30:31 <yavor_OP5> in terms of vacation that is 14:30:49 <rhochmuth> i have a backlog, but probably in september 14:31:19 <rhochmuth> OK everyone, thanks for the meeting today 14:31:23 <rhochmuth> see you all next week 14:31:29 <rhochmuth> #endmeeting