#openstack-publiccloud log

14:04:09 <tobberydberg> #startmeeting publiccloud_wg
14:04:09 <openstack> Meeting started Thu Jul 18 14:04:09 2019 UTC and is due to finish in 60 minutes.  The chair is tobberydberg. Information about MeetBot at http://wiki.debian.org/MeetBot.
14:04:11 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
14:04:14 <openstack> The meeting name has been set to 'publiccloud_wg'
14:04:33 <tobberydberg> Agenda found at https://etherpad.openstack.org/p/publiccloud-wg
14:04:47 <tobberydberg> Simple agenda today, one topic :-)
14:06:00 <witek> hi
14:06:17 <ncastele> Hi
14:06:44 <tobberydberg> welcome folks
14:07:04 <tobberydberg> meeting notes for this topic is located here: https://etherpad.openstack.org/p/publiccloud-sig-billing-implementation-proposal
14:07:25 <tobberydberg> Short recap first maybe ...
14:07:40 <ncastele> yep, needed, long time no see :D
14:07:50 <peschk_l> would be great :)
14:07:51 <tobberydberg> What we said last time was to do a metric / collection method mapping
14:08:34 <tobberydberg> We have identified some metrics that we need to collect, as well as a few different suggestions of collection methods
14:09:46 <tobberydberg> Just created a google spreadsheet for that purpose
14:09:47 <tobberydberg> https://docs.google.com/spreadsheets/d/15HtA15Lrf8UhkPqSTzM4Nan08aEUDQmHUl8XS7T2KO0/edit#gid=0
14:10:00 <tobberydberg> haven't had the time to fill in any data unfortunate
14:11:30 <tobberydberg> The thought was to complete this mapping so we get a clear overview which metric can be collected with which method ... this to be able to make some good decisions for implementation
14:12:23 <tobberydberg> (you correct me if I'm out flying here:-) )
14:13:51 <tobberydberg> So, we took one step back to analyze the situation a little bit
14:13:56 <ncastele> so the spreadsheet is here to name all the metrics we want, and to check if we can have them regarding each collector ?
14:14:16 <tobberydberg> yes, that was my thought
14:15:02 <tobberydberg> To fins a solution that will cover them all, I'm pretty sure we will end up with more than one collector tool/collection method
14:15:31 <peschk_l> definitely
14:15:36 <tobberydberg> I might be totally off in my thinking though, so you correct me if I'm wrong :-)
14:16:47 <peschk_l> I think an "SQL collection" column may also be needed :) several cloudkitty users did that
14:17:31 <tobberydberg> Feel free to add
14:17:38 <peschk_l> for example for octavia loadbalancers, they directly scrapped the SQL database and pushed metrics into gnocchi
14:18:18 <tobberydberg> Personally, IF that is possible to avoid and be collected in any other way that is preferable
14:18:53 <peschk_l> of course, adding these metrics to ceilometer or monasca would be way better, but it was the fastest solution for them
14:18:56 <tobberydberg> I'm happy if we can avoid direct db queries ... but maybe that isn't possible
14:19:08 <tobberydberg> understood
14:19:42 <ncastele> +1, scrapping DB is a way to go but not the best one, even if it can be handled by querying slaves to avoid impacting the production
14:21:38 <peschk_l> ncastele: agreed. For pure-openstack, I like how ceilometer listens to rabbitMQ notifications
14:21:47 <tobberydberg> Do you all feel this is the way to go initially?
14:21:53 <tobberydberg> the mapping stuff that is
14:22:23 <peschk_l> tobberydberg: yes
14:22:27 <ncastele> yes
14:22:50 <jferrieu> yes
14:23:05 <tobberydberg> Ok, cool
14:23:36 <ncastele> scrapping DB is the simplest/quickest way to go, but if we can rely on other components (by relying I mean 100% sure it does the job and we do not risk to lose data), then it's a better way to go
14:23:43 <tobberydberg> If we all help out with this mapping we can probably have a good initial view until next meeting
14:24:26 <peschk_l> tobberydberg: I'll add stuff on monday, some customers sent us very detailed lists of the metrics they wanted to rate
14:24:30 <tobberydberg> reliability is important I would say :-)
14:24:38 <tobberydberg> cool peschk_l
14:25:09 <tobberydberg> I mean, neutron deletes everything from db after existence ... har to rely on that :-)
14:26:10 <peschk_l> I believe pretty much every "core" projet does this, except for nova and maybe keystone
14:26:30 <tobberydberg> You are probably right about that
14:27:16 <peschk_l> that's a big plus for the notification system, even resources that are only up for a few seconds are taken into account
14:27:31 <tobberydberg> +1
14:28:16 <tobberydberg> What are the next steps after that mapping? Don't want to rush but maybe good to be able to start thinking about that as well...
14:29:02 <tobberydberg> collection method/s will be a big key, but also backed storage for this and how to be able to query this in a real time fashion
14:30:26 <peschk_l> For storage: cloudkitty uses influxDB, and I'm working on an ElasticSearch driver for ck
14:31:25 <peschk_l> from my experience, a flexible backend with support for complex aggregations is important
14:31:34 <tobberydberg> I'll add a section for that in the etherpad
14:31:40 <ncastele> +1
14:32:15 <ncastele> we are running thousands of VMs here, we need the backend to be extensible
14:32:27 <peschk_l> we used SQL or gnocchi in the past, and both caused problems
14:33:39 <peschk_l> and I believe ceilometer used MongoDB as a backend several releases ago, but it had terrible performance
14:34:59 <peschk_l> ncastele: agreed, extensibilty is also a key point
14:35:43 <peschk_l> in some european countries, companies may be required to store unaggregated billing data for up to 10 years
14:36:42 <tobberydberg> yea, I think the raw data will be important for some companies as well
14:36:57 <tobberydberg> yea, mongo was not super :-)
14:37:36 <ncastele> raw data in the past can be stored in a cold storage, I don't think we need it to be quickly queryable after a couple of months
14:38:06 <peschk_l> +1
14:38:23 <tobberydberg> totally agree
14:38:50 <peschk_l> but even a few months can be a huge amount of data, especially if you want real-time precision
14:38:56 <ncastele> yup
14:40:00 <peschk_l> speaking of that, does anyone have an idea of a datamodel for that ?
14:41:17 <peschk_l> ck has a collect period which can be configured (1h hour by default), but when this period becomes too small, there is just too much data
14:41:34 <peschk_l> (we're storing one point per resource per collect period)
14:42:08 <tobberydberg> Haven't thought about the details around that
14:42:55 <tobberydberg> BUT .... per second (for resources measured in time) precision will be a requrement
14:43:01 <peschk_l> gnocchi's model is great for this kind of issue because metadata is stored only once (+ updates), but aggregation queries are not very flexible
14:43:54 <peschk_l> tobberydberg: yes, that's the impression we had at the summit
14:44:06 <tobberydberg> that doesn't say that data must be stored every second though
14:44:35 <peschk_l> everybody wants do to FaaS, so at least second-precision is required
14:45:01 <tobberydberg> +1
14:45:27 <ncastele> This is why we need to scope the metrics we need, for which purpose, and challenge the collect period and how we store them
14:47:28 <peschk_l> maybe a model similar to gnocchi's, but stored in a document-oriented DB ? it would avoid metadata duplication, and we could store exact creation, update and deletion timestamps
14:47:51 <peschk_l> (or maybe it is too soon to think about this)
14:48:16 <tobberydberg> would be good to avoid duplication of that, agreed, and that sounds resonable to me
14:48:56 <tobberydberg> hehe, maybe ... but good to have something to think about a little in the back of the head ;-)
14:49:44 <tobberydberg> mnaser might have had some thoughts around this earlier ... you correct me if I'm wrong
14:52:23 <peschk_l> if mnaser thoughts are still available on eavesdrop or somewhere, I'd love to read them :)
14:52:39 <peschk_l> oh, something else that caused us a few headaches: users WILL want/need to modify existing rating rules
14:52:56 <tobberydberg> All meetings have been recorded ...
14:53:37 <tobberydberg> I might be wrong there though, been some weeks since first meetings :-)
14:53:54 <peschk_l> the problem is that you can't just compute the final price on-the-fly when an API request is made
14:54:04 <tobberydberg> right ... I'm pretty sure you are right about that
14:54:09 <peschk_l> (all right, I'll try to find them then :) )
14:54:53 <tobberydberg> what do you mean by that peschk_l ?
14:54:57 <tobberydberg> In what way?
14:56:15 <peschk_l> fox example, if a company decides that for internal billing, metric X should be taken into account for the current month
14:56:37 <peschk_l> or that the price for metric Y was not right
14:57:16 <peschk_l> the data collected since the beginning of the month will have to be modified
14:57:26 <tobberydberg> aha, ok ok
14:58:33 <tobberydberg> Time flies...almost up for today
14:58:57 <ncastele> yes
14:59:07 <peschk_l> so there are two possibilities: either the price is calculated on collection, and stored along with the qty (that's what ck does). In that case, you can have exact invoices in a few seconds through the API, but you need to plan for data updates
14:59:11 <ncastele> do we agree we have a look on the spreadsheet to fulfill metrics
14:59:28 <peschk_l> ncastele: +1
14:59:31 <jferrieu> +1
14:59:35 <tobberydberg> Short summary ... if we all can help out identifying the metrics and collections, as well as suggestions for backend storage until next meeting, that would be super
14:59:42 <ncastele> price should be computed through another brick/layer
14:59:59 <jferrieu> tobberydberg: agreed
15:00:09 <peschk_l> agreed
15:00:40 <tobberydberg> sounds good!
15:01:11 <tobberydberg> Thanks a lot for today folks! Happy we are moving forward in these discussions :-)
15:01:54 <peschk_l> tobberydberg: thanks for organizing :-)
15:01:56 <tobberydberg> Leaving for vacation here tomorrow, but I hope that I will be able to make every other thursdays for this meeting to keep this going
15:02:11 <tobberydberg> See you all in 2 weeks :-)
15:02:21 <jferrieu> see you, thanks again for the meeting
15:02:41 <tobberydberg> #endmeeting