#openstack-meeting log

16:01:02 <jd___> #startmeeting
16:01:03 <jd___> #meetingname ceilometer
16:01:03 <openstack> Meeting started Thu Jun  7 16:01:02 2012 UTC.  The chair is jd___. Information about MeetBot at http://wiki.debian.org/MeetBot.
16:01:03 <jd___> #link https://lists.launchpad.net/openstack/msg12851.html
16:01:04 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
16:01:05 <openstack> The meeting name has been set to 'ceilometer'
16:01:16 <jd___> #chair nijaba dachary
16:01:17 <openstack> Current chairs: dachary jd___ nijaba
16:01:28 <jd___> #topic actions from previous meetings
16:01:49 <jd___> dhellmann: so how was your demo? :)
16:02:00 <dhellmann> :-)
16:02:02 <dhellmann> it went well
16:02:24 <dhellmann> I was able to show messages passing from the agent to the collector and being logged, both from notifications and polling
16:02:51 <dachary> dhellmann: congratulations
16:03:36 <dhellmann> thank you all, again, for all of the work you've put in. I wouldn't be nearly this far if I was doing it all myself.
16:03:42 <nijaba> dhellmann: impressive!
16:04:24 <jd___> great dhellmann :)
16:04:26 <jd___> nijaba: anything to report since you had a couple of action items?
16:05:03 <nijaba> Well, I commented on the bug and I started the thread on configuration handling
16:05:40 <jd___> fair enough :)
16:06:14 <jd___> #topic Storage backend (high availability, SPOF etc.)
16:07:15 <jd___> now it's time to discuss how to store the collected data :)
16:07:39 <dhellmann> our ops team is recommending postgres, in part because of familiarity and in part because they think it will handle the scale
16:07:46 <jd___> dhellmann wrote something cf https://lists.launchpad.net/openstack/msg12884.html
16:07:48 <jd___> #link https://lists.launchpad.net/openstack/msg12884.html
16:07:54 <dhellmann> they have specifically warned me off of mongodb
16:08:12 <dhellmann> but I don't expect everyone to want to use the same tool, which is why I proposed a plugin api
16:08:18 <dhellmann> (thanks, jd___)
16:08:19 <jd___> dhellmann: I agree with your ops, but I'd suggest to use sqlalchemy as an abstract just because OS already did this choice
16:08:25 <dhellmann> yes, absolutely
16:08:39 <dhellmann> I intend to have an "sql" or "rdbms" plugin that uses sqlalchemy
16:08:45 <jd___> but as you stated, this can be pluggable in our case, as you proved and wrote so we can start with one plugin
16:09:11 <dhellmann> it may be simplest to create a mongo plugin for testing and experimentation, but we wouldn't use it in production at dreamhost
16:09:25 <nijaba> please, do NOT use SQL Alchemy.  That would prevent us from using any noSQL db.  You may want to use is in one of the pluging, but not as the plugin method
16:09:45 <jd___> nijaba: it's not the plugin method
16:09:55 <nijaba> jd___: then that's fine
16:09:59 <jd___> :)
16:10:22 * nijaba thinks that's one of the bad elements remaining in OpenStack at the moment
16:10:23 <jd___> dhellmann: well if we both want to use SQL I think so it's likely we can work on this plugin first
16:10:27 <dhellmann> right, there would be a single plugin with a name like "rdbms" that uses sqlalchemy to talk to your DB of choice, but the plugin API would be a higher level thing
16:10:40 <dhellmann> agreed, jd___
16:11:05 <nijaba> that sounds good to me
16:11:06 <dhellmann> the plugin will need more methods so the API server can use it to query the database, too, but I haven't given those any thought
16:11:18 <dhellmann> I would expect them to map pretty closely to the queries in the API itself, though
16:11:19 <jd___> #agreed jd and dhellmann to focus on an SQL plugin storage
16:11:35 <nijaba> well, looks like we only have have the storage part so far
16:11:44 <jd___> dhellmann: indeed
16:11:58 <nijaba> we need to define the maping to the API we defined earlier
16:12:03 <dhellmann> jd___, do you know anything about how the other OS components handle database migrations?
16:12:22 <jd___> dhellmann: I know they use 'migrate'
16:12:34 <jd___> I even wrote one or two migration stuff in the last months
16:12:49 <dhellmann> right, nijaba. We should at least add a method to retrieve the raw data so we can test getting data in and out
16:13:05 <dhellmann> jd___: oh, good, so you can do that part! :-)
16:13:13 <nijaba> dhellmann: that would sound like a nice first step
16:13:19 <jd___> dhellmann: it will only be needed for version 2! :-P
16:13:32 <dhellmann> well, we have to have something to initialize the database, right?
16:13:50 <jd___> right, but sqlalchemy does that for us AFAIK
16:14:15 <dhellmann> I know it can be used to create the schema, but I think you have to tell it to do that explicitly
16:14:21 <dhellmann> we can take that part of the discussion to the mailing list, though
16:14:36 <jd___> otherwise OS uses this to upgrade between releases: http://code.google.com/p/sqlalchemy-migrate/
16:14:41 <jd___> IIRC
16:14:48 <jd___> dhellmann: sure :)
16:15:01 <nijaba> so the plugin should abstract a migrate function...
16:15:04 <jd___> so everybody agrees on the plugin system proposed by dhellmann ?
16:15:17 <nijaba> it's a good start, I think
16:15:39 <dhellmann> +1
16:15:41 <jd___> nijaba: if it's needed, each plugin handles its migration; i think mongodb migration can be easy to do since you don't have to do anything to add fields ;)
16:15:56 <dachary> +1
16:16:23 <jd___> #agreed use the plugin system proposed by dhellmann at https://lists.launchpad.net/openstack/msg12884.html
16:16:36 <dhellmann> #action dhellmann: submit plugin branch for review and merging
16:16:40 <nijaba> as long as we agree to always go through the abstraction to talk to the db, I think we are fine
16:16:53 <dhellmann> nijaba, agreed
16:17:43 <jd___> sure
16:18:01 <dachary> dhellmann: could you point me to the abstraction related to how the plugin is used to query the database ?
16:18:17 <dhellmann> dachary, there aren't any query methods, yet
16:18:36 <dhellmann> just like there's a method to store a new event, there would be one or more methods to ask for event data
16:18:55 <dhellmann> one would ask for all of the raw events, filtered by account, user, etc -- whatever the API args are
16:19:03 <nijaba> in fac, I think we will have one method per API call type
16:19:09 <dachary> that's non trivial to abstract. Or do you have an abstract model in mind already ?
16:19:10 <dhellmann> yeah, probably
16:19:21 <dhellmann> no, I haven't gotten that far
16:19:25 <nijaba> the API is the abstraction...
16:19:47 <dachary> nijaba: then the database plugin will be in charge of interpreting the API calls. That works for me.
16:20:01 <dhellmann> we may be able to build the API server using fewer plugin API calls (too many APIs…) but they will map closely
16:20:12 <nijaba> agreed
16:20:22 <dhellmann> the API service will do some parameter validation, call the plugin to get the data, then format it for return
16:20:43 <nijaba> sounds like MVC applied to DB...
16:20:51 <dhellmann> something like that :-)
16:21:49 <nijaba> jd___: what do you think?
16:22:07 <jd___> I think like dhellmann :)
16:22:54 <nijaba> jd___: should we capture on action on building a few example API calls to the plugin?
16:22:59 <jd___> we restrict the use of a storage plugin to one?
16:23:26 <nijaba> jd___: one at a time? yes
16:23:34 <jd___> nijaba: not sure it's that useful since we have nothing (no code) related to API for now
16:23:46 <dhellmann> jd___, yes, each collector instance would be using only one storage plugin but you could have multiple clusters writing the data to different databases if you wanted
16:23:46 <nijaba> so maybe we should call it an Engine rather than a plugin
16:23:49 <jd___> yeah I meant one at time at runtime :)
16:23:59 <jd___> nijaba: +1
16:24:20 <nijaba> let's learn from quantum's mistake here ;)
16:24:26 <jd___> lol
16:24:40 * dhellmann shakes head
16:24:53 <dhellmann> "engine" it is
16:24:59 <jd___> #action dhellmann rename plugin to engine for storage backend ex-plugin-now-engine system
16:25:09 <jd___> I hope that's clear
16:25:24 <dhellmann> I will do that before submitting the code for review
16:25:34 <nijaba> thanks dhellmann
16:25:59 <dachary> how do we address SPOF ?
16:26:10 <dhellmann> with the database?
16:26:22 <nijaba> well, that's why I wanted us to be able to suport NOSQL dbs
16:26:38 <dhellmann> how does nosql relate to spof?
16:26:48 <dachary> dhellmann: you mean by using a postgresql setup with no spof ?
16:26:51 <dachary> for instance
16:26:53 <jd___> there's HA in SQL too
16:27:24 <nijaba> true, but it is a bit easir to setup multiple conf=current master with some NoSQL than with postgres
16:27:26 <dhellmann> I'm not an ops expert, but yes, my ops team didn't seem concerned about postgresql as a SPOF so I assume they are planning to cluster it
16:27:53 <dhellmann> nijaba, that can be true
16:28:09 <jd___> good ops :)
16:28:11 <dhellmann> the feedback I was getting was that mongo might fall over if we push too much data in
16:28:24 <dhellmann> that's anecdotal, but I have to trust my ops team, don't I?
16:28:32 <dachary> in practice when you need to follow a method to implement "no SPOF", it usualy does not happen.
16:28:46 * jd___ thinks it's a religious war we don't want to get into
16:28:46 <nijaba> dhellmann: that's not the view from our ops here, but used to be until 6 month ago
16:28:59 * nijaba agrees with jd___
16:29:03 <jd___> (flat file anyone?)
16:29:10 <dhellmann> pickle ftw
16:29:19 <dachary> I think the idea of introducting a "no SPOF" in the definition of the database was to make it a default instead of a possibility that needs to be implemented on top.
16:29:48 <jd___> I think it was nijaba wanting to push nosql :)
16:29:53 <dhellmann> do any of the SPOF solutions for databases require application code changes?
16:29:54 <nijaba> dachary: how would you do this?  Write to 2 dbs at once?
16:31:01 <nijaba> my point was: let's not get stuck with an SQL only impelmentation.  If we have an abstraction layer, then we can let the community play around and come up with solutions
16:31:07 <dhellmann> I don't think we want the app to be responsible for database reliability. All of the "real" solutions I've seen support some sort of clustering
16:31:21 <dhellmann> exactly, we can push that responsibility down into the plugin and not worry about it in the core app
16:31:22 <dachary> I'd use mongodb because it has this concern built in from the start. Otherwise i'd follow Florian Haas advices regarding HA ;-)
16:32:00 <dhellmann> has anyone done any work to estimate the amount of data they will be generating?
16:32:07 <nijaba> dachary: I am hopping to get some resources soon to work on a mongodb engine...
16:32:49 <dachary> nijaba: great ;-)
16:32:59 <dachary> I'm just stating a concern but I don't see this as a blocker.
16:33:08 <nijaba> dhellmann: I can take the action to build a calculator
16:33:23 <dhellmann> nijaba, excellent, that would be a real help
16:33:40 * nijaba think about a google spreadsheet, if that suits everyone
16:33:48 <dhellmann> I have some estimates for the number of VMs we expect to have, but have not had time to do the math on data size, yet
16:33:49 <jd___> np
16:34:02 <jd___> think about Swift too
16:34:32 <nijaba> swift as a source for metering message, or as a storgae engine?
16:34:39 <jd___> I meant source for metering
16:34:45 <nijaba> k
16:36:35 <nijaba> #action nijaba to propose a google spreadsheet calculator to estimate volume of metering message (including nova, swift, cinder, quantum)
16:37:02 <jd___> anything else?
16:37:32 <dhellmann> I think that covers everything I had related to storage
16:37:55 <dachary> was there a agree on the fact that the database engine has a function to interpret the API queries in addition to the function to store the data ?
16:38:11 <dachary> a agree => a "dash agree" ;-)
16:38:26 <nijaba> I think there was.
16:38:29 <dhellmann> dachary, I think we agreed there would likely be several methods related to querying
16:38:42 <dhellmann> and that we still need to define them
16:39:07 <dachary> I dont see the dash agree matching this
16:39:21 <dhellmann> oh, we may not have recorded it that way
16:39:39 <dhellmann> I just meant we seemed to come to consensus :-)
16:39:49 <dachary> absolutely, I got that too ;-)
16:39:58 <dhellmann> ok
16:40:26 <jd___> #agreed a database engine has a function to interpret the API queries in addition to the function to store the data
16:40:47 <dachary> jd___: thanks
16:40:49 <dhellmann> #action dhellmann: start mapping API queries to database engine methods
16:41:04 <dhellmann> I'll put together a wiki page with some proposals and we can discuss on the list
16:41:30 <nijaba> dhellmann: if you prime me with a first example, I can take care of the declinations
16:41:47 <dhellmann> nijaba, sounds good
16:43:06 <dhellmann> was there something else on the agenda for today?
16:43:26 <nijaba> Agent configuration mechanism?
16:43:28 <jd___> yep
16:43:32 <jd___> moving on then
16:43:36 <jd___> #topic Agent configuration mechanism
16:43:56 <dhellmann> did we agree on the list that we would use text config files and leave it up to ops to manage them, as with the other components?
16:44:08 <jd___> that's my point of view at least
16:44:22 <nijaba> so, I must say that I would not be happy with this for meter configuration
16:44:37 <nijaba> I am fine with this being used for the global agent config
16:44:38 <dachary> I think there are merits on complementing the configurations mechanisms.
16:45:15 <nijaba> bt I think we risk to have unuseable date captured if not all meter for a given value are set to report the same way, or report at all
16:45:16 <dachary> Configuration engines like puppet or chef have limitations.
16:45:26 <dhellmann> nijaba, I think I understand, but how often do you see the agent/meter configurations changing?
16:45:28 <nijaba> it is a real data consistency problem
16:45:52 <nijaba> dhellmann: as often as marketing will ask for tsomthing new ;)
16:46:35 <dhellmann> in that case, wouldn't it make sense to just collect as much data as possible?
16:46:55 <nijaba> that would be the brute force approach
16:47:02 <dhellmann> well, yeah :-)
16:47:12 <dachary> I've worked with puppet recently and synchronisation with nagios : it's not pretty. We would be *much* better of using direct connections with the nagios plugins, if it was possible. Instead of going thru the puppet database.
16:48:03 <jd___> I don't even see what could be configurable and go wrong with plugins for now
16:48:10 <nijaba> how hard do you think it is to have the meter configuration stored and retrived by the agents?
16:48:15 <jd___> except time sync problem but that we won't solve :)
16:48:39 <dhellmann> time sync? isn't that solved by ntp?
16:48:42 <nijaba> jd___: imagine that you want to capture cpu, but for some reason, only half of your host get that
16:48:51 <jd___> dhellmann: I hope so :)
16:48:58 <nijaba> dhellmann: I think he meant frequency
16:49:18 <dhellmann> ah
16:49:54 <dhellmann> nijaba, how would that happen? the ops configuration management tool should detect that a config is out of date and fix it, no?
16:50:20 <nijaba> so my proposal is: agent are configured through trditinal means, butagent get meter config from the central collector
16:50:44 <nijaba> dhellmann: in theory, yes, but practice has shown this to not always be so true
16:51:06 <dachary> very much in a same way a mysql database has communication with its slave for internal purposes, event when it's configured at install time using puppet
16:51:21 <nijaba> and since this causes a real data consistency issue, this is why I am a bit pushy here
16:51:55 <dachary> dhellmann: I would not trust puppet or chef to handle every use case
16:52:36 <dhellmann> dachary, OK. Well, I trust my ops team to figure out how to make that work, but let's assume we need to have this feature for now and discuss what it might look like.
16:52:43 <nijaba> dhellmann: ask your ops if they trust puppet to set up a drbd cluster...
16:52:46 <dhellmann> The proposed API seemed more complicated than necessary.
16:52:54 <dhellmann> we use chef, but OK :-)
16:53:07 <dachary> nijaba: another good example, yes.
16:53:08 <nijaba> dhellmann: I am very open to changes
16:53:36 <dhellmann> I propose a 2 step system.
16:53:53 <dhellmann> On startup, the agent "checks in" with the collector to retrieve its configuration
16:54:19 <dhellmann> At any other time, when the configuration is changed, the collector sends the new configuration to the agent. The agent discards its existing configuration and replaces it with the new settings.
16:54:33 <dachary> warning : 6 minutes left ;-)
16:54:47 <dhellmann> we probably need to move this discussion back to the list, then
16:54:51 <nijaba> dhellmann: do we use a cast in that case, or directed message (which implies maintaining a list of gents?)
16:55:01 <nijaba> s/gents/agents/
16:55:09 <dhellmann> nijaba, cast (assuming all agents are configured the same way)
16:55:30 <dachary> I propose we move to a vote on the principle and move the discussion on the implementation to the list.
16:55:42 <dhellmann> ok
16:55:46 <nijaba> dhellmann: that sounds good.  Do you want me to rework my proposal, or do you want to have a stab at it?
16:55:59 <dhellmann> nijaba, I don't expect to use this feature so maybe you should do it? :-)
16:56:16 <nijaba> dhellmann: k, fair enough
16:56:30 <dhellmann> dachary, we should also discuss/vote on whether this is a Folsom feature or a G feature
16:57:14 <nijaba> dhellmann: time box: feature which don't make it for a release are pushed to the next one....
16:57:21 <dachary> dhellmann: is there a blocker for it to be a Folsom if someone works on it ?
16:57:30 <dachary> ah
16:57:33 <dachary> timebox of course ;-)
16:58:06 <dhellmann> I wouldn't block someone else from working on it, but I don't think it should have a high priority given all of the other things we have to do for Folsom
16:58:16 <dachary> makes perfect sense to me
16:58:20 <dhellmann> I would rather see people working on pollsters and notification collection
16:58:21 <nijaba> dhellmann: here you find me in agreement
16:58:54 <dachary> who is in agreement to the proposal that agent are configured through traditional means, but agent get meter config from the central collector ?
16:59:05 * nijaba really hopes to be able to put some effort where is mouth is
16:59:15 <dachary> nijaba: you'll have to ;-)
16:59:18 <nijaba> +1
16:59:19 <dachary> +1
16:59:23 <dhellmann> -1
16:59:32 <jd___> 0
17:00:08 <dachary> #agreed agent are configured through traditional means, but agent get meter config from the central collector ?
17:00:15 <dachary> nijaba: you take action ?
17:00:41 <nijaba> yes, my action is to rework my proposal on the basis of what was proposed by dhellmann
17:00:53 <dachary> nijaba: dash action please ;-)
17:01:19 <nijaba> #action nijaba to rework meter configuration proposal on the basis of discussion
17:01:27 <jd___> end time guys
17:01:38 <dachary> yes
17:01:40 <nijaba> thanks *
17:01:42 <jd___> I can hear dhellmann' stomach
17:01:44 <dachary> tnaks
17:01:52 <jd___> #endmeeting