#openstack-meeting log

15:00:34 <jd__> #startmeeting ceilometer
15:00:34 <openstack> Meeting started Thu May  2 15:00:34 2013 UTC.  The chair is jd__. Information about MeetBot at http://wiki.debian.org/MeetBot.
15:00:35 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
15:00:37 <openstack> The meeting name has been set to 'ceilometer'
15:00:49 <jd__> #link https://wiki.openstack.org/wiki/Meetings/MeteringAgenda
15:00:55 <nijaba> o/
15:00:57 <jd__> hi everyone
15:01:00 <jd__> hey nijaba
15:01:05 <n0ano> o/
15:01:05 <flwang> hi jd__
15:01:05 <danspraggins> o/
15:01:09 <thomasm> hello
15:01:11 <jd__> hi flwang
15:01:12 <apmelton> 0/
15:01:12 <sandywalsh> o/
15:01:15 <llu-laptop> o/
15:01:33 <dhellmann> o/
15:01:47 <epende> o/
15:01:50 <dragondm> o/
15:02:07 <jd__> #topic Havana blueprint assignment
15:02:27 <jd__> I've finished organizing blueprints for havana, I hope
15:02:30 <jd__> #link https://blueprints.launchpad.net/ceilometer/havana
15:02:45 <jd__> and we're supposed to have an assignee to each one
15:03:08 <jd__> so if you feel one of them attractive, go ahead
15:03:12 <sandywalsh> I think we have some overlap between https://blueprints.launchpad.net/ceilometer/+spec/add-event-table and https://blueprints.launchpad.net/ceilometer/+spec/sqlalchemy-metadata-query (as discussed at the summit). I think the same structure can be used for both.
15:03:44 <jd__> sandywalsh: I don't think so, at least for now
15:03:49 <sandywalsh> (on the sql side anyway)
15:03:58 <gordc> anyone allowed to sign up for an unassigned bp?
15:03:59 <jd__> sqlalchemy-metadata-query is going to be implemented on top of what we have *now*
15:04:13 <jd__> gordc: I think so, otherwise tell me I'll assign
15:04:31 <sandywalsh> jd__, k
15:04:35 <dhellmann> the metadata query also needs to support samples that don't come from notifications
15:04:42 <llu-laptop> I can take https://blueprints.launchpad.net/ceilometer/+spec/paginate-db-search, though I might need some help from minjie_shen on HBase.
15:04:43 <gordc> cool, i'll have a look at them and see if anything catches me eye. :)
15:04:49 <jd__> I've assigned myself to it because I don't think anyone is going to take it but I'll be happy to give it away
15:04:58 <jd__> llu-laptop: fair enough
15:05:41 <sandywalsh> dhellmann, I see the link from sample metadata -> event as been a weak link anyway (if there is an underlying event, great, otherwise, Null is valid)
15:06:14 <sandywalsh> and, was there a reason all the dependent bp's on https://blueprints.launchpad.net/ceilometer/+spec/stacktach-integration were left unassigned for havana? Is it just because the umbrella BP is sufficient?
15:06:16 <dhellmann> sandywalsh: yep
15:06:41 <sandywalsh> (yet some other sub-bp's for that umbrella bp were "approved")
15:06:52 <jd__> sandywalsh: no, they should be set on havana
15:07:03 <jd__> sandywalsh: but some are from nova, and I don't touch nova bps
15:07:15 <jd__> sandywalsh: if I missed one about Ceilometer, i'd be glad to fix :)
15:07:18 <sandywalsh> jd__, cool, yes, 1 or 2 were nova/olso
15:07:33 <jd__> sandywalsh: yeah so you want to ask nova/oslo guys to change that :)
15:07:33 <sandywalsh> jd__, thanks, I'll have a quick peek at them again
15:07:44 <jd__> sandywalsh: otherwise ttx will not be happy! :-)
15:07:49 <sandywalsh> jd__, I think they're all approved, but I'll double check
15:08:17 <jd__> ack
15:09:13 <jd__> well anyway feel free to ask me about blueprints if needed
15:09:20 <jd__> and to take any of them :)
15:09:27 <ttx> I'm always happy.
15:09:57 <jd__> ttx: let me use you as a leverage ;-)
15:10:08 <jd__> #topic Chair for next week meeting
15:10:19 <jd__> I think I won't be there for the next meeting
15:10:36 <jd__> so I'd prefer to delegate the meeting run to be sure it runs
15:10:43 <jd__> anyone up for the task?
15:11:19 <sandywalsh> jd__, only one missing that I can see https://blueprints.launchpad.net/ceilometer/+spec/add-event-table
15:11:35 <jd__> sandywalsh: fixing, thanks
15:11:38 <flwang> @jd__, i have a question
15:11:40 <dhellmann> jd__: I should be able to do it
15:12:03 <sandywalsh> ttx should be dipped in bronze and placed at the entrance doors of the next summit :)
15:12:05 <flwang> is it possible to add new blueprint into havana after it's almost settle down
15:12:20 <jd__> dhellmann: thanks!
15:12:55 <jd__> flwang: yes, but likely only if you already have something nearly-implemented I'd say
15:13:21 <jd__> #topic Open discussion
15:13:32 <flwang> got it, thanks
15:13:32 <jd__> I'm out of topic so we can go back to bp or whatever you want
15:13:51 <dhellmann> there was a ML thread on rpc/messaging security that may have an impact on ceilometer
15:13:53 <dhellmann> #link http://lists.openstack.org/pipermail/openstack-dev/2013-April/007916.html
15:14:17 <jd__> I tried to read the wiki page but I lost myself
15:14:28 <dhellmann> I haven't been able to read all of it yet, but it was flagged by my "ceilometer" mail filter when mark mentioned us
15:14:49 <dhellmann> if anyone knows about pki & ceilometer, maybe they can chime in?
15:15:45 <dhellmann> also we have close to 20 changes in the queue right now, so we could use some reviews
15:15:46 * nijaba would be happy to followup on this
15:15:49 <jd__> yeah, I'm definitely not an expert on this
15:16:01 <sandywalsh> it's an interesting problem ... we're soon going to be running into resellers that we need to audit for revenue purposes. It's like we'll be a downstream consumer of Ceilometer billing-related events. Need to be able to prevent forgeries.
15:16:04 <dhellmann> nijaba: you'd be perfect, since the question of repudiability (sp?) came up
15:16:10 <jd__> yeah, I'm back on reviews
15:16:21 <sandywalsh> dragondm, how's your pki?
15:16:22 <jd__> dhellmann: lol@reputhingy
15:16:45 <nijaba> #action nijaba to follow on message security thread and report next week
15:16:47 <dragondm> I know a bit about it. I'll look throught the proposal
15:16:51 <dhellmann> jd__: is there a french word for that? maybe it's easier to spell, even with the extra vowels?
15:17:09 <nijaba> dhellmann: spelling is right :)
15:17:23 * dhellmann will not tempt fate by attempting *that* again
15:17:24 <nijaba> dhellmann: répudiabilité
15:17:32 <jd__> dhellmann: non-répudiation
15:18:05 <jd__> nijaba: I'm not sure répudiabilité exists actually in French?
15:18:50 <jd__> dhellmann: the RPC thread seems turning quite good too, right?
15:18:59 <llu-laptop> nova scheduler had a meeting several days ago, http://eavesdrop.openstack.org/meetings/scheduler/2013/scheduler.2013-04-30-15.04.html, they're planning to expand the hostState and add some kind of plugin mechanism to do the polling.
15:19:19 <llu-laptop> Do you guys think we should ask them to publish the hostState so we can collect those polled data?
15:19:22 <dhellmann> jd__: I think we're making progress. I feel like we at least understand the requirements.
15:19:23 <sandywalsh> nijaba, jd__ that's impressive spelling
15:19:45 <dhellmann> llu-laptop: "them"? "hostState"?
15:19:50 <jd__> dhellmann: yeah, I'm kinda thrilled about the URL approach so far
15:19:58 <dhellmann> llu-laptop:  oops, sorry, missed a message
15:20:14 <jd__> llu-laptop: I don't know, what would be in the host state we would care about?
15:20:27 <dhellmann> "collect all the data?"
15:20:35 <flwang> dhellmann, are you talking about the host metrics?
15:20:35 <eglynn> sorry, late arrival
15:20:36 <llu-laptop> one obvious thing is the CPU utilization data
15:20:48 <n0ano> not so much state as usage data, that's what might make sense to send to CM
15:21:05 <sandywalsh> llu-laptop, the compute node should publish this on a fanout queue and let the scheduler / CM consume it as needed (that was my initial design on that mechanism)
15:21:08 <dhellmann> jd__: I'm not sure the url thing is actually going to help us, but we'll see.
15:21:16 <flwang> @dhellmann, if yes, I have discussed that with jd__
15:21:30 <n0ano> sandywalsh, there was a suggestion to use something like pub-sub for just such a thing
15:21:42 <sandywalsh> llu-laptop, otherwise, what's the advantage of getting it from the scheduler vs. the horse's mouth
15:22:06 <sandywalsh> n0ano, that's what the queues and event notifications are for. No need to reinvent the wheel.
15:22:27 <jd__> dhellmann: what would be your main concern about it?
15:22:34 <llu-laptop> sandywalsh: RPC queue is what i'm talking about for 'publish'
15:22:34 <n0ano> sandywalsh, WFM, I don't care how we send the data just that it goes out from the host and others can pick it up
15:22:52 <sandywalsh> there can be downstream pubsub mechanisms to distribute (like AtomHopper or anything)
15:23:03 <jd__> llu-laptop: I agree these data are useful, not sure it's the best place to retrieve them though
15:23:15 <dhellmann> jd__: I missed the first part of the thread, and didn't get the reference. I think we should get that data if we can, although sandywalsh has a point about the source
15:23:31 <sandywalsh> llu-laptop, n0ano agreed
15:24:01 <flwang> dhellmann, we can also get them from host directly by implement some new pollsters
15:24:01 <sandywalsh> now, what would be interesting for us is to see the top-10 Weighted Host options that are determined before a scheduling choice is made
15:24:20 <sandywalsh> that could be some *really* valuable information for making better decisions down the road
15:24:28 <jd__> flwang: yeah, but we kinda don't want more pollsters :]
15:24:32 <flwang> sandywalsh, yep, I have a simple list
15:24:39 <dhellmann> flwang: true, though there's also the question of why collect the same data twice
15:24:57 <sandywalsh> but trying to second guess the scheduler in CM doesn't make sense (since all the weighing and filtering functions are complex and live in the scheduler already)
15:25:04 <n0ano> there was some discussion that data might be obtained from CM since CM is collecting data already but we don't want to make the scheduler dependent upon CM.
15:25:14 <dhellmann> the scheduler folks don't want to depend on our agent (understandibly), but maybe we can get them to send messages we can collect
15:25:18 <sandywalsh> n0ano, +1
15:25:27 <flwang> dhellmann, yep, get them from MQ is another option
15:25:39 <n0ano> concensus seemed to be that we can potentially utilize CM as an option if it's available.
15:25:58 <n0ano> dhellmann, +1
15:26:06 <sandywalsh> what would be valuable to the scheduler is some idea of what's coming ... are there 100 instances about to be built, etc.
15:26:14 <sandywalsh> CM could be useful for that (with Heat)
15:26:22 <dhellmann> that would mean having them use "messages" and not RPC calls or casts
15:26:38 <n0ano> sandywalsh, ?? - how does CM know that, I thought scheduler would know that before CM does
15:26:56 <eglynn> notifying intent to spin up instances, e.g. via autoscaling?
15:27:05 <dhellmann> heat might know that, but CM don't -- we will know an alarm tripped, but not what action that will cause
15:27:08 <eglynn> CM wouldn't know that
15:27:10 <eglynn> yep
15:27:18 <sandywalsh> n0ano, well, heat would know that there are more requests coming I would imaging ("things are heating up and I'm going to need more instances soon" , etc)
15:27:21 <n0ano> dhellmann, indeed, the model for this data is more UDP than TCP, dropping a message here and there is acceptable.
15:27:43 <sandywalsh> eglynn, yes, notifying intent
15:27:44 <dhellmann> n0ano: real UDP, or message bus?
15:28:10 <n0ano> dhellmann, just an analogy, an unreliable message system is OK
15:28:16 <sandywalsh> scheduler already listens for fanout messages from the compute nodes. We could publish "intent" on a fanout queue too. Consume if desired.
15:28:21 <dhellmann> n0ano: got it.
15:28:34 <sandywalsh> n0ano, fan out is TTL, which would work well
15:28:42 <eglynn> sandywalsh: so autoscaling in heat would more likely scale up in dribs and drabs, as opposed to one big bang of 100 instances
15:29:03 <eglynn> sandywalsh: e.g. continue to scale up if CPU util alarm hasn't cleared
15:29:12 <n0ano> sandywalsh, a possibility but I would imagine we're looking at a second order effect at best, not something to make major scheduling decisions on
15:29:39 <sandywalsh> eglynn, gotcha ... I'm just spitballing here. Other than "how many scheduling requests do I have in my queue" ... I can't imagine getting much valuable data from the scheduler.
15:30:03 <eglynn> yeah (for each individual autoscaling group I meant, but I guess maybe a big bang would be possible in the aggregate ...)
15:30:17 <sandywalsh> eglynn, yes, indeed
15:30:31 <sandywalsh> and since the scheduler is single threaded currently, the queue size would be valuable
15:30:41 <sandywalsh> but we could get that from amqp directly
15:31:32 <sandywalsh> Heat: "the scheduler is busy, so a request for new instances are taking 10s before scheduling occurs" etc
15:31:47 <sandywalsh> s/are/is
15:32:13 <sandywalsh> again, thinking out loud here :)
15:32:14 <eglynn> yes that could be useful perhaps to feed into Heat's cooldown backoff
15:32:19 * jd__ just understood this discussion after realizing what "horse's mouth" from sandywalsh meant
15:32:27 <sandywalsh> jd__, haha, sorry
15:32:32 <jd__> :-)
15:33:02 <eglynn> i.e. backoff less aggressively if there's an appreciable delay on the build queue
15:33:05 <jd__> and I agree with what has been said :)
15:33:15 <sandywalsh> eglynn, +1
15:33:53 <sandywalsh> eglynn, otherwise what will happen is Heat will see the need for more and keep throwing resource requests at it, but the delay is in the scheduler
15:34:09 <eglynn> sandywalsh: true that
15:34:59 <eglynn> not sure how sophisticated/adaptive the AS cooldown logic is currently in Heat
15:35:20 <eglynn> but it certainly sounds like plausibly useful info to have on hand ...
15:38:25 <jd__> well, closing the meeting in a minute if nobody has anything to add today :)
15:38:38 <nijaba> 45
15:38:49 <nijaba> 30
15:39:02 <jd__> ah I though you wanted to add 45
15:39:04 <jd__> :-)
15:39:12 <jd__> +t
15:39:12 <nijaba> just a countdown
15:39:14 <eglynn> PTL pivilege ;)
15:39:20 <dragondm> "Fourty-Two!"
15:39:35 <jd__> #endmeeting kthxbye!