15:00:34 <jd__> #startmeeting ceilometer 15:00:34 <openstack> Meeting started Thu May 2 15:00:34 2013 UTC. The chair is jd__. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:00:35 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:00:37 <openstack> The meeting name has been set to 'ceilometer' 15:00:49 <jd__> #link https://wiki.openstack.org/wiki/Meetings/MeteringAgenda 15:00:55 <nijaba> o/ 15:00:57 <jd__> hi everyone 15:01:00 <jd__> hey nijaba 15:01:05 <n0ano> o/ 15:01:05 <flwang> hi jd__ 15:01:05 <danspraggins> o/ 15:01:09 <thomasm> hello 15:01:11 <jd__> hi flwang 15:01:12 <apmelton> 0/ 15:01:12 <sandywalsh> o/ 15:01:15 <llu-laptop> o/ 15:01:33 <dhellmann> o/ 15:01:47 <epende> o/ 15:01:50 <dragondm> o/ 15:02:07 <jd__> #topic Havana blueprint assignment 15:02:27 <jd__> I've finished organizing blueprints for havana, I hope 15:02:30 <jd__> #link https://blueprints.launchpad.net/ceilometer/havana 15:02:45 <jd__> and we're supposed to have an assignee to each one 15:03:08 <jd__> so if you feel one of them attractive, go ahead 15:03:12 <sandywalsh> I think we have some overlap between https://blueprints.launchpad.net/ceilometer/+spec/add-event-table and https://blueprints.launchpad.net/ceilometer/+spec/sqlalchemy-metadata-query (as discussed at the summit). I think the same structure can be used for both. 15:03:44 <jd__> sandywalsh: I don't think so, at least for now 15:03:49 <sandywalsh> (on the sql side anyway) 15:03:58 <gordc> anyone allowed to sign up for an unassigned bp? 15:03:59 <jd__> sqlalchemy-metadata-query is going to be implemented on top of what we have *now* 15:04:13 <jd__> gordc: I think so, otherwise tell me I'll assign 15:04:31 <sandywalsh> jd__, k 15:04:35 <dhellmann> the metadata query also needs to support samples that don't come from notifications 15:04:42 <llu-laptop> I can take https://blueprints.launchpad.net/ceilometer/+spec/paginate-db-search, though I might need some help from minjie_shen on HBase. 15:04:43 <gordc> cool, i'll have a look at them and see if anything catches me eye. :) 15:04:49 <jd__> I've assigned myself to it because I don't think anyone is going to take it but I'll be happy to give it away 15:04:58 <jd__> llu-laptop: fair enough 15:05:41 <sandywalsh> dhellmann, I see the link from sample metadata -> event as been a weak link anyway (if there is an underlying event, great, otherwise, Null is valid) 15:06:14 <sandywalsh> and, was there a reason all the dependent bp's on https://blueprints.launchpad.net/ceilometer/+spec/stacktach-integration were left unassigned for havana? Is it just because the umbrella BP is sufficient? 15:06:16 <dhellmann> sandywalsh: yep 15:06:41 <sandywalsh> (yet some other sub-bp's for that umbrella bp were "approved") 15:06:52 <jd__> sandywalsh: no, they should be set on havana 15:07:03 <jd__> sandywalsh: but some are from nova, and I don't touch nova bps 15:07:15 <jd__> sandywalsh: if I missed one about Ceilometer, i'd be glad to fix :) 15:07:18 <sandywalsh> jd__, cool, yes, 1 or 2 were nova/olso 15:07:33 <jd__> sandywalsh: yeah so you want to ask nova/oslo guys to change that :) 15:07:33 <sandywalsh> jd__, thanks, I'll have a quick peek at them again 15:07:44 <jd__> sandywalsh: otherwise ttx will not be happy! :-) 15:07:49 <sandywalsh> jd__, I think they're all approved, but I'll double check 15:08:17 <jd__> ack 15:09:13 <jd__> well anyway feel free to ask me about blueprints if needed 15:09:20 <jd__> and to take any of them :) 15:09:27 <ttx> I'm always happy. 15:09:57 <jd__> ttx: let me use you as a leverage ;-) 15:10:08 <jd__> #topic Chair for next week meeting 15:10:19 <jd__> I think I won't be there for the next meeting 15:10:36 <jd__> so I'd prefer to delegate the meeting run to be sure it runs 15:10:43 <jd__> anyone up for the task? 15:11:19 <sandywalsh> jd__, only one missing that I can see https://blueprints.launchpad.net/ceilometer/+spec/add-event-table 15:11:35 <jd__> sandywalsh: fixing, thanks 15:11:38 <flwang> @jd__, i have a question 15:11:40 <dhellmann> jd__: I should be able to do it 15:12:03 <sandywalsh> ttx should be dipped in bronze and placed at the entrance doors of the next summit :) 15:12:05 <flwang> is it possible to add new blueprint into havana after it's almost settle down 15:12:20 <jd__> dhellmann: thanks! 15:12:55 <jd__> flwang: yes, but likely only if you already have something nearly-implemented I'd say 15:13:21 <jd__> #topic Open discussion 15:13:32 <flwang> got it, thanks 15:13:32 <jd__> I'm out of topic so we can go back to bp or whatever you want 15:13:51 <dhellmann> there was a ML thread on rpc/messaging security that may have an impact on ceilometer 15:13:53 <dhellmann> #link http://lists.openstack.org/pipermail/openstack-dev/2013-April/007916.html 15:14:17 <jd__> I tried to read the wiki page but I lost myself 15:14:28 <dhellmann> I haven't been able to read all of it yet, but it was flagged by my "ceilometer" mail filter when mark mentioned us 15:14:49 <dhellmann> if anyone knows about pki & ceilometer, maybe they can chime in? 15:15:45 <dhellmann> also we have close to 20 changes in the queue right now, so we could use some reviews 15:15:46 * nijaba would be happy to followup on this 15:15:49 <jd__> yeah, I'm definitely not an expert on this 15:16:01 <sandywalsh> it's an interesting problem ... we're soon going to be running into resellers that we need to audit for revenue purposes. It's like we'll be a downstream consumer of Ceilometer billing-related events. Need to be able to prevent forgeries. 15:16:04 <dhellmann> nijaba: you'd be perfect, since the question of repudiability (sp?) came up 15:16:10 <jd__> yeah, I'm back on reviews 15:16:21 <sandywalsh> dragondm, how's your pki? 15:16:22 <jd__> dhellmann: lol@reputhingy 15:16:45 <nijaba> #action nijaba to follow on message security thread and report next week 15:16:47 <dragondm> I know a bit about it. I'll look throught the proposal 15:16:51 <dhellmann> jd__: is there a french word for that? maybe it's easier to spell, even with the extra vowels? 15:17:09 <nijaba> dhellmann: spelling is right :) 15:17:23 * dhellmann will not tempt fate by attempting *that* again 15:17:24 <nijaba> dhellmann: répudiabilité 15:17:32 <jd__> dhellmann: non-répudiation 15:18:05 <jd__> nijaba: I'm not sure répudiabilité exists actually in French? 15:18:50 <jd__> dhellmann: the RPC thread seems turning quite good too, right? 15:18:59 <llu-laptop> nova scheduler had a meeting several days ago, http://eavesdrop.openstack.org/meetings/scheduler/2013/scheduler.2013-04-30-15.04.html, they're planning to expand the hostState and add some kind of plugin mechanism to do the polling. 15:19:19 <llu-laptop> Do you guys think we should ask them to publish the hostState so we can collect those polled data? 15:19:22 <dhellmann> jd__: I think we're making progress. I feel like we at least understand the requirements. 15:19:23 <sandywalsh> nijaba, jd__ that's impressive spelling 15:19:45 <dhellmann> llu-laptop: "them"? "hostState"? 15:19:50 <jd__> dhellmann: yeah, I'm kinda thrilled about the URL approach so far 15:19:58 <dhellmann> llu-laptop: oops, sorry, missed a message 15:20:14 <jd__> llu-laptop: I don't know, what would be in the host state we would care about? 15:20:27 <dhellmann> "collect all the data?" 15:20:35 <flwang> dhellmann, are you talking about the host metrics? 15:20:35 <eglynn> sorry, late arrival 15:20:36 <llu-laptop> one obvious thing is the CPU utilization data 15:20:48 <n0ano> not so much state as usage data, that's what might make sense to send to CM 15:21:05 <sandywalsh> llu-laptop, the compute node should publish this on a fanout queue and let the scheduler / CM consume it as needed (that was my initial design on that mechanism) 15:21:08 <dhellmann> jd__: I'm not sure the url thing is actually going to help us, but we'll see. 15:21:16 <flwang> @dhellmann, if yes, I have discussed that with jd__ 15:21:30 <n0ano> sandywalsh, there was a suggestion to use something like pub-sub for just such a thing 15:21:42 <sandywalsh> llu-laptop, otherwise, what's the advantage of getting it from the scheduler vs. the horse's mouth 15:22:06 <sandywalsh> n0ano, that's what the queues and event notifications are for. No need to reinvent the wheel. 15:22:27 <jd__> dhellmann: what would be your main concern about it? 15:22:34 <llu-laptop> sandywalsh: RPC queue is what i'm talking about for 'publish' 15:22:34 <n0ano> sandywalsh, WFM, I don't care how we send the data just that it goes out from the host and others can pick it up 15:22:52 <sandywalsh> there can be downstream pubsub mechanisms to distribute (like AtomHopper or anything) 15:23:03 <jd__> llu-laptop: I agree these data are useful, not sure it's the best place to retrieve them though 15:23:15 <dhellmann> jd__: I missed the first part of the thread, and didn't get the reference. I think we should get that data if we can, although sandywalsh has a point about the source 15:23:31 <sandywalsh> llu-laptop, n0ano agreed 15:24:01 <flwang> dhellmann, we can also get them from host directly by implement some new pollsters 15:24:01 <sandywalsh> now, what would be interesting for us is to see the top-10 Weighted Host options that are determined before a scheduling choice is made 15:24:20 <sandywalsh> that could be some *really* valuable information for making better decisions down the road 15:24:28 <jd__> flwang: yeah, but we kinda don't want more pollsters :] 15:24:32 <flwang> sandywalsh, yep, I have a simple list 15:24:39 <dhellmann> flwang: true, though there's also the question of why collect the same data twice 15:24:57 <sandywalsh> but trying to second guess the scheduler in CM doesn't make sense (since all the weighing and filtering functions are complex and live in the scheduler already) 15:25:04 <n0ano> there was some discussion that data might be obtained from CM since CM is collecting data already but we don't want to make the scheduler dependent upon CM. 15:25:14 <dhellmann> the scheduler folks don't want to depend on our agent (understandibly), but maybe we can get them to send messages we can collect 15:25:18 <sandywalsh> n0ano, +1 15:25:27 <flwang> dhellmann, yep, get them from MQ is another option 15:25:39 <n0ano> concensus seemed to be that we can potentially utilize CM as an option if it's available. 15:25:58 <n0ano> dhellmann, +1 15:26:06 <sandywalsh> what would be valuable to the scheduler is some idea of what's coming ... are there 100 instances about to be built, etc. 15:26:14 <sandywalsh> CM could be useful for that (with Heat) 15:26:22 <dhellmann> that would mean having them use "messages" and not RPC calls or casts 15:26:38 <n0ano> sandywalsh, ?? - how does CM know that, I thought scheduler would know that before CM does 15:26:56 <eglynn> notifying intent to spin up instances, e.g. via autoscaling? 15:27:05 <dhellmann> heat might know that, but CM don't -- we will know an alarm tripped, but not what action that will cause 15:27:08 <eglynn> CM wouldn't know that 15:27:10 <eglynn> yep 15:27:18 <sandywalsh> n0ano, well, heat would know that there are more requests coming I would imaging ("things are heating up and I'm going to need more instances soon" , etc) 15:27:21 <n0ano> dhellmann, indeed, the model for this data is more UDP than TCP, dropping a message here and there is acceptable. 15:27:43 <sandywalsh> eglynn, yes, notifying intent 15:27:44 <dhellmann> n0ano: real UDP, or message bus? 15:28:10 <n0ano> dhellmann, just an analogy, an unreliable message system is OK 15:28:16 <sandywalsh> scheduler already listens for fanout messages from the compute nodes. We could publish "intent" on a fanout queue too. Consume if desired. 15:28:21 <dhellmann> n0ano: got it. 15:28:34 <sandywalsh> n0ano, fan out is TTL, which would work well 15:28:42 <eglynn> sandywalsh: so autoscaling in heat would more likely scale up in dribs and drabs, as opposed to one big bang of 100 instances 15:29:03 <eglynn> sandywalsh: e.g. continue to scale up if CPU util alarm hasn't cleared 15:29:12 <n0ano> sandywalsh, a possibility but I would imagine we're looking at a second order effect at best, not something to make major scheduling decisions on 15:29:39 <sandywalsh> eglynn, gotcha ... I'm just spitballing here. Other than "how many scheduling requests do I have in my queue" ... I can't imagine getting much valuable data from the scheduler. 15:30:03 <eglynn> yeah (for each individual autoscaling group I meant, but I guess maybe a big bang would be possible in the aggregate ...) 15:30:17 <sandywalsh> eglynn, yes, indeed 15:30:31 <sandywalsh> and since the scheduler is single threaded currently, the queue size would be valuable 15:30:41 <sandywalsh> but we could get that from amqp directly 15:31:32 <sandywalsh> Heat: "the scheduler is busy, so a request for new instances are taking 10s before scheduling occurs" etc 15:31:47 <sandywalsh> s/are/is 15:32:13 <sandywalsh> again, thinking out loud here :) 15:32:14 <eglynn> yes that could be useful perhaps to feed into Heat's cooldown backoff 15:32:19 * jd__ just understood this discussion after realizing what "horse's mouth" from sandywalsh meant 15:32:27 <sandywalsh> jd__, haha, sorry 15:32:32 <jd__> :-) 15:33:02 <eglynn> i.e. backoff less aggressively if there's an appreciable delay on the build queue 15:33:05 <jd__> and I agree with what has been said :) 15:33:15 <sandywalsh> eglynn, +1 15:33:53 <sandywalsh> eglynn, otherwise what will happen is Heat will see the need for more and keep throwing resource requests at it, but the delay is in the scheduler 15:34:09 <eglynn> sandywalsh: true that 15:34:59 <eglynn> not sure how sophisticated/adaptive the AS cooldown logic is currently in Heat 15:35:20 <eglynn> but it certainly sounds like plausibly useful info to have on hand ... 15:38:25 <jd__> well, closing the meeting in a minute if nobody has anything to add today :) 15:38:38 <nijaba> 45 15:38:49 <nijaba> 30 15:39:02 <jd__> ah I though you wanted to add 45 15:39:04 <jd__> :-) 15:39:12 <jd__> +t 15:39:12 <nijaba> just a countdown 15:39:14 <eglynn> PTL pivilege ;) 15:39:20 <dragondm> "Fourty-Two!" 15:39:35 <jd__> #endmeeting kthxbye!