15:03:44 <jd__> #startmeeting ceilometer 15:03:45 <openstack> Meeting started Thu Jan 23 15:03:44 2014 UTC and is due to finish in 60 minutes. The chair is jd__. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:03:46 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:03:48 <openstack> The meeting name has been set to 'ceilometer' 15:04:08 <sileht> o/ 15:04:10 <eglynn> o/ 15:04:11 <ildikov_> o/ 15:04:13 <llu-laptop> o/ 15:04:39 <lsmola_> o/ 15:04:40 <nadya_> o/ 15:04:45 <scroiset> o/ 15:05:27 <jd__> #topic Milestone status icehouse-2 / icehouse-3 15:05:41 <jd__> so icehouse-2 is being released right now 15:05:47 <eglynn> a tad disappointing how things panned out overall with icehouse-2 15:05:52 <tongli> o/ 15:05:53 * jd__ nods 15:06:01 <eglynn> ... i.e. chronic problems in the gate caused stuff that was ready to go to be bumped to i-3 :( 15:06:18 <eglynn> ... /me would have preferred if the milestone was delayed by a few days to allow the verification queue to drain 15:06:25 <nadya_> my fix became merged after 2 hours after releasing i-2 :( 15:06:42 <eglynn> yep, I had similar with a BP 15:06:56 <jd__> :( 15:07:08 <ildikov_> my fix also failed on the gate... 15:07:27 <ildikov_> so it is at the end of the waiting list right now 15:07:41 <eglynn> upside: it gives us a head-start on our icehouse-3 slate 15:07:45 <dhellmann> o/ 15:07:47 <jd__> we also need to triage icehouse-3 15:08:05 <eglynn> note we've only 5 full person-weeks of dev time remaining before the freeze for icehouse-3 15:08:13 <eglynn> #link https://wiki.openstack.org/wiki/Icehouse_Release_Schedule 15:08:25 <eglynn> ... gotta put my hand up, I'm usually one of the worst offenders for landing stuff late in the cycle 15:08:39 <eglynn> ... however for i-3 lets try to aim for a more even flow-rate on BP patches 15:08:56 <eglynn> (... as there's gonna be an even bigger rush on the gate for i-3) 15:08:59 <jd__> if you have blueprint for icehouse-3, please update 15:09:19 <eglynn> (... given the amount of stuff that was bumped off i-2, plus the natural backloading onto the 3rd milestone that always happens in every OS release) 15:09:39 <ildikov_> jd__: I have a bp for i-3, which depends on the complex query one 15:10:10 <jd__> hm 15:10:48 <ildikov_> jd__: it would be good to get the complex query discussion fixed in the early phase of i-3 as the feature is ready to fly already 15:11:36 <jd__> we'll try to do that indeed 15:11:44 <nadya_> we need to hurry up right now :) a lot of things to discuss 15:11:59 <ildikov_> jd__: tnx 15:12:39 <eglynn> over the final 2-3 weeks of the icehouse-3 leadin, one possible strategy is to prioritize BPs over bugfixes 15:12:58 <eglynn> (... much easier to get a bugfix landed post i-3 than to beg a FFE on an unfinished BP) 15:13:10 <eglynn> ... justsayin 15:13:19 <llu-laptop> eglynn: +1 15:14:15 <dhellmann> eglynn: prioritizing reviews on anything associated with a bp or bug report between now and then seems like a good idea, too -- encourage people to help with project tracking by filing the right "paperwork" 15:14:21 <jd__> I think we'll rediscuss this next week too anyway 15:14:25 <jd__> just update your bp in the meantime 15:14:31 <jd__> dhellmann: +1 15:14:35 <eglynn> dhellmann: absolutely 15:14:49 <jd__> moving on as we have a lot of topic 15:14:50 <jd__> #topic Tempest integration 15:14:57 <jd__> nadya_: a word on that? 15:15:05 <nadya_> yep! 15:15:07 <nadya_> Our client is apporoved! Congrats :) 15:15:31 <nadya_> But we still facing with problems. There is a bug in devstack (and tempest) about processing the last '/' in the url. #link https://bugs.launchpad.net/tempest/+bug/1271556 15:15:34 <jd__> clap clap clap 15:15:39 <eglynn> \o/ 15:15:47 <dhellmann> nice work! 15:15:57 <ildikov_> nadya_: +1 15:16:01 <nadya_> because of this we have -1 from Jenkins here #link https://review.openstack.org/#/c/67164/ 15:16:30 <nadya_> but we are working on fix 15:16:40 <jd__> great 15:16:42 <nadya_> Besides, we have two nova-related patches: #link https://review.openstack.org/#/c/46744/ and #link https://review.openstack.org/#/c/64136/ . yassine, do you have any news:)? 15:17:01 <nadya_> if you are here... 15:17:21 <nadya_> We are still working on pollsters testing. Tempest part is started but we need #link https://review.openstack.org/#/c/66551/ for testing. JFYI 15:18:06 <nadya_> and the last point is alarm testing. It is lost #link https://review.openstack.org/#/c/39237/ . Nick, nsaje, are you here? 15:18:37 <jd__> I'd suggest to steal the patch if it's not restored 15:18:48 <eglynn> +1 15:19:11 <jd__> otherwise that looks really good, thanks for taking care nadya_ 15:19:14 <nadya_> yep, actually Vadim has started to fix alarm patch 15:19:27 <eglynn> nadya_: great! thanks 15:19:45 <nadya_> So I think that's all from my side on this topic 15:19:54 <nadya_> ur welcome :) 15:19:56 <jd__> great :) 15:19:58 <jd__> #topic Release python-ceilometerclient? 15:20:07 <eglynn> I need this patch to land asap: https://review.openstack.org/68637 15:20:20 <jd__> I'll review 15:20:24 <eglynn> a super-simple fix 15:20:35 <eglynn> but the alarm-threshold-create verb is pretty much borked without it 15:20:43 <eglynn> (legacy alarm-create verb works fine tho') 15:20:47 <ildikov_> eglynn: it looks good on the gate now 15:21:00 <eglynn> ildikov_: cool, thanks 15:21:26 <jd__> #topic Discussion of the resource loader support patch 15:21:27 <eglynn> once that's landed I'll cut a new client 15:21:36 <jd__> eglynn: ack :) 15:21:38 <jd__> #link http://lists.openstack.org/pipermail/openstack-dev/2014-January/024837.html 15:21:52 <jd__> I still failed to read that thread 15:22:06 <llu-laptop> jd__: what's your concern about the 'not generic enough'? 15:22:39 <jd__> my main concern is that we externalized resources in an external file with a different module for handling it 15:22:48 <jd__> whereas I see no reason to not have it into the pipeline definition 15:23:09 <jd__> and the problem solved like caching are not to be solved at that level 15:24:12 <lsmola_> jd__: hmm what about resourecs that we want to have automatically retrieved 15:24:29 <llu-laptop> i just want to give the admin a way to get resource endpoints without restarting the agent 15:24:35 <jd__> lsmola_: could you elaborate? 15:24:37 <lsmola_> jd__: for example for tripleo, we will ask nova to give us list of IPs 15:25:00 <lsmola_> jd__: should we implement that as part of inspector logic that can be turned on? 15:25:02 <llu-laptop> lsmola_: i think we can have a restful resource loader then 15:25:14 <jd__> llu-laptop: reloading the file is a different problem, don't try to solve two problems at the same time 15:25:19 <lsmola_> jd__: or should this reside somewhere aprat from ceilometer as plugin 15:25:30 <jd__> llu-laptop: if we want to have automatic reload of config file, we'll do that in a generic way for all files for example 15:26:09 <jd__> lsmola_: that would be a sort of extension for getting resources list, we don't have that yet, but we could build something around that ultimately 15:26:36 <lsmola_> jd__: ok, rbrady will probably do it once the patches are in 15:26:57 <lsmola_> jd__: I just wanted to make sure that it make sense to have it in the ceilometer codebase 15:27:04 <jd__> lsmola_: basically we already list resources when we poll for things like instances (we list instances) 15:27:16 <lsmola_> jd__: when you can use it only when you deploy openstack via tripleo 15:27:54 <jd__> llu-laptop: does that help you? 15:28:04 <llu-laptop> jd__: so your suggestion is to drop the resource loader idea, and leave it to the pollster or inspector themselves? 15:28:13 <lsmola_> jd__: ok then, we will send the patch, thank you 15:28:28 <jd__> llu-laptop: no, it have them part of the pipelines definition 15:29:00 <llu-laptop> jd__: add another pipeline definition 'resourceloader'? 15:29:52 <jd__> llu-laptop: "resources" for the resources list you want, and if you don't know the resource list, then it's up to the pollster to be able to build one I guess 15:30:23 <jd__> if there can be different type of resource list for a pollster, yeah having a resourceloader parameter for a pipeline would make sense 15:30:47 <jd__> so far we have an implicit resourceloader for all pollsters 15:31:20 <llu-laptop> jd__: ok this is just what I mean by saying 'leave it to the pollster' 15:31:35 <jd__> llu-laptop: understood :) 15:31:43 <eglynn> e.g. a resource loader for the compute agent that polls nova-api? 15:31:52 <jd__> llu-laptop: I'm ok with 'leave it to the pollster' if there is no corner case, at least for now maybe 15:32:00 <eglynn> (... to discover the local virts) 15:32:13 <jd__> eglynn: yeah, we have that already, but it's used implicitely anyway 15:32:23 <llu-laptop> eglynn: I think the compute agent pollsters already does that, don't they? 15:32:49 <jd__> I don't know if we need to make it explicit in the pipeline – I don't know if there are cases where it might be useful to be able to change it 15:32:50 <eglynn> yep, but just wondering for consistency would that be moved into the new resourceloader sbatraction? 15:32:58 <jd__> eglynn: it should, if we go that road 15:33:04 <eglynn> cool 15:33:24 <llu-laptop> so currently, we don't see any imediate needs for resource loader 15:33:25 <llu-laptop> ? 15:33:43 <jd__> llu-laptop: I don't but I am not the sacred holder of All The Use-Cases 15:33:44 <llu-laptop> s/imediate/immeiate/ 15:33:53 <jd__> so if you see use-cases, go ahead 15:34:11 <dhellmann> I may be missing some context, but I think this is part of what the "cache" argument passed to the pollsters was supposed to help with. 15:34:21 <eglynn> on a side-note ... I was a little stumped also by the concept that the baremetal hosts had no user or project ID being metered 15:34:22 <jd__> OTOH let's not implement YACO (Yet Another Config Option) just for sake of having one 15:34:46 <eglynn> lsmola_: does tripleo and/or ironic surface the "identity" associated with baremetal hosts? 15:34:50 <jd__> dhellmann: the question is about what's put inside the cache and by who :) 15:34:51 <eglynn> (... even if it's always an administrative user/tenant) 15:34:52 <dhellmann> if there was a set of pollsters that needed some data (the resources), a base class could load them and store them in the cache on each iteration 15:34:59 <dhellmann> jd__: ah 15:35:18 <lsmola_> eglynn: I believe Ip adress is enough for SNMP, right? 15:35:22 <jd__> dhellmann: i.e. the list of resources is cached, but the question is how do you get that list of resources (that's why we talk about resourceloader) 15:35:23 <llu-laptop> eglynn: this is not for baremetal only 15:35:30 <lsmola_> eglynn: that should be stored in Undercloud nova 15:35:50 <dhellmann> jd__: ok, got it 15:35:52 <eglynn> llu-laptop: sure, but at least *some* of the hosts would generally be baremetals, right? 15:36:37 <eglynn> ... /me just thinking in terms of capturing "owner" identity where it makes sense 15:37:12 <llu-laptop> eglynn: yes. but from snmp point of view, there is no way to know the undercloud project-id, because it doesn't require the undercloud to make itself working 15:37:26 <eglynn> llu-laptop: a-ha, I see ... 15:37:37 * jd__ stares at 'undercloud' 15:37:38 <lsmola_> eglynn: not sure what is 'owner' of baremetal :-) 15:38:19 <eglynn> lsmola_: the user that "registered" the host, if such a concept even exists (?) 15:38:21 <lsmola_> eglynn: I believe we don't work with project/tenants in Undercloud 15:38:30 <eglynn> lsmola_: k 15:38:32 <lsmola_> eglynn: or User 15:38:37 <jd__> shall we move on gentlemen? 15:38:43 <eglynn> sure 15:38:44 <lsmola_> eglynn: it might appear some day 15:38:48 <jd__> we got an overcloud topic next 15:39:02 <jd__> #topic Handle Heat notifications: new meters?, targeted to I3? 15:39:13 <jd__> scroiset: around? 15:39:18 <scroiset> it's me, yeah 15:39:19 <jd__> #link https://blueprints.launchpad.net/ceilometer/+spec/handle-heat-notifications 15:39:32 <scroiset> I would like share and agreed with you the resulting meters we will generate with heat notifications 15:39:34 <jd__> please enlighten us about your evil plan 15:39:44 <scroiset> I propose those described in BP 15:40:17 <scroiset> It's to be able to bill on stack CRUD firstly 15:40:18 <jd__> what's in the whiteboard looks more like notification than samples 15:40:33 <jd__> but we can map to samples for sure 15:41:07 <scroiset> notifications are described here #link https://wiki.openstack.org/wiki/SystemUsageData#orchestration.stack..7Bcreate.2Cupdate.2Cdelete.2Csuspend.2Cresume.7D..7Bstart.2Cerror.2Cend.7D: 15:41:14 <eglynn> the autoscaling aspect struck me as being a bit circular 15:41:22 <scroiset> sample proposed differs from notificaiton 15:41:32 <jd__> eglynn: you want to autoscale on the autoscaling meters? 15:41:53 <jd__> also know as übercloud 15:41:54 <scroiset> jd__: no 15:42:19 <eglynn> well would the flow be something like ... ceilo compute stats -> fire auotscale alarm -> heat launches instance -> heat notification of scale-up -> more ceilo samples 15:42:20 <scroiset> 1/ I want to be able to bill on stack crud 15:42:38 <scroiset> 2/ I want to be notify when an autoscaling is done 15:42:58 <jd__> scroiset: you may want to bill on numbers of stack too, I think this one's missing 15:43:00 <scroiset> the bp is for 1/ 15:43:22 <jd__> otherwise I see no objection 15:43:28 <eglynn> ... yeah now that I wrote a flow down, it doesn't seem that unnatural 15:43:34 <scroiset> jd__: yes, I can do it by counting the stack.create sample 15:43:51 <llu-laptop> eglynn: I don't see why the last 'more ceilo samples' would definitely trigger another 'fire autoscale alarm' 15:44:10 <eglynn> llu-laptop: yep, it wouldn't 15:44:29 <scroiset> llu-laptop: no it wouldn't indeed 15:44:33 <eglynn> (different meter in general of course) 15:45:01 <eglynn> ... just me thinking aloud, ignore & carry on :) 15:45:53 <scroiset> ... so. for new meter/samples, do you see the need ? 15:46:56 <scroiset> .. billing purpose only. 15:47:15 <scroiset> my point 2/ is another BP #https://blueprints.launchpad.net/ceilometer/+spec/alarm-on-notification 15:48:51 <scroiset> I'm feeling alone here, I'm surely not clear... am not I ? 15:49:05 <nadya_> we are here :) 15:49:07 <eglynn> well that other BP is intended to allow alarming on notifications (as opposed to allowing an operator to bill on the number of notifications over a period) 15:49:20 <tongli> @scroiset, I am working on that now. 15:49:37 <scroiset> tongli: I saw you'r the owner 15:49:40 <tongli> @scroiset, not exactly sure what your concern is. 15:50:01 <tongli> @scroiset, planning to submit the patch later today or tomorrow. 15:50:04 <jd__> re 15:50:05 <eglynn> (presuming that you want to be able to bill on the number of autoscale events say, not to generate an alarm on a single autoscale notification being received) 15:50:05 <jd__> sorry I got kicked out by a local BOFH 15:50:18 <scroiset> tongli: I would like create an alarm on the event orhestration.autoscaling.end to be alerted when occurs 15:50:38 <jd__> yeah tongli is working on that 15:50:42 <scroiset> tongli: cool 15:50:50 <jd__> let's circle back to that at the end if we have time 15:50:58 <jd__> I think scroiset concern are good now 15:51:08 <tongli> @scroiset, yeah, you will be able to do that when the patch gets merged,I've been working with jd__ and eglynn on it. 15:51:08 <jd__> #topic Should I proceed with aggregation? 15:51:18 <jd__> #link https://blueprints.launchpad.net/ceilometer/+spec/aggregation-and-rolling-up 15:51:27 <eglynn> I left some comments on https://etherpad.openstack.org/p/ceilometer-aggregation 15:51:36 <nadya_> So guys, I created a bp and have started implementation 15:51:44 <jd__> I'm not really opiniated yet on that one 15:51:46 <eglynn> IIUC only stats queries that have periods that actually line up with wall-clock boundaries will benefit from the pre-aggregation 15:51:59 <eglynn> nadya_ is that correct? ... or a gross simplification? 15:52:42 <eglynn> (... in practice, I'm not sure these wallclock-clamped queries will be the majority) 15:53:03 <jaypipes> eglynn: as opposed to what exactly? 15:53:04 <eglynn> as I think alarming, charting applications etc, would tend to use NOW as their baseline for queries 15:53:08 <nadya_> there should be a mechanism of merging old-data and online 15:53:13 <eglynn> ... not NOW-(minutes past the hour) 15:53:29 * dhellmann apologizes for having to leave early 15:53:55 <eglynn> jaypipes: say stats query with period one hour, but with start and end timestamps decoupled from an hour boundary 15:54:07 <nadya_> NOW is not a problem. You may use cache for 10 hours before NOW and get other data from db directly 15:54:09 <eglynn> jaypipes: ... I put a worked example in nadya_'s etherpad 15:54:15 <jaypipes> eglynn: ah, yes, agree completely. 15:54:41 <jaypipes> eglynn: I'm still not sold on the idea that the aggregate table has value over just a simple caching layer for the statistics table. 15:55:50 <eglynn> nadya_: but if the query periods are 10:07-11:06,11:07-12:06,... and the cache periods are 10:00-10:59,11:00-11:59,... then its a total cache-miss, or? 15:56:32 <nadya_> eglynn, we just do not have cache for this 15:57:21 <nadya_> eglynn, if I create half-hour aggregates it will work as well 15:57:43 <nadya_> it may be configurable. I think an hour is ok for now 15:57:43 <eglynn> nadya_: yep, so I'm wondering if such queries are in majority, would the cache give that much benefit 15:58:30 <nadya_> eglynn, hour-cache is for long-queries by definition 15:58:31 <jaypipes> eglynn: the only time I can see those queries being in the majority is in a graphical user interface that shows a graph of meters on hourly intervals... 15:59:24 <jaypipes> eglynn: but I'm not sold that such a use case is not better implemented as a simple memcache caching layer that saves the results of a SQL query against the main meter table... 15:59:29 <lsmola_> jaypipes: something like that is being implemented in Horizon 15:59:59 <eglynn> lsmola_: the current horizon metering dashboard doesn't use time-clamped queries, or? 16:00:09 * jd__ is not that sold too on caching 16:00:11 <jaypipes> lsmola_: sure, understood. but the architecture/design of a backend server should never be dictated by the needs of a front-end UI. 16:00:44 <lsmola_> eglynn: you mean with use of period parameter? 16:01:43 <eglynn> lsmola_: I mean [(start, start+period), (start+period, start+2*period), ..., (end-period, end)] 16:01:46 <jd__> we need to wrap up now guys 16:01:54 <jd__> it may be better to continue this on the list as needed 16:01:57 <nadya_> I'm afraid we're out of time. To summ up, this functionality is to make long-time queries faster 16:02:21 <eglynn> lsmola_: where start % 1hour != 0 even if period == 1hour 16:02:22 <lsmola_> eglynn: well that is shown in the timeseries line chart 16:02:28 <cody-somerville> :) 16:02:30 <jaypipes> continue discussion in #openstack-ceilometer? 16:02:38 <eglynn> jaypipes: sure 16:02:42 <jd__> #endmeeting