15:00:18 <jd__> #startmeeting ceilometer 15:00:19 <openstack> Meeting started Thu Feb 6 15:00:18 2014 UTC and is due to finish in 60 minutes. The chair is jd__. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:00:20 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:00:22 <openstack> The meeting name has been set to 'ceilometer' 15:00:40 <eglynn> o/ 15:00:41 <jd__> #link https://wiki.openstack.org/wiki/Meetings/Ceilometer 15:00:42 <tongli> hi, @jd__ 15:00:44 <tongli> o/ 15:00:47 <jd__> hi everyone 15:00:50 <gordc> o/ 15:00:53 <ildikov_> o/ 15:02:04 <nprivalova> o/ 15:02:24 <jd__> #topic Milestone status icehouse-3 15:02:39 <jd__> #link https://launchpad.net/ceilometer/+milestone/icehouse-3 15:02:53 <jd__> so a lot of things are started, but it'd be great to finish ASAP 15:02:57 <ildikov_> we still need approval for this patch: https://review.openstack.org/#/c/62157/ 15:03:03 <jd__> otherwise we'll be caught in the gate storm 15:03:33 <jd__> ildikov_: yeah I'll try to take a look at it 15:03:40 <gordc> ildikov_: i may have time to review tomorrow as well. 15:03:50 <ildikov_> jd__: thanks 15:04:13 <jd__> otherwise not much to add on my part yet 15:04:35 <ildikov_> thanks guys, it would be really good, if we could go on wih the statistics bp and also have the patch sets of the complex query landed in i-3 15:04:38 <jd__> anything else about one of your blueprint? 15:04:59 <nprivalova> I'm still confused about aggregation 15:05:17 <nprivalova> not sure should I continue or not 15:05:52 <jd__> nprivalova: do you have a requirement on that? 15:05:52 <eglynn> nprivalova: did we come to any conclusion on the overlapping periods issue I raised? 15:05:52 <sileht> o/ 15:05:59 <ityaptin> o/ 15:06:30 <eglynn> nprivalova: ... i.e. the question of whether aggregation can be helpful in the common case of periods that overlap 15:06:31 * jd__ dodges the issue 15:06:46 <nprivalova> eglynn: we agreed that it is not for alarming 15:07:17 <nprivalova> #link https://blueprints.launchpad.net/ceilometer/+spec/base-aggregation 15:07:45 <eglynn> nprivalova: k, then the question really is the potential benefit for the other common cases of recurring statistics queries 15:08:10 <eglynn> nprivalova: ... if we can detect when the same query constraints recurr 15:08:13 <nprivalova> yep, I agree. I saw a comment about billing use case 15:08:29 <eglynn> nprivalova: ... and match the actual query constraints to the pre-aggregated values 15:08:49 <nprivalova> anyway, I think we may continue with meeting :) 15:08:53 <jd__> ok 15:09:00 <jd__> #topic Tempest integration 15:09:06 <jd__> wassup on that? 15:09:23 <nprivalova> we have the following 15:09:25 <nprivalova> https://review.openstack.org/#/q/status:open+project:openstack/tempest+branch:master+topic:bp/add-basic-ceilometer-tests,n,z 15:09:46 <nprivalova> so notifications part is done 15:09:52 <nprivalova> but we have a bug :) 15:10:29 <nprivalova> #link https://bugs.launchpad.net/ceilometer/+bug/1274607 15:10:31 <uvirtbot> Launchpad bug 1274607 in ceilometer "ceilometer-agent-notification is broken without eventlet monkey patching" [Critical,In progress] 15:11:01 <nprivalova> yep, so that's why we have only -1 from Jenkins 15:11:16 <jd__> fair enough, that one should be resolved soon fortunately 15:11:18 <nprivalova> I'm testing the fix 15:11:48 <jd__> #topic Release python-ceilometerclient? 15:11:57 <eglynn> no need this AFAIK 15:12:07 <eglynn> *for this 15:12:12 <jd__> ok :) 15:12:15 <jd__> #topic Polling-on-demand discussion (ityaptin) 15:12:28 <jd__> ityaptin: enlighten us 15:12:46 <ityaptin> about pollsters on demand. Use cases of this feature are tests and debug. 15:13:06 <nprivalova> #link https://review.openstack.org/#/c/66551/ 15:13:12 <jd__> (nprivalova: the fix works if you have https://review.openstack.org/#/c/71124/) 15:13:34 <eglynn> so the purpose of this is to trigger polling for tests ... could the same be acheived by simply configuring the test with a v. short pipeline interval? 15:13:37 * dhellmann apologizes for being late 15:14:13 <gordc> https://blueprints.launchpad.net/ceilometer/+spec/run-all-pollsters-on-demand 15:14:29 <ityaptin> And exists proposal to turn on this feature only with flag 'debug', because somebody can DoS ceilometer with starting pollstering. 15:14:31 <jd__> dhellmann: you're… not fired! 15:14:37 * dhellmann whew! 15:14:50 <eglynn> i.e. the test needs to precipitate events that happen relatively infrequently (i.e. polling cycles with the boilerplate pipeline.yaml) 15:15:10 <eglynn> ... so one approach would be simply to make these events more frequent in the test scenario 15:15:11 <gordc> ityaptin: how does the flag get set? the DoS issue was a concern when i read bp 15:15:12 <sileht> fyi: I have added this to devstack: CEILOMETER_PIPELINE_INTERVAL=10 15:15:22 <jd__> the problem is that polling != having sample anyway, there's no guarantee that samples are going to be available N seconds after being polled 15:15:27 <dhellmann> is this for tempest tests or unit tests? 15:15:28 <jd__> nothing's synchronous 15:15:31 <jd__> dhellmann: tempest 15:15:36 <tongli> @eglynn, that still won't be the same, I would think. 15:15:40 <sileht> perhaps we can just set a different value for devstack-gate 15:15:57 <jd__> DoS concern? I doubt that, it's a feature available on RPC 15:16:07 <jd__> sure the admin can DoS himself, but well.. he's admin 15:16:11 <eglynn> tongli: not exactly equivalent, but perhaps a close enough analogue? 15:16:31 <nprivalova> I think it's not only for tempest. When I install devstack it is useful just to check that pollsters work ok, without waiting interval 15:16:47 <jd__> nprivalova: agreed 15:16:50 <ityaptin> gordc: For example - debug option 15:16:56 <eglynn> nprivalova: ... but the test has to wait anyway for some "ingestion" lag 15:17:05 <tongli> @eglynn, I think it will be nice to hit an enter key, then expect the code hit the break point. 15:17:38 <sileht> eglynn, agree, for example the swift accounts size is done by a async swift task 15:17:48 <tongli> @eglynn, @ityaptin, or you use the new notification alarm. 15:18:01 <sileht> eglynn, so you have to wait swift have updated the value because ceilometer poll it 15:18:12 <sileht> because/before 15:18:15 <tongli> which will simply trigger it as soon as a notification is present on the bus. 15:18:16 * jd__ has no problem with that feature 15:18:30 <dhellmann> if we're going to have a special test mode, it seems like it makes the most sense to make that a separate executable that runs the polling one time and exits 15:18:41 <nprivalova> actually the question was about default value for debug flag :) 15:18:42 <dhellmann> rather than adding a test mode to the main service 15:19:03 <gordc> dhellmann: agreed 15:19:05 <eglynn> dhellmann: ... that sounds reasonable to me 15:19:05 <jd__> dhellmann: if that would be synchronous, that'd be better 15:19:09 <ityaptin> tongli: If we want to test pollsters it does not suitable 15:19:23 <jd__> dhellmann FTW 15:19:24 <tongli> @ityaptin, true. 15:19:29 <dhellmann> jd__: yeah, just refactor the code that runs the polling pipelines so it can be called from a console script 15:19:45 <jd__> dhellmann: I vote for that definitely, because that would be much better for Tempest 15:19:47 <dhellmann> that wouldn't do anything for testing the collector 15:20:04 <dhellmann> do we need some way to have the collector notify tests when data is available? 15:20:15 <jd__> what I don't know is if it's reasonable to use that trick in tempest? 15:20:28 <jd__> dhellmann: that'd be great 15:20:41 <dhellmann> jd__: good point, we wouldn't really be testing the polling service 15:20:48 <dhellmann> but we could have separate tests for that 15:21:18 <dhellmann> if the point is to show that the service works and the pollsters work, do they have to run in the same test to know they work? 15:21:22 <jd__> or we can also use the API method as in the current patch if it's synchronous, i.e. the GET /pollsters returns only when all pollsters are run 15:21:24 <nprivalova> we may set configs only in devstack 15:21:37 <nprivalova> there is o way to hack smth in tempest 15:21:48 <jd__> having a callback on the collector is another issue, I don't have a solution yet but we can think about something else later I gues 15:21:49 <eglynn> in general I wonder how does tempest handle asserting other asynchronous tasks have completed? 15:21:50 <dhellmann> nprivalova: ah, so we have to set devstack to configure ceilometer for tempest? 15:21:57 <eglynn> such as spinning up an instance 15:22:04 <eglynn> or a volume becoming available? 15:22:07 <nprivalova> dhellmann: AFAIK, yes 15:22:09 <dhellmann> eglynn: I see a lot of polling for status and timing out in the errors in the recheck list 15:22:16 <dhellmann> nprivalova: ok 15:22:18 <jd__> eglynn: by waiting and timing out, which has the potential to make Ceilometer the new Neutron :/ 15:23:08 <eglynn> yeah I guess 15:23:22 <nprivalova> maybe we should move it to mailing list? 15:23:32 <dhellmann> are we emitting any sort of notification of having received data? 15:23:38 <eglynn> ... /me is made a bit nervous by making big changes to the ceilo execution path for testing 15:23:43 <dhellmann> could we write the test to watch the log for a specific message, or to listen for a notification? 15:23:52 <dhellmann> eglynn: yeah 15:24:01 <eglynn> ... in the sense that we end up testing something other than what actually runs in prod 15:24:02 <jd__> sending notification when we receive notifications? 15:24:28 <dhellmann> jd__: otherwise I guess the test would call the api over and over until it got the data it wanted? 15:24:32 <jd__> Ceilometer inception 15:24:41 <nprivalova> oh no :) 15:24:53 <jd__> dhellmann: yeah… polling and timing out :( 15:24:53 <eglynn> yeah so in prod its not the extra notification being emitted that has value, it's these data being visible in the API 15:25:29 <dhellmann> eglynn: sure, I'm just trying to figure out how to write the test with the least polling 15:25:36 <dhellmann> maybe polling is the best thing we can do 15:25:41 <eglynn> ... I dunno, suppose we did something funky with mongo replication 15:25:44 <jd__> I think so for now 15:25:54 <dhellmann> polling would certainly be simplest 15:25:55 <eglynn> ... and the data stopped being visible from a secondary replica 15:26:01 <eglynn> ... but the tests still pass 15:26:03 <jd__> now the question is, is it acceptable to have a different path for polling (a request to the API) rather than the regular timer in term of testing 15:26:17 <dhellmann> eglynn: but our tests aren't for mongo, they're for our code 15:26:24 <nprivalova> notifications is another question. Now we are speaking only about polling 15:26:51 <eglynn> dhellmann: I thinking of our mongo storage driver doing some replication aware logic that has the potential to be broken 15:26:53 <dhellmann> nprivalova: what I was hinting at was having ceilometer send a notification that the test could listen for to know when data had arrived, instead of polling the API 15:27:14 <dhellmann> eglynn: if we have to put replication logic in our driver, then we'd have to test for it -- we don't have anything like that now, right? 15:27:49 <eglynn> dhellmann: nope, we don't ... that was just off the cuff example of something that could break 15:27:56 <dhellmann> eglynn: ok 15:28:01 <jd__> I think this is going too far? 15:28:14 <tongli> @dhellmann, I am working the notification alarm, if that is what you asked. 15:28:24 <jd__> I think my previous question is a good one, can I haz a cheese^W^Wyour opinion? 15:28:27 <eglynn> dhellmann: and might not be caught by a test that just asserted for a special notification that the collector had seen the incoming metering message 15:28:45 <tongli> @dhellmann, when a notification appears, you can make something happen, 15:28:45 <dhellmann> jd__: ok, I think we're talking about 2 different things 15:28:56 <dhellmann> tongli: good point, just a sec 15:28:56 <eglynn> s/collector/notification agent/ 15:29:09 <dhellmann> jd__: I was talking about how the test would know when ceilometer's collector had received data 15:29:19 <nprivalova> let us write to mailing list again because honestly I don't see any solution now 15:29:48 <dhellmann> nprivalova: good idea 15:29:51 <jd__> dhellmann: I know, but that's a different topic that the one we're discussing 15:29:59 <dhellmann> sorry, I thought we had moved on 15:30:02 <jd__> dhellmann: so I'd like to have an answer on the first point, first :) 15:30:16 * dhellmann wonders when jd__ became such a stickler ;-) 15:30:17 <jd__> which is having different path used to poll the data 15:30:20 <nprivalova> and please take a look into notification tests in tempest, because we need to be sure that tests are correct 15:30:26 <jd__> lol 15:30:35 <dhellmann> I think it's a mistake to build something in for testing that is too different from something that would be useful in production 15:30:49 <dhellmann> we have a periodic polling loop, so we need a test that shows that we poll periodically 15:30:49 <eglynn> dhellmann: +1 15:30:57 <jd__> agreed 15:31:02 <dhellmann> if we have an API to trigger polling, then we need a *separate* test to show that the api triggers polling 15:31:16 * jd__ hits the channel with his maillet 15:31:19 <dhellmann> so we might as well just test for the code we have now, since we can't avoid it 15:31:51 <dhellmann> if, as nprivalova says, we have to use the devstack configuration, then we will need to adjust the polling interval there to something relatively small and use that for the test 15:32:06 * jd__ nods 15:32:12 <eglynn> yep ... my suggestion exactly 15:32:17 <jd__> as far as the notification of notification received is concern, I think it's something we should think about 15:32:19 <dhellmann> alternately, if we could have the test adjust that interval -- maybe by starting a second copy of the service? -- then we could do all of this in tempest 15:32:25 <jd__> but probably not here and now :) 15:33:15 <dhellmann> jd__: for notification of notifications, we might be able to use the alarm trigger feature, but that is using some production code to test other production code 15:33:25 <jd__> indeed 15:33:28 <dhellmann> so it might be better conceptually to just have the test poll the API looking for the data 15:33:47 <eglynn> as long as the polling is "smart" enough, it that approach really that bad? 15:33:50 <jd__> that would be good enough for now anyway 15:33:54 <dhellmann> which is less elegant, in some sense, but more "correct" from a testing standpoint 15:33:57 <jd__> eglynn: we'll see? 15:34:04 <dhellmann> eglynn: nah, it just feels a little heavy-handed 15:34:28 <eglynn> by "smart" I mean say using a reasonably adaptive/backed-off intra-poll delay 15:34:43 <jd__> it's tempest, you can hammer the API 15:34:43 <dhellmann> eglynn: right 15:34:47 <dhellmann> haha 15:34:56 <jd__> "adaptive", tsss :) 15:35:01 <eglynn> LOL :) 15:35:04 <jd__> GIVE ME THE DAMN DATA YOU API 15:35:12 <jd__> that's how we should do it 15:35:29 <jd__> shall we move on gentlemen? 15:35:29 * dhellmann opens a blueprint to change the API to allow queries in all caps 15:35:39 <nprivalova> unfortunately we should commit it to devstack first :) 15:35:41 * jd__ puts his maillet away 15:35:54 <sileht> devstack already have CEILOMETER_PIPELINE_INTERVAL configuration variable, so we just have to set it in gate-devstack 15:36:10 <jd__> (and gentlewomen) 15:36:26 <jd__> nprivalova: would that be a problem? 15:36:35 <nprivalova> sileht: I will work on this 15:36:39 <jd__> good point sileht 15:36:53 <dhellmann> sileht saves us from over-engineering 15:37:18 <jd__> #topic Work with metadata discussion 15:37:18 <nprivalova> jd__, I don't know :) maybe you have a power to commit everything to everywhere 15:37:31 <nprivalova> it's me again 15:37:41 <jd__> nprivalova: I may or may not have some super power :D 15:38:08 <nprivalova> The long story short: 15:38:17 <nprivalova> When user requests meters or resources their's metadata is being flattened. 15:38:21 <nprivalova> On other hand, when meter or resource is stored to db their metadata is flattened too. 15:38:28 <nprivalova> These two processes are independent and now two different flatten-functions exist. 15:38:33 <nprivalova> We decided to keep only one of them (related bug #link https://bugs.launchpad.net/ceilometer/+bug/1268618). 15:38:35 <uvirtbot> Launchpad bug 1268618 in ceilometer "similar flatten dict methods exists" [Medium,In progress] 15:38:45 <nprivalova> After some discussions with team I decided to use dict_to_keyval everywhere. The reason is that this func allow user to create queues on lists and doesn't contain bugs. 15:38:55 <nprivalova> So the question: API layer is the only place where recursive_keypairs is used and this function contais a bug. 15:39:16 <nprivalova> The perfect solution is to change recursive_keypairs=>dict_to_keyval in API, but output of these funcs are different 15:39:20 <nprivalova> You may take a look here #link https://review.openstack.org/#/c/67704/4/ceilometer/api/controllers/v2.py 15:39:29 <nprivalova> Is it absolutely forbidden to make any changes in API output? We may postpone to change recursive_keypairs=>dict_to_keyval in API but maybe we may fix a bug in recursive_keypairs and fix all our wrong tests? 15:40:05 <dhellmann> nprivalova: what's the bug in recursive_keypairs? 15:40:20 <eglynn> well it would be forbidden I'd say to make changes that could break existing API callers 15:40:37 <dhellmann> yes, changing the return format would require an API version bump 15:40:45 <nprivalova> should I fix the bug but simulate it again in API to keep the behaviour? 15:40:50 <eglynn> #link https://wiki.openstack.org/wiki/APIChangeGuidelines 15:40:53 <dhellmann> which isn't out of the question, but is probably not something we want to do at this point in the cycle 15:41:18 * jd__ shakes in fear of APIv3 15:41:35 <nprivalova> #link https://bugs.launchpad.net/ceilometer/+bug/1268628 15:41:37 <uvirtbot> Launchpad bug 1268628 in ceilometer "recursive_keypairs doesn't throw 'separator' param to next iteration" [Undecided,In progress] 15:41:50 <gordc> nprivalova: i guess your fix is good then. i actually don't like how we're outputing some odd formatting... but it will change output to fix it. 15:41:50 <dhellmann> nprivalova: ah 15:42:39 <gordc> since the consensus is to not change output i think we need to keep your patch in to keep output consistent as before. 15:42:59 <nprivalova> yep, just wanted to clear that 15:43:20 <jd__> cool 15:43:28 <jd__> I like when we all agree 15:43:46 <jd__> #topic Open discussion 15:43:48 <nprivalova> and one more cr https://review.openstack.org/#/c/68583/ 15:43:53 <gordc> nprivalova: you've no idea how much it bothers me seeing 'a.b:c:d' keys.lol i'll review patch again 15:44:15 <nprivalova> gordc: cool :) 15:44:41 <tongli> anyone know if the ctrl+c problem was fixed or not? 15:44:59 <nprivalova> tongli: where? in devstack? 15:45:06 <tongli> yes. 15:45:20 <tongli> I do not think that is specific to devstack though. 15:45:33 <sileht> tongli, https://review.openstack.org/#/c/70338/ this one is missing I think for CRTL+C issue 15:45:51 <nprivalova> tongli: oh, I'm not alone :) Today I faced it several times with devstack-master 15:46:04 <gordc> tongli: it's patched. the oslo sync code just got merged. 15:46:20 <tongli> ok. good. 15:47:11 <nprivalova> do you have some bug-scrub procedure? 15:48:10 <eglynn> nprivalova: ... do you mean triaging the bug queue? 15:48:17 <eglynn> ... or squashing the actual bugs with a concerted effort at fixing? 15:48:33 <eglynn> ... such as a "bug-squashing day" 15:49:04 <nprivalova> I meant clean-up 15:49:18 <nprivalova> and triaging, yes 15:49:40 <tongli> nova had few of these days in the past, 15:49:53 <eglynn> a clean-up that ends with a neater, prioritized queue ... but not necessarily with fixed bugs, right? 15:50:16 <ddutta> Hi I am a noob in ceilometer .... was reading code ..... any place i can help to start learning about the code? 15:50:33 <dhellmann> hi, ddutta! 15:50:39 <jd__> ddutta: try fixing a bug? 15:50:46 <nprivalova> and the same with bps. I just found a bug with 'Confirmed' state that was fixed half of the year ago :) 15:50:59 <ddutta> btw I found a trivial typo too :) https://review.openstack.org/#/c/71431/ 15:51:37 <ddutta> dhellmann: hi ... would love to do something here as my interests are in streaming data mining and machine learning :) ... 15:52:22 <dhellmann> ddutta: you've seen http://docs.openstack.org/developer/ceilometer/ right? 15:52:22 <eglynn> nprivalova: ... the newer bugs seem to be traiged fairly rapidly in general, but seems like we may need to do a periodic trawl of the older ones for dupes/stales etc. 15:52:32 <ddutta> will take on some simple bugs for starters to get more code and design insight ,...... 15:52:36 <gordc> nprivalova: which bug was that? i occasionally run through bugs to clean them up a bit... i tend to let jenkins switch bug status so i guess it missed it in this case. 15:52:53 <ddutta> dhellmann: yes I started to read those 15:53:52 <gordc> ddutta: i tend to throw breakpoints in code i'm interested in and step through... probably doesn't work for everyone but works for me. 15:54:13 <dhellmann> ddutta: +2 on that patch, good eye 15:54:20 <nprivalova> gordc: ah, ok. it was https://bugs.launchpad.net/ceilometer/+bug/1217412 . We've changed the status 15:54:21 <uvirtbot> Launchpad bug 1217412 in ceilometer "HBase DB driver losing historical resource metadata" [Medium,Fix released] 15:55:18 <ddutta> dhellmann: thx .... on to the bugs now 15:55:27 <ddutta> gordc: good idea .... 15:55:57 <gordc> nprivalova: ah, yeah. that status wasn't updated by build... i guess anyone can change the status so if you notice anything feel free to make updates. 15:57:28 <jd__> time to wrap up guys 15:57:41 <jd__> feel free to continue in #openstack-ceilometer :) 15:57:48 <jd__> happy hacking! 15:57:50 <jd__> #endmeeting