Tuesday, 2014-11-11

openstackgerritKai Qiang Wu proposed a change to openstack/ceilometer: Initializing a longer resource id in DB2 nosql backend  https://review.openstack.org/13194701:24
mc__For a faster measuring, I changed the standard polling interval from 600 to 60s in polling.yaml. After restart the services, I receive no measurements at all. Ceilometer-agent-compute.log says ' Obtaining CPU Util is not implemented for LibvirtInspector." What's happened?06:21
mc__Did I switch the gathering method accidently?06:22
*** ddieterly has joined #openstack-ceilometer06:26
mc__ DEBUG ceilometer.compute.pollsters.cpu   [-] Obtaining CPU Util is not implemented for LibvirtInspector get_samples /usr/lib/python2.7/dist-packages/ceilometer/compute/pollsters/cpu.py:89, ... DEBUG ceilometer.compute.pollsters.net [-] LibvirtInspector does not provide data for  OutgoingBytesRatePollster get_samples /usr/lib/python2.7/dist-packages/ceilometer/compute/pollsters/net.py:11006:26
mc__But shortly afterwards I get another measurement from cpu, seems to be good. anyhow nothing is listed in alarmhistory. why ?  14:13:48.165 28947 INFO ceilometer.compute.pollsters.cpu [-] CPUTIME USAGE: {'OS-EXT-STS:task_state': None, 'addresses': {u'openstack-net': [{u'OS-EXT-IPS-MAC:mac_addr': u'fa:16:3e:7f:aa:a5', u've rsion': 4, u'addr': u'', u'OS-EXT-IPS:type': u'fixed'}]}, 'links': [{u'href': u'http://10.236.06:38
openstackgerritAjaya Agrawal proposed a change to openstack/ceilometer-specs: Add spec for MagnetoDB metering support  https://review.openstack.org/12633506:40
cmystermc__: I'd open a bug here with more info like cpu info from /proc and OS + all ^06:55
_elena_eglynn, sileht, gordc, ildikov, jd__, may you take a look on my changes https://review.openstack.org/#/c/130572/ and https://review.openstack.org/#/c/131147/?11:03
openstackgerritZhiQiang Fan proposed a change to openstack/python-ceilometerclient: Fix wrong initialization of AuthPlugin for keystone v3  https://review.openstack.org/13359311:44
nealphsileht: ping?14:59
eglynn_nealph: hello sir!15:26
nealphwelcome back! I trust the time in Paris was fun-filled and requiring much recovery. :D15:27
nealphsleep-wise at least.15:27
eglynn_thank you! ... yeah, I pretty much crashed the whole weekend15:27
*** k4n0 has quit IRC15:27
eglynn_... recovery after a touch of man-flu contracted in Paris, as opposed to the usual summit excuse of excessive partying15:28
eglynn_I heard the HP party was a classic of the genre15:29
nealphyeah, now that I think of it, the conference is just about the perfect petri dish: close proximity, foreign bugs.15:29
nealphI've heard few details of the party yet, which means it must've been good . ;)15:30
nealphand my teammates are just sparing me.15:30
nealphquestion for you: I'm in the midst of a befuddling error on one of our CM deploys and looking for some fresh thoughts...15:31
nealphwe have an instance where we see that the notifier is actively picking up events and calling the publishing method, but collector is not seeing them.15:31
nealphthis is for events only....polled samples pass through without a hitch.15:32
eglynn_by the notifier, you mean the notification agent?15:32
nealphI pinged sileht above, thinking this is related to the refactoring to use oslo notifications instead of rpc, but that was just a shot in the dark really...15:33
nealphyes, notification agent.15:33
eglynn_is the same ceilometer.conf & pipeline.yaml being used by all the agents? (i.e. compute/central and notification)15:33
eglynn_e.g. could the old pipeline RPC sink be still used in the notification agent case?15:34
eglynn_or vice versa?15:34
*** _nadya_ has quit IRC15:34
eglynn_(ie. rpc:// used in the one and notifier:// used in the other pipeline.yaml)15:35
eglynn_just thinking that one of the pipeline.yamls could have been left behind after this change https://github.com/openstack/ceilometer/commit/fec77dbc15:37
eglynn_(though both styles of publishing where intended to continue working)15:37
nealphyes, just confirmed only one pipeline.yaml. I'm suspicious of that change being our root cause, but can't seem to pin it down.15:39
eglynn_nealph: yeah, the confusing aspect is that there should be no difference between the usage of rpc:// versus notifier:// for polled versus published samples15:41
nealphwell, that was my understanding, so thanks for reinforcing it.15:41
eglynn_nealph: yeah ... polled versus *notification-driven samples I meant15:41
eglynn_nealph: are you using a stock pipeline.yaml, or a handcrafted one?15:41
nealpha handcrafted one, let me pastebin.15:42
nealph(drastically simplified)15:42
eglynn_nealph: are you sure from the logging that the notification-originated sample are actually getting to the pipeline sink?15:43
eglynn_nealph: e.g. not discarded earlier, say here https://github.com/openstack/ceilometer/blob/master/ceilometer/pipeline.py#L36615:44
nealphI've instrumented the notification agent's plugin.py heavily, and it definitely calls the publish method.15:44
nealphi.e. the event is in the 'handled' list, https://github.com/openstack/ceilometer/blob/master/ceilometer/plugin.py#L12315:45
nealphBut that's a thought.15:45
eglynn_nealph: hmmm, there's no "catch-all" source in that pipeline15:46
eglynn_nealph: just a source explicitly filtered on "storage.objects"15:46
eglynn_nealph: which is a polling-originated meter15:47
eglynn_nealph: so none of the notification-originated meters will match a filter on a pipeline source15:47
eglynn_nealph: so wouldn't they be discarded by this check in publish_samples ...15:47
nealphhmm, had it in my head that the sink only impacted polled data...15:48
eglynn_nealph: yeah we discussed that very point at the summit contributors meetup with Fabio on Friday15:49
eglynn_nealph: I believe Fabio and gordc did some experimentation to prove that notifications were indirectly impacted by the source filtering15:50
eglynn_nealph: the conclusion was to make this more explicit in future with separate source config for polled meters (with an interval specified) and notified meters (interval is not relevant)15:51
nealph*sigh* yeah, I think this is the tangible proof.15:52
eglynn_nealph: yeah ... can you try loosening the meter filter on line 6 of that pipeline.yaml?15:52
nealphsure, but then that opens polling up futher.15:52
nealphSo, as noted, there's not a way to constrain polling without impacting notifications, yes?15:53
eglynn_nealph: well you could just explicitly list the notification-originated meters you want to collect15:53
eglynn_nealph: in the style of https://github.com/openstack/ceilometer/blob/master/etc/ceilometer/pipeline.yaml#L1715:54
eglynn_nealph: (except that you'd be listing only the notified meters that you were interested in)15:55
nealpheglynn_: so could I conceivably leave meter_source as minimally defined? i.e. how I currently have it, but add an additional source section with the notified meters of interest?15:59
eglynn_nealph: well the most minimal change would be to list all in a single source16:03
eglynn_nealph: e.g. http://pastebin.com/HxswL6wu16:03
eglynn_nealph: or using a second source would work also if preferred16:04
eglynn_nealph: just note that this second source would need to have a meaningless interval defined for it16:04
eglynn_nealph: since interval is a required field, yet has no impact on the collection of notified samples16:05
eglynn_nealph: (since the cadence is controlled by the emitting side)16:05
nealpheglynn_: my goal here is to minimalize the polling activity, so methinks a second source would make sense...that is unless the polling mechanism evaluates all defined sources?16:07
eglynn_nealph: it does evaluate all sources, that's a good point16:08
eglynn_nealph: but only creates polling tasks when there's a matching pollster16:09
eglynn_nealph: so the only potentially problematic case is when the *same* meter is fed by *both* polling and notifications16:09
eglynn_nealph: this occurs for the few cases marked "both" in http://docs.openstack.org/developer/ceilometer/measurements.html16:10
nealpheglynn_: I'm thinking of meters such as "image"...which have both a polling and notification origin.16:10
nealphha, you beat me to it.16:10
eglynn_yep, exactly16:10
eglynn_nealph: so one hacky way would be specify a very large polling interval for the second source16:11
nealphYeah, that's a thought. Okay, I think I've got my head around it. thanks...ironic that Fabio raised this too. I assume it was in the context of whitelisting.16:11
eglynn_nealph: it was in the context of item #3 on this list http://bit.ly/kilo-ceilometer-contributor-meetup16:12
eglynn_Create a pipeline like (or extend it) to explicitly state a filter to store only relevant16:13
nealpheglynn_: ahh, yes. okay, thanks again.16:14
eglynn_nealph: np! :)16:14
nealpheglynn_: separate question: who has a working knowledge of tooz? The documentation on the agent scale-out is pretty sparse and I'm looking to pick someone's brain.16:18
eglynn_nealph: jd__, sileht & nsaje would be most familar of the ceilometer team16:19
eglynn_nealph: cdent & I also to a lesser extent16:19
nealpheglynn_: perfect...will follow on with them. And lament the Euro bias in time zones. :D16:20
eglynn_nealph: is it tooz itself you're interested in finding out more on, or the ceilometer usage of tooz for agent partitioning?16:21
eglynn_nealph: if the latter, then nsaje, cdent or I16:21
* cdent waves16:21
eglynn_nealph: if the former, then another option would be Josh Harlow16:21
* cdent catches up16:22
nealpheglynn_: the latter, thanks for clarifying.16:22
nealph eglynn_: I've found some notes around your use of memcached in https://bugzilla.redhat.com/show_bug.cgi?id=1130372. I think that will transfer?16:26
nealph(beyone alarm evaluators that is)16:26
cdentclearly, nealph, since they didn't see fit to send us to Paris we need to have a hackathon for us and related unblessed, somewhere nice and sunny16:26
* nealph nods vigorously16:27
* cdent starts a kickstarter campaign16:29
eglynn_nealph: in the meantime Josh has written a redis driver for tooz, released in tooz 0.8.016:29
eglynn_nealph: ... with some bugfixes and improvements to follow in the imminent tooz 0.9.0 release16:29
eglynn_nealph: so the key issue with memcache is how to make it fault resilient16:30
eglynn_nealph: if you scale it out horizontally, there is no replication/sync-up/cross-talk between the memcached servers16:30
eglynn_nealph: after all, it's just an in-memory cache16:31
eglynn_nealph: intended for usecases where the application layer is tolerant of cache-misses16:31
eglynn_nealph: in our case, a cache miss would involve mistakenly assuming an agent is no longer a member of the relevant tooz group16:31
eglynn_nealph: so the conclusion was to treat the memcached driver as only suitable for test/PoC as opposed to prod16:32
nealphwell, then it's redis or zookeeper?16:32
eglynn_nealph: yeap, moving upscale there's the zookeeper option16:33
eglynn_nealph: this is used sucessfully at scale by Yahoo16:33
eglynn_nealph: however it being java-based, it's awkward for some distros to package16:33
eglynn_nealph: so for one well-known distributor at least, who shall remain nameless ;) ... redis hits a nice sweet-spot between memcache and ZK16:34
eglynn_nealph: redis having master/slave goodness via redis-sentinel, and options to persist the data so as to survive restarts16:35
eglynn_nealph: does that all make sense?16:36
nealpheglynn_: hadn't considered the java dependency, that's a bit of a hump to get over indeed. yes, it makes sense...I am on the hook for spinning this up in the next week, so I'm sure I will have follow-on questions.16:37
eglynn_nealph: cool16:39
*** Longgeek has joined #openstack-ceilometer16:40
eglynn_nealph: so nsaje was the original author, though I'm not sure if he's online this week (he was at summit and also partially on vacation last week)16:40
eglynn_nealph: but cdent and I should be able to help you in any case16:40
cdentFrom thailand16:40
cdentOr maybe the Bahamas16:41
nealpheglynn_: cool, thanks.16:41
nealphcdent: I hear St. Lucia is nice this time of year.16:42
*** pradysam has joined #openstack-ceilometer16:53
pradysamzqfan: hi16:56
zqfanpradysam, good evening16:57
pradysamzqfan: are you located in china ?16:57
zqfanyes, it's midnight here17:00
pradysamzqfan: oh ok. I just saw your review comment. My first reaction was "doh !" :-)17:01
pradysamzqfan: I could make the msg more generic, with 'Invalid alarm %s' % self._name. But can you explain what what you meant when you said I should use the comparison_operator ?17:04
*** _nadya_ has quit IRC17:06
zqfanpradysam: with your patch, I got error message "faultstring": "Invalid input for field/attribute state. Value: 'xx'. Invalid alarm state, should be one of ['ok', 'alarm', 'insufficient data']", when I test with wrong command: ceilometer --debug alarm-threshold-create --name x --meter-name cpu --threshold 1 --comparison-operator xx17:14
pradysamThats not good17:14
pradysamzqfan: let me test that again17:15
zqfanpradysam: As I mentioned in review comment, not only alarm state uses AdvEnum, other attributes use it too, we should not return 'alarm state something' when user put wrong value for other field17:15
pradysamzqfan: Yes, you are right !17:16
pradysamzqfan: Ill push another patch set with the correction17:16
*** _nadya_ has joined #openstack-ceilometer17:20
*** yatin has quit IRC17:21
*** _nadya_ has joined #openstack-ceilometer18:57
*** Dafna has quit IRC18:59
*** Viswanath has joined #openstack-ceilometer21:09
*** Viswanath has quit IRC21:12
*** asalkeld__ has joined #openstack-ceilometer23:20
