#openstack-meeting log

15:04:37 <jd__> #startmeeting ceilometer
15:04:37 <openstack> Meeting started Thu Mar 20 15:04:37 2014 UTC and is due to finish in 60 minutes.  The chair is jd__. Information about MeetBot at http://wiki.debian.org/MeetBot.
15:04:38 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
15:04:40 <openstack> The meeting name has been set to 'ceilometer'
15:04:42 <jd__> hey
15:04:49 <ildikov_> o/
15:04:52 <llu-laptop> o/
15:04:54 <jd__> ok it's just lag :)
15:04:57 <jd__> hi everyone
15:05:08 <terriyu> o/
15:05:08 <gibi> o/
15:05:11 <nprivalova> o/
15:05:12 <gordc> o/
15:06:02 <Alexei_9871> o/
15:06:07 * eglynn lurks ...
15:06:16 <sileht> o/
15:07:19 <jd__> #topic Milestone status icehouse-rc1
15:07:39 <jd__> #link https://launchpad.net/ceilometer/+milestone/icehouse-rc1
15:07:49 <jd__> so we have a bunch of bugs to fix we probably need to focus a lot on that
15:08:02 <jd__> I've started to go through the entire bug list  to triage the bug and target them as needed
15:08:04 <jd__> help welcome :)
15:08:11 <jd__> anything to discuss on that otherwise?
15:08:54 <gordc> trying to find a link.
15:08:57 <nprivalova> yep
15:09:13 <nprivalova> I've started critical bug about logging
15:09:17 <gordc> nm. i should really read #link.lol
15:09:29 <gordc> nprivalova: what are you plans for addressing that bug?
15:10:07 <nprivalova> gordc: am I right that ceilometer pipeline is running inside swift to publish messages?
15:10:33 <jd__> nprivalova: yes
15:10:35 <gordc> yep. it's used in middleware.
15:10:45 <jd__> I saw that bug and i'm not sure how to fix it
15:10:52 <nprivalova> actually syslog and s-proxy are almost the same
15:11:34 <nprivalova> I guess it's not ceilometer problem
15:11:45 <gordc> is there no way to filter logs from 'external' libraries? like how we define here: https://github.com/openstack/ceilometer/blob/master/ceilometer/service.py#L117
15:11:47 <nprivalova> looks like swift writes in 2 log files
15:12:36 <jd__> I don't know from what I saw in the bug the problem is that the log are set to debug
15:12:51 <jd__> I don't see how we are supposed to change that?
15:13:21 <gordc> jd__: yeah, our logs should be debug ... (some of them are audit level right now)
15:13:35 <jd__> the ones in the bug report are debug as far as I saw
15:13:36 <nprivalova> the problem is not id 'debug'
15:13:59 <terriyu> nprivalova: do you have a link to the bug you're discussing?
15:13:59 <nprivalova> ok, let's proceed in local-channel
15:14:12 <nprivalova> #link https://bugs.launchpad.net/ceilometer/+bug/1294789
15:14:14 <uvirtbot> Launchpad bug 1294789 in ceilometer "ceilometer swift module is spamming rsyslog with useless logs" [Critical,Triaged]
15:14:25 <terriyu> nprivalova: thanks
15:14:50 <gordc> nprivalova: i'm ok with discussing this after other topics.
15:14:56 <gordc> seems like it'll take a while.
15:15:00 <nprivalova> yep
15:15:04 <jd__> me too
15:15:07 <jd__> is there any other topic?
15:17:36 <nprivalova> looks like no
15:17:38 <jd__> I guess not :)
15:17:46 <jd__> #topic Tempest integration
15:18:15 <nprivalova> does everybody knows the whole story about tempest :)?
15:18:27 <jd__> what the whole story?
15:18:48 <jd__> or what do you call the whole story? :)
15:19:10 <terriyu> nprivalova: I'd like to hear stories :)
15:19:11 <nprivalova> I've started a thread [Ceilometer][QA][Tempest][Infra] Ceilometer tempest testing in gate
15:19:28 <jd__> ok I saw that one I think but I didn't read everything yet
15:19:53 <nprivalova> so we cannot run tempest tests on gating as it is
15:20:06 <nprivalova> there are at least 2 blockers
15:20:13 <nprivalova> 1. DBDeadlock
15:20:26 <nprivalova> 2. One collector
15:21:06 <nprivalova> but Sean noticed that cpu is very high even with 1 collector
15:21:26 <gordc> nprivalova: i strung together all my patches to see if my multi collector and dbdeadlock patches work well
15:21:49 <jd__> cool
15:21:58 <nprivalova> so my plan is to continue investigation and do some profiling
15:22:01 <jd__> clearly we need to improve that so all efforts should be oriented on that now
15:22:16 <jd__> future kudos to gordc and nprivalova then :)
15:22:17 <gordc> nprivalova: i'd imagine the cpu would be high with one collector. we're essentially passing hundred/thousands of messages to one source and letting it churn away.
15:23:15 <Alexei_9871> gordc: it can't be the reason for high cpu
15:23:34 <Alexei_9871> hundred/thousands of messages would cause high io latency not cpu
15:24:09 <Alexei_9871> it seems that we are creating/processing lot's of objects somehow
15:24:32 <gordc> Alexei_9871: sure... that and the load will always be at 100%.
15:24:51 <jd__> Alexei_9871: that's a good hypothesis I think
15:24:55 <jd__> doing profiling would help
15:24:57 <Alexei_9871> gordc: yes but multiple collectors on same server won't help us
15:25:08 <Alexei_9871> jd__: me and nprivalova are now working on it
15:25:17 <nprivalova> #action provide profiling resuls
15:25:41 <nprivalova> doesn't work :)
15:25:55 <Alexei_9871> jd__: we disscussed a little bit one colletor issue and a better solution would be to use several native threads instead of workers
15:26:06 <Alexei_9871> one collector*
15:26:19 <Alexei_9871> jd__: what do you think?
15:26:42 <jd__> Alexei_9871: nothing is thread safe
15:27:05 <jd__> I'm not against it but you won't manage to do that until 2016 ;)
15:27:07 <Alexei_9871> jd__: we still have GIL :)
15:27:23 <jd__> that too, so the perf would be less than having several workers anyway
15:27:29 <jd__> but I think it's out of scope here
15:27:34 <jd__> anything else on testing?
15:27:40 <Alexei_9871> jd__: problem with several workers is huge memory consumption
15:27:56 <nprivalova> yep, We've created tests for pollsters
15:28:16 <llu-laptop> Alexei: how huge on collector?
15:28:29 <gordc> llu-laptop: same question
15:29:25 <nprivalova> I think we will get this info after profiling
15:29:32 <jd__> ok, moving on then
15:29:40 <jd__> let's go back at the end or after the meeting on that if you want
15:29:43 <jd__> #topic Release python-ceilometerclient?
15:29:48 <jd__> eglynn-afk: around maybe? :)
15:29:54 <jd__> I think we still have patches in the queue
15:30:06 <ildikov_> jd__: you're right, we still have left
15:30:22 <jd__> ok and we've still have time to release, so let's wait a bit :)
15:30:28 <jd__> #topic Open discussion
15:30:39 <ildikov_> jd__: there was a gate issue with the clients' pypy gate, it's temporary fixed, I hope it will help in the process
15:31:29 <Alexei_9871> llu-laptop: around 50Mb * num_workers
15:31:46 <jd__> not that terrible
15:31:55 <jd__> ildikov_: ok
15:36:18 <jd__> nothing else? closing in a minute then
15:36:41 <terriyu> thanks to everyone who helped anamalagon with her OPW application
15:36:49 <terriyu> We'll find out April 21 if she got into the program
15:36:59 <jd__> cool
15:37:32 <tongli> hi, can I ask a question regarding _ThreadPoolWithWait?
15:37:49 <tongli> I asked the question earlier in the ceilometer channel, but got no response.
15:37:54 <ildikov_> terriyu: np, she's very active, so it's easy to help her :)
15:38:19 <tongli> anybody know the reason why we want the threadPool to wait?
15:38:25 <jd__> tongli: maybe nobody has the answer :)
15:38:57 <tongli> I noticed some rather strange behaviors.
15:39:11 <terriyu> ildikov_: awesome :) I'll pass on the compliment
15:39:25 <tongli> when message arrives in collector, if we have like 64 threads in the pool (default), then the thread will
15:39:45 <tongli> hold for a long time, until all the 64 threads get some tasks, then all the threads start execution.
15:40:03 <tongli> took a long time to figure this out, wonder if anyone knows anything about this.
15:40:57 <ildikov_> terriyu: cool, thanks :)
15:42:40 <jd__> wrapping up then, let's go on #openstack-ceilometer if needed :)
15:42:52 <jd__> #endmeeting