#openstack-meeting log

15:01:00 <eglynn> #startmeeting ceilometer
15:01:01 <openstack> Meeting started Thu Apr 24 15:01:00 2014 UTC and is due to finish in 60 minutes.  The chair is eglynn. Information about MeetBot at http://wiki.debian.org/MeetBot.
15:01:02 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
15:01:04 <openstack> The meeting name has been set to 'ceilometer'
15:01:09 <gordc> o/
15:01:11 <DinaBelova> o/
15:01:15 <ildikov> o/
15:01:15 <llu-laptop> o/
15:01:17 <fabio> Hello!
15:01:18 <ityaptin> o/
15:01:28 <nealph> o/
15:01:41 <nsaje> o/
15:01:56 <sileht> o/
15:02:03 <jd__> o/
15:02:11 <eglynn> welcome all! ... the new timeslot seems to suit more peeps at a 1st glance
15:02:36 <ildikov> eglynn: +1 :)
15:02:42 <_nadya_> o/
15:02:47 <eglynn> #topic summit scheduling
15:02:55 <dhellmann> o/
15:02:57 <eglynn> so the ceilometer track is gonna have 10 slots this time round
15:03:05 <eglynn> (one less than either HK or Portland)
15:03:19 <eglynn> all day Wed plus morning slots on the Thurs
15:03:28 <prad> o/
15:03:36 <eglynn> (clashing with Heat sessions unfortunately, but c'est la vie ...)
15:03:46 <nealph> I count 18 proposals at this point...
15:04:08 <eglynn> nealph: yeap, we're over-subscribed to the tune of circa 2:1
15:04:19 * nealph smells a consolidation coming
15:04:22 <eglynn> ... which is *good* as it shows interest in the project
15:04:28 <eglynn> nealph: :)
15:04:31 <dhellmann> nealph: those decisions are why the PTLs get paid the big bucks ;-)
15:04:45 <nealph> riiiiight. :)
15:04:48 <ildikov> we had several shared slots in HK too, as I remember
15:04:54 <ildikov> dhellmann: LOL :)
15:05:12 <eglynn> dhellmann: I only accept payment in the form of the finest craft beer ;)
15:05:35 <eglynn> yeah so the downside is the high contention for the available slots
15:05:57 <eglynn> the ceilo core team has been working on a "collaborative scheduling" exercise
15:06:17 <eglynn> I'll translate the output of that into something vaguely coherent on sched.org by EoW
15:06:34 <eglynn> ideally all these discussion would be done on gerrit in the open
15:06:43 <eglynn> maybe we'll get to that for Paris ...
15:07:14 <eglynn> also new for this summit ... there'll be a dedicated space for overflow sessions from each project track
15:07:20 <dhellmann> eglynn: the source for the current proposal system is in git somewhere, so we could possibly add features
15:07:47 <eglynn> dhellmann: cool, let's think about far enough in advance of the K* summit
15:07:53 * dhellmann nods
15:08:12 <eglynn> see https://wiki.openstack.org/wiki/Summit under "Program pods" for a short blurb on the pod idea
15:08:41 <eglynn> ... inevitably the contention will result in us punting some of the session proposals to the pod
15:09:00 <eglynn> ... we've also identified a bunch of candidate sessions for merging
15:10:08 <nealph> eglynn:looking at the pod description...seems like some conversations would fit there well and others not so much
15:10:29 <nealph> i.e. collaboration within ceilometer team yes, cross-team no
15:10:40 <eglynn> nealph: yeap, that is true
15:10:54 <nealph> okay...guessing the core team has that in mind. :)
15:10:58 <eglynn> nealph: cross-team also more likely to suffer from scheduling conflicts
15:12:00 <eglynn> nealph: ... yeah IIRC none of the punts are obviously cross-team proposals, so I think we're good on that point
15:12:01 * nealph sighs
15:12:26 <ildikov> eglynn: I think so too
15:12:46 <eglynn> nealph: sigh == "scheduling is hard" ?
15:13:11 <nealph> cool...appreciate the work core team is doing. excited to see the schedule. sighing because we always seem to conflict with heat. :)_
15:13:30 <nealph> (and was hoping to talk about tripleo) :)
15:13:42 <eglynn> nealph: yeah I wasn't filled with joy about that either conflict either
15:14:15 <eglynn> nealph: ... in previous summits heat and ceilo were kept apart because we had some pressing cross-project issues to discuss
15:14:29 <eglynn> (i.e. autoscaling/alarming)
15:15:42 <nealph> perhaps I'm remembering wrong then...regardless, will be good sessions I'm sure.
15:15:43 <eglynn> nealph: ... but yeah we've conflicted before also, I guess only so many ways to slice and dice the 4 days
15:16:59 <eglynn> BTW any cores who haven't fully filled in their prefs on the proposals (and want to get their say in), pls do so by EoD today
15:17:13 <eglynn> move on folks?
15:17:25 <DinaBelova> +1
15:17:31 <eglynn> #topic update on f20-based gating
15:17:42 <eglynn> we discussed this last week
15:17:52 <eglynn> BTW https://review.openstack.org/86842 hasn't landed yet
15:18:29 <eglynn> I suspect because the reviewer pulling the trigger on the +1 becomes unoffiical "nursemaid" for the new job
15:18:34 <ildikov> eglynn:  at least it seems close to
15:18:36 <DinaBelova> so many +'s :)
15:19:00 <_nadya_> HP doesn't have image as i understood
15:19:08 <DinaBelova> only rackspace
15:19:13 <DinaBelova> possibly that's the reason
15:19:22 <_nadya_> yep
15:19:26 <eglynn> _nadya_:  yeah, there was push-back from the infra guys on that redundancy issue
15:20:10 <eglynn> _nadya_: at the infra meeting, when you brought it up ... pushback away from f20, towards trusty, right?
15:20:28 <_nadya_> I guess the only thing we may do is to wait
15:20:58 <DinaBelova> possibly after summit we'll have ubuntu14/f20 - both will be cool for mongo
15:21:02 * eglynn not entirely understanding the infra team's logic TBH
15:21:21 <eglynn> not on the image redundancy across HP & RAX clouds
15:21:39 <eglynn> ... more on the point that the TC decides distro policy for project, infra just implement it
15:21:52 <eglynn> TC policy is ...
15:21:53 <_nadya_> eglynn: jobs may run on hp-cloud or rackspace. it's not determined as I understand (maybe wrong)
15:22:10 <eglynn> #link http://eavesdrop.openstack.org/meetings/tc/2013/tc.2013-01-08-20.02.html
15:22:26 <DinaBelova> _nadya_, yes, that's true
15:22:32 <eglynn> _nadya_: jobs need to be runnable on *both* clouds, or?
15:22:38 <DinaBelova> that's not determined
15:22:40 <_nadya_> eglynn: yep
15:23:00 <DinaBelova> eglynn - if there is no image on HP -> jobs going to it will fail
15:23:11 <DinaBelova> and ok for rackspace
15:23:12 <eglynn> DinaBelova: yep, agreed
15:23:38 <eglynn> but on the TC policy point, pasting here for reference ...
15:23:51 <eglynn> "OpenStack will target its development efforts to latest Ubuntu/Fedora, but will not introduce any changes that would make it impossible to run on the latest Ubuntu LTS or latest RHEL."
15:24:13 <eglynn> infra interpretation is ...
15:24:18 <eglynn> "basic functionality really ought to be done in the context of one of the long-term distros"
15:25:19 <eglynn> sounds like a tension between target-to-latest and gate-on-long-term-distros
15:25:51 <eglynn> dhellmann: ... were you around on the TC when that distro policy ^^^ was agreed?
15:27:13 <eglynn> ... k, I'll take that discussion on the distro policy off-line
15:27:17 <eglynn> ... moving on
15:27:22 <dhellmann> eglynn: no, I think that predates me
15:27:28 <DinaBelova> :)
15:27:49 <dhellmann> we might have inherited that from the ppb
15:28:14 <eglynn> dhellmann: cool, I'll see if I can clarify with the infra folks at their next meeting
15:28:25 <dhellmann> eglynn: good idea
15:29:05 * eglynn doesn't want to be caught in the mongo-less gate situation again, so having alternative gate distros with latest version is goodness IMO
15:29:10 <eglynn> #topic launchpad housekeeping
15:29:25 <eglynn> DinaBelova has identified a bunch of bugs & BPs in LP that need attention
15:29:35 <eglynn> (in terms of status relfecting reality)
15:29:44 <DinaBelova> #link https://etherpad.openstack.org/p/ceilometer-launchpad-cleaning
15:29:49 <DinaBelova> well, yes
15:30:04 <ildikov> maybe we could have some periodic rounds on LP
15:30:21 <DinaBelova> while surfing the launchpad it turned out that there are some things that should be fixed I guess
15:30:25 <eglynn> yeah if anyone wants to pitch in and help with the bureaucracy ... we could divide and conquer
15:30:36 <DinaBelova> ildikov, well, there are triage-days in some of hte OS projects
15:30:47 <eglynn> maybe use DinaBelova's etherpad as a mutex?
15:30:50 <ildikov> DinaBelova: sounds good
15:31:01 <ildikov> DinaBelova: I thought somethings similar
15:31:26 <eglynn> i.e. if you're gonna attack a section, mark it on the etherpad so that you effort isn't duplicated?
15:31:29 <DinaBelova> well, so they are running 1-2 days a month if there is no much load
15:31:39 <DinaBelova> triage-days I mean
15:32:02 <eglynn> yeah, I'd be on for regular triage days in general
15:32:19 <ildikov> eglynn, DinaBelova: or having a round robin schedule for cores and whoever wants to join for checking LP periodically
15:32:29 <ildikov> eglynn: it was an issue earlier too
15:32:41 <ildikov> eglynn: and I think it will be a continuous issue TBH
15:32:45 <DinaBelova> ildikov, eglynn - it's up for you))
15:32:56 <DinaBelova> the solution I mean
15:32:56 <DinaBelova> :)
15:33:02 <DinaBelova> and other core team members, sure)
15:33:07 <ildikov> so I'm definitely on for triage-days or anything similar
15:33:21 <ildikov> DinaBelova: thanks for the etherpad BTW
15:33:31 <eglynn> my preference is to avoid too much heavy-weigth scheduling of the core team's time
15:33:35 <DinaBelova> eglynn, please mark this as info - decision about triage days
15:33:40 <DinaBelova> ildikov, np
15:33:41 <ildikov> DinaBelova: in long term it will be not effective, but it will be good for now for a heads up for sure
15:34:04 <DinaBelova> ildikov, I guess yes
15:34:16 <eglynn> ... as everyone has chaotic demands on their schedules, hard to make round-robin scheduling stick
15:34:18 <ildikov> eglynn: sure, that's true also
15:34:20 * jd__ used to triage NEW once a week at least :(
15:34:30 <DinaBelova> for now mess is huge :(
15:34:47 <ildikov> eglynn: it is a painful process anyway I think, no one likes administration...
15:35:00 <DinaBelova> jd__, well, I guess it'll be great - as now there is a traffic jam really))
15:35:07 <eglynn> jd__: right, I'll follow your lead and go with a once-a-week trawl
15:35:15 <gordc> DinaBelova: i'll take a quick look through the list. thanks for building it.
15:35:22 <DinaBelova> gordc, np
15:35:42 <eglynn> ... and if anyone wants to also pitch in on a best-effort basis, that would be welcome also
15:36:02 <DinaBelova> gordc, the main problem here is that there is also huge list of completely new bigs/bps
15:36:11 <DinaBelova> and I did not mention them here
15:36:34 <jd__> you can't really clean BP i think because you can't even delete it, they just rot
15:36:42 <jd__> s/it/them/
15:37:01 <DinaBelova> well, at least we may set priority for the almost merged things
15:37:12 <llu-laptop> does launchpad has any advanced feature to help this kind of work?
15:37:14 <DinaBelova> as there are lots of them here too
15:37:32 <DinaBelova> llu-laptop, don't think so
15:37:43 <llu-laptop> :(
15:38:21 <eglynn> what's the thought on moving to gerrit for blueprint review?
15:38:37 <eglynn> ... as was recently discussed for nova on the ML?
15:38:41 <ildikov> eglynn: like nova-specs?
15:38:50 <dhellmann> +1
15:39:03 <eglynn> ... not a solution for existing BP cruft, but might prevent the accretion in the future
15:39:36 <ildikov> eglynn: +1 from me too
15:39:58 <dhellmann> iiuc, those nova specs all have a blueprint, too, and the reviews are only used to track changes to the implementation plan but not the status of the blueprint's implementation or schedule
15:39:59 <eglynn> #action eglynn look into the mechanics of reviewing new BPs in gerrit
15:40:08 <llu-laptop> +1, this
15:40:13 <DinaBelova> ildikov, eglynn but please notice that that will also make load on the core reviewers even much
15:40:15 <ildikov> on LP the outdate ones can be set to an invalid state or something like this
15:40:20 <DinaBelova> ... as it's now
15:40:39 <ildikov> DinaBelova: sure, but at least the BPs will be finally reviewed
15:40:45 <DinaBelova> ildikov, sure
15:40:46 <dhellmann> DinaBelova: yes, true, but the tradeoff is that code reviews should be easier, because we would have agreed to the design in advance
15:40:55 <DinaBelova> and not appearing on the LP without the need
15:41:03 <DinaBelova> dhellmann +1
15:41:14 <eglynn> DinaBelova: ... true enough, but we'd have to look at that extra upfront workload as an investment for the future
15:41:16 <ildikov> dhellmann: +1
15:41:53 <eglynn> k, we're up against the shot-clock here so better move on
15:41:56 <llu-laptop> b.t.w. does the approval of ceilometer-spec patch will be reflected on launchpad blueprint?
15:42:31 <eglynn> llu-laptop: my understanding was that "approved" status would be gated on the gerrit review of the BP
15:42:48 <eglynn> llu-laptop: ... as opposed to just being set on an ad-hoc basis
15:43:16 <llu-laptop> eglynn: got that, please move on
15:43:23 <eglynn> #topic Tempest integration
15:43:31 <eglynn> _nadya_: anything new to report?
15:44:11 <DinaBelova> btw these patches are the following
15:44:12 <_nadya_> no valuable updates. Postgres doesn't work quick enough too :(
15:44:19 <DinaBelova> #link https://review.openstack.org/#/q/status:open+project:openstack/tempest+owner:vrovachev,n,z
15:44:45 <_nadya_> DinaBelova: some became abandoned today or yesterday
15:44:55 <eglynn> DinaBelova: so those patches are still effectively blocked from landing by the sqla performance issues?
15:44:56 <DinaBelova> already restored)
15:45:02 <DinaBelova> eglynn, yes
15:45:13 <DinaBelova> we're blocked by the f20/ubuntu14
15:45:21 <DinaBelova> with mongo
15:45:31 <DinaBelova> that is working 30x times faster
15:45:33 <_nadya_> we may move on to 'performance tests' :)
15:45:43 <eglynn> k, so no change then until we sort out the sqla issues and/or gate the longer-running tests on mongo
15:45:48 <DinaBelova> _nadya_, yes)
15:46:04 <_nadya_> eglynn: yep
15:46:04 <DinaBelova> eglynn, yes, it looks so
15:46:23 <eglynn> fair enough, so that dovetails nicely to the next topic
15:46:30 <eglynn> #topic Performance testing
15:46:41 <eglynn> ityaptin: the floor is yours, sir!
15:47:10 <ityaptin> how you known we started perfomance testing
15:47:54 <DinaBelova> #link https://docs.google.com/document/d/1ARpKiYW2WN94JloG0prNcLjMeom-ySVhe8fvjXG_uRU/edit?usp=sharing
15:47:57 <ityaptin> we tests mysql, mongo, hbase - it's standalone backends and habse cluster on VMs
15:48:01 <eglynn> ityaptin: it was on the meeting agenda and discussed on the IRC channel yesterday
15:48:56 <ityaptin> And I guess all core reviers to take look on this document
15:49:19 <DinaBelova> as currently we got feedback from you, eglynn, I guess and that's it
15:49:22 <DinaBelova> I mean from core team
15:49:43 <ityaptin> tests results of mysql shows what it's work more than 30 times slower than hbase or mongo
15:49:47 <DinaBelova> dhellmann, ildikov, jd__, gordc ^^
15:49:50 <ildikov> ityaptin: I think it will be very useful after having the revised SQL models, etc
15:50:08 <eglynn> DinaBelova, ityaptin: so I was also wondering if the test harness used to generate the load was up on github or somewhere similar?
15:50:14 <dhellmann> does that include any tuning of mysql itself? or changes to our indexes?
15:50:29 <ityaptin> ildikov, yes)
15:50:38 <ildikov> so we can compare how much the situation is better
15:50:53 <_nadya_> to evaluate "new model" we need results for the old one. to compare
15:51:02 <eglynn> ityaptin: ... having the loadgen logic in the public domain would be really useful for anyone following up on your results
15:51:05 <ityaptin> eglynn, not yet) but I can, if you want)
15:51:15 <_nadya_> dhellmann: no, no tuning
15:51:21 <eglynn> ildikov: excellent, that would be great!
15:51:23 <DinaBelova> ityaptin, I guess it'll be anyway better to share it
15:51:29 <eglynn> ityaptin: ^^^
15:51:47 <eglynn> darned tab completion of irc nicks!
15:51:57 <DinaBelova> okay, and one more time - core team - please take a look on the results
15:52:05 <DinaBelova> and please propose your own cases/ideas
15:52:18 <ildikov> eglynn: np ;)
15:52:20 <DinaBelova> as ityaptin is going to continue this work
15:52:21 <dhellmann> _nadya_: it would be useful to have some profiling info about what part of the sql driver is so slow, to see if tuning helps
15:52:26 <_nadya_> so the main plea is "please tell us what results do you want to see"
15:52:30 <dhellmann> mongo was really slow at one point, until we improved the indexes there
15:53:17 <ityaptin> dhellmann, I'll try to find bottleneck in mysql
15:53:22 <_nadya_> dhellmann: ok, will take that into account
15:53:23 <gordc> DinaBelova: this is with a single worker?... i get the feeling sql will never work unless with multiple workers (it was consistently 10x slower for me)
15:53:46 <DinaBelova> gordc, yes single one for now
15:54:09 <llu-laptop> gordc: there is a 3-collector setup
15:54:13 <DinaBelova> gordc, ityaptin is planning to try other variant too
15:54:17 <ildikov> dhellmann: did you mean there SQLAlchemy as driver?
15:54:17 <ityaptin> dhellman, if we have some results, it will be shown
15:54:47 <eglynn> DinaBelova, ityaptin, _nadya_: any ideas on the cause of the saw-tooth pattern observed in time-per-message for hbase as the load is scaled up in the 3-collector case?
15:55:13 <ityaptin> we had 2 ideas:
15:55:15 <dhellmann> I wonder if there's a more efficient way to implement _create_or_update() in that sql driver
15:55:26 <eglynn> ... i.e. the semi-regular spikes in the time-per-message
15:55:46 <ityaptin> 1) It's connection pool size in hbase and mongo. It is not true.
15:55:50 <gordc> the api response time are concerning. especially for only 100k samples... i was testing with 1million and it was comparable/better than that on sql.
15:56:09 <ildikov> _create_or_update was changed not so long ago or it was just planned to be changed?
15:56:13 <ityaptin> we test different poolsizes in hbase and all results have same pattern
15:56:38 <gordc> dhellmann: i'd hope to drop the _create_or_update logic... the update option is a bottleneck.
15:56:39 <_nadya_> eglynn: ityaptin speaking about 1-collector case
15:56:40 <ityaptin> 2) greenpool size
15:56:48 <dhellmann> gordc: yeah
15:57:32 <ityaptin> grenn pool size is not tested yet.
15:57:36 <_nadya_> eglynn: there is a peak there with 400 m/s. Regarding 3-collectors case I don't know the answer yet
15:57:49 <gordc> fyi, this is the etherpad i created for ceilometer reschema session: https://etherpad.openstack.org/p/ceilometer-schema
15:57:54 <eglynn> _nadya_: k
15:58:25 <_nadya_> eglynn: actually we don't tune hbase yet. only schema optimization
15:58:39 <DinaBelova> ok, any questions here? as that's close to the end of the meeting
15:58:58 <eglynn> gordc: yeah re. API responsiveness, the "was 141.458 a typo or in seconds?" comment is revealing
15:59:23 <eglynn> ... that sounds unusable as an API
15:59:31 <ityaptin> :(
15:59:38 <DinaBelova> eglynn, yeah...
15:59:39 <gordc> eglynn: agreed..especially for such a small set.
15:59:55 <eglynn> ... would cause an LB or haproxy to drop the connection long before the API call completes :(
16:00:17 <eglynn> right, we a lot of work to do on performance
16:00:24 <eglynn> but we're outta time now
16:00:31 <DinaBelova> eglynn, +1
16:00:40 <eglynn> lets continue the discussion in ATL
16:00:53 <eglynn> thanks folks! ... let's close now
16:01:01 <DinaBelova> bye
16:01:01 <eglynn> #endmeeting ceilometer