15:01:00 <eglynn> #startmeeting ceilometer 15:01:01 <openstack> Meeting started Thu Apr 24 15:01:00 2014 UTC and is due to finish in 60 minutes. The chair is eglynn. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:01:02 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:01:04 <openstack> The meeting name has been set to 'ceilometer' 15:01:09 <gordc> o/ 15:01:11 <DinaBelova> o/ 15:01:15 <ildikov> o/ 15:01:15 <llu-laptop> o/ 15:01:17 <fabio> Hello! 15:01:18 <ityaptin> o/ 15:01:28 <nealph> o/ 15:01:41 <nsaje> o/ 15:01:56 <sileht> o/ 15:02:03 <jd__> o/ 15:02:11 <eglynn> welcome all! ... the new timeslot seems to suit more peeps at a 1st glance 15:02:36 <ildikov> eglynn: +1 :) 15:02:42 <_nadya_> o/ 15:02:47 <eglynn> #topic summit scheduling 15:02:55 <dhellmann> o/ 15:02:57 <eglynn> so the ceilometer track is gonna have 10 slots this time round 15:03:05 <eglynn> (one less than either HK or Portland) 15:03:19 <eglynn> all day Wed plus morning slots on the Thurs 15:03:28 <prad> o/ 15:03:36 <eglynn> (clashing with Heat sessions unfortunately, but c'est la vie ...) 15:03:46 <nealph> I count 18 proposals at this point... 15:04:08 <eglynn> nealph: yeap, we're over-subscribed to the tune of circa 2:1 15:04:19 * nealph smells a consolidation coming 15:04:22 <eglynn> ... which is *good* as it shows interest in the project 15:04:28 <eglynn> nealph: :) 15:04:31 <dhellmann> nealph: those decisions are why the PTLs get paid the big bucks ;-) 15:04:45 <nealph> riiiiight. :) 15:04:48 <ildikov> we had several shared slots in HK too, as I remember 15:04:54 <ildikov> dhellmann: LOL :) 15:05:12 <eglynn> dhellmann: I only accept payment in the form of the finest craft beer ;) 15:05:35 <eglynn> yeah so the downside is the high contention for the available slots 15:05:57 <eglynn> the ceilo core team has been working on a "collaborative scheduling" exercise 15:06:17 <eglynn> I'll translate the output of that into something vaguely coherent on sched.org by EoW 15:06:34 <eglynn> ideally all these discussion would be done on gerrit in the open 15:06:43 <eglynn> maybe we'll get to that for Paris ... 15:07:14 <eglynn> also new for this summit ... there'll be a dedicated space for overflow sessions from each project track 15:07:20 <dhellmann> eglynn: the source for the current proposal system is in git somewhere, so we could possibly add features 15:07:47 <eglynn> dhellmann: cool, let's think about far enough in advance of the K* summit 15:07:53 * dhellmann nods 15:08:12 <eglynn> see https://wiki.openstack.org/wiki/Summit under "Program pods" for a short blurb on the pod idea 15:08:41 <eglynn> ... inevitably the contention will result in us punting some of the session proposals to the pod 15:09:00 <eglynn> ... we've also identified a bunch of candidate sessions for merging 15:10:08 <nealph> eglynn:looking at the pod description...seems like some conversations would fit there well and others not so much 15:10:29 <nealph> i.e. collaboration within ceilometer team yes, cross-team no 15:10:40 <eglynn> nealph: yeap, that is true 15:10:54 <nealph> okay...guessing the core team has that in mind. :) 15:10:58 <eglynn> nealph: cross-team also more likely to suffer from scheduling conflicts 15:12:00 <eglynn> nealph: ... yeah IIRC none of the punts are obviously cross-team proposals, so I think we're good on that point 15:12:01 * nealph sighs 15:12:26 <ildikov> eglynn: I think so too 15:12:46 <eglynn> nealph: sigh == "scheduling is hard" ? 15:13:11 <nealph> cool...appreciate the work core team is doing. excited to see the schedule. sighing because we always seem to conflict with heat. :)_ 15:13:30 <nealph> (and was hoping to talk about tripleo) :) 15:13:42 <eglynn> nealph: yeah I wasn't filled with joy about that either conflict either 15:14:15 <eglynn> nealph: ... in previous summits heat and ceilo were kept apart because we had some pressing cross-project issues to discuss 15:14:29 <eglynn> (i.e. autoscaling/alarming) 15:15:42 <nealph> perhaps I'm remembering wrong then...regardless, will be good sessions I'm sure. 15:15:43 <eglynn> nealph: ... but yeah we've conflicted before also, I guess only so many ways to slice and dice the 4 days 15:16:59 <eglynn> BTW any cores who haven't fully filled in their prefs on the proposals (and want to get their say in), pls do so by EoD today 15:17:13 <eglynn> move on folks? 15:17:25 <DinaBelova> +1 15:17:31 <eglynn> #topic update on f20-based gating 15:17:42 <eglynn> we discussed this last week 15:17:52 <eglynn> BTW https://review.openstack.org/86842 hasn't landed yet 15:18:29 <eglynn> I suspect because the reviewer pulling the trigger on the +1 becomes unoffiical "nursemaid" for the new job 15:18:34 <ildikov> eglynn: at least it seems close to 15:18:36 <DinaBelova> so many +'s :) 15:19:00 <_nadya_> HP doesn't have image as i understood 15:19:08 <DinaBelova> only rackspace 15:19:13 <DinaBelova> possibly that's the reason 15:19:22 <_nadya_> yep 15:19:26 <eglynn> _nadya_: yeah, there was push-back from the infra guys on that redundancy issue 15:20:10 <eglynn> _nadya_: at the infra meeting, when you brought it up ... pushback away from f20, towards trusty, right? 15:20:28 <_nadya_> I guess the only thing we may do is to wait 15:20:58 <DinaBelova> possibly after summit we'll have ubuntu14/f20 - both will be cool for mongo 15:21:02 * eglynn not entirely understanding the infra team's logic TBH 15:21:21 <eglynn> not on the image redundancy across HP & RAX clouds 15:21:39 <eglynn> ... more on the point that the TC decides distro policy for project, infra just implement it 15:21:52 <eglynn> TC policy is ... 15:21:53 <_nadya_> eglynn: jobs may run on hp-cloud or rackspace. it's not determined as I understand (maybe wrong) 15:22:10 <eglynn> #link http://eavesdrop.openstack.org/meetings/tc/2013/tc.2013-01-08-20.02.html 15:22:26 <DinaBelova> _nadya_, yes, that's true 15:22:32 <eglynn> _nadya_: jobs need to be runnable on *both* clouds, or? 15:22:38 <DinaBelova> that's not determined 15:22:40 <_nadya_> eglynn: yep 15:23:00 <DinaBelova> eglynn - if there is no image on HP -> jobs going to it will fail 15:23:11 <DinaBelova> and ok for rackspace 15:23:12 <eglynn> DinaBelova: yep, agreed 15:23:38 <eglynn> but on the TC policy point, pasting here for reference ... 15:23:51 <eglynn> "OpenStack will target its development efforts to latest Ubuntu/Fedora, but will not introduce any changes that would make it impossible to run on the latest Ubuntu LTS or latest RHEL." 15:24:13 <eglynn> infra interpretation is ... 15:24:18 <eglynn> "basic functionality really ought to be done in the context of one of the long-term distros" 15:25:19 <eglynn> sounds like a tension between target-to-latest and gate-on-long-term-distros 15:25:51 <eglynn> dhellmann: ... were you around on the TC when that distro policy ^^^ was agreed? 15:27:13 <eglynn> ... k, I'll take that discussion on the distro policy off-line 15:27:17 <eglynn> ... moving on 15:27:22 <dhellmann> eglynn: no, I think that predates me 15:27:28 <DinaBelova> :) 15:27:49 <dhellmann> we might have inherited that from the ppb 15:28:14 <eglynn> dhellmann: cool, I'll see if I can clarify with the infra folks at their next meeting 15:28:25 <dhellmann> eglynn: good idea 15:29:05 * eglynn doesn't want to be caught in the mongo-less gate situation again, so having alternative gate distros with latest version is goodness IMO 15:29:10 <eglynn> #topic launchpad housekeeping 15:29:25 <eglynn> DinaBelova has identified a bunch of bugs & BPs in LP that need attention 15:29:35 <eglynn> (in terms of status relfecting reality) 15:29:44 <DinaBelova> #link https://etherpad.openstack.org/p/ceilometer-launchpad-cleaning 15:29:49 <DinaBelova> well, yes 15:30:04 <ildikov> maybe we could have some periodic rounds on LP 15:30:21 <DinaBelova> while surfing the launchpad it turned out that there are some things that should be fixed I guess 15:30:25 <eglynn> yeah if anyone wants to pitch in and help with the bureaucracy ... we could divide and conquer 15:30:36 <DinaBelova> ildikov, well, there are triage-days in some of hte OS projects 15:30:47 <eglynn> maybe use DinaBelova's etherpad as a mutex? 15:30:50 <ildikov> DinaBelova: sounds good 15:31:01 <ildikov> DinaBelova: I thought somethings similar 15:31:26 <eglynn> i.e. if you're gonna attack a section, mark it on the etherpad so that you effort isn't duplicated? 15:31:29 <DinaBelova> well, so they are running 1-2 days a month if there is no much load 15:31:39 <DinaBelova> triage-days I mean 15:32:02 <eglynn> yeah, I'd be on for regular triage days in general 15:32:19 <ildikov> eglynn, DinaBelova: or having a round robin schedule for cores and whoever wants to join for checking LP periodically 15:32:29 <ildikov> eglynn: it was an issue earlier too 15:32:41 <ildikov> eglynn: and I think it will be a continuous issue TBH 15:32:45 <DinaBelova> ildikov, eglynn - it's up for you)) 15:32:56 <DinaBelova> the solution I mean 15:32:56 <DinaBelova> :) 15:33:02 <DinaBelova> and other core team members, sure) 15:33:07 <ildikov> so I'm definitely on for triage-days or anything similar 15:33:21 <ildikov> DinaBelova: thanks for the etherpad BTW 15:33:31 <eglynn> my preference is to avoid too much heavy-weigth scheduling of the core team's time 15:33:35 <DinaBelova> eglynn, please mark this as info - decision about triage days 15:33:40 <DinaBelova> ildikov, np 15:33:41 <ildikov> DinaBelova: in long term it will be not effective, but it will be good for now for a heads up for sure 15:34:04 <DinaBelova> ildikov, I guess yes 15:34:16 <eglynn> ... as everyone has chaotic demands on their schedules, hard to make round-robin scheduling stick 15:34:18 <ildikov> eglynn: sure, that's true also 15:34:20 * jd__ used to triage NEW once a week at least :( 15:34:30 <DinaBelova> for now mess is huge :( 15:34:47 <ildikov> eglynn: it is a painful process anyway I think, no one likes administration... 15:35:00 <DinaBelova> jd__, well, I guess it'll be great - as now there is a traffic jam really)) 15:35:07 <eglynn> jd__: right, I'll follow your lead and go with a once-a-week trawl 15:35:15 <gordc> DinaBelova: i'll take a quick look through the list. thanks for building it. 15:35:22 <DinaBelova> gordc, np 15:35:42 <eglynn> ... and if anyone wants to also pitch in on a best-effort basis, that would be welcome also 15:36:02 <DinaBelova> gordc, the main problem here is that there is also huge list of completely new bigs/bps 15:36:11 <DinaBelova> and I did not mention them here 15:36:34 <jd__> you can't really clean BP i think because you can't even delete it, they just rot 15:36:42 <jd__> s/it/them/ 15:37:01 <DinaBelova> well, at least we may set priority for the almost merged things 15:37:12 <llu-laptop> does launchpad has any advanced feature to help this kind of work? 15:37:14 <DinaBelova> as there are lots of them here too 15:37:32 <DinaBelova> llu-laptop, don't think so 15:37:43 <llu-laptop> :( 15:38:21 <eglynn> what's the thought on moving to gerrit for blueprint review? 15:38:37 <eglynn> ... as was recently discussed for nova on the ML? 15:38:41 <ildikov> eglynn: like nova-specs? 15:38:50 <dhellmann> +1 15:39:03 <eglynn> ... not a solution for existing BP cruft, but might prevent the accretion in the future 15:39:36 <ildikov> eglynn: +1 from me too 15:39:58 <dhellmann> iiuc, those nova specs all have a blueprint, too, and the reviews are only used to track changes to the implementation plan but not the status of the blueprint's implementation or schedule 15:39:59 <eglynn> #action eglynn look into the mechanics of reviewing new BPs in gerrit 15:40:08 <llu-laptop> +1, this 15:40:13 <DinaBelova> ildikov, eglynn but please notice that that will also make load on the core reviewers even much 15:40:15 <ildikov> on LP the outdate ones can be set to an invalid state or something like this 15:40:20 <DinaBelova> ... as it's now 15:40:39 <ildikov> DinaBelova: sure, but at least the BPs will be finally reviewed 15:40:45 <DinaBelova> ildikov, sure 15:40:46 <dhellmann> DinaBelova: yes, true, but the tradeoff is that code reviews should be easier, because we would have agreed to the design in advance 15:40:55 <DinaBelova> and not appearing on the LP without the need 15:41:03 <DinaBelova> dhellmann +1 15:41:14 <eglynn> DinaBelova: ... true enough, but we'd have to look at that extra upfront workload as an investment for the future 15:41:16 <ildikov> dhellmann: +1 15:41:53 <eglynn> k, we're up against the shot-clock here so better move on 15:41:56 <llu-laptop> b.t.w. does the approval of ceilometer-spec patch will be reflected on launchpad blueprint? 15:42:31 <eglynn> llu-laptop: my understanding was that "approved" status would be gated on the gerrit review of the BP 15:42:48 <eglynn> llu-laptop: ... as opposed to just being set on an ad-hoc basis 15:43:16 <llu-laptop> eglynn: got that, please move on 15:43:23 <eglynn> #topic Tempest integration 15:43:31 <eglynn> _nadya_: anything new to report? 15:44:11 <DinaBelova> btw these patches are the following 15:44:12 <_nadya_> no valuable updates. Postgres doesn't work quick enough too :( 15:44:19 <DinaBelova> #link https://review.openstack.org/#/q/status:open+project:openstack/tempest+owner:vrovachev,n,z 15:44:45 <_nadya_> DinaBelova: some became abandoned today or yesterday 15:44:55 <eglynn> DinaBelova: so those patches are still effectively blocked from landing by the sqla performance issues? 15:44:56 <DinaBelova> already restored) 15:45:02 <DinaBelova> eglynn, yes 15:45:13 <DinaBelova> we're blocked by the f20/ubuntu14 15:45:21 <DinaBelova> with mongo 15:45:31 <DinaBelova> that is working 30x times faster 15:45:33 <_nadya_> we may move on to 'performance tests' :) 15:45:43 <eglynn> k, so no change then until we sort out the sqla issues and/or gate the longer-running tests on mongo 15:45:48 <DinaBelova> _nadya_, yes) 15:46:04 <_nadya_> eglynn: yep 15:46:04 <DinaBelova> eglynn, yes, it looks so 15:46:23 <eglynn> fair enough, so that dovetails nicely to the next topic 15:46:30 <eglynn> #topic Performance testing 15:46:41 <eglynn> ityaptin: the floor is yours, sir! 15:47:10 <ityaptin> how you known we started perfomance testing 15:47:54 <DinaBelova> #link https://docs.google.com/document/d/1ARpKiYW2WN94JloG0prNcLjMeom-ySVhe8fvjXG_uRU/edit?usp=sharing 15:47:57 <ityaptin> we tests mysql, mongo, hbase - it's standalone backends and habse cluster on VMs 15:48:01 <eglynn> ityaptin: it was on the meeting agenda and discussed on the IRC channel yesterday 15:48:56 <ityaptin> And I guess all core reviers to take look on this document 15:49:19 <DinaBelova> as currently we got feedback from you, eglynn, I guess and that's it 15:49:22 <DinaBelova> I mean from core team 15:49:43 <ityaptin> tests results of mysql shows what it's work more than 30 times slower than hbase or mongo 15:49:47 <DinaBelova> dhellmann, ildikov, jd__, gordc ^^ 15:49:50 <ildikov> ityaptin: I think it will be very useful after having the revised SQL models, etc 15:50:08 <eglynn> DinaBelova, ityaptin: so I was also wondering if the test harness used to generate the load was up on github or somewhere similar? 15:50:14 <dhellmann> does that include any tuning of mysql itself? or changes to our indexes? 15:50:29 <ityaptin> ildikov, yes) 15:50:38 <ildikov> so we can compare how much the situation is better 15:50:53 <_nadya_> to evaluate "new model" we need results for the old one. to compare 15:51:02 <eglynn> ityaptin: ... having the loadgen logic in the public domain would be really useful for anyone following up on your results 15:51:05 <ityaptin> eglynn, not yet) but I can, if you want) 15:51:15 <_nadya_> dhellmann: no, no tuning 15:51:21 <eglynn> ildikov: excellent, that would be great! 15:51:23 <DinaBelova> ityaptin, I guess it'll be anyway better to share it 15:51:29 <eglynn> ityaptin: ^^^ 15:51:47 <eglynn> darned tab completion of irc nicks! 15:51:57 <DinaBelova> okay, and one more time - core team - please take a look on the results 15:52:05 <DinaBelova> and please propose your own cases/ideas 15:52:18 <ildikov> eglynn: np ;) 15:52:20 <DinaBelova> as ityaptin is going to continue this work 15:52:21 <dhellmann> _nadya_: it would be useful to have some profiling info about what part of the sql driver is so slow, to see if tuning helps 15:52:26 <_nadya_> so the main plea is "please tell us what results do you want to see" 15:52:30 <dhellmann> mongo was really slow at one point, until we improved the indexes there 15:53:17 <ityaptin> dhellmann, I'll try to find bottleneck in mysql 15:53:22 <_nadya_> dhellmann: ok, will take that into account 15:53:23 <gordc> DinaBelova: this is with a single worker?... i get the feeling sql will never work unless with multiple workers (it was consistently 10x slower for me) 15:53:46 <DinaBelova> gordc, yes single one for now 15:54:09 <llu-laptop> gordc: there is a 3-collector setup 15:54:13 <DinaBelova> gordc, ityaptin is planning to try other variant too 15:54:17 <ildikov> dhellmann: did you mean there SQLAlchemy as driver? 15:54:17 <ityaptin> dhellman, if we have some results, it will be shown 15:54:47 <eglynn> DinaBelova, ityaptin, _nadya_: any ideas on the cause of the saw-tooth pattern observed in time-per-message for hbase as the load is scaled up in the 3-collector case? 15:55:13 <ityaptin> we had 2 ideas: 15:55:15 <dhellmann> I wonder if there's a more efficient way to implement _create_or_update() in that sql driver 15:55:26 <eglynn> ... i.e. the semi-regular spikes in the time-per-message 15:55:46 <ityaptin> 1) It's connection pool size in hbase and mongo. It is not true. 15:55:50 <gordc> the api response time are concerning. especially for only 100k samples... i was testing with 1million and it was comparable/better than that on sql. 15:56:09 <ildikov> _create_or_update was changed not so long ago or it was just planned to be changed? 15:56:13 <ityaptin> we test different poolsizes in hbase and all results have same pattern 15:56:38 <gordc> dhellmann: i'd hope to drop the _create_or_update logic... the update option is a bottleneck. 15:56:39 <_nadya_> eglynn: ityaptin speaking about 1-collector case 15:56:40 <ityaptin> 2) greenpool size 15:56:48 <dhellmann> gordc: yeah 15:57:32 <ityaptin> grenn pool size is not tested yet. 15:57:36 <_nadya_> eglynn: there is a peak there with 400 m/s. Regarding 3-collectors case I don't know the answer yet 15:57:49 <gordc> fyi, this is the etherpad i created for ceilometer reschema session: https://etherpad.openstack.org/p/ceilometer-schema 15:57:54 <eglynn> _nadya_: k 15:58:25 <_nadya_> eglynn: actually we don't tune hbase yet. only schema optimization 15:58:39 <DinaBelova> ok, any questions here? as that's close to the end of the meeting 15:58:58 <eglynn> gordc: yeah re. API responsiveness, the "was 141.458 a typo or in seconds?" comment is revealing 15:59:23 <eglynn> ... that sounds unusable as an API 15:59:31 <ityaptin> :( 15:59:38 <DinaBelova> eglynn, yeah... 15:59:39 <gordc> eglynn: agreed..especially for such a small set. 15:59:55 <eglynn> ... would cause an LB or haproxy to drop the connection long before the API call completes :( 16:00:17 <eglynn> right, we a lot of work to do on performance 16:00:24 <eglynn> but we're outta time now 16:00:31 <DinaBelova> eglynn, +1 16:00:40 <eglynn> lets continue the discussion in ATL 16:00:53 <eglynn> thanks folks! ... let's close now 16:01:01 <DinaBelova> bye 16:01:01 <eglynn> #endmeeting ceilometer