15:00:07 #startmeeting ceilometer 15:00:08 Meeting started Thu Aug 7 15:00:07 2014 UTC and is due to finish in 60 minutes. The chair is eglynn. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:00:09 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:00:12 The meeting name has been set to 'ceilometer' 15:00:20 hey y'all! 15:00:27 hello. 15:00:29 o/ 15:00:33 o/ 15:00:46 hola 15:00:49 o/ 15:01:01 <_nadya_> hi! 15:01:16 #topic juno-3 planning 15:01:23 #link https://launchpad.net/ceilometer/+milestone/juno-3 15:01:32 let's focus in on the 2 blocked BPs 15:01:43 first the bigger-data-sql ... seems there's a new patchset pending, but converging on agreement? 15:01:50 #link https://review.openstack.org/#/c/101009 15:01:53 o/ 15:02:16 gordc, cdent: does it feel like convergence? 15:02:40 o/ 15:02:56 gordc just posted some new comment saying it seems pretty okay, but may have some issues with meter reads, so a bit more poking to come, but it is heading in the right direction 15:03:11 (specially meter-list performance) 15:03:25 #link https://etherpad.openstack.org/p/ceilometer-big-data-pt2 15:03:26 o. 15:03:37 that's the latest proposal basically ^^^ ? 15:03:52 there's active code here: https://review.openstack.org/111313 15:04:25 the decision was that there was no way to reach the ideal schema by thinking along, real code was required 15:04:41 what's implemented there is very close to what's in the etherpad 15:04:49 cool, so we need to get the spec lined up the etherpad and the PoC code in order to get it landed, amiright? 15:05:06 reckon so, yes 15:06:04 cdent: cool, I'll follow up with gordc after the meeting 15:06:13 next central-agent-partitioning ... we have too alternative proposals 15:06:24 we've done a preso/hangout on each this week 15:06:28 so it really is decision time now 15:06:34 * eglynn pastes links for each approach ... 15:06:39 (prepare for flood!) 15:06:49 #link https://review.openstack.org/101282 15:06:50 #link http://www.slideshare.net/FabioGiannetti/ceilometer-central-agent-activeactive-ha-proposal 15:06:52 #link https://review.openstack.org/111978 15:06:53 #link http://www.slideshare.net/EoghanGlynn/hash-based-central-agent-workload-partitioning-37760440 15:07:26 eglynn: can we have some clarifications on the Tooz future support? 15:07:33 can everyone who cares about this feature (e.g. attended the hangouts or participated in the reviews already) please "nail their colours to the mast" on gerrit before EoD today? 15:07:57 eglynn, ok, will be done :) 15:08:14 eglynn, actually I wonder what will be the answer on the fabiog_ quesiton here 15:08:15 eglynn: I'm kinda waiting for the answer of that question that Fabio has just asked and also asked by you on gerrit 15:08:16 fabiog_: since you posed the question on gerrit already, I'll ask jd__ to respond there 15:08:23 ildikov, ++ 15:08:32 eglynn, thank you sir 15:08:33 o/ 15:08:41 jd__, good evening :) 15:09:00 jd__: question on https://review.openstack.org/101282 about future maintainership of tooz 15:09:19 jd__: sorry wrong link! ... https://review.openstack.org/111978 15:10:11 let's complete that discussion on gerrit 15:10:12 it has support for harlowj and it's going to be put under the Oslo maintenance 15:10:18 s/for/from/ 15:10:45 same as stevedore, etc… 15:11:09 jd__: a-ha, cool, that's good news about the oslo aspect 15:11:11 jd__, thanks for the clarification 15:11:26 we're just waiting for some projects to use it before moving it under Oslo 15:11:29 jd__: thanks, this helps 15:11:31 and I'm working as we speak as using it with Nova 15:11:43 (to replace their custome implementation of memcached/zookeeper) 15:11:46 <_nadya_> jd__: what projects? :) 15:12:03 _nadya_: Nova and Neutron are first in sight last time I checked 15:12:04 _nadya_: nova 15:12:10 jd__, I remember kind of this mailing thread - am I right it was already started? 15:12:16 there might be more I'm not aware of 15:12:29 <_nadya_> cool 15:12:42 it's likely that a lot of project have crafted custom solutions that are not so robust and would have a better usage of tooz, but we'll see 15:12:55 and Ceilometer obviously :) 15:13:32 jd__: and on the cards also, possibly new tooz drivers? (e.g. based on olso-messaging) 15:13:47 sure thing, we just need someone to write them 15:13:56 it's not on my priority list, though if I get bored, you never know… 15:14:01 :) 15:14:31 fabiog_: it that sufficient information about tooz future status? 15:14:58 eglynn: sure. I think it responds to my concern 15:15:06 fabiog_: excellent! 15:15:11 jd__: thank you sir! 15:15:29 you're welcome ladies and gentlmen 15:15:58 ok, apart from those 2 blocked BPs, does anyone have any concerns about any other stuff target'd for juno-3? 15:16:14 Gnocchi dispatcher? 15:16:20 * jd__ whispers 15:16:42 jd__: I have several excuses, none of them is good enough for mentioning it here... 15:16:53 :-) 15:17:22 ildikov: are you still confident that it's do-able in the j-3 timeframe? 15:17:30 if anyone would like to jump in, I will not kill him/her, otherwise, I'm on it, just had some environmental issues to fix 15:17:42 ildikov: cool, understood! :) 15:18:03 eglynn: it has to be regardless my concerns... 15:18:17 eglynn: still 4 weeks, right? 15:18:28 ildikov: yep 4 weeks from today 15:18:34 eglynn: cool, tnx 15:18:46 OK, shall we move on? 15:18:55 eglynn: I responded already in Gerrit ;-) 15:19:06 fabiog_: thank you sir! 15:19:10 #topic Ironic conductor ipmi -> ceilometer saga (punted from last week) 15:19:19 cdent: the floor is your's sir! 15:20:03 I think we can punt this one further down, unless any is gasping for it. 15:20:13 cdent: k ... Haomeng's patch has now landed amiright? 15:20:19 yes 15:20:21 #link https://blueprints.launchpad.net/ironic/+spec/send-data-to-ceilometer 15:20:30 cool, let's return to that so 15:20:39 #topic Tempest status 15:21:10 DinaBelova: your mongodb experimental job landed, nice work! 15:21:17 thank you sir :) 15:21:25 eglynn, it behaves quite nice :) 15:21:46 I encourage everyone who have ceilo changes to comment "check experimental" on your patches 15:21:52 and see what will happen 15:22:14 currently this job has experimental status and needs to be run manually 15:22:21 great! ... so maybe we could consider switching the main tempest job to mongodb if it continues to behave nicely for the next say week or so? 15:22:34 eglynn, I hope so, yes 15:22:55 DinaBelova: so the magic cast is 'check experimental', literally? 15:23:02 llu-laptop, yes 15:23:05 good 15:23:16 DinaBelova: so se could potentially swap the status of sql-a and mongodb in the gate? 15:23:35 eglynn, well, if TC and over community won't be agians - why not, actually 15:23:46 that will work faster and more stable I guess 15:23:51 i.e. mongodb becomes the main "blessed" job while sql-a becomes the second-tier 15:24:09 since that the reality in terms of what is recommended by distros for production 15:24:16 eglynn, yeah, I got your idea - that lgtm actually, but we need QA team blessing here :) 15:24:34 DinaBelova: cool :) ... I'll raise at the project meeting next Tuesday 15:24:57 eglynnm thank you sir - I'll try to attend - but it's late a little bit for me, so can't promise :) 15:24:59 #action eglynn raise blessing the mongodb-based tempest variant at PTL meeting 15:25:09 DinaBelova: np! 15:25:18 :) 15:25:27 cool, this feels like real progress :) 15:25:40 k, anything else on Tempest? 15:25:45 also I've raised this question yesterday, but not all of us was aware, so I'll say one more thing here 15:25:54 shoot! 15:26:08 we have notification tempest patches -2'ed - https://review.openstack.org/67164 and https://review.openstack.org/70998 15:26:10 that 15:26:23 that is because of unstable gate tests passing here 15:26:38 <_nadya_> nice :) 15:26:39 sometimes notifications are just not comint 15:26:42 coming* 15:26:47 _nadya_, hehe 15:27:06 eglynn - as I remember, you even disabled nova notifications now to keep gate more or less stable 15:27:11 DinaBelova: yes, I think these will have to be moved to the new in-tree functional tests 15:27:16 <_nadya_> not comming or not sent? 15:27:32 _nadya_, that actually is a good quesiton 15:27:45 vrovachev could not reprouce it locally 15:27:47 _nadya_: not arriving and persistent in the metering store in time I thought? 15:28:03 eglynn, yes, at least it looks so after the logs surfing 15:28:38 but we have no actual data to say - 'yes, it is so' - as it was not actually reproduced 15:28:43 locally 15:29:15 k, I think we're kind of blocked on that one unless we it's simply collector latency that would not be such an impact if the main tempest job were re-based on mongodb? 15:29:29 eglynn, yes, exactly 15:29:58 sounds more like an environmental or load related issue tho' 15:30:33 DinaBelova: have you tried just bumping up the number of collector workers? 15:30:36 it's very hard to know without fully reproducing the Jenkins slave 15:31:03 gordc, don't remember actually - I thought Vadim had tried different combinations... 15:31:06 <_nadya_> ildikov: yep 15:31:09 gordc: that could be tried, tho' the tests would also have to unskipped to confirm 15:31:10 eglynn: I know that is why I said sounds like :( 15:31:13 ildikov, +1 15:31:17 in that case, should we postpone the patch https://review.openstack.org/#/c/80225/ until we're sure that the notification won't get lost, at least not change the default publisher until then 15:31:59 eglynn: yeah, i'm not sure if its' helpful much either... might just be masking issue by incresaing workers. 15:32:02 llu-laptop: can't we verify that wil local tempest runs? 15:32:08 would be good if there were logs for mq 15:32:23 gordc: k, worth experimenting with in any case I think 15:32:23 gordc, YES, for sure 15:32:40 k, let's move on with the agenda 15:32:45 eglynn, ok 15:32:46 #topic TSDaaS/gnocchi status 15:33:06 * eglynn passes the conch to jd__ ... 15:33:36 nothing really knew on that side for this week 15:33:40 s/knew/new/ 15:33:51 * DinaBelova whispers that initial python-opentsdbclient version was merged 15:33:57 I've started to work on the archiving policy we discussed at mid-cycle 15:34:04 cool DinaBelova 15:34:04 jd__: cool 15:34:16 and I'm also working on tooz wrt the file driver we have in Gnocchi 15:34:20 DinaBelova: I also had a quick peek at the opentsdb driver 15:34:22 so I rewritten opentsdb driver change to fit it 15:34:28 eglynn, oh, cool 15:34:43 finger crossed I'll have a first implementation of the archive policy patch for next week 15:34:52 s/finger/fingers/ 15:34:59 jd__: excellent! 15:35:27 DinaBelova: so one thought that occurred is that we should think again out about what we consider the set of "standard aggregates" to be 15:35:41 eglynn, also I found one really interesting guy who used opentsdb much with ceilo - they have rewritten ceilo actually - so I'm trying to reach him for some experience here 15:35:42 DinaBelova: (given that opentsdb doesn't support median etc.) 15:35:52 eglynn, yeah, I found this moment too 15:35:53 cool 15:36:32 FYI on a more general point, I was asked to explain the background on gnocchi to the TC 15:36:38 ... this ML thread is a precursor to that conversation 15:36:46 #link http://lists.openstack.org/pipermail/openstack-dev/2014-August/042080.html 15:36:47 eglynn, yes, I saw the e-mail 15:37:24 that'll prolly be at the TC meeting next week, if there's space on the agenda 15:38:07 k, anything else to discuss on gnocchi folks? 15:38:45 cdent: shall we un-punt IPMI in that case? 15:39:02 #topic Ironic conductor ipmi -> ceilometer saga (punted from earlier in the meeting) 15:39:26 * eglynn passes conch to cdent ... 15:40:05 The basic gist there is looking for advice on what to do when code on the ceilo side has to work with code on another project's side and that other side has insufficient testing such that simple runtime bugs are being shown up by my own manual testing. 15:40:18 I wonder if people have good ways of "managing" that. 15:40:39 (that's runtime bugs in the other side's code) 15:40:51 so I guess the obvious one is to draw attention to this testing deficit on gerrit 15:40:59 (which I think you've already done) 15:41:30 the less obvious approach is to jump in and start the ball rolling on the other project 15:41:47 i.e. by raising it on their meeting agenda 15:42:02 or even by authoring and proposing patches with tests 15:42:20 though that involves obviously a big investment of time 15:42:21 send patches with unit tests + fixes to the other project, pointing how they suck? ;) 15:42:36 you're so graceful jd__ :) 15:42:52 at least reporting a bug with a basic test in it on how to reproduce is a good thing 15:43:00 because sometimes you don't have the time or knowledge to fix it 15:43:22 but most devs will feel the shame and will try to fix it since you provided a good way to reproduce it 15:43:29 :) 15:44:03 so I think the telling them, and then telling them again, will get you so far ... 15:44:10 ... but jumping in and actually leading by example can be a more powerful motivator for the other project 15:44:21 yes 15:44:23 ✓ 15:44:32 * cdent worries about spread 15:44:33 the good strategy is to give everything you can and terminate by "how can I help more?" 15:44:49 that pushes the other dev to do something so you can help, and things move on :) 15:45:43 jd__: good point ... a helpful tone tends to be welcomed, whereas snark tends to be counter-productive 15:45:52 * jd__ nods 15:46:18 cdent: by "spread", you mean spreading cdent too thinly? 15:46:33 cdent: yeah, I hear ya ... it's a balancing act to be sure 15:46:48 (yes, on spread) In the particular case the response I got was the it was basically impossible to unit test the particular path that was (multiple times) causing problems. 15:47:16 Which suggests there's some kind of infrastructure problem that perhaps in-tree functional testing could help with, eventually. 15:47:23 cdent: you were unsatisified with that answer? ... i.e. felt it would/might be possible? 15:47:33 cdent: a-ha, yes, got it 15:48:29 * eglynn hopes we as a community are not starting to cargo-cult the new in-tree functional testing ... 15:48:36 * eglynn meant that half in jest :) 15:48:49 it's a mythical savior, coming soon, probably around christmas 15:49:16 cdent, :D 15:49:32 yeah that looks quite mystical :) and mythical :) 15:49:34 cdent: let's codename the project "santa" :) 15:49:39 eglynn, lol 15:51:50 cdent: do you feel you've enough to go on with the above discussion? 15:52:14 yup, just wanted to get a feel for people's approaches 15:52:27 cool 15:52:31 #topic open discussion 15:52:52 Hi! One little update for performance tests. 15:52:58 ityaptin: cool 15:52:59 I'm prepare doc for getting start tests with oslo messaging 15:53:04 https://docs.google.com/document/d/1MXhZRXm8UoEN1kYt2NdZe7qyMb7pM9aW1kq5TuGz5lc/edit?usp=sharing 15:53:36 In doc there is plan for prepare for test and repository url 15:53:57 ityaptin: a-ha, so this would it make it possible for anyone to run that load test? 15:54:05 (as requested by gordc last week) 15:54:16 If someone wants to trying perf tests and have questions or run into issues - please comment or edit) 15:54:25 threoreticall yes) 15:54:29 ityaptin: excellent, thanks! 15:54:43 it would be great if someone had time to try that 15:54:58 once it's proven out, maybe we could put the content up on the wiki? 15:56:03 eglynn: i'm going to try to run it locally. of course it can't be compared to previous numbers since it's a diferent machine. 15:56:25 ityaptin: i may ping you tomorrow. haven't tried to run it yet. 15:56:33 yes, we can) Also we can add profiling graphs to wiki. They are very nice) 15:56:42 gordc - ok) 15:56:48 gordc: excellent! :) 15:57:27 wow, we've 3 whole minutes left in our slot 15:57:31 ... that's unusual :) 15:57:34 ityaptin: in step2, which file to create? 15:58:13 file with db urls, like hbase://localhost:9090 15:58:16 one at line 15:58:47 It's used for configuration backend runtime) 15:58:55 ityaptin: got that, thanks 15:59:43 ok, thanks folks as always for a productive meeting 15:59:49 ... let's call it a wrap! 15:59:54 #endmeeting