15:01:17 #startmeeting openstack-ceilometer 15:01:18 Meeting started Thu Mar 12 15:01:17 2015 UTC and is due to finish in 60 minutes. The chair is eglynn. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:01:19 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:01:23 The meeting name has been set to 'openstack_ceilometer' 15:01:28 who's all around for the ceilo meeting? 15:01:29 o/ 15:01:34 hi 15:01:43 o/ 15:01:47 o/ 15:02:12 o/ 15:02:16 o/ 15:02:17 o/ 15:02:26 <_elena_> o/ 15:02:29 #topic kilo3 status 15:02:46 o/ 15:02:48 o/ 15:02:57 fabiog: I think we can roll this in with the "ConfigDB land or burn?" topic 15:03:00 <_nadya_> o/ 15:03:11 eglynn: yep 15:03:18 #link https://launchpad.net/ceilometer/+milestone/kilo-3 15:03:18 is it a vote? 15:03:44 so the main issue is go/no-go on DB config 15:04:12 fabiog: just an fyi, i marked patch wip for now. 15:04:19 fabiog: part of my reluctance here is that the series of BPs seems like it's gonna be incomplete for kilo 15:04:20 jd__: I think we need to take a decision if we want to merge with the current design and then re-factor later or go back to the whiteboard for Liberty 15:04:50 fabiog: i'd be more confident with the latter. 15:04:51 fabiog: i.e. we have 3 BPs, but the most advance by far is the initial one (IIUC) 15:05:01 most *advanced 15:05:13 postpone 15:05:51 eglynn: I think with all the discussion that went on the other patches got delayed. Personally I think we could still make it. But if the consensus is to re-propose it in Liberty we can go with that 15:06:15 fabiog: say we were to land ceilometer-configuration-via-data-store ... the mechanism would be effectively unused in kilo without the API and agent BPs? 15:06:58 eglynn: I think we could achieve the very basic func. store and retrieve the config 15:07:06 <_nadya_> eglynn: I think we have found one more use case for config-in-db. idegtiarov has describe it 15:07:14 eglynn: we may then thing about how to connect the agents 15:07:21 fabiog: can you confirm conf-datastore-agents is still not started? 15:07:26 eglynn: I think that is the biggest part in contentious 15:07:37 eglynn: that's probably my reason for postponing, i don't think the idea to utilise feature is fully realised yet. 15:07:45 eglynn: no the agent work is started 15:07:57 eglynn: but we need the api to be finished 15:07:59 gordc: yes, my instinct is similar 15:08:15 eglynn: so basically you need the db part first tobe in, then api and finally agent 15:08:39 I like the DB based config idea, but I also share the others' opinion regarding to include full features in main releases 15:09:00 I don't want to ship "dead" code 15:09:09 * linuxhermit agrees 15:09:33 fabiog: would we loose much by postponing to liberty-1? 15:09:42 well, then. I think it will be wise to have a session (again) at the summit on this and nail it once for all :-) 15:09:46 I think it would be nice to have config related session(s) on the design summit 15:10:01 fabiog: i.e. the foundational patches would be effectively unused anyway in kilo, so no real loss 15:10:01 eglynn: no I think we can survive with it 15:10:06 I think fabiog already mentioned it on the review of the db patch 15:10:21 ugh, got distracted, I'm here 15:11:15 are we edging towards rough consensus here? 15:11:24 eglynn: so, since there is no consensus on merging it in Kilo we will propose an updated spec in Liberty and discuss it at the summit, is that a good plan? 15:11:39 fabiog: yes, I think that's the wisest course 15:11:44 fabiog: thank you sir! 15:11:45 fabiog: *1 15:11:51 config in DB seems to be very useful in HA mode and that part also could be discussed on summit 15:12:24 eglynn: do we have a session proposal page for the summit? 15:12:29 idegtiarov: it is also useful for lifecycle management ... 15:12:33 otherwise the outstanding BP is the thermal data for Edwin, amiright? 15:12:46 I think it is implemented 15:12:50 fabiog: it also useful... 15:12:52 eglynn: ^^^ 15:13:07 <_nadya_> fabiog: +1, thanks for patience 15:13:08 eglynn: I mean I have a doc bug for it in OS Manuals... :) 15:13:18 ildikov_: WRT design session page ... not yet, that usually happens much closer to summit 15:13:46 ildikov_: yes, manual needs a patch for the new measurements 15:13:47 eglynn: np, I wasn't sure when we usually open it that is why I asked 15:14:15 ildikov_: what's the bug number? 15:14:29 llu-laptop: I'm on it, but of course if anyone else would volunteer the stage is open ;) 15:14:52 llu-laptop: #link https://bugs.launchpad.net/openstack-manuals/+bug/1430138 15:14:53 Launchpad bug 1430138 in openstack-manuals " Add more power and thermal data" [Medium,Confirmed] 15:15:04 ildikov_: well it's up to us when we start thinking about liberty summit sessions 15:15:22 ildikov_: so if you'd like to start collecting proposal early, please create a googledocs spreadsheet and share the link :) 15:15:44 ... would be handy to capture fabiog's session at least while the ideas are still fresh 15:16:08 eglynn: cool, I will discuss with fabiog if we need that this early, or prepare a bit and then propose session(s) 15:16:17 ildikov_: thanks! 15:16:46 fabiog: is your update related? 15:17:08 eglynn: I will be leaving HP (after 15 years) next week 15:17:17 dank_: wow! OMG 15:17:23 fabiog: ^^^ 15:17:34 dank_: sorry, bad tab completion 15:17:45 eglynn: the good news though is that I will join Cisco 15:17:52 fabiog: well, congratulations! 15:18:01 eglynn: and I will be involved in ceilometer 15:18:09 fabiog: excellent :) 15:18:14 eglynn: and I will attend the summit 15:18:53 fabiog: we'll look forward to seeing you there :) 15:19:12 eglynn: me too 15:19:31 fabiog: I don't mean to pry, but is it Cisco Cloud Services division you'll be working for? 15:19:40 eglynn: yes 15:20:08 fabiog: a-ha, cool :) ... in which we may cross paths from a RH vendor PoV also as well as upstream 15:20:08 eglynn: to be precise infrastructure 15:21:02 eglynn: yep, cloud is a small word ;-) 15:21:09 world 15:21:30 fabiog: well congratulations again :) 15:21:37 eglynn: thanks 15:21:51 gordc: no more mesmorizing slides though :-) 15:22:36 fabiog: :( just copy background and whiteout hp logo. 15:23:18 no legal issues there. 15:23:33 just use the upstream preso template and save yourself the logo change :) 15:24:50 k, let's move onto the delicious pasta-making 15:25:16 #topic gnocchi 15:26:47 i've never eaten gnocchi, is it any good? seems like it would be...thick, pasty 15:27:11 cdent: very fattening I hear ;) 15:27:18 cdent: congrats on joining gnocchi-core :) 15:27:23 it's good! 15:27:28 #link https://review.openstack.org/#/admin/groups/353,members 15:27:42 I think we implemented a bunch of interesting things recently 15:27:43 thanks 15:27:50 * cdent agrees 15:27:53 we have the equivalent of Ceilometer complex query on resources 15:28:01 so you can look for anything in resources :) 15:28:08 we have put that into the metric aggregation code too 15:28:19 so you can do aggregation on metric from resources filtered using this kind of query 15:28:39 jd__: excellent! 15:28:51 now I'm adding measure search to finally have something like "give me instance with > 80 % CPU" or the like, I hope that'll be in time for K 15:29:10 for reference, see here for the latest landings ... https://review.openstack.org/#/q/status:merged+project:stackforge/gnocchi,n,z 15:29:12 I have started to write a new alarm rule kind to use this new format 15:29:14 and sileht is updating alarm support following these changes 15:29:38 so the definitive gnocchi alarm code is now the copy in the ceilo repo? 15:29:50 eglynn, yes 15:29:57 coolness 15:30:05 I _may_ be picking up influxdb where eglynn left off if I can eke out the time. emphasis on the "may". 15:30:25 cdent: that would be great 15:30:58 so the data migration task 15:31:08 sounds like we'll have to punt that to liberty timeframe? 15:31:40 eglynn: I think it's a safe bet 15:31:46 it's likely than nobody will really want to migrate 15:31:51 as they'll start fresh 15:31:56 cdent: I'm interested in that too 15:32:15 i.e. migration from "classic" datastores to the featherlight gnochi equivalent 15:32:23 yeah some may want to start afresh 15:32:35 esp. if just doing a PoC with gnocchi 15:32:51 but others won't want to discard old data 15:33:08 I guess co-existence is the other option 15:33:11 I doubt we really have a lot of "others" 15:33:16 but we'll see I guess 15:33:26 i.e. query v2 API for your old data, gnocchi API for the newer 15:33:26 if ops scream "how do I migrate my data" we'll do something 15:33:34 ok 15:33:48 eglynn: with Ceilo it is usually suggested to use ttl with small inetrval, so it might be the case that it is easier for users to have a fresh start 15:33:54 jd__ that will happen 15:33:55 If people really get anxious we can make an outside the normal release cycle migrator 15:33:59 in fact we should make most things that way ;) 15:34:10 eglynn: as who would like to store data longer should use an external data warehouse anyway 15:34:32 speaking of ops screaming ... is that a good segue-way into https://etherpad.openstack.org/p/PHL-ops-burning-issues ? 15:34:43 cdent: we won't limit Gnocchi releases to the 6 months timeframe I think 15:34:56 *claps* kudos on the segue. 15:36:13 "If ceilometer dies other services stop (for example Glance will not serve images to spin up a new VM)" <- WAT 15:36:26 That's a new one to me. Really? 15:36:26 cdent: that puzzles me too 15:36:39 we've seen that 15:36:56 is that because of the message queue under, or? 15:36:57 linuxhermit: what's the situation? 15:36:58 linuxhermit: is it a ceilometer->heat->glance connection? 15:37:05 cdent: I couldn't see that happening in reality either 15:37:07 I don't know why, but I know it happens, and I don't have all the details 15:37:14 * eglynn belatedly changes topic 15:37:17 linuxhermit: yes, it is related to rabbitMQ 15:37:20 #topic ops summit feedback 15:37:34 I do remember rabbit being part of it as well 15:37:42 I think the biggest issue we need to resolve is that the level of communication between us and the people using what we make is really really poor. 15:37:46 I guess the queue gets filled up 15:37:46 fabiog, linuxhermit: backpressure from rabbit as notifications are being consumed 15:37:48 ? 15:38:00 *aren't being consumed? 15:38:06 eglynn I'll ask ops for more details 15:38:25 linuxhermit: and eglynn well, especially polling can slow down rabbit to a grinding halt 15:38:26 linuxhermit: that would be excellent, as otherwise I'm finding that hard to fathom 15:38:28 linuxhermit: that'd be awesome... do we want a bug in meantime to track? 15:38:38 gordc: yes please 15:38:50 gordc I was hesitant to create a bug until I had the details 15:39:05 I'm in the camp that many thing getted blames on ceilo that might not be just that 15:39:06 Before Juno this is possible that the notifier can block but not more now 15:39:21 cdent: the docco is still behind, we would need more hands there too 15:39:34 realize many of the ops are still on Icehouse and a few on Juno etc 15:39:49 also interesting ... "Customers more scared of new time series databases than mongo" 15:39:53 cdent: I mean to identify what would be needed besides the already opened bugs and then fix the list of missing items 15:39:55 ildikov_: that's a good point too, but I'm mean general conversation/interaction 15:40:18 ideally these things on this etherpad would not be surprises 15:40:24 eglynn: I think we should have an ops meeting at the summit to collect their complains and feature requests 15:40:43 cdent: sure, the marketing what linuxhermit mentioned earlier is also important 15:40:45 I was in the room, but I didn't feel comfortable acting like a rep 15:40:46 new TSDBs in that context == specifically carbonara I wonder? or == to influxdb/opentsdb/whatever 15:41:08 eglynn: not worth speculating we should just ask 15:41:10 TSDB was gnocchi etc and influx 15:41:14 fabiog: yes, if we could find a way to frame feedback in a constructive way 15:41:19 it's not something that operators have had to manage that before 15:41:27 cdent: fair point 15:41:28 cdent: but finally, when someone uses a product the end user docco should be enough to show the right way to use it 15:41:39 eglynn The Nova PTL ran a great one of those at the summit 15:41:45 ildikov_++ 15:41:47 fabiog: +1 15:41:54 and people left feeling much better and happier 15:42:03 ildikov_ ++ 15:42:30 We shouldn't fear the negative unconstructive feedback. It's like therapy: once it comes of the chest at least some small number of the people will become useful allies in the future. 15:42:38 s/of/off/ 15:42:49 cdent no also you can respond and dig at the root cause 15:42:58 we know not doing mongo right is a problem 15:43:00 cdent, linuxhermit: yes and yes 15:43:05 but we don't have great guidance on that 15:43:19 and operators aren't used to running mongo 15:43:38 most of these people had vmware, oracle etc until they stood up open stack 15:43:43 sileht had a good blog post I seem to remember with deployment guidance for mongo 15:44:12 we should find that and have a twitter/blog and repost it or link to it something 15:44:15 +1 to sileht post. 15:44:16 Just an example of how we do sharding with mongo 15:44:24 and configure ceilometer 15:44:34 sileht those things are super handy 15:44:40 * gordc has heard to many i have a single mongo node with no sharding scenarios 15:44:51 gordc exactly 15:44:53 sileht: got a link handy for reference? 15:45:15 blog.sileht.net/using-a-shardingreplicaset-mongodb-with-ceilometer 15:45:18 The real issue is which key to use for splitting the db, and that depends of what you have decide to store with ceilometer 15:45:24 gordc, thx 15:46:37 \o/ that felt vaguely like progress 15:46:39 sileht those things are super good knowledge to share 15:46:52 so my thoughts after going to the operators summit 15:47:12 were that we need to communicate often with that group 15:47:24 and that we need to publish and share as much as possible 15:47:31 linuxhermit: +1 15:47:36 SO MANY operators are looking for ANYTHING to replace ceilo 15:47:46 and often it's not ceilo to blame 15:48:01 but some other relyed upon resource 15:48:27 let's start trying to gather the "tribal knowledge" in one easily discoverable place 15:48:33 StackTach was pushed BIG time af the event 15:48:46 ... e.g. linking blog posts like sileht's off the wiki 15:49:03 eglynn do we have a ceilometer blog/twitter etc? 15:49:13 linuxhermit: well stackstach pre-dates ceilo and has always been pushed hard IME 15:49:50 eglynn perhaps, but our team had been to several ops summits, and this was the first HUGE push of it they had seen 15:50:15 linuxhermit: re twitter, that would a nope ... is there a twitter account for stacktach? 15:50:28 linuxhermit: push for stacktack standalone, or within monasca? 15:50:37 linuxhermit: I guess that was triggered by the negative feedbacks on the MLs 15:51:03 ildikov_ that and basically every carrier had a terrible story in the burning issues talk 15:51:23 part of Cisco and HP had positive rebuttals etc 15:51:32 linuxhermit: sounds like we should have had more of a presence there to counter 15:51:43 agreed 15:51:44 linuxhermit: sure, I can imagine :( 15:51:48 It's obvious from the way stacktach presents itself on its web page that it is going for a full court marketing press on version 3 15:51:54 <_nadya_> linuxhermit: I like the idea about blog/twitter. It would be really cool 15:52:08 I want to help, and I'm just sharing what I saw 15:52:21 which is cool for them 15:52:31 * gordc wonders if i should mention i finished off some of stacktachs original integration bps... will wait after performance testing. 15:52:35 they move quite outside the usual openstack circles though 15:53:01 I was thinking about having a session on the summit about how to organize our marketing and have more eyes on feedbacks etc, but maybe that would be too late... 15:53:03 cdent they are a redhat project 15:53:07 s/redhat/rackspace 15:53:34 so they got quite a bit of operator attention at their session 15:54:03 but it's not about us vs them 15:54:11 it's about us + community = awesome 15:54:14 linuxhermit: yep, the RAX angle has always carried a lot of inbuilt operator kudos 15:54:24 linuxhermit: agreed 15:54:30 and it seems like the community needs some guidance help and education 15:54:33 linuxhermit: yeap, agreed 15:54:37 linuxhermit: though it not always easy to harness that community 15:55:00 agreed 15:55:14 but I'm not sure what I've seen to foster that 15:55:21 linuxhermit: OK, that's a very useful steer as to how we should organize ourselves for the Vancouver summit 15:55:32 yeah linuxhermit++ 15:55:44 also my boss would totally send several of us to a ceilo midcycle 15:55:46 linuxhermit: e.g. ops-oriented sessions, more active outreach etc. 15:55:59 and they are making a bigger investment in ceilo atm 15:56:11 eglynn: +1, that was my thought also 15:56:21 eglynn completely agree 15:56:22 linuxhermit: good to hear :) 15:56:22 linuxhermit: yes, the cancellation of the midcycle was deflationary in retrospect 15:56:30 linuxhermit: it's very good to hear! 15:57:35 before we run out of time I wanted to mention that I've started the process of moving ceilometerclient functional tests out of tempest and into the repo 15:57:47 cdent: good, good 15:57:52 cdent nice 15:57:59 it's the-thing-to-do these days, and is pretty straightforward 15:58:09 coolness 15:58:30 shotclock is against us 15:58:36 shall we call it a wrap? 15:58:50 just one more tiny thing 15:58:58 ildikov_: shoot 15:59:09 it would be nice to get rid of the "please rebase" comments on gerrit 15:59:33 review etiquette 101 :) 15:59:35 <_nadya_> ildikov_: ++ :) 16:00:13 sorry, we had a lot nowadays, so I just wanted make a heads up :) 16:00:23 :) 16:00:52 and we're done i guess? 16:01:04 yeap :) 16:01:26 #endmeeting ceilometer