15:00:54 #startmeeting ceilometer 15:00:54 Meeting started Thu Feb 26 15:00:54 2015 UTC and is due to finish in 60 minutes. The chair is eglynn. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:00:56 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:00:58 The meeting name has been set to 'ceilometer' 15:01:06 o/ 15:01:06 o/ 15:01:09 who's all on for the ceilo meeting today? 15:01:13 o/ 15:01:14 o/ 15:01:17 o/ 15:01:19 o/ 15:01:19 <_elena_> o/ 15:01:34 o/ 15:01:39 #topic kilo-3 15:02:06 #link https://launchpad.net/ceilometer/+milestone/kilo-3 15:02:20 eglynn: Rohit and I prepared a deck to explain in details how the pipeline db works across the board 15:02:21 looks good in terms of BP progress 15:02:30 fabiog: that would be excellent 15:02:35 eglynn: here it is the link and I hope it helps reviewers: http://www.slideshare.net/FabioGiannetti/ceilometer-pipeline-db-storage 15:02:54 fabiog: is that linked in a gerrit comment 15:02:55 ? 15:03:00 llu-laptop: we can probably close edwin's story right? 15:03:15 eglynn: is not, but I can definitely add, let me do it now 15:03:24 eglynn: could add https://blueprints.launchpad.net/ceilometer/+spec/power-thermal-data into kilo-3, looks like patch is being under review 15:03:28 fabiog: pls do, thanks! 15:03:45 llu-laptop: targetting now 15:03:55 gordc: yeah, self-disabled-pollster should be completed 15:04:08 llu-laptop: cool cool 15:04:11 are we tracking gnocchi bps as part of k3? 15:04:31 eglynn: thx 15:04:43 prad: not on ceilometer launchpad, on the separate gnocchi LP 15:05:14 ok, thx 15:05:24 llu-laptop: so https://blueprints.launchpad.net/ceilometer/+spec/self-disabled-pollster is completed? 15:05:45 * eglynn marks as implemented 15:06:31 eglynn: I think so, at least for kilo-3 15:06:33 DinaBelova: do you know if igor/ilya are planning to put up code/spec for event/sample generator/bracketer work? 15:06:35 ok biggest outstanding chunk is the pipeline config in DB ... that slidedeck should help expedite reviewing 15:07:29 * gordc is to mesmorised by how pretty the hp intro slide template. 15:08:33 fabiog: is the slideshare link correct? looks like I can't open the link 15:08:49 llu-laptop: worked for me 15:09:09 i will follow up with them later i guess. 15:09:23 ok, maybe my isp issue. will try that tomorrow in office 15:09:26 <_elena_> from Dina Belova: yep, they were going to publish it this week, but probably they�ll do that early next week 15:09:50 _elena_: awesome! thanks for the update. 15:10:01 <_elena_> Dina�s laptop is broken, so she�s fighting with temporary one 15:10:41 _elena_, DinaBelova: the feature proposal freeze is supposed to be Mar 5th, so the spec review would need be followed on pretty quickly by actual code if it's to make kilo-3 15:11:50 move on from k3? 15:12:17 #topic revisit asyncio/trollious discussion 15:12:36 <_elena_> eglynn, yes, guys know this and will hurry 15:13:36 idea was to revisit this discussion from last week, as some of the folks likely in favor weren't present 15:13:56 so I've discussed this a bit with haypo also off-line 15:14:36 haypo has started working on a fork https://github.com/haypo/ceilometer 15:15:19 also I had misunderstood a crucial point last week 15:15:25 you can take a look at commits if you would like to have an idea of the required changes to switch from eventlet to trollius 15:15:33 threads and asyncio are not intended to be mutually exclusive 15:15:39 but the final code may be simpler, currently it's a mix between eventlet and trollius :) 15:16:20 the choice to replace eventlet is rather between threads alone (as advocated by Josh Harlow) or asyncio+threads (as advocated by haypo) 15:16:38 eglynn: currently, i'm running the RPC listener and the notification listener in threads, because oslo messaging still use blocking calls (sock.recv) 15:17:18 haypo: so we can also continue to use threads for example for the DB access, which was one concern last week 15:17:25 it's simple to call a blocking function in a thread: loop.run_in_executor(None, func) 15:18:09 (the concern about DB access led to the suggestion that the notification-agent, as opposed to the collector, would be the best place to start) 15:18:09 eglynn: you may start with threads, in switch to fully asynchronous code a few years later, if it's simpler and/or faster 15:18:43 yep, so no need to jump straight to asyncio-aware mongo/redis/sql-a client libs, right? 15:19:04 eglynn: yes 15:19:33 and that seems to mitigate the risk/impact of this change 15:19:44 eglynn: i would like to add a new storage implementation with the asyncio client for redis, to compare performances, but i'm not sure that i will have time for that ;) 15:20:10 hum, mongo, nor redis 15:20:26 I still have this vague feeling that we've never really established fully why people went with eventlet in the first place. That is: why wasn't horizontal scaling by process (and things like wsgi containers) the way things went? 15:20:34 (i understood that redis uses tooz, but it may require to change tooz too, i didn't check) 15:20:35 haypo: cool ... BTW the redis usage is in oslo/tooz, as opposed to ceilometer proper 15:21:12 cdent: in the wiki fossil record there's some justification for the original change from twisted to eventlet 15:21:32 yes, but using twisted in the first place was...twisted ;) 15:21:51 cdent: twisted/eventlet are used for performances when must of the time is spent to wait for I/O 15:22:01 #link https://wiki.openstack.org/wiki/UnifiedServiceArchitecture#eventlet_vs_Twisted 15:22:14 cdent: eventlet is easier to use than twisted, twisted is mostly based on callbacks, which is not very fun 15:22:18 Yes, I know haypo, and the evidence is thing on the ground that a lot of waiting on I/O is happening 15:22:24 s/thing/thin/ 15:22:33 it may be true, but I haven't seen the _evidence_ 15:23:10 nor have I see the evidence that operating system based methods for handling concurrency are inferior to those provided by python 15:23:37 we've seen that keystone's api server is switching to no eventlet 15:23:52 because they realized if they just hork the thing under mod wsgi or uwsgi it "just works" 15:23:52 cdent: for the specific case of python, threads are not as fast as expected, because of the GIL 15:24:00 * cdent hasn't mentioned threads 15:24:56 Look, let me reset: I have nothing against exploring asyncio, or even choosing to use it. What I'm worried about is that it seems to be a drive to solve a problem that hasn't been well explored/defined. 15:25:08 We're _assuming_ concurrency problems, as far as I can tell. 15:25:08 cdent: usually, horizontal scaling in a single host means multiple processes and multiple threads 15:25:25 cdent: I'm not sure that keystone is directly comparable to say nova, i.e. does it do much apart from the synchronous API request/response call pattern? 15:25:25 multiple threads, yes, but not within the python code 15:25:37 cdent: i.e. no async workflows based on RPC 15:25:45 (IIUC) 15:25:55 I'm not sure either eglynn. I'm saying: Let's be more sure. 15:26:04 that's _all_ 15:26:06 i read that keystone is mostly CPU-bound because it uses a lot of crypto functions 15:26:17 yep, that too 15:26:48 cdent: for API deamons, i suggested to rely on web servers to spawn processes/threads/whatever, and just write regular "blocking" code 15:27:13 * linuxhermit agrees 15:27:32 i'm more interested by other daemons ;) 15:27:53 yes, me too, show me the profiling data 15:28:31 (or at least be prepared to show comparative benchmarks later) 15:28:48 I think it would be better to investigate first, though. 15:29:28 cdent: can we get realistic profiling data as things currently stand, seeing as eventlet monkey-patching arranges for the services not to be blocked waiting on I/O in general? 15:29:30 I have to say that I'm really surprised that I'm getting resistance on this issue. Isn't this the sort of thing we're supposed to do to do our work well? 15:30:01 I don't know, but I haven't heard of anyone trying. 15:30:09 i.e. would we need to revert to a completely non-async approach to know how I/O bound these processes would be in the absence of eventlet/asyncio etc. 15:30:27 cdent: my plan is to port ceilometer to asyncio. then we will have two implementations of ceilometer, i will be easier to compare performances and estimate the number of modified lines (just compute the diff) 15:31:03 is this being applied to oslo.messaging as well? 15:31:19 Yes, haypo, and as I've said elsewhere, that's a fine plan as long as it has checkpoints along the way: "I did service X and it went well, see this data, so I'm continuing on to service Y" 15:31:22 gordc: yep, executor already landed there IIUC 15:31:34 gordc: oslo messaging has already an "aioeventlet" executor. basically, it means that you can already play with asyncio and oslo messaging 15:31:50 I'm not saying "don't do this", I'm saying "do this with data and analysis" 15:32:08 If you're planning to do the data and analysis, more power to you, please carry on, let us know how it goes, god speed. 15:32:33 eglynn: i see. is there any way to test oslo.messaging? i would think a lot of the actual work in ceilometer is really handled by oslo.messaging. 15:33:01 I suspect you are right gordc 15:33:03 or is it not a simple "switch oslo.messaging executor" 15:33:16 cdent: i don't really care of performances. i expect "good" performances, but maybe not the best performances. my first concern is to get rid of all issues related to monkey patching 15:33:50 haypo: comparable performance at least, surely? 15:33:52 cdent: but i understand that nobody wants a slower ceilometer (because "it just works" currently, no change is needed ;)) 15:34:13 It doesn't work all that well now. 15:34:32 well, i don't see why asyncio would be X times slower than asyncio. it has a very similar design (event loop) 15:35:04 haypo: can you give gordc some pointers on testing oslo-messaging with asyncio executor? (in isolation to your other changes) 15:35:10 And I'm not strictly concerned with performance either. I'm worried about replacing annoying eventlet idioms with annoying asyncio idioms when we haven't got good data on the necessity of non-blocking io. 15:35:54 * cdent suspects we can achieve asynchrony at process boundaries between services 15:36:04 gordc, eglynn : here is my commit to switch to aioeventlet executor, https://github.com/haypo/ceilometer/commit/c552d98508ee4ce9678ac581d579aea05cb798f5 15:36:12 haypo: is step one read your mailing list post? i put it aside and forgot to start reading again. 15:36:33 haypo: oh... seems simple enough :) 15:36:45 (i expect the aioeventlet to be a temporary solution, because it still uses eventlet. i would like to replace it with a regular asyncio executor later) 15:37:06 gordc: yeah, you see, it doesn't mean "rewriting ceilometer from scratch" :-) 15:37:31 cdent: how would "asynchrony at process boundaries" map into the ceilometer archiecture? 15:37:35 gordc: it looks like ceilometer-collector still works in my branch, even if i disabled monkey-patching! 15:37:54 cdent: you cannot get concurrency for free 15:38:12 cdent: concurrency is a complex problem, especially if you expect performances ;) 15:38:30 haypro: I know. We've all got some expertise here. 15:39:20 eglynn: the collector is a good example: we run multiples of those listening on the bus 15:39:35 the fact that we have a bus means that we can have many producers and consumers on it 15:40:28 gordc: the asyncio branch of ceilometer is already the second step on my overall plan, search for "second part" in https://review.openstack.org/#/c/153298/3/specs/asyncio.rst 15:40:43 gordc: the last part should be quick 15:41:41 I think we've belabored the point enough. There's no reason for the work not go on and for us to see how it works out. I think it is a worthwhile exploration. 15:41:49 replication at the process-level with effectively single-threaded collectors feels very old school 15:41:50 haypo: cool cool. i don't really have a strong opinion either way right now. i'll take a read through 15:42:14 cdent: i agree that the spec was discussed enough :) now it's time to review the code, but please wait until i write it :-D 15:42:14 If we think it is crap we can always just not merge it. If it is awesome, then great. 15:42:51 so it seems we're slowly approaching a better of understanding of what haypo is trying to acheive here and how intrusive the change will be 15:42:54 cdent: yeah, that's why i chose to develop in a new branch: just ignore it if you don't like it ;) 15:42:59 eglynn: yep 15:43:23 eglynn: it may be old school but it results in python code that is easy to maintain, the complexity is kept elsewhere 15:43:27 don't hesitate to ask me directly more specific questions if you want ;) 15:44:18 cdent: concurrency is a hot topic :) joshua proposed to use threads. someone else proposes go coroutines :-p 15:44:44 the fact that this work is being done on a branch means low risk of ongoing distruption to the core ... plus will provide a working illustration/proof-point before anything is merged 15:44:50 both plus-points 15:45:52 so conclusion is guarded support for haypo to proceed on his fork? 15:45:59 (with the option not to merge, if it looks like a big perf or code-complexity penalty) 15:46:10 cool with me. 15:46:12 * cdent agrees 15:46:35 coolness, thanks 15:46:36 this is for L* cycle? 15:46:41 my branch may also help to identify the code where concurrency matters 15:46:46 gordc: yes 15:46:58 gordc: it's probably to late for Kilo 15:47:01 too* 15:47:13 haypo: cool cool. just wanted to confirm how i should schedule review time. 15:47:43 thanks for the input haypo 15:48:01 agreed thanks for knowledge drop 15:48:15 you're welcome 15:48:16 #topic gnocchi status 15:49:15 recent patches landed https://review.openstack.org/#/q/status:merged+project:stackforge/gnocchi,n,z 15:49:50 no owner yet for the data store migration story 15:50:55 OK, nothing else to report? 15:51:12 #topic open discussion 15:51:40 one thing worth mentioning is that /em broke the world with that 1.0.13 python-ceiloclient release 15:51:52 :) 15:52:14 heat wanted a new release so I went ahead and cut it without thinking about SEMVER 15:52:41 ah, that was you, huh? 15:52:42 i think we were just the first ones to relase client with updated keystoneclient req 15:52:45 in future we'll need to reserve z-bumps for truely compat changes only 15:52:52 cdent: yeap, mea culpa 15:53:25 * eglynn hangs head ... 15:53:37 eglynn: Errare humanum est 15:53:42 :-) 15:53:46 * gordc breaks gate on every pycadf release. keeps qa/infra team on their toes. 15:54:14 :) 15:54:17 fabiog: my Latin is rusty, but I get the gist :) 15:55:39 anything else to discuss? 15:56:19 k, let's call it a wrap ... thanks folks for your time! 15:56:22 cya 15:56:27 peace 15:56:29 #endmeeting ceilometer