09:00:22 #startmeeting scientific_wg 09:00:23 Meeting started Wed Dec 7 09:00:22 2016 UTC and is due to finish in 60 minutes. The chair is oneswig. Information about MeetBot at http://wiki.debian.org/MeetBot. 09:00:24 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 09:00:27 The meeting name has been set to 'scientific_wg' 09:00:34 Greetings! 09:00:37 Good morning 09:00:44 Morning. 09:00:48 hi priteau zioproto verdurin 09:00:59 hello all 09:01:08 Blair here? 09:01:18 Hello! 09:01:24 Hi dariov 09:01:45 hi oneswig! 09:01:48 Morning 09:01:49 I'm here (Sam NeCTAR) blair should be, he got me along 09:01:51 #link today's agenda https://wiki.openstack.org/wiki/Scientific_working_group#IRC_Meeting_December_7th_2016 09:01:59 Hi stevejims sorrison_ 09:03:05 o/ 09:03:08 #chair b1airo 09:03:09 Current chairs: b1airo oneswig 09:03:12 Hi b1airo 09:03:18 g'day oneswig 09:03:34 OK - today's agenda was for monitoring, telemetry, tracing and so on 09:03:55 #topic Monitoring - following on from last week 09:04:12 #link Here's the etherpad https://etherpad.openstack.org/p/scientific-wg-telemetry-and-monitoring 09:04:14 fyi - things still a bit rowdy in my house so if i go quiet it's because a child needs rounding up 09:04:37 no problem! Calmer here - kids have gone to school. Any rowdiness is my own 09:05:16 lol 09:05:21 looks like sorrison_ has joined us..? 09:05:33 yip I'm here 09:05:45 This etherpad was gathered together last week, it would be great to flesh it out with more use cases and experiences 09:06:41 oneswig, do we want this to be use-cases that are specifically different from just general IaaS cloud metrics ? 09:07:01 Can we add links to solutions that we haven't used but are relevant? For example for alerting in OpenStack there is Aodh 09:07:46 I think it makes most sense to focus on our specific use cases, but to focus exclusively on those doesn't make sense - prioritise those use cases but take all into account 09:08:04 just read the etherpad, surprised ELK wasn't mentioned 09:08:07 priteau: please do 09:08:23 is this meant for HPC-like deployments only, or for general stuff? 09:08:31 priteau: what is Aodh ? link ? 09:08:51 found https://github.com/openstack/aodh 09:09:07 yeah i suppose ELK can do almost real time monitoring and alerting sorrison_ 09:09:11 #link Aodh: Provide alarms and notifications based on metrics http://docs.openstack.org/developer/aodh/ 09:09:11 dariov: both I think - using the general to enable the specific ideally 09:09:12 Ceilometer alarms was ripped out into it's own project 09:09:17 depends what you push into it 09:09:25 oneswig, cool, thanks 09:09:59 sorrison_ b1airo: how much of Ceilometer is used if you use Gnocchi for telemetry, and how does it function for you at nectar? 09:10:15 #link Full set of Telemetry services in OpenStack https://wiki.openstack.org/wiki/Telemetry#Services 09:10:33 I didn't know about Panko 09:10:41 oneswig we use the ceilometer compute agents to collect the data 09:10:45 There is one thing that is not clear to me. Why we think Telemetry and Monitoring is a use case specific to research computing and not a general use case of large installations ? 09:10:53 Isn't there a wider question of the extent to which we can reuse existing monitoring, for non-OpenStack resources? 09:10:59 and use agent-notification and collector to then send it to gnocchi 09:11:13 sorrison_, you actually made a diagram of our ceilometer -> gnocchi pipeline didn't you? 09:11:28 zioproto: I see additional use cases - added a section for covering that 09:11:31 the ceilometer-api is deprecated 09:11:36 or was that just on the whiteboard at some point? 09:11:41 hmm yeah I'll try find it 09:12:10 the other thing i think oneswig was particularly interested in was using influx with gnocchi 09:12:22 i wasn't sure what the state of your resurrection work was...? 09:12:24 It's at https://wiki.rc.nectar.org.au/images/a/a2/Ceilometer.png 09:12:53 I heard some bad experiences with influx (data corruption) 09:13:02 Can we add these details to the etherpad? 09:13:04 The influxDB driver for gnocchi is now passing all tests. So it is ready for review https://review.openstack.org/#/c/390260/ 09:13:14 priteau what version of influx? 09:13:30 priteau: what were the reproducer conditions? 09:13:32 I would consider anything less than 1.1.0 as unusable at scale 09:13:32 sorrison_: I don't know, just random tweets here and there 09:13:38 not very scientific of me ;-) 09:13:55 Isn't the problem with InfluxDB that clustering isn't open source anymore? 09:13:58 gossip! 09:14:01 can anyone else access that image or should i stick it in a paste somewhere? 09:14:14 I found the most recent tweet I saw, it was using 0.13 09:14:16 yeah not sure, does it need login? 09:14:20 I saw it - had no idea IRC client could do that... 09:14:20 lol 09:14:24 https://twitter.com/punkgode/status/804852123544461312 09:15:02 That's only 4 days ago 09:15:11 The one thing about that diagram that will change is that ceilometer-api will be replaced by panko 09:15:13 ah yeah it is publicly accessible, was able to wget it 09:15:29 oneswig, depends on the irc client i guess - mine didn't display anything 09:15:55 yeah they using an old version of influxDB 09:18:19 and there is an open-source clustering option isn't there sorrison_ ? 09:18:40 b1airo: there is for the older versions 09:19:02 there is an open source relay so you can run multiple influxes 09:19:03 Documentation for 1.1 says: "Open-source InfluxDB does not support clustering. For high availability or horizontal scaling of InfluxDB, please investigate our commercial clustered offering, InfluxEnterprise." 09:19:12 sorrison_: at what scale of metric throughput is clustering needed/advised, or is it purely for HA? 09:19:14 https://docs.influxdata.com/influxdb/v1.1/high_availability/clusters/ 09:19:34 The relay is purely for HA, I believe. 09:19:51 I think clustering would only be for redundancy 09:20:27 and you can get HA with the relay. The catch is it doesn't auto heal you are basically paying for the auto heal 09:20:36 you can manually heal without too much drama 09:20:40 OK thanks sorrison_ - do you use particular hardware for influx, and what throughput can it sustain? 09:20:59 we are pumping in around 30k points/second at peak on 1 server 09:21:22 that's ceilometer data from about 1000 compute nodes 09:21:38 how much storage does this consume per day? 09:21:48 about to ask the same :-) 09:21:51 That's not hugely taxing then for influx I guess 09:22:06 we currently have 51G of data and that is from 2 months 09:22:19 na influx host is hardly doing a thing 09:22:32 load average of about 1.5 09:22:39 sorrison_: the influx host is SSD-backed, yes? 09:22:48 sorrison_: Is that from a single OpenStack instance or are you able to operate a single monitoring infrastructure for multiple deployments? 09:22:50 presumably that is largely dependent on the retention and aggregation policy 09:23:20 it's a physical host with 8 slow disks in raid 10 09:23:30 we going to move it to an SSD host soon 09:23:43 b1airo: yes, those are what make the big difference for Influx storage requirements 09:23:50 yeah depends on granularity and retention of data 09:25:11 sorrison_: what time series data gets collected in this deployment? 09:25:41 we are collecting all the standard ceilometer data 09:26:21 we are also generating our own stats for reporting, eg number of instances, number boots per day etc. 09:26:31 that amounts to libvirt stats, right? Does it also gather hardware monitoring & network data? 09:26:35 (just ducking in between meetings, we are about to set up monitoring on our new production deployment so I will read the logs later for ideas/best practice etc - thanks) 09:26:42 yeah libvirt stats 09:26:50 plus network interfaces and disk interfaces 09:27:03 We'll try to make it an interesting read dave_____ :-) 09:27:27 think those come care of libvirt anyway right? 09:27:41 do we have hypervisor stats going in? 09:27:45 this may render ugly but this is what gets collected on an instance 09:27:51 oneswig: I only have one entertaining(?) nugget in that we already accidentally knocked over our metrics server by pointing 540 OSD's worth of Ceph metrics at it too often 09:28:23 yeah that could do it 09:28:23 sorrison_: how extensible is it for data collection - what would I need to do to collect random things? 09:28:26 metrics we gather on an instance http://paste.openstack.org/show/591620/ 09:28:42 dave_____: what were you doing with the metrics from all your other OSDs? :-) 09:28:48 it's really extensible. You can create your own resource types 09:29:21 eg. we created a resource type called "Institution" which has attributes of State, Location etc. 09:29:34 then add metrics on an instance like number of instance etc. 09:29:57 oneswig: :-) heh but all the disk I/O figures every 5 seconds probably was overkill (until/unless we have something with anomaly detection that can look at all the data which no human is going to look at) 09:30:55 dave_____ raises an interesting example - what would I need to do to add monitoring of (say) smartmon data for disks in a hypervisor or a ceph node? 09:31:48 I'm interested in monitoring the physical in the same context as the virtual metrics - wonder if it can do that 09:32:09 there are a couple of ways to do it, it would be nice if you could use something like collectd to push data into the ceilometer pipeline and then it would just appear in gnocchi 09:32:44 there is ipmi plugins for ceiloimeter to monitor hypervisors and the likes 09:33:02 haven't used that yet, I think thats designed to work with ironic 09:33:14 sorrison_: do you know if those plugins play nicely with ironic? or do they fight over the ipmi? 09:33:26 sorrison_: I think it's groundwork yet to be built on, from what I know 09:33:49 yeah not sure, I just saw some ipmi stuff in the ceilometer codebase 09:34:06 ok, I didn't know they existed, I will check them out, thanks 09:34:08 sorrison_: you can certainly push into influx with collectd, which according to your diagram could then end up in gnocchi? 09:34:36 no you'd need to push to gnocchi 09:34:39 ... but would gnocchi know about telemetry data that didn't arrive through it 09:34:51 no it wouldn't 09:35:20 Oh well, thought it was too good to be true 09:35:23 it's stored in influx with the metric ID as a tag which is a gnocchi ID 09:35:39 https://github.com/jd/collectd-gnocchi 09:35:43 for example 09:35:50 I guess if you knew how gnocchi stored stuff you could by pass it 09:35:56 ah nice b1airo 09:36:04 sorrison_: have you extended ceilometer/gnocchi to collect anything else on your deployment? 09:36:29 we should totally do that for nectar swift 09:37:30 oneswig we have written some custom collection for stuff. It used to use graphite and I converted it to use gnocchi https://github.com/NeCTAR-RC/nectar-metrics/blob/master/nectar_metrics/senders/gnocchi.py 09:37:32 ruddy hell Blair that repo is brand new, how did you find it? :-) 09:37:39 sorrison_, there’s a public document somewhere describing this architecture you guys built for monitoring? I think it would be a good read for our openstack guys 09:38:22 dariov, I posted the link above somewhere 09:38:26 dariov, did you see the link earlier? 09:38:52 dariov: https://wiki.rc.nectar.org.au/images/a/a2/Ceilometer.png 09:38:53 do you mean the image? I was hoping for some written text :-) 09:39:03 yeah, saw that one already 09:39:08 this is your chance to get it! ;-) 09:39:29 yeah ;-) 09:39:30 So does ceilometer really use the notifications bus for sending metrics? 09:39:30 i probably have to pay him in beer on friday 09:39:54 oneswig yes 09:39:59 strewth 09:40:23 oneswig there was talk about getting rid of the ceilometer collection parts and just using something like collectd 09:40:35 rmq is quiet without ceilometer though, i think ours would get bored 09:40:46 collectd-gnocchi may be a first step 09:40:51 So every metric ceilometer sends is a json object - does it support bulk transfers? 09:41:27 yes there are settings in ceilometer for batching things 09:41:49 and you can also batch send metrics to gnocchi 09:42:00 b1airo: rmq activity/events is one of the things we want to monitor! 09:42:00 the snake is eating itself! 09:42:00 ah, what was that tail-eating snake from last week? 09:42:00 bindo 09:42:00 snap :-) 09:42:00 *bingo 09:42:01 ceilometer http POST to gnocchi 09:43:12 Sounds like Ceilometer is good for monitoring usage but not for troubleshooting - I guess that's it's billing heritage? 09:43:53 oneswig yeah agree 09:43:59 I would say Ceilometer is becoming just about acceptable for monitoring 09:44:38 We've been looking at Monasca, which is slightly more decoupled but still has a dependency on functioning keystone 09:45:23 Influx is a contender here also 09:45:38 monasca uses influx? 09:45:47 Can do - or vertica, or...? 09:45:54 Looking at other options 09:46:30 One of the Summit talks mentioned WhisperDB, when looking for alternatives to Influx 09:46:31 One of the questions we currently have is how Monasca is extended to collect other metrics. 09:46:38 http://graphite.readthedocs.io/en/latest/whisper.html 09:46:42 (and off to my next meeting...) 09:46:46 verdurin: not heard of that, thanks 09:47:29 I think we are interested in Monasca for slurping user performance trace data alongside system performance metrics, and somehow coherently presenting the whole thing mashed up together 09:47:55 Mashed-up slurp. 09:48:07 which is where the scientific compute angle creeps in 09:48:15 verdurin: I know, what else is there for breakfast? 09:49:00 The current test case is using Monasca to catch rabbit cluster fragmentation, which happens for us and we need to isolate why 09:49:23 If it can do that, it's a contender 09:49:34 oneswig fragmentation = partitioning? 09:49:47 ah yes, woolly terminology 09:50:15 so we have 3 controllers, nominally HA, but sometimes one leaves the cluster - but nobody else notices 09:50:27 so it goes wrong and monitoring is not seeing it 09:50:56 Saw some talks last week where this came up as their #1 gripe too 09:50:57 nagios plugin that runs rabbitmqctl cluster_status would fix that for you 09:51:13 we set ours to auto heal 09:51:34 sorrison_: it would find it, but I'm suspicious of why it's happening 09:51:43 with openstack having rabbit up is more important that dropping messages when deciding who is master 09:51:49 which is why I'd like to see it coupled with (say) MLAG events from the network 09:51:58 ok makes sense 09:52:30 zioproto: what do you use at SWITCH? 09:52:33 oneswig: only a few minutes left, can we talk about traces quickly? 09:52:46 priteau: sorry - please do - take the floor! 09:52:51 thanks priteau - i wasn't watching the time either 09:53:00 sorry for interrupting 09:53:25 so we have an interest in getting the OpenStack community to share workload traces from their own clouds 09:53:36 oneswig: we dont have Ceilometer in production 09:53:46 we use Nagios for a lot of alarm monitoring 09:53:46 for scientific research purposes, e.g. people developing new VM scheduling algorithms 09:53:56 and we have collectd with graphite 09:54:09 I have started to review existing (non-OpenStack) work 09:54:15 priteau: can you define the data you would like collecting? 09:54:16 #link https://etherpad.openstack.org/p/Cloud_workload_traces 09:54:30 oneswig: yes, that's the next action item for me 09:54:52 I saw some examples which were pretty basic - create vm xxx, etc 09:55:16 One of the questions I have to answer is for VM lifecycle, whether the nova SQL table is enough, or if we need nova-scheduler logs 09:55:32 Is it specifically about infrastructure events, or things like SLURM events? 09:56:14 I think SLURM events would be be more relevant to existing parallel workload traces 09:56:34 I saw some interesting links from that etherpad - https://alexpucher.com/blog/2015/06/29/cloud-traces-and-production-workloads-for-your-research/ - follow a few links at the bottom of the post 09:57:14 priteau, the SQL table won't give us enough unless we are capturing transitions via triggers or something 09:57:26 I saw this one https://alexpucher.com/blog/2015/07/20/cloud-simulators-for-reasearch-and-development/ and wondered if there is a way to faithfully create the stimulus of a large number of compute hypervisors for a test control plane 09:57:33 b1airo: what do you call transitions? 09:57:37 you could use the notifications that are emitted 09:57:46 eg. instance.start, instance.end 09:58:21 i thought it would also be the initial api calls, e.g., boot 09:58:37 these events are actually stored in panko 09:59:17 sorrison_: great point - distill those into something generic, that's the data - I've heard it can be tricky to link the notifications back to an originating API call though - have you seen that done? 10:00:00 when you say link it back to the api call what does that mean? what data needs to be link? 10:00:06 b1airo: depends what you want to do with the data. A more fine grained gathering of events would allow you to measure how long a VM takes to get created 10:00:20 Ah, reconstructing a hierarchy of notifications from a single API call 10:00:24 yeah you can do that with ceilometer 10:00:27 across services 10:00:43 we are out of time all. Last words! 10:00:49 eg. you get a notification when instance create starts and ends, plus a bunch along the way 10:00:51 It is interesting but not exactly what I had in mind, at least for the scheduling research use case, it's really the creation / deletion events that matter 10:01:03 sorrison_: there is also nova.instance_actions 10:01:12 in SQL 10:01:15 that's exactly what the notifications do 10:01:25 time up, gotta close 10:01:29 #endmeeting