19:00:02 #startmeeting Poppy Weekly Meeting 19:00:03 Meeting started Thu Sep 18 19:00:02 2014 UTC and is due to finish in 60 minutes. The chair is amitgandhinz. Information about MeetBot at http://wiki.debian.org/MeetBot. 19:00:04 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 19:00:06 The meeting name has been set to 'poppy_weekly_meeting' 19:00:13 #topic Roll Call 19:00:20 o/ 19:00:26 o/ 19:00:29 hi everyone! 19:00:43 who else here for Poppy? 19:01:19 megan and guimarin cant make it today 19:01:27 o/ 19:01:32 Howdy 19:01:52 ok, lets get started.... 19:02:08 #topic Review Last Weeks Items 19:02:16 #link Agenda: https://wiki.openstack.org/wiki/Meetings/Poppy 19:02:31 #link: Last Week: http://eavesdrop.openstack.org/meetings/weekly_poppy_meeting/2014/weekly_poppy_meeting.2014-09-11-19.00.html 19:02:58 amitgandhinz to investigate MaxCDN CDN Manager API for master/sub accounts 19:03:04 so i still havent done that =/ 19:03:16 #action amitgandhinz to investigate MaxCDN CDN Manager API for master/sub accounts 19:03:19 hi dhellmann_ 19:03:30 obulpathi to keep bugging the atlanta openstack meetup organizers to get on the schedule 19:03:38 any progress on that? 19:03:48 today is the OpenStack meetup 19:04:02 I am going to talk to the organizers and get the talk organised 19:04:03 have you managed to contact dhellmann regarding getting on the schedule? 19:04:09 ok 19:04:24 #action obulpathi to keep bugging the atlanta openstack meetup organizers to get on the schedule 19:04:50 ok thats it from last weeks items 19:04:59 #topic Updates on Blueprints 19:05:08 #link https://blueprints.launchpad.net/poppy 19:05:23 create-service is now in review right? 19:05:29 Yes 19:05:39 updated 19:05:51 get-service ? 19:05:54 tonytan4ever: ^ 19:06:19 That one is in good progress, I am about to roll out a PR for that in today. 19:06:20 started or good progress or slow progress? 19:06:25 oooh cool 19:06:42 obulpathi: health endpoint? 19:06:49 its in final review 19:07:01 cool 19:07:06 addressed Yours and Tonys comments .. now waiting for final review 19:07:17 any other bp's in progress? 19:07:50 can we add the mockcassandra as a blueprint 19:07:59 yes, please register it obulpathi 19:08:02 initially I thought its plug and play .. . 19:08:18 but we need change the functionality quite a bit .. 19:08:24 because its written in 2010 .. 19:08:32 functionality of mock cassandra right? 19:08:46 we need to change the implementation to meet the changes in cassandra 19:08:47 yes 19:09:04 ok, I wil lregister 19:09:08 ok cool, thanks 19:09:31 I will register a delete-service afterwards, and I will pick it up sometime next week. 19:09:54 there alread is a delete-service bp there 19:09:59 can I take up one the service tasks? 19:10:04 Oh ok. 19:10:17 obulpathi: definately 19:10:25 Yeah, list-services is up for grab. 19:10:35 cool .. ok 19:10:35 there is list, also purge 19:10:39 will take it up 19:10:47 and patch 19:10:56 ok 19:12:10 #topic new items 19:12:35 nothing was on the agenda here. i think everyone has been busy getting the existing endpoints and functionality implemented 19:12:44 so moving on... 19:12:48 #topic Open Discussin 19:12:57 first one here is from me 19:13:12 ive been thinking about how we do the consumption of logs from the cdn providers 19:13:33 the initial thoughts were to have api endpoints to provision log sinks from cdn providers 19:13:54 eg they send logs to S3, FTP, Swift, etc and operators would configure that through an api 19:14:18 then i thought, does it even need to be done through the api. since operators will have relationships with those providers 19:14:27 should they just do it external to poppy 19:14:44 so poppy doesnt concern itself with log collection, and metering 19:14:56 thats up to the operators to do 19:15:01 agreed+1 19:15:08 +1 on that. 19:15:15 even the log formats and storages are widely different 19:15:21 poppy can concern itself with presenting analytics that are aggregated from the logs though 19:15:52 so if operators want to use that, they build their own aggregation tool (hadoop, or whatever) and feed the data back into the poppy datastore in aggregated form 19:16:02 poppy then has api's to query that aggregated data 19:16:16 thus providing poppy users with the useful analytics they desire 19:16:18 thoughts? 19:16:40 Good idea 19:16:48 It also keeps the product lean 19:16:48 Sounds good to me. We just need to define a standard data format feed to poppy. 19:16:53 amitgandhinz: does that mean poppy will defined standard metrics? 19:16:56 tonytan4ever: +1 19:17:01 malini: yes 19:17:14 poppy would define that we store analytics data in this format 19:17:23 operators do whatever they want to get their logs into that format 19:17:35 poppy queries that data and returns the json to the user 19:17:38 what value do we add by providing an API to query this data? 19:18:00 I am assuming there will be tons of tools out there that can better dice & visualize this data 19:18:10 users can bind stats calls to this endpoints, dsahboards can visualize it in graphs etc 19:18:19 CDN is all about distributing the data 19:18:41 so if we provide this info, it gives useful info the end users of poppy how the content is being consumed 19:18:43 CDN users want to know how their cdn is doing - cache hits/misses, bandwidth, etc 19:18:48 yes 19:19:10 we wouldnt allow users to create their own queries though 19:19:27 we would define endpoints such as GET /stats/cachehits 19:19:33 or GET /stats/bandwidth 19:19:34 etc 19:19:48 so its well defined, subset of all the data we aggregate 19:20:01 I think it's better to at least allow user to specify of a time window. 19:20:08 agreed 19:20:15 time window, marker/limit 19:20:23 +1 19:20:34 we may also decide (or let operator decide) the ttl of records in teh datastore 19:20:39 To make sure I understand it right - Providers will send the data they have in poppy specific format & poppy will return this data back via API? 19:20:40 so that it doesnt grow crazy big 19:21:18 providers will send logs to a location. operators will ingest this data (via their own service) and aggregate it. The aggregated data is fed back into poppy 19:21:34 poppy then returns that aggregated data back via the API 19:21:52 The aggregated data is in a well defined schema 19:22:21 ok..starting to make sense :) 19:22:33 so do we have any idea of the stats Akamai/CloudFront/Fastly/MaxCDN provide? 19:22:42 Right on. 19:22:45 there is a whole bunch 19:22:51 some in common, some not 19:22:58 it still requires more investigation 19:22:59 also do they allow custom queries by end users? 19:23:01 ok 19:23:09 mostly dont support custom queries 19:23:18 many of them dump logs (some stream logs) periodically 19:23:24 we then have to parse those logs 19:23:32 we can sometimes define what goes into those logs 19:23:44 eg fastly support apache style formatting to define the log output 19:23:52 im not sure what akamai or maxcdn does 19:23:56 cool 19:24:04 "we then have to parse those logs" - 'we' are operators, not Poppy -rt amitgandhinz? 19:24:12 storing and ingesting logs is not our concern 19:24:15 in that context - operators 19:24:27 thanks 19:24:29 but, let me propose a slightly alternative idea 19:24:35 all ears 19:24:46 so how about we store the ingested data 19:25:03 and expose that? 19:25:21 the ingested data is the data obtained from logs 19:25:34 and should not be too large to handle ... 19:25:43 That's we are going to do right ? 19:25:54 Parse data from logs, store in poppy standard format 19:25:55 obulpathi: how is it different from amitgandhinz's suggestion? 19:25:57 then expose that ? 19:26:09 let me rephrase my question 19:26:22 we dont want to expose raw logs 19:26:25 its too large 19:26:32 no, not the raw logs 19:26:35 plus we need a standard format 19:26:40 the digested information only 19:27:09 are you suggesting to make the aggregated logs available via a swift container? 19:27:19 sort of .. 19:27:28 hmmm...possible 19:27:38 operator dumps aggregated logs into a swift container 19:27:48 my question essentially boils down to will we be answering all the questions that a user have about the content distribution? 19:28:05 poppy is responsible for parsing aggregated logs from swift, storing in datastore, and then making it available via the api 19:28:27 operator is responsible for parsing raw logs, aggregating, and storing in swift 19:28:54 yes, that part is clear to me 19:29:05 lets say we expose api 19:29:11 stats/cache_hits 19:29:17 stats/bandwidth etc .. 19:29:25 i do like the idea of giving users more data that we may not necessarily be querying 19:29:52 when user comes and says .. hey I need this metric 19:29:59 we should not be adding another endpoint 19:30:03 thats my only concern 19:30:15 but simplicity wins 19:30:32 I am in with the proposal you suggested 19:30:48 but just wanted to toss the idea and discuss if it has any merits 19:30:52 ok, it still requires more thought, but this is a good starting point 19:30:52 thanks :) 19:30:58 +1 19:31:23 To achieve what you are suggesting, obulpathi, we need to carefully design this standard format. 19:31:24 i think once we analyze the data that can be analyzed, will can proceed (that was confusing ;-)) 19:31:34 s/will/we 19:31:48 cool 19:31:55 awesome, thanks obulpathi 19:32:02 thanks 19:32:06 ok, anyone have anything else they want to discuss? 19:32:25 i am going to be out tomorrow and all next week 19:32:35 malini: will run next weeks meeting 19:32:46 malini malini malini :P 19:32:50 we'll make sure you get all the #actions amitgandhinz ;) 19:33:02 * amitgandhinz considers not coming back 19:33:10 :) 19:33:13 :D 19:33:14 mmaalliinnii... 19:33:26 ok, anything else? 19:33:30 going once..... 19:33:38 going twice....... 19:33:44 nothing from me 19:33:50 cool, thanks everyone 19:33:54 thanks! 19:34:00 bye! 19:34:03 #endmeeting