20:02:17 <Rockyg> #startmeeting log-wg 20:02:18 <openstack> Meeting started Wed Mar 4 20:02:17 2015 UTC and is due to finish in 60 minutes. The chair is Rockyg. Information about MeetBot at http://wiki.debian.org/MeetBot. 20:02:19 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 20:02:23 <openstack> The meeting name has been set to 'log_wg' 20:03:05 <dhellmann> o/ 20:03:20 <Rockyg> #topic intros 20:04:38 <Rockyg> I'd like to get everyone on similar pages, so if you can do a quick intro of who you are, what part of the community and what your top logging issue/goal is, that will get us rolling 20:04:49 <Rockyg> No novels, please, though 20:05:14 <dhellmann> Hi, I'm Doug Hellmann, Oslo PTL and TC member. 20:05:17 <bknudson> Brant Knudson -- Keystone core reviewer -- hoping to have logs that can be used to debug issues. 20:05:49 <masteinhauser> Hi, I'm Myles Steinhauser, Engineer at Blue Box Group. Automated alerting and Ops of primary interest. 20:05:53 <nkrinner> hi, my name is Nanuk Krinner, i am a software developer for suse and work on openstack. i attended the kickoff meeting at the kilo summit and want to help improving the logging situation 20:05:59 <barrett> Carol Barrett -- Win The Enterprise WG -- monitoring/logging is a top issue for Enterprise Deployment of OpenStack 20:06:12 <Rockyg> Rocky Grober -- product group, evangalist, QA -- consistent logs so problems can be quickly identified 20:06:25 <jokke_> Erno Kuvaja - Glance Core reviewer - and my goal is to have logs that makes it possible for us to debug and support/ops to support openstack in real life environments 20:07:55 <Rockyg> kewl. let's discuss state of of logging and community wrt logging 20:08:02 <ppetit_> Patrick Petit, working on OSt logs anaytics project in Fuel. Goal is to bring logs at a level of consistency can be effectively used for operations 20:08:19 <Rockyg> #topic cureent state 20:08:51 <bknudson> the keystone logs are useless. 20:09:04 <jokke_> "~agreed not great" :P 20:09:35 <nkrinner> yesterday 20:09:41 <Rockyg> dhellman and Oslo have created a log library that can help with getting consistent formats. Especially headers, etc 20:09:49 <nkrinner> sorry 20:09:53 <jokke_> I'd like to hear if someone thinks that there is actuallyt project out there that has great logging 20:10:11 <bknudson> keystone uses the log library... doesn't help us log what's needed. 20:10:21 <Rockyg> I just looked at keystone's log docs and they seem to have vanished. Last cycle, they looked to be promoting oslo_log 20:10:37 <dhellmann> it might be more constructive to talk about some specific deficiencies 20:10:54 <Rockyg> dhellman: good point 20:11:03 <dhellmann> I assume everyone had a chance to comment on http://specs.openstack.org/openstack/openstack-specs/specs/log-guidelines.html for example? 20:11:08 <jokke_> tbh I don't think our biggest problem is even near oslo.log 20:11:38 <nkrinner> yesterday i worked on a heat issue where the logs provided no information about why starting of a heat stack failed. the other day when i had issues with nova they were very helpful though. the state of logs varies over the projects. 20:11:54 <bknudson> here's a specific deficiency, at least on keystone I think it's needed to have some way to correlate a request with the error response, but we don't have that, either in the client or the logs or the server. 20:12:29 <jokke_> bknudson: I'd say :%s/or/and/ 20:13:05 <bknudson> y, I also have specific things where I would have like the logs to show me what I was doing wrong and there was nothing, had to resort to the debugger. 20:13:17 <dhellmann> nkrinner: I hope you filed a bug? 20:13:26 <jokke_> bknudson: I think one of the most important things is to be able to correlate the user action to possible middle man logs and at the end to the actual service logs 20:13:31 <dhellmann> bknudson: that's a good one, and I think there's another spec related to that... 20:13:43 <nkrinner> dhellmann: not yet, will do so 20:13:46 <Rockyg> I believe its a glance spec 20:13:46 <masteinhauser> log correlation via req-* and tracking through the entire request pipeline and fanout would be incredibly useful. 20:13:59 <dhellmann> #link https://review.openstack.org/156508 20:14:08 <dhellmann> nkrinner: please do! 20:14:09 <bknudson> jokke_: actually, in keystone it's probably easier since there aren't middlemen... the middlemen is more difficult. 20:14:23 <ppetit_> w.r.t correlating on request id a first step would be to have all projects sync to oslo.log 20:14:37 <bknudson> for some reason https://review.openstack.org/#/c/156508/ says cinder... not sure why it would be specific. 20:14:39 <jokke_> bknudson: nope ... every other service has that middleman towards keystone ;) 20:14:42 <Rockyg> nkrinner: were debug logs of any help or you had to do it interactively? 20:15:15 <dhellmann> bknudson: I think they just wanted to start there 20:15:24 <Rockyg> sorry, bknudson... 20:16:05 <dhellmann> ppetit_: all of the projects are using *a* version of the oslo logging code, except possibly swift. The work we did this cycle was to turn that into a library, but it has been in the incubator for a while. 20:16:09 <bknudson> Rockyg: actually, debug was enabled, and the debug log didn't help... I know where to add the debug log now. 20:16:14 <jokke_> bknudson: iirc that cinder spec was made for one project just to have some scope for it (same as I wrote the start of error code spec under glance just because it was familiar environment to get started with) 20:16:48 <nkrinner> Rockyg: i looked at the debug logs and did not find anything helpful there. I don't have them here, but will look at them and file a bug with relevant information 20:16:50 <Rockyg> dhellmann: would it be reasonable to take the cinder spec and create a cross spec for all projects? 20:17:33 <dhellmann> Rockyg: yes, that was the feedback ttx left after the cross-project meeting discussed the spec 20:17:34 <bknudson> I would love to see https://review.openstack.org/#/c/156508/ for keystone... that's one part of it... other parts are 1) logging the request ID in keystone log, 2) displaying request ID in CLI 20:17:37 <masteinhauser> dhellmann: glad you mentioned swift, that has been incredibly painful for us dealing with logging. 20:18:02 <dhellmann> masteinhauser: I'm not sure of the current situation there, I haven't looked in a while. 20:18:15 <dhellmann> masteinhauser: have you provided that feedback to the swift team? 20:18:31 <masteinhauser> dhellmann: I don't know the exact refs we are running in production, I can check. 20:18:33 <ppetit_> dhellmann the problem with project using the incubator version is that the request id and tenant id are not properly rendered 20:18:41 <masteinhauser> dhellmann: I have not, yet. Finally diving more into the community aspects. 20:19:01 <Rockyg> can we get a volunteer to generate the crossproject version of spec for https://review.openstack.org/#/c/156508/ ? 20:19:09 <dhellmann> ppetit_: ok, that problem may have been fixed but not synced, but at this point we do want projects to start using the library. Full adoption may not happen until L though. 20:19:27 <dhellmann> Rockyg: someone should work with the original author on that, rather than starting a new spec 20:19:38 <ppetit_> It should happen before L IMO 20:20:00 <ppetit_> Jay Pipes has an action item to make that happen 20:20:03 <dhellmann> ppetit_: I would have liked for that to happen, too, but there were some delays early in the cycle and we're only a few weeks away from feature freeze at this point 20:20:16 <Rockyg> dhellman: agreed. 20:20:22 <dhellmann> ppetit_: I'm not saying it *shouldn't* just that we shouldn't count on it. :-) 20:20:52 <Rockyg> but a new version of the lib was just released. it should help? 20:20:55 <dhellmann> full adoption of the library is one of my goals for L, though 20:21:22 <jokke_> I'd be happy if the bikeshedding around the X-spec is done by start of Liberty so people get to work on it right away 20:22:04 <Rockyg> dhellman: noticed that the oslo log docs are getting fleshed out, so should make it easier for developers to implement against 20:22:16 <dhellmann> jokke_: the best way to make that happen is to participate in the conversation and push it to conclusion 20:22:33 <Rockyg> Abhishek Kekane is the owner of 156508 20:22:47 <dhellmann> Rockyg: thanks, I think they're looking fairly good now 20:22:57 <Rockyg> agreed. 20:23:25 <Rockyg> I would love to see some code snippets, though. So devs can cut and paste 20:24:01 <ppetit_> beyond consistency of format we are certainly facinf also a lack of consistency and effectivness of the logs produced at INFO level for the consumption of operators… 20:24:15 <Rockyg> Also want to point out that I saw in passing a question on QA yesterday as to whether oslo_log should be used for tests now. We should socialize that idea. 20:24:25 <dhellmann> ppetit_: agreed, that is a big goal for sdague's spec linked earlier 20:24:59 <Rockyg> Again, I think examples of what should be in INFO go a long ways to getting devs to do it. 20:24:59 <dhellmann> Rockyg: there are fixtures to hook up logs for tests, if that's what you mean? 20:25:22 <dhellmann> I think that's covered by http://specs.openstack.org/openstack/openstack-specs/specs/log-guidelines.html isn't it? 20:25:30 <Rockyg> I think so. It was a one liner 20:26:15 <masteinhauser> ppetit_: agreed, we are running all production stacks at info and some at debug logging consistently to capture usable error output. I can provide examples with some research. 20:26:28 <jokke_> ppetit_ & dhellmann: I think that sdague's spec is good start to that direction, then we just need to beat that scaryness out of WARN ;) 20:26:41 <Rockyg> log-guidelines have general examples. I think we need to get a member of each projecyt to own log improvement 20:27:09 <dhellmann> jokke_: right, the next step there may be to open bugs against projects with specific cases of where logging is not matching those guidelines 20:27:37 <dhellmann> Rockyg: having an owner per project makes sense 20:27:58 <ppetit_> We can taake a share of that effort 20:28:03 <dhellmann> ppetit_: ++ 20:28:22 <jokke_> I think the bigger problem is to get people out of their old habbits and "It's fine on devstack" mentality 20:28:26 <Rockyg> #action get log message owner for each project and add to Theirry's page tracking them 20:28:51 <dhellmann> jokke_: this is going to take a fair amount of education, which will take time, but we can improve the current situation as we go, too 20:29:03 <dhellmann> Rockyg: "Thierry's page"? 20:29:04 <jokke_> dhellmann: I totally agree 20:29:14 <bknudson> #link https://wiki.openstack.org/wiki/CrossProjectLiaisons 20:29:16 <bknudson> ? 20:29:27 <dhellmann> bknudson: ah, yeah, that makes sense 20:29:49 <dhellmann> Rockyg: does this group have a home page in the wiki? 20:30:06 <jokke_> dhellmann: I'd really like to hear about the ideas how to do that ... one person onboard does not make miracles if most active devs and cores disagrees :( 20:30:12 <Rockyg> https://wiki.openstack.org/wiki/LogWorkingGroup 20:30:27 <Rockyg> thanks bknudson. You beat me to it 20:30:35 <jokke_> like just making some minor adjustments on Glance was half a year long rocky road to fight through 20:30:48 <bknudson> Are there cores or projects that don't think logging is useful? 20:31:04 <bknudson> obviously it will slow development to -1 over logging issues. 20:31:22 <Rockyg> One person does not, but getting the common devref and fixing project devref gets newbies started 20:31:55 <dhellmann> #link https://wiki.openstack.org/wiki/CrossProjectLiaisons#Logging_Working_Group 20:32:13 <jokke_> bknudson: it's not about logging being useful, it's more about being too verbose is ok or hiding stuff under DEBUG is ok as everyone runs their clouds on debug, right 20:32:13 <Rockyg> I noticed some projects have hacking rules for logs. Propose development of common hacking rules? 20:32:15 <dhellmann> jokke_: around here we have to lead by example 20:32:23 <jokke_> dhellmann: ++ 20:32:47 <ppetit_> BTW we are releasing a tool Heka / ElasticSearch / Kibana with pre-configured parsers and templates that that help debugging logs. Its being packaged as Fuel plugin but can be easily extracted 20:32:47 <dhellmann> Rockyg: most of those rules have to do with enforcing the translation markers, don't they? I'm not sure we can use code analysis to enforce the other guidelines. 20:33:13 <Rockyg> Translation is the big one 20:33:29 <jokke_> and important 20:33:30 <dhellmann> I do really believe that if we identify deficiencies and actually start fixing them then people will be on board when they see the improvements. 20:33:44 <dhellmann> So let's focus on that before making more rules or tools. 20:33:45 <Rockyg> dhellmann: is the global requirement to set log style to syslog in? 20:33:58 <dhellmann> I'm not aware of that as a requirement? 20:34:17 <Rockyg> requirements.txt Sorry. I know you had put in a patch 20:34:36 <Rockyg> So the format could be set in global instead of project by project 20:34:58 <dhellmann> oslo.log is in the global requirements list, yes. Not all projects are currently using the library version of the oslo logging module, though. 20:35:11 <jokke_> Rockyg: requirements is not enforcing configs, but providing list of required dependencies 20:35:25 <Rockyg> #action Identify guidelines that can make sense to add to Hacking 20:35:59 <dhellmann> we can work on oslo.log adoption and cleaning up the info vs. debug level issues in parallel, but we will need someone to produce those patches 20:36:29 <Rockyg> Dhellman: right. Got a list of those here: https://etherpad.openstack.org/p/Log-Rationalization 20:36:52 <bknudson> it's been on my list o' things to do in keystone for a long time. 20:36:55 <dhellmann> #link https://etherpad.openstack.org/p/Log-Rationalization 20:37:05 <bknudson> but it's a lot of work and also not the most interesting. 20:37:14 <jokke_> Perhaps we should move on, I think we had other topics in the list still 20:37:22 <Rockyg> under Developer Docs but needs updating... 20:37:25 <dims> bknudson: nova is now oslo.log enabled, we hit a few bumps, one bump still in progress (patch is in nova) 20:37:53 <dhellmann> jokke_: ++ 20:38:10 <bknudson> dims: I was talking about switching to oslo.log ... that's easy ... was talking about info vs debug issues & useful logging in general. 20:38:11 <Rockyg> #topic Error codes 20:38:15 * dhellmann looks for his copy of the agenda 20:38:18 <bknudson> *wasn't* 20:38:32 <dims> bknudson: ack 20:38:50 <Rockyg> I started a x-project version of jokke's sxpec 20:39:23 <jokke_> this seems to be topic that divides people more than request IDs 20:39:34 <Rockyg> #link https://review.openstack.org/#/c/127482 20:40:15 <bknudson> where's the x-project version? 20:40:59 <Rockyg> I think a big issue is to separate the layers of the system so folks understand better what the message focus is bknudson: still working on it. Focus to get it to review befor EOW 20:41:41 <bknudson> Rockyg: ok, thanks. 20:42:00 <jokke_> Rockyg: let me know if you are/get stuck with it 20:42:25 <Rockyg> I figure the right way to do these sorts of xproject specs is to do them, then link to the project versions, which will have the right level of detail for devs 20:42:26 <bknudson> I thought this was going to define an error document or header? 20:43:24 <jokke_> bknudson: I still do not know where that header idea has came from nor really a use case for it 20:43:26 <Rockyg> So, the thinking is: for every error, create a code with a summary description. Then the payload will have specifics of instance, etc 20:43:44 <bknudson> I'm not a fan of numeric codes since it's hard to remember what the mapping is. 20:44:31 <Rockyg> #link http://www.faqs.org/rfcs/rfc5424.html is syslog spec 20:44:40 <dhellmann> Rockyg: +1 on linking from project blueprints to the cross-project specs 20:44:46 <jokke_> bknudson: they tend to be easier to remember than uuids and takes less space than the camel text ... the space is limited after all 20:45:10 <Rockyg> The message format includes "MSGID" 20:45:35 <bknudson> store it in the cloud. 20:45:46 <Rockyg> We are going to split the code into Proj,component, then number 20:46:11 <Rockyg> So, three letters for the project is the first part of the code 20:46:18 <jokke_> bknudson: the whole point for those is exactly that ... enabling the possibility of building knowledge bases around our erroring 20:46:23 <bknudson> 3 letters should be enough for anyone. 20:46:34 <Rockyg> Yup. 20:46:52 <Rockyg> I'll get the spec out pdq 20:46:59 <Rockyg> Then it will make more sense. 20:47:04 <dhellmann> yeah, let's save the format discussion for the spec review 20:47:17 <jokke_> ++ 20:47:54 <jokke_> should we move on to Ops meetup 20:48:03 <Rockyg> also, with the way syslog works, one log message could be encapsulated in another if we wanted to cascade to track the effects 20:48:14 <Rockyg> #topic Ops meetup 20:48:28 <Rockyg> I can be there, but I've be waffling. Should I? 20:49:16 <jokke_> so we have some real ideas, specificly around req ids and error codes ... I'd like to get those two speks even if not ready out there and the links to those reviews to the ops for feedback 20:49:23 <Rockyg> I think if we have referrer-id and error code specs in review, we can get ops input 20:49:48 <dhellmann> is there already an agenda for that meetup, or is it being put together on site? 20:49:59 <Rockyg> I also think we need to socialize the logging guidelines spec to ops more. 20:50:06 <jokke_> ++ 20:50:32 <Rockyg> I think they could create a spec or two that will make the guidelines more useful for them 20:51:07 <Rockyg> dhellmann: will you be at the meetup? 20:51:15 <dhellmann> Rockyg: no, I'm afraid not 20:51:30 <Rockyg> Any other Oslo core? 20:51:48 <dhellmann> I haven't had anyone say they're going 20:52:12 <Rockyg> I'd love to find a developer who can step up and become the lead for oslo-log, but that's not a small order 20:52:28 * jokke_ volunteers dhellmann :P 20:52:35 <dhellmann> I think dims and I have that covered. I'm more concerned about the work on the other projects. 20:52:48 <Rockyg> Yeah. Like he doesn't have enough hats to wear... 20:52:55 <jokke_> dhellmann: ++ 20:52:58 <dhellmann> we don't have that much work to do on the library itself, afaik 20:53:26 <Rockyg> dhellmann: I think you're right. It's really close at this point 20:53:46 <jokke_> dhellmann: agreed and as said in Paris, if there comes something we need to get done, I'm more than happy trying to help 20:54:08 <dhellmann> jokke_: thanks, I'll keep that in mind when the time comes 20:55:11 <Rockyg> OK. So, maybe I can go to the meetup and try to recruit devs who turn up to champion oslo-log on their projects? 20:55:21 <Rockyg> If they are there for Ops, they care. 20:55:39 <dims> Rockyg: +1 :) 20:55:49 <dhellmann> Rockyg: that's a good idea. I would also bring it up as a need during the cross-project weekly meeting, and see about getting PTLs to help identify liaisons 20:56:03 <Rockyg> Carol, will you be there? 20:56:10 <jokke_> Rockyg: that and any input from the ops side is valued ... if you get the message through that we want to do this right, that would be great 20:56:20 <Rockyg> Folks focused on WTE will want good logs 20:56:32 <Rockyg> OK. I'm there. 20:56:47 <Rockyg> Meet with boss this afternoon and I'll let him know I'm going... 20:56:58 <jokke_> :D 20:57:22 <Rockyg> #topic priorities 20:57:32 <dhellmann> fwiw, I don't think the folks working on these logging patches need to necessarily be cores on the projects, so we should be able to recruit from a wider pool than might be at the midcycles 20:57:45 <jokke_> dhellmann: ++ 20:58:00 <Rockyg> I think refactoring log messages will be good low hanging fruit 20:58:33 <bknudson> lots of the problems with keystone logging are actually problems with the whole design of the keystone code. 20:58:35 <jokke_> I think the priority nro 1. should be educating about the guideline spec nro 2. getting feedback and hammer down those two other specs by start of Liberty 20:58:56 <Rockyg> 3. get liasons from projects 20:59:21 <jokke_> bknudson: that's big problem on other projects as well. It's not easy to figure out what should be logged and where 20:59:24 <dhellmann> bknudson: that would make it harder for a new contributor to make improvments :-/ 21:00:06 <Rockyg> Maybe we should identify the log messages that are good. It might be a shorter list :P 21:00:18 <jokke_> Well I learned hell of a lot while refactoring the Glance logs :) Have to understand what is going on to be able to do meaningful logging 21:00:32 <bknudson> jokke_: exactly. 21:01:29 <Rockyg> I think that might be why the one dev wanted to start with APIs. Since they're restful, there's less to tweak. Theoretically. 21:01:34 <bknudson> agree with the priorities mentioned here. 21:01:35 <Rockyg> Not actually. 21:02:03 <jokke_> do we have any other burning priorities? Are we agreeing on these? (We're running out of time) 21:02:14 <dhellmann> we're actually over time by a couple of minutes 21:02:33 <jokke_> is there Q behind the door already? 21:02:40 <jokke_> :) 21:02:46 <dhellmann> jokke_: maybe you can start a ML thread about priorities? 21:02:54 <bknudson> I didn't even know there was a meeting-4 21:02:57 <jokke_> dhellmann: will do 21:03:04 <Rockyg> #action Priorities: 1. Education around Logging Guidelines 2. Spec out and feedback for error code spec and referrer id 3. Get project liaisons for Log Working Group 21:03:28 <jokke_> #action jokke will bring up ML thread around priorities agreed 21:03:39 <Rockyg> Anything else? 21:03:50 <Rockyg> Good meeting, guys! 21:03:55 <jokke_> Thanks all! 21:04:01 <nkrinner> thanks everybody 21:04:10 <dhellmann> thanks! 21:04:15 <Rockyg> #endmeeting