15:06:29 <anteaya> #startmeeting third-party 15:06:29 <openstack> Meeting started Mon Jul 11 15:06:29 2016 UTC and is due to finish in 60 minutes. The chair is anteaya. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:06:31 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:06:33 <openstack> The meeting name has been set to 'third_party' 15:06:36 <anteaya> thanks lennyb 15:06:46 <anteaya> I was deep in figuring out storyboard apis 15:06:48 <anteaya> :) 15:07:16 <mmedvede> hi anteaya 15:07:23 <anteaya> how are you today, mmedvede? 15:07:30 <mmedvede> all good, thanks 15:07:36 <anteaya> oh I'm so glad 15:07:39 <anteaya> nice day here 15:07:53 <anteaya> had a deer on my lawn last night for the longest time 15:08:18 <anteaya> does anyone have anything they would like to discuss today? 15:08:19 <mmedvede> lots of deer and rabbit around where I am 15:08:24 <mmedvede> :) 15:08:26 <anteaya> mmedvede: nice 15:08:27 <anteaya> :) 15:08:31 <anteaya> I love watching them 15:08:52 <wznoinsk> hi there 15:09:00 <anteaya> hey wznoinsk 15:09:12 <anteaya> #info http://lists.openstack.org/pipermail/openstack-dev/2016-July/098992.html pin nodepool 15:09:19 <wznoinsk> hi anteaya 15:09:30 <anteaya> so have you all read asselin__'s post to dev about pinning nodepool? 15:09:37 <anteaya> wznoinsk: nice to see you 15:10:02 <asselin__> hi. i'm here but double-booked. 15:10:14 <wznoinsk> anteaya, good to be here ;-) even tho still on vacation 15:10:26 <anteaya> asselin__: thanks for the post to dev 15:10:51 <anteaya> wznoinsk: oh my, well glad you are here but I hope you enjoy vacation 15:11:05 <anteaya> does anyone have anything they would like to discuss today? 15:11:20 <mmedvede> I have a question regarding OpenStack infra monitoring 15:11:27 <anteaya> mmedvede: go ahead 15:11:30 <mmedvede> (trying to set something up myself) 15:11:46 <mmedvede> does the team use any automated notification system? 15:11:50 <anteaya> no 15:11:52 <mmedvede> or looked into having one? 15:12:03 <anteaya> humans are much faster than any automated notifaction system 15:12:04 <wznoinsk> anteaya, don't mind this little break to get back and set my mind into a technical mode 15:12:21 <anteaya> we have purposely not wanted any automatic notification system 15:12:32 <anteaya> none of the infra team has a pager, nor do we want one 15:12:35 <lennyb> mmedvede: what are you looking for? 15:12:40 <anteaya> wznoinsk: fair enough 15:13:03 <anteaya> some infra team member purposely choose to work in this environment as a way of leaveing a pager behind 15:13:08 <wznoinsk> anteaya, this maybe due to amount of notifications teams may get and get around to do the 'really important' ones, isn't it? 15:13:30 <mmedvede> I ideally am looking for alerts being sent to irc, triggered by anomaly detection in metrics 15:13:45 <anteaya> wznoinsk: well there a a couple of reasons, one is a lifestyle choice, when our infra folks are online they are responding to things in channel as fast as possible 15:13:55 <anteaya> when they are offline, they really need to be offline 15:13:56 <mmedvede> anteaya: agree on no pager. I am talking about irc alerts 15:14:00 <wznoinsk> mmedvede, what monitoring system you have as the source of these alerts? 15:14:06 <anteaya> mmedvede: what kind of alerts? 15:14:38 <mmedvede> wznoinsk: none yet. was considering graphite-beacon initially (simple thresholds trigger a script that sends irc message) 15:15:06 <mmedvede> anteaya: any sorts of alerts, e.g. zuul lost connection to OpenStack gerrit 15:15:45 <mmedvede> anteaya: but my question is mostly to find out if infra considered any tooling (as generally you chose good tools :) ) 15:15:54 <anteaya> zuul would reconnect then would it not, if it lost a connection to gerrit? 15:16:11 <anteaya> ah, well our irc bots are in need of an overhaul 15:16:22 <anteaya> but noone has had the time & interest to do it 15:16:48 <anteaya> as our irc bots are fraught with issues I think due to threading which requires us to pin a bot to a server 15:16:58 <anteaya> then if the server goes down we lose the bot 15:17:36 <anteaya> so in terms of irc messaging tool I believe that infra does not feel we are using the latest bright and shiny 15:17:52 <wznoinsk> mmedvede, can an event be used (like running an arbitrary command) when alert occurs in graphite? 15:18:45 <mmedvede> wznoinsk: yes, but you need external tool (e.g. graphite-beacon) 15:19:15 <wznoinsk> on the other hand, the more puzzles in the 'notification' system the mroe prone it is it will not work in some cases 15:19:51 <wznoinsk> my setup was nagios + nagstamon (windows app that sits on your desktop/systray and makes flashing/noise when nagios sees someting bad) 15:19:52 <mmedvede> anteaya: ok, that is more or less what I thought 15:20:27 <anteaya> mmedvede: it is an interesting discussion, I'm curious as to your motivation for it, what are you looking to fix or address? 15:20:43 <wznoinsk> I'd recommend something as simple as that for notification + externally hosted script to check the monitoring system (nagios, graphite) itself ;-) 15:20:48 <mmedvede> wznoinsk: I found nagios so far hard to manage 15:21:26 <wznoinsk> mmedvede, I'm not promoting nagios in any form (it's just what I was used to and didn't want to learn new monitoring tool back then) 15:21:51 <mmedvede> anteaya: main motivation is to react quickly to things going wrong. we do not have downstream users to complain, so sometimes it takes awhile to catch things 15:22:03 <wznoinsk> bad wording, again: I like nagios, but the above was just an example of noitification simplicity (to avoid problems with the notification system itself) 15:22:34 <anteaya> mmedvede: ah for your personal use, yes that makes sense 15:23:07 <anteaya> mmedvede: um, how many of your tools send you email alerts on failures? 15:23:11 <mmedvede> wznoinsk: I understand what you are saying :) 15:24:01 <wznoinsk> mmedvede, good ;-) 15:24:06 <mmedvede> anteaya: I did not configure emails on failure. I think after awhile you would start ignoring them, as noise 15:24:35 <anteaya> well we are talking about a system for you to know when your system is failing, are we not? 15:25:02 <anteaya> if the tools have the ability to email you, have you tried to figure out how to get that feature to work for you? 15:25:02 <lennyb> mmedvede: we send emails on 5 failures 15:25:26 <mmedvede> anteaya: I thought you meant CI failures 15:25:35 <mmedvede> like jenkins test failed 15:25:45 <anteaya> I don't want your emails 15:26:00 <anteaya> but I thought we were talking about solving a problem you have 15:26:07 <anteaya> so get your tools to email you 15:26:21 <anteaya> and you can configure it as you see fit, as lennyb suggests 15:26:32 <mmedvede> lennyb: that works to a degree, but you might wait 5 hours before you get email 15:27:16 <mmedvede> anteaya: getting a tool to email is not a problem. Main puzzle piece is what to use to decide when to send an alert 15:27:26 <mmedvede> there are a lot of options 15:28:12 <lennyb> mmedvede: correct. my assumption is that a single failure is a developer responsibility, if there are a number of failures, that probably means that the problem is my CI. I still have in my todo list a script to compare my CI failure to the others 15:28:18 <anteaya> mmedvede: ah, yes I do agree 15:29:51 <watanabe_isao> anteaya, do we have a way to know if the zuul of infra CI is down ASAP? I'm thinking maybe that is mmedvede 's question? 15:30:08 <anteaya> mmedvede: is that your question? 15:30:27 <wznoinsk> mmedvede, does graphite have a configuration on how many attempts a check has to fail before its marked as a WARNING/CRITICAL? 15:31:06 <mmedvede> watanabe_isao: I brought up our CI's zuul as example 15:31:30 <mmedvede> there are many more things we need to monitor, zuul was one of them that misbehaves frequently 15:31:58 <mmedvede> wznoinsk: graphite is not monitoring tool, it is aggregation. So it does not have alerts 15:32:24 <mmedvede> wznoinsk: so someone wrote graphite-beacon to monitor graphite metrics 15:32:29 <watanabe_isao> mmedvede, I see. Well in my third party CI zuul hungs before due to some issue, and I need to check it every day now, which is a nightmare. 15:32:30 <mmedvede> (there are many others) 15:32:37 <wznoinsk> mmedvede, is that the graphite youre talking about ? http://graphiteapp.org/ ? 15:32:52 <anteaya> mmedvede: well it sounds like there is no existing thing that does what you are looking for, my suggestion would be to put something in an etherpad that specificies _exactly_ what you want, since we seem to be getting lost guessing due to generalities 15:33:21 <mmedvede> wznoinsk: in the context of OpenStack infra - http://graphite.openstack.org/ 15:33:31 <anteaya> then once you get a few people to read the etherpad who can then repeat back what you say you need such that they understand what you want, post to the infra list 15:33:45 <anteaya> since I will be honest, currently I don't know what it is you want 15:34:25 <mmedvede> anteaya: it is ok, I know what I want to try already. And this discussion confirmed I did not missed some obscure super-cool tool everyone is using 15:34:48 <anteaya> ah that was the point of this conversation 15:34:50 <anteaya> okay great 15:34:55 <anteaya> glad you got what you needed 15:35:11 <anteaya> and yeah, I don't think you are missing out in any of the latest hotness 15:35:35 <anteaya> does anyone have anything more for this discussion? 15:35:41 <watanabe_isao> anteaya, are we only talking about 3rd party tol here? May I ask something about devstack-gate? 15:35:55 <anteaya> watanabe_isao: you can ask 15:36:00 <anteaya> this is the third-party meeting 15:36:21 <mmedvede> we use devstack-gate (some of us) 15:36:23 <anteaya> so anything you ask will be viewed in the context of third party operators and their activitiese 15:36:27 <wznoinsk> mmedvede, ok - I was only reading the 'about' section of graphite, sometimes it's hard to tailor a tool for data collecting/metrics for the monitoring/notifications purposes... would graphite be your only source of data you want to alert/notify on? or would you want to monitor output of different kinds (i.e.: particular processes on a machine, run a completely custom check etc.) ? 15:36:53 <watanabe_isao> Does anyone considered about a mid_test_hook? 15:37:07 <mmedvede> wznoinsk: right now it seems graphite (statsd metrics) is a good way to aggregate everything 15:37:20 <anteaya> what do you want a test hook to do in the middle of a test? 15:37:22 <watanabe_isao> With is used to execute some commands before tempest 15:37:46 <mmedvede> watanabe_isao: do you mean after devstack, but before tempest? 15:38:02 <watanabe_isao> anteaya, to set up the environment, like add a node to ironic. 15:38:09 <watanabe_isao> mmedvede, yes 15:38:11 <wznoinsk> watanabe_isao, if you're thinking what I think you're thinking about you probably want to use local.sh that devstack itself runs at the very end 15:38:34 <mmedvede> watanabe_isao: we actually have ironic job that does something like that 15:38:40 <watanabe_isao> wznoinsk, I know it also can e.x. add the node to ironic 15:39:04 <mmedvede> we use pre_test_hook to create config with baremetal node information 15:40:00 <watanabe_isao> wznoinsk, for example you want to add a node to ironic as late as you can. But devstack install takes too long time. 15:41:03 <wznoinsk> mmedvede, when I think about it... I agree, I could probably tailor nearly all of my custom scripts to output some form of metric to graphite... monitoring tools usually have graphing tools built-in tho... my main aim is to monitor, alert/notify 2nd to see historical graphs hence I use nagios 15:42:02 <wznoinsk> watanabe_isao, sorry, I can't help you here, don't use Ironic here yet 15:42:41 <mmedvede> watanabe_isao: we considered adding node later, but for POC job ended up de-facto adding it before devstack 15:42:47 <wznoinsk> but shouldn't a stacked node, given OS_URL and other links to the controller/keystone, register itself up (note: lack of Ironic knowledge here) 15:42:49 <watanabe_isao> wznoinsk, it's ok. well in my use case, I also want to run some local scripts before tempest 15:42:57 <mmedvede> watanabe_isao: did you consider a devstack plugin? 15:43:27 <watanabe_isao> mmedvede, no just some local scripts. 15:44:10 <watanabe_isao> mmedvede, but it is a good point I think. 15:45:43 <anteaya> anything more on this topic or the monitoring one? 15:45:52 <anteaya> we seemed to be doing both at the same time 15:46:21 <anteaya> does anyone have anything else they would like to discuss today? 15:46:41 <wznoinsk> anteaya, we've agreed with moshele from Mellanox to submit a barcelona talk abstract about SRIOV/NFV CI setups we have... we both use openstackci toolset... I'd like to bring it up with os infra guys, is tomorrows 3rdparty WG meeting a good one for this? 15:46:42 <watanabe_isao> anteaya, one more on ci-watch, please. 15:47:09 <watanabe_isao> wznoinsk, you first. 15:47:32 <wznoinsk> watanabe_isao, go with yours, given the time 15:47:46 <watanabe_isao> wznoinsk, thanks 15:48:30 <watanabe_isao> anteaya, does any of us going to give ci-watch a filter? 15:48:44 <anteaya> what filter might they give? 15:49:37 <mjturek1> hey all mmedvede is on his way back. Lost power and is reconnecting 15:49:48 <anteaya> mjturek1: thank you 15:49:59 <mjturek1> np! 15:50:00 <watanabe_isao> anteaya, my CI in cinder is always at below, and I don't want to see some CI's result. I'm talking about a filter to stop showing some results. 15:50:23 <anteaya> ah filter out results you don't want 15:50:35 <watanabe_isao> anteaya, yes. 15:50:43 <anteaya> watanabe_isao: well the ci-watch code is using gerrit, so you could offer a patch 15:51:14 <anteaya> even if you aren't sure of the code, write a clear commit message saying what you want the patch to do, and hopefully kind reviewers can help you get the patch in shape 15:51:43 <watanabe_isao> anteaya, got it. Currently it is just a idea. will do it. 15:52:26 <anteaya> #link http://git.openstack.org/cgit/openstack/third-party-ci-tools/ 15:52:33 <anteaya> I believe that is the repo 15:52:49 <anteaya> great, more on this or shall we move to wznoinsk's question? 15:53:06 <watanabe_isao> anteaya, thank you. yes. please go. 15:53:11 <anteaya> thanks 15:53:16 <anteaya> wznoinsk: you are up 15:54:18 <anteaya> wznoinsk: I believe your question was about making the infra team aware of a talk you have submitted? 15:55:40 <anteaya> well a few things, I personally don't endorse anyone else's talk proposal, lest I be inindated with requests 15:55:41 <wznoinsk> you, os infra, guys would know best what community is looking for about 3rdparty CI setups so I wanted to have a chat about our yet-general sriov/nfv CI talk in barcelona on tomorrows 3rd party WG meeting 15:56:25 <anteaya> if you want to discuss the content of the proposal prior to proposing that is fine, you can ask questions in the infra channel 15:56:59 <anteaya> if you want the whole team to discuss something (both men and women, not just the guys) then you can add an agenda item to the infra team meeting: https://wiki.openstack.org/wiki/Meetings/InfraTeamMeeting#Agenda_for_next_meeting 15:57:15 <wznoinsk> ok, I'll share more on nthis soon then 15:57:26 <anteaya> I have a lot of meetings already and am hard pressed to attend more 15:57:48 <anteaya> I can't speak for other infra team members but if there is someone you would like to invite you are welcome to ask them 15:58:18 <anteaya> thanks 15:58:22 <anteaya> more on this topic? 15:58:39 <wznoinsk> nope, thanks 15:58:42 <anteaya> thank you 15:58:49 <anteaya> anyone with anything else today? 15:58:55 <anteaya> about 1 minutes remaining 15:59:01 <anteaya> 1 minute 15:59:07 <watanabe_isao> anteaya, sorry that I'm new to this meeting. Do we have another meeting, tomorrow? 15:59:20 <anteaya> watanabe_isao: thanks for attending, glad you have you 15:59:40 <anteaya> watanabe_isao: all openstack meetings are listed here: http://eavesdrop.openstack.org/ 16:00:07 <anteaya> #link http://eavesdrop.openstack.org/#Third_Party_Meeting 16:00:18 <anteaya> #link http://eavesdrop.openstack.org/#Third_Party_Working_Group_Meeting 16:00:28 <anteaya> those would be the links you are looking for 16:00:31 <anteaya> and time to end 16:00:32 <watanabe_isao> anteaya, ohhh, 16:00:32 <watanabe_isao> Third Party Working Group Meeting 16:00:43 <anteaya> thank you everyone see you next week 16:00:46 <anteaya> #endmeeting