15:00:26 <portdirect> #startmeeting openstack-helm
15:00:27 <openstack> Meeting started Tue Sep 10 15:00:26 2019 UTC and is due to finish in 60 minutes.  The chair is portdirect. Information about MeetBot at http://wiki.debian.org/MeetBot.
15:00:28 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
15:00:30 <openstack> The meeting name has been set to 'openstack_helm'
15:00:36 <portdirect> lets give it a fes mins for people to arrive
15:00:37 <roman_g> o/
15:00:48 <lamt> \o
15:00:53 <portdirect> the agenda is here: https://etherpad.openstack.org/p/openstack-helm-meeting-2019-09-10
15:01:01 <portdirect> please add away :)
15:01:01 <stevthedev> hello everyone
15:01:11 <megheisler> o/
15:01:29 <mattmceuen> o/
15:02:40 <mbuil> \o
15:04:20 <srwilkers> o/
15:04:35 <rihabb> o/
15:05:32 <portdirect> ok looks like it will be a fairly quite meeting today
15:05:37 <portdirect> #topic Monitoring/Alerting stack in the gates
15:05:46 <cheng1> o/
15:05:57 <gagehugo> o/
15:06:10 <portdirect> as some users of osh are moving forward with serious deployments
15:06:31 <portdirect> im wondering if we could enchance the work we do in the gates
15:06:53 <portdirect> to make more active use of our Logging, monitoring and alerting stack
15:07:38 <portdirect> it would be great (i think) to start querying our monitoring services for the state of things we are attempting to validate
15:07:55 <srwilkers> actually, yes
15:08:10 <srwilkers> i've got a few WIP patches in flight that do this sort of thing
15:08:21 <portdirect> which would help close (some of) the gap we have in that our gate just does point in time checks on pod and service state
15:08:31 <portdirect> srwilkers: awesome!
15:08:36 <portdirect> can you point to them?
15:08:48 <srwilkers> yeah, sec
15:09:07 <stevthedev> I like the thought
15:09:33 <srwilkers> this change added a chaoskube experimental check then queried prometheus for firing alerts: https://review.opendev.org/#/c/630299/28/tools/deployment/common/check-prom-alerts.sh
15:10:15 <srwilkers> this change queried elasticsearch for pod logs using a bash utility i wrote awhile ago: https://review.opendev.org/#/c/624435/14
15:10:42 <srwilkers> granted, these are a bit stale.  however, they'd still serve as a decent reference for what we could do in our jobs for validating operation with these tools
15:11:05 <portdirect> i think as a 1st step it would be great to get queries to nagios?
15:11:18 <srwilkers> nagios doesn't have an API, so that's out
15:11:19 <portdirect> as this is the 'front door' we promote to operations etc
15:11:30 <portdirect> what about selenium?
15:11:37 <srwilkers> best we can do is take snapshots with something like selenium
15:11:39 <srwilkers> which we already do
15:11:49 <srwilkers> we just don't do that in every job we run
15:12:16 <portdirect> rather than just take snapshots we should be able to query for element state - eg red/green
15:12:50 <evrardjp> o/
15:13:42 <portdirect> also does our nagios not support ncpa? https://www.nagios.org/ncpa/help/2.0/api.html
15:14:54 <srwilkers> we just use nagios core at the moment - we can see if we can include NCPA, but we don't at the moment
15:16:23 <portdirect> ok, i think we will need somthing in this space
15:17:03 <srwilkers> the next question is - do we want this as part of every job we run?
15:17:19 <portdirect> if we dont have the ability so simply query nagios via an api, then old skool selenium scraping looks to be our only option?
15:17:27 <portdirect> srwilkers: at the least a periodical
15:17:42 <srwilkers> i was thinking the periodic multinode jobs would be good candidates, beyond what we do already
15:17:48 <portdirect> yup
15:22:36 <portdirect> srwilkers: lets have a look at our options this week, and come back next week with what we come up with?
15:22:45 <srwilkers> portdirect: works for me
15:23:11 <portdirect> ok - thats all i have for topics today
15:23:46 <portdirect> anything else we should be discussing/thinking about this week, before we move onto the plea for reviews?
15:25:48 <portdirect> ok - lets move on
15:25:53 <portdirect> #topic reviews
15:26:12 <portdirect> https://www.irccloud.com/pastebin/jFzEmB8L/
15:27:27 <rihabb> Could you guys please give this patch (https://review.opendev.org/#/c/643284/) final review? We have tried to incorporate all the comments that were addressed
15:27:39 <rihabb> :)
15:28:35 <cheng1> rihabb, I was also about to mention it, it really needs core reviewer's reviews
15:29:35 <portdirect> will do rihabb/cheng1
15:29:51 <rihabb> Thanks :)
15:29:56 <portdirect> if all the comments have been addressed, then i think we can finally put this one home
15:30:02 <cheng1> portdirect, thanks
15:30:48 <portdirect> ok - lets give everyone 30 mins back
15:31:03 <portdirect> #endmeeting