15:00:49 <srwilkers> #startmeeting openstack-helm 15:00:50 <openstack> Meeting started Tue Nov 14 15:00:49 2017 UTC and is due to finish in 60 minutes. The chair is srwilkers. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:00:51 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:00:53 <openstack> The meeting name has been set to 'openstack_helm' 15:01:29 <srwilkers> #topic rollcall 15:01:33 <korzen_> hello 15:01:37 <srwilkers> o/ 15:01:39 <jayahn> o/ 15:02:01 <mateuszb> o/ 15:02:23 <srwilkers> here's the agenda: https://etherpad.openstack.org/p/openstack-helm-meeting-2017-11-14 15:02:33 <srwilkers> we'll give it a few minutes to see if anyone else comes along or wants to add to it 15:03:32 <v1k0d3n> hey all! o/\o 15:03:41 <srwilkers> hey v1k0d3n \o/ 15:03:43 <v1k0d3n> o/*\o 15:04:56 <srwilkers> alright, seems we've got a good list to start with 15:05:11 <srwilkers> #topic: graph drawing in documentation 15:05:14 <srwilkers> all you jayahn 15:05:35 <jayahn> I just wanted to have a graph drawing capability in doc. :) 15:05:51 <jayahn> i did my best to copy the example configuration. 15:06:04 <srwilkers> i agree -- pictures are a great way to share things 15:06:46 <jayahn> not sure if i did all the necessary stuff to setup "sphinxcontrib-blockdig". if anyone can give more feedback. please. :) 15:06:50 <srwilkers> admittedly i dont know enough about getting it enabled to tell if that's the right way or not 15:07:04 <srwilkers> lamt is really good at that sort of stuff 15:07:19 <lamt> o/ 15:07:28 <jayahn> ah. lamt is here. :) 15:07:42 <jayahn> https://review.openstack.org/#/c/519653/ 15:08:43 <jayahn> pls leave any feedback on ps. :) 15:08:46 <srwilkers> ill poke lamt and make sure he gives some feedback there 15:08:52 <srwilkers> anything else on this jayahn? 15:08:57 <jayahn> nope 15:09:05 <srwilkers> sweet 15:09:09 <srwilkers> #topic fluent-logging 15:09:17 <srwilkers> i like this one -- take it away jayahn 15:09:40 <jayahn> we are almost done with putting flunt-based logging. 15:10:00 <jayahn> this is be the first one, it will be followed by next steps. 15:10:16 <jayahn> however, is openstack-helm-infra gate is working properly? 15:10:45 <srwilkers> yeah, it's working -- i can work with you and provide feedback to do what we need to run it properly 15:11:02 <jayahn> okay. that would be great. 15:11:17 <srwilkers> once things are tidied up, ill make sure everythings documented appropriately so adding new services is easy 15:11:29 <jayahn> can you do through review? or will it be better to setup a serate time? 15:11:50 <srwilkers> id be happy to do it through review 15:11:52 <srwilkers> if that's okay 15:11:55 <jayahn> okay. great! 15:12:15 <srwilkers> also, wanted to get everyones and yours thoughts on something with the fluent-logging stuff 15:12:42 <jayahn> fyi, we have checked new version of fluent-bit support kubernetes plugin and some experimental kafka output. that will be our very next thing to do after this is merged. 15:13:10 <jayahn> okay. srwilkers shoot it. :) 15:13:40 <srwilkers> would it be worth considering handling the parsers and fluentbit configurations via values.yaml? reason i ask is because while it works out of the box, its very opinionated in that it's expecting you'll only ever use the json logging driver for docker 15:14:02 <srwilkers> ie, it only gets log events via tailing /var/log/whatever and /var/lib/docker/containers/whatever 15:14:44 <srwilkers> it's not something i want us to spin the wheels on with the current patchset, because some of your work and mateuszb's work really depends on the fluent-logging stuff being finished 15:14:53 <srwilkers> but might be worth considering as an enhancement down the road 15:15:25 <jayahn> okay. we will surely consider your idea into our enhancement task. 15:15:33 <jayahn> i will talk to sungil about this. 15:15:47 <srwilkers> awesome :) i can draw up some pictures and throw some roughed up ideas your way too if that helps 15:16:04 <seungkyua> right all docker container logs are json type. We need to change that. 15:16:30 <seungkyua> and kubernetes logs in /var/log/xxx 15:16:44 <srwilkers> but really jayahn -- it's great work. :) 15:16:48 <jayahn> as you said, the current ps will be just the first of all the waves coming after. let's make this base thing work, then continuosly enhance it 15:16:54 <srwilkers> i agree 15:17:16 <srwilkers> thats all i had 15:17:36 <jayahn> seungkyua is our senior developer. he agrees with you, srwilkers. 15:17:39 <jayahn> :) 15:17:50 <srwilkers> nice -- pleasure to meet you seungkyua o/ 15:18:07 <seungkyua> nice to meet you. 15:18:22 <jayahn> if he says yes.. it means yes for us :) 15:18:23 <srwilkers> #topic default alert list spec 15:18:26 <srwilkers> awesome :) 15:18:31 <seungkyua> this is the first time online chat. :) 15:18:36 <jayahn> ah. this is very very early draft. 15:19:00 <jayahn> just want to get everyone's opinion on "what is the best way to write it". 15:19:41 <mateuszb> I'll take a look at it tomorrow jayahn :) 15:20:02 <jayahn> I think we first define "alert/alarm definition" things like we would like to alert on cpu idle, cpu percent, etc. 15:20:16 <jayahn> but not defining actual trigger threshold in this spec. 15:20:17 <srwilkers> yeah, i was going to say your input would be awesome mateuszb since this touches what you've been working against 15:20:29 <v1k0d3n> jayahn: wrt to the fluent work, i can take a look at it too...we've been doing a lot of this recently too for some internal demos. would be nice to get this into upstream. 15:20:32 <srwilkers> jayahn: yeah i agree 15:20:46 <srwilkers> v1k0d3n: thatd be awesome :) 15:20:50 <jayahn> v1k0d3n, awesome :) 15:21:24 <jayahn> thanks mateuszb. i will add more alert definition in this week. your feedback would be really helpful 15:22:55 <srwilkers> anything else on this one? 15:22:59 <jayahn> i think korzen_ only has the firs half of meeting time. let's turn it to him now. :) 15:23:09 <srwilkers> sounds good 15:23:17 <korzen_> yes 15:23:26 <srwilkers> #topic multi namespace support for entrypoint 15:23:44 <korzen_> so I would like to highlight that multiple namespace support is done in the PS 15:23:52 <srwilkers> i just workflowed it :) 15:23:55 <korzen_> #link https://review.openstack.org/#/c/510810 Support services in different namespaces 15:24:09 <korzen_> #link https://review.openstack.org/#/c/511515/ Add jobs and daemonsets namespace support 15:24:27 <korzen_> after it being merged, the full solution is enabled 15:24:57 <korzen_> so we can add cross namespace dependencies for services via enpoints, and for jobs and daemonsets in dependencies section 15:25:19 <korzen_> I am testing it in use-case where every service have its own infra 15:25:31 <v1k0d3n> nice work over there korzen_ ! :) great to see this added. 15:25:36 <srwilkers> nice korzen_ :) 15:25:49 <korzen_> like keystone namespace would have its own mariadb, rabbimq ingress etc 15:25:56 <korzen_> ceph would be common 15:26:08 <v1k0d3n> yeah that's awesome. always been the goal... 15:26:08 <korzen_> thx ;) 15:26:20 <v1k0d3n> you guys made it reality. :) 15:26:55 <korzen_> glad to see that it is appreciated ;) 15:27:39 <korzen_> I guess that multiple namespace it is all 15:27:45 <korzen_> for RBAC 15:27:45 <srwilkers> #topic RBAC support 15:28:01 <korzen_> #link https://review.openstack.org/#/c/464630 RBAC authorization support 15:28:22 <korzen_> this one i huge but it contains all RBAC rules that are needed to be run 15:28:39 <korzen_> I wanted to get portdirect review on that one 15:29:02 <korzen_> all necessary details are included in agenda 15:29:49 <korzen_> I would also test in for multiple namespace use-case in following days 15:30:20 <korzen_> but example with ceph and ceph-config made this PS ready for multiple namespace 15:30:25 <jayahn> korzen_ we will try to review this RBAC one as well 15:30:36 <srwilkers> ill get portdirect to look at it today and provide his feedback 15:31:10 <korzen_> ok, I need to run 15:31:15 <korzen_> it is all from my side 15:31:16 <srwilkers> later korzen_ :) 15:31:26 <srwilkers> #topic log based alerting approaches 15:31:27 <jayahn> thanks korzen_ 15:31:27 <korzen_> bye 15:31:29 <srwilkers> take it away mateuszb 15:31:41 <mateuszb> I've got a couple of patchsets in review regarding log-based alarms 15:31:55 <mateuszb> I've grouped them into 2 categories depending on the approach: 15:32:02 <mateuszb> 1. Based on ElastAlert: 15:32:07 <mateuszb> ElastAlert chart: https://review.openstack.org/#/c/516629/ 15:32:11 <mateuszb> Nagios: passive check for DB errors: https://review.openstack.org/#/c/518543/ 15:32:14 <mateuszb> Pushing notifications from ElastAlert to Nagios: https://review.openstack.org/#/c/518711/ 15:32:34 <mateuszb> and 2. Based on fluent-plugin-prometheus: 15:32:43 <mateuszb> Gathering DB errors count using fluent-plugin-prometheus: https://review.openstack.org/#/c/514938/ 15:32:51 <mateuszb> Example log-based alert in Prometheus: https://review.openstack.org/#/c/515061/ 15:32:54 <mateuszb> Nagios: Prometheus check for DB errors in logs: https://review.openstack.org/#/c/519318/ 15:33:28 <mateuszb> So I've verified that both of the solutions are ready to be integrated with Nagios (I wasn't so sure about ElastAlert+Nagios, but it works well) 15:33:29 <jayahn> these are really beautifully categorized examples. :) 15:34:00 <mateuszb> I'm leaving it as it is until the decision is made which of the two approaches we choose (I'd vote for ElastAlert as it's precisely designed for log-based alerting - with a lot of configuration capabilities in place) 15:34:08 <mateuszb> So any comments and votes are welcome :) 15:36:14 <mateuszb> that's all from me 15:37:02 <jayahn> ElastAlert seems great tool to use. +1 on that. 15:37:02 <jayahn> however, since we are probably use prometheus alert manager for metric-based alert, it would be good to use single solution for all the alert. so +1 on fluent-plugin-prometheus. :) 15:37:26 <jayahn> i will do some discussion with my team members, and will leave our feedback. 15:37:34 <mateuszb> You're not helping ;) 15:37:40 <mateuszb> Ok that would be great 15:37:42 <jayahn> yeap. i know. :( 15:37:45 <srwilkers> yeah, im a bit torn on this. i feel like ive introduced some confusion with nagios, as it was meant to be pitched as a deadmans switch for things like backing prometheus with ceph 15:38:12 <srwilkers> but elastalert is able to fire off alerts independently right? it doesnt need nagios or alertmanager? 15:38:33 <mateuszb> No, it doesn't need nagios and alertmanager 15:38:58 <srwilkers> okay, that makes me feel better. 15:39:22 <mateuszb> It fires off alerts independently - but there is a possibility to execute a script, which in turns executes the passive chech to Nagios 15:39:30 <mateuszb> check * 15:39:35 <srwilkers> cool :) 15:40:10 <jayahn> I would like to compare alert template on both solution, i mean how flexible it is to set some alert patterns. 15:40:30 <srwilkers> yeah, that's something to consider for sure 15:40:39 <srwilkers> but great work all around on this stuff mateuszb 15:41:07 <mateuszb> Well, I may prepare a list of what's needed to add additional alert in both cases 15:41:57 <mateuszb> to make things faster, I'll write it on slack tomorrow 15:42:01 <srwilkers> sounds good :) 15:42:08 <mateuszb> in order not to wait until the next meeting :) 15:42:18 <jayahn> great! 15:43:53 <srwilkers> anything else? 15:44:07 <mateuszb> no, that's all. Thanks 15:44:15 <srwilkers> #topic reviews needed 15:45:07 <srwilkers> Cell service: https://review.openstack.org/#/c/516810/ 15:45:10 <jayahn> cell service and nova placement is two essential stuff to do ocata. these are almost ready. pls do final review on this. :) 15:45:27 <srwilkers> :) 15:46:06 <jayahn> FYI, as portdirect's request, we will do separate upstream "value override to make ocata work". probably make a new mvp values. 15:46:51 <jayahn> Neutron: Correct section name for linuxbridge bridge_mappings config: https://review.openstack.org/#/c/518503/ 15:46:52 <srwilkers> jayahn: that'd be awesome. 15:47:13 <jayahn> we have been testing vlan-based provider-network w/ linuxbridge 15:47:21 <jayahn> to support some of legacy openstack env. 15:47:36 <jayahn> this is one of few thing we are fixing while doing that. 15:47:56 <jayahn> i think it is rather straight-forward. pls review. :) 15:48:04 <srwilkers> just workflowed it 15:48:15 <jayahn> thanks 15:49:03 <jayahn> that is all 15:49:40 <srwilkers> awesome :) 15:49:58 <srwilkers> any other last minute items? 15:50:22 <srwilkers> otherwise we can take the open discussion to the openstack-helm channel -- im getting rushed out of my conference room :) 15:50:37 <jayahn> bye 15:50:49 <srwilkers> :) 15:50:52 <mateuszb> bye 15:50:53 <srwilkers> thanks for coming everyone 15:50:57 <srwilkers> #endmeeting