Tuesday, 2018-08-21

*** yamahata has quit IRC01:22
*** yamahata has joined #openstack-meeting-501:33
*** ricolin has joined #openstack-meeting-501:33
*** lujinluo has joined #openstack-meeting-501:41
*** lujinluo has quit IRC01:43
*** lujinluo has joined #openstack-meeting-501:59
*** yamahata has quit IRC02:15
*** markvoelker has joined #openstack-meeting-503:01
*** lujinluo has quit IRC03:47
*** lujinluo has joined #openstack-meeting-503:51
*** lujinluo has quit IRC03:56
*** markvoelker has quit IRC04:32
*** markvoelker has joined #openstack-meeting-504:43
*** markvoelker has quit IRC04:52
*** skazi has joined #openstack-meeting-505:08
*** yamahata has joined #openstack-meeting-506:07
*** goutham1 has joined #openstack-meeting-513:58
*** sgrasley has joined #openstack-meeting-513:58
*** jamesgu has joined #openstack-meeting-514:13
*** tdoc has joined #openstack-meeting-514:45
*** goutham1 has quit IRC14:49
*** gagehugo has joined #openstack-meeting-514:51
*** hongbin has joined #openstack-meeting-514:56
*** mattmceuen has joined #openstack-meeting-514:57
*** yamahata has quit IRC14:57
mattmceuen#startmeeting openstack-helm14:59
openstackMeeting started Tue Aug 21 14:59:57 2018 UTC and is due to finish in 60 minutes.  The chair is mattmceuen. Information about MeetBot at http://wiki.debian.org/MeetBot.14:59
openstackUseful Commands: #action #agreed #help #info #idea #link #topic #startvote.15:00
*** openstack changes topic to " (Meeting topic: openstack-helm)"15:00
openstackThe meeting name has been set to 'openstack_helm'15:00
mattmceuen#topic Rollcall15:00
*** openstack changes topic to "Rollcall (Meeting topic: openstack-helm)"15:00
srwilkerso/15:00
mattmceuenGM / GE / GD everyone!15:00
mattmceueno/ srwilkers15:00
tdochi15:00
mattmceuenhey tdoc15:00
evrardjpo/15:01
mattmceuenHere's our agenda: https://etherpad.openstack.org/p/openstack-helm-meeting-2018-08-2115:01
mattmceuenPlease go ahead and add anything you'd like to discuss today15:01
mattmceuenOtherwise I'll give one more min for folks to filter in15:01
*** jayahn has joined #openstack-meeting-515:02
mattmceueno/ jayahn!15:02
portdirecto/15:02
jayahno/15:03
mattmceuen#topic LMA News15:03
*** openstack changes topic to "LMA News (Meeting topic: openstack-helm)"15:03
jayahnmattmceuen!!15:03
mattmceuenGood to see you man :)15:03
mattmceuensrwilkers has been hard at work testing our LMA stack in various labs of various sizes and workloads15:03
jayahnyeah.. it was independency holiday + summer vacation last week.15:03
jayahnthat is good.15:04
mattmceuenThat sounds awesome jayahn - hope it was an awesome vacation15:04
mattmceuenwell earned15:04
jayahnwe are analyzing what each exporter gathers, which is siginificant one to watch, and alarm..15:04
portdirecthave you ever deployed the current lma stack at scale in a working osh cluster?15:05
mattmceuensrwilkers has been doing some of the same thing as part of his analysis15:05
srwilkersoh hello15:05
jayahnat scale.. how big?15:05
portdirectlets start small >10 nodes, with active workloads?15:06
jayahni think we did fairly good test on >10, for logging part15:06
portdirectdid you run into any issues? OOM's or similar?15:07
jayahnfor prometheus, we are behind the schedule.15:07
jayahnon elastic search side, i heard sungil experience lots of oom15:07
jayahnsungil had experienced..15:08
srwilkersare you running default values mostly, or have you started providing more fine-grained overrides for things like fluentbit and fluentd?  my biggest takeaway from the logging stack was that it's better to leverage fluentd to provide smarter filters than just jamming everything into elasticsearch15:08
srwilkersonce i started adding more granular filters and dumping specific entries, elasticsearch was much healthier in the long term15:08
jayahnnope, i think we overrides values on es, fluent-bit.15:08
jayahni will ask sunil tomorrow on this15:09
portdirectthat would be great jayahn15:09
jayahnhe has been struggling with logging for the last two month..15:09
srwilkersi've done quite a bit of work on this and exposed it as part of the work to introduce an ocata based armada gate15:09
srwilkershttps://review.openstack.org/#/c/591808/12/tools/deployment/armada/manifests/ocata/armada-lma.yaml15:09
jayahndid some test on federation as well.15:10
jayahnokay.15:10
srwilkersprometheus is a whole different beast though15:11
jayahnagreed15:11
jayahnit graduated at least. :)15:12
srwilkerslol15:12
mattmceuensrwilkers I believe you're moving more sane defaults into the charts so that operators can choose to let more logs through to elasticsearch if they need them, right?15:12
jayahnwe are urgently hiring a person to take on prometheus, really short handed right now. :)15:12
jayahnsrwilkers: you will be always welcome here :)15:13
mattmceuenhey hey save the poaching for after the team meeting15:13
srwilkersjayahn: lol15:13
portdirectjayahn: I'm his agent15:13
mattmceuenand I'm portdirect's agent15:13
portdirectand alanmeadows is yours?15:13
mattmceuenwe all get a cut15:14
jayahnhey.... pyramid organization...15:14
mattmceuen:D15:14
srwilkersbernie madoff would be proud15:14
srwilkersanyway15:14
evrardjplol15:14
mattmceuenAnything else to share on the LMA front folks?15:14
srwilkersyeah15:14
evrardjpFYI it hurts corporate diversity :p15:14
evrardjpI know,I have been there :p15:14
srwilkersim starting to work on pruning out the metrics we actually ingest into prometheus by default15:15
srwilkersas we were consuming a massive amount of metrics that we arent actually using with grafana/nagios/prometheus by default15:15
jayahnfrankly speaking, we are currently guarantying "short time usage" for elastic search, and asking all the operation team to help us to fine-tune these logging beast if they want to use this as more long term logging stroage15:15
jayahn:)15:15
srwilkerscadvisor was the biggest culprit -- i've proposed dropping 41 metrics from cadvisor alone, and that reduced the total number of time series in a single node deployment from 18500ish to a little more than 300015:16
jayahnsrwilers: I think we can help on that15:16
mattmceuenwow15:16
srwilkersnode exporter is probably my next target15:17
srwilkersas there's some there we dont really need15:17
evrardjpsrwilkers: good to know15:17
jamesgusrwilkers: is there a doc for the metrics that we are currently collecting?15:18
portdirectsrwilkers: were you able to get all welost from cadvisor out of k8s itself, or too early to say?15:18
srwilkersit's something that needs some attention though, because ive been seeing prometheus fall over dead in a ~10 node deployment with 16GB memory limits15:18
srwilkersand it was hitting that limit after about 2 days without significant workloads running on top15:18
srwilkersportdirect: too early to say15:18
portdirectjayahn: maybe your team could help there?15:19
srwilkersjamesgu: we currently gather everything available from every exporter we leverage.  i dont have a list handy yet, but can provide a quick list of exporters we have15:19
jayahnfor every exporter, we are doing like this. https://usercontent.irccloud-cdn.com/file/mtrps177/Calico%20Exporter.pdf15:19
evrardjpsrwilkers: if you could document it in, that would be nice :)15:19
*** goutham1 has joined #openstack-meeting-515:19
srwilkersevrardjp: yeah, it's about that time :)15:20
jayahnif we can setup wiki page we can all use, I will certainly upload information we summarized so far, and work together.15:20
jamesgusrwilkers: that would be very nice.15:20
jayahnwiki or anything to put this massive document, or information to share15:20
evrardjpsrwilkers: ping me for reviews when ready15:20
srwilkersevrardjp: nice, cheers15:20
evrardjpdon't need for the whole document, just saying how it works15:21
jamesguhave we run into disk issue too besides memory?15:21
srwilkersjamesgu: yep.  noticing 500gb PVCs filling up in ~7 days time on a similar sized deployment (~8-10 nodes)15:22
srwilkerswhich once again is due to the massive amount of time series we gather and persist15:22
evrardjpyeah pruning would be an important part :)15:22
portdirecthave any idea on the i/o reqs?15:22
portdirectas well as raw capacity15:22
srwilkersand things like cadvisor are especially bad, because there's ~50 metrics that get gathered per container, so if you think about how many containers would be deployed in a production-ish environment, that gets out of hand quickly15:23
srwilkersportdirect: not at the moment -- certainly something that would be nice to get multiple peoples' input on.  would be awesome if you could help evaluate that too jayahn15:23
jayahnI am missing the converation flow..15:24
jayahnsorry, .. what awesome thing I can do?15:25
*** goutham1 has quit IRC15:25
jayahncould you kindly summarize?15:25
srwilkersjayahn: oh, sorry.  just getting a better idea of the io requirements and storage capacity requirements for prometheus in a medium/large-ish deployment15:25
jayahnah.. okay15:25
portdirectjayahn: in your env do you know how much pressure lma has been putting on the storage - both capacity wise, and IOPs/thoughput?15:25
jayahnfor prometheus, we have a plan to test that on 20 nodes deployment from the next week15:26
jayahnwe are right now enabling every exporter.15:26
jayahnso, i guess we can share something next month.15:27
evrardjpjayahn: my point (sorry to have disrupted the flow) was that your research can be documented https://docs.openstack.org/openstack-helm/latest/devref/fluent-logging.html15:28
jayahnto get idea on capacity planning / requirement on prometheus.15:28
mattmceuenwould you plan to incorporate srwilkers' pruning work, or do you want all that data?15:28
evrardjpor elsewhere, as this is maybe not enough15:28
*** goutham1 has joined #openstack-meeting-515:29
jayahnevrardjp: document would be a good place once we finalize all the contents, but WIP information sharing might be better with more flexible tools, like wiki15:29
srwilkersevrardjp: that's largely my fault.  it's no secret that the biggest documentation gap we have is the LMA stack15:29
mattmceuenSorry guys, great discussion but we need to move on unfortunately15:29
srwilkersmattmceuen: agreed15:29
mattmceuenLet's touch point next week15:29
jayahnmattmceuen: I will review srwilkers' pruningn work, and try to leverage that15:30
mattmceuenThanks jayahn, hopefully its a quick & easy win for you to learn from our pain :)15:30
mattmceuenOk speaking of this and going slightly out of order as it's probably related to this topic15:30
mattmceuen#topic Korean Documentation15:31
*** openstack changes topic to "Korean Documentation (Meeting topic: openstack-helm)"15:31
portdirectoh15:31
portdirectso - we have some awesome work being done by korean speaking community memebers15:31
portdirectand they have some fantastic docs15:31
evrardjpthat's nice :)15:32
evrardjpis that linked to i18n team?15:32
portdirectnot yet!15:32
jayahnnot yet. :)15:32
evrardjpsorry, go ahead :)15:32
portdirectjayahn: can we work together to get korean docs up for osh15:32
portdirectso your team can start moving work upstream?15:32
jayahnthat would be no problem.15:32
jayahnso it would be korean docs? not need to translate to english?15:33
portdirectin what would be an awesome bit of reversal, i think the other english speakers would be happy to help translate them into english15:33
evrardjpjayahn: I guess you still need to have upstream english, but that can go through i18n process to publish a korean docs15:33
portdirectevrardjp: i think we need to work out how to handle this case prob a bit differently15:34
evrardjpif it's following the standard process :)15:34
evrardjpyeah I guess the first step would be to do the other way around?15:34
portdirectas there is more korean docs than english....15:34
portdirecti think so?15:34
jayahnokay. I will talk to ian.choi, previous i18n PTL15:34
evrardjpyeah, but I am not sure the tool is ready for that.15:34
portdirectjayahn: can you loop me in on that please15:34
evrardjpjayahn: that's great,I was planning to suggest that :)15:34
mattmceuenI think this is a great idea15:35
jayahnwe both (ian.choi and myself) will be at PTG, we can do f2f discussion on this topic as well15:35
evrardjpif you need help on the english side, shoot. I think good docs is a good factor for community size ramp up.15:35
portdirectcould not agree more evrardjp15:35
portdirectand seeing things like this: https://usercontent.irccloud-cdn.com/file/mtrps177/Calico%20Exporter.pdf15:35
portdirectmake me sad, as this is such a great resource to have15:36
evrardjpjayahn: let's plan that PTG part in a separate channel :)15:36
srwilkersjayahn: youre coming to denver?  time for more beer15:36
jayahnI told the foundation that I only have a budge to do single trip between PTG and Summit15:36
*** goutham1 has quit IRC15:36
portdirectso - mattmceuen can we get an action item to get this worked out at ptg15:36
evrardjpso I guess the question was: do we all agree to bring more docs from jayahn to upstream, and how we do things, right?15:36
jayahnthey kindly offered me a free hotel15:36
mattmceuenI will add it to the agenda15:36
portdirectevrardjp: 100%15:37
mattmceuenoh that's awesome jayahn.  #thanksOSF!!!15:37
jayahnokay. doing upstream in korean is really fantastic!15:37
evrardjpthat's cool indeed :)15:37
mattmceuenAlrighty - anything else before we move on?15:37
evrardjpshould we discuss more about the technicalities at the PTG now that ppl are in agreement we should bring your things in?15:37
evrardjpmattmceuen: I guess we agree there :)15:38
mattmceuenI think that'll be easier15:38
mattmceuenwe can move in that direction ahead of time15:38
mattmceuenbut lets plan on having things in good shape by the time PTG is over15:38
evrardjpI think Frank or Ian's input would be valuable in here.15:38
*** yamahata has joined #openstack-meeting-515:38
alanmeadowsjayahn / evrardjp: quick question, to be able to report things like calico being unable to peer to prometheus, are you running prometheus and all scrapers in host networking mode15:39
jayahnunfortunatly, i am not an expert on that, but I will ask hyunsun and get back to you. just put your question on etherpad. :)15:39
portdirectjayahn: is there a reason your team cant attend these? time/language etc?15:40
jayahntime and language15:40
portdirectlol - the double wammy15:41
evrardjp:)15:41
jayahndan, robert often attend these. they have english capa.15:41
portdirectthe other thing i'd like to dicuss at the ptg is how to bridge that gap a bit better15:41
jayahnbut most of others are not15:41
jayahni totally agree.. it has been very difficult point for me as well.15:42
portdirectlets start on the docs - and use that as a way to close the language barrier better15:42
mattmceuenNext week let's revisit meeting timing -- we still haven't found a time that works for everyone well15:43
mattmceuenBut if we can try harder and find a good time that would be really valuable15:43
mattmceuenAlright gotta keep movin'15:43
mattmceuen#topic Moving config to secrets15:43
*** openstack changes topic to "Moving config to secrets (Meeting topic: openstack-helm)"15:43
portdirectoh hai15:43
portdirectso - I'm working to move much of the config we have for openstack services to k8s secrets from configmaps15:44
evrardjp\o/15:44
portdirectthis should bring us a few wins15:44
portdirect1) stop writing passwords/creds to disc on nodes15:44
portdirect2) give us more ganular control on rbac for ops teams*15:45
portdirect3) let us leverage k8s secrets backends etc15:45
portdirect* this will need to fully come in in follow up work, when we start to split out 'config' from 'sensitive config'15:46
portdirectJust wanted to highlight this - as it will be a bit disruptive for some work in flight15:47
portdirectbut i think moves us in the right direction.15:47
evrardjpit's positive disrupting -- maybe using release notes would help :p15:47
*** gmmaha has joined #openstack-meeting-515:47
mattmceuenMaking sure I understand the last part:  is this the path15:47
mattmceuen1) None of the configs are secrets today15:47
mattmceuen2) All configs that contain passwords etc will be secrets soon15:47
mattmceuen3) More fine-grained split between the two in the future15:47
mattmceuen?15:47
portdirect1) yup15:47
portdirect2) yup15:48
portdirect3) yeah15:48
portdirect* three make take some time to implement, and frankly not be possible15:49
portdirectbut thats the intent15:49
mattmceuen#2 is my favorite15:49
srwilkers++15:49
mattmceuenBut yeah - #3 would be nice15:49
evrardjp++15:49
mattmceuenThat's awesome portdirect15:49
mattmceuenAny questions on secrecy before we move on15:50
evrardjpnone, positive improvement, thanks portdirect15:50
mattmceuen#topic Tempest15:50
*** openstack changes topic to "Tempest (Meeting topic: openstack-helm)"15:50
mattmceuenWe have several colors of lavender in the etherpad, I think this may be you jayahn :)15:50
jayahnjust curious on tempest usage15:51
mattmceuenSharing the full question:15:51
mattmceuenAT&T uses tempest? We found out that "regex, blacklist, whitelist" part is not working well. tempest 19.0.0 is required for pike, regex generation logic is changed from "currently avaialble tempest 13.0.0 on osh upstream". Just curious how gating or AT&T uses tempest. We think tempest need to be fixed, similar to rally.15:51
jayahnyeah.. that15:52
mattmceuenWe are still integrating tempest into our downstream gating15:52
evrardjptempest 19.0.0 is required in rocky for keystone api testing, if you do it. 18.0.0 will not work.15:52
evrardjpand queens15:53
portdirectthe tempest chart we have today, is very unloved :(15:53
portdirectand could do with a blanket, and some coco.15:53
jayahnso like discussion we had with rally, we need to find a good way to keep tempest version for each openstack release, and have corresponding values15:53
srwilkersi love it only enough to kick it every now and then15:53
mattmceuenrough crowd!15:54
evrardjpjayahn: so, for OSA, we are using tempest 18.0.0 for everything until rocky.15:54
evrardjpthat should work, as tempest is supposed to be backwards compatible15:54
portdirectevrardjp: we should make that same shift then15:54
evrardjpif you point me to your whitelist/blacklist, I can help on which version should be required per upstream branch15:55
evrardjpbut ourselves we are thinking to move everything to smoke.15:55
*** goutham1 has joined #openstack-meeting-515:55
portdirect++ this makes sense for community gates15:55
jayahnwe did manage to make it work.15:56
mattmceuenwhat did you do to get it working jayahn?15:56
portdirectjayahn: can you get a ps, with the changes you made?15:56
evrardjpportdirect: indeed, for community, I'd think that smoke tests are fine. You can do more thorough tests in periodics or internally.15:57
mattmceuen++15:58
jayahnportdirect: okay15:58
mattmceuenalright guys - we're at a couple minute to time15:58
mattmceuen#topic Roundtable15:58
*** openstack changes topic to "Roundtable (Meeting topic: openstack-helm)"15:58
mattmceuenI will move the things we didn't get to today to next week, sorry for not hitting everything today15:59
jayahnpls review PS. :)15:59
mattmceuenYes!15:59
portdirectone big thing - helm 2.10 is here!15:59
mattmceuenhelm yeah!15:59
portdirectso expect to see some tls related patches from ruslan and I ;)15:59
jayahnyeah!15:59
evrardjpthanks everyone16:00
goutham1By anychance did anyonce went through this16:00
goutham1https://storyboard.openstack.org/#!/story/200350716:00
goutham1portdirect: u said u will check yesterday did u find anything ??16:01
mattmceuenGotta shut down the meeting goutham1 - can we move this into #openstack-helm ?16:01
mattmceuenThanks all!16:01
mattmceuen#endmeeting16:01
*** openstack changes topic to "OpenStack Meetings || https://wiki.openstack.org/wiki/Meetings/"16:01
goutham1yes16:01
openstackMeeting ended Tue Aug 21 16:01:53 2018 UTC.  Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4)16:01
openstackMinutes:        http://eavesdrop.openstack.org/meetings/openstack_helm/2018/openstack_helm.2018-08-21-14.59.html16:01
openstackMinutes (text): http://eavesdrop.openstack.org/meetings/openstack_helm/2018/openstack_helm.2018-08-21-14.59.txt16:01
openstackLog:            http://eavesdrop.openstack.org/meetings/openstack_helm/2018/openstack_helm.2018-08-21-14.59.log.html16:01
*** njohnston has joined #openstack-meeting-516:03
*** ricolin has quit IRC16:06
*** gagehugo has left #openstack-meeting-516:09
*** goutham1 has quit IRC16:14
*** jamesgu has left #openstack-meeting-516:34
*** markvoelker has joined #openstack-meeting-516:50
*** gmmaha has left #openstack-meeting-516:51
*** yamahata has quit IRC17:17
*** portdirect has quit IRC17:54
*** yamahata has joined #openstack-meeting-517:58
*** mjturek has joined #openstack-meeting-518:10
*** lamt has quit IRC18:37
*** vkmc has quit IRC18:37
*** ttx has quit IRC18:37
*** fungi has quit IRC18:37
*** ttx has joined #openstack-meeting-518:38
*** vkmc has joined #openstack-meeting-518:40
*** TheJulia has quit IRC18:42
*** polvi has quit IRC18:42
*** fungi has joined #openstack-meeting-518:48
*** spiette has quit IRC19:32
*** spiette has joined #openstack-meeting-519:36
*** evrardjp has quit IRC19:39
*** shan5464_ has quit IRC19:39
*** pcarver has quit IRC19:39
*** njohnston has left #openstack-meeting-519:55
*** beisner_ has joined #openstack-meeting-520:12
*** tdoc has quit IRC20:19
*** beisner has quit IRC20:19
*** beisner_ is now known as beisner20:19
*** mjturek has quit IRC21:08
*** hongbin has quit IRC22:44

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!