15:00:50 <mattmceuen> #startmeeting openstack-helm
15:00:50 <openstack> Meeting started Tue May 29 15:00:50 2018 UTC and is due to finish in 60 minutes.  The chair is mattmceuen. Information about MeetBot at http://wiki.debian.org/MeetBot.
15:00:51 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
15:00:54 <openstack> The meeting name has been set to 'openstack_helm'
15:00:56 <mattmceuen> #topic rollcall
15:01:02 <mattmceuen> GM all!
15:01:05 <roman_g> good morning
15:01:13 <piotrrr> hi
15:01:16 <lamt> o/
15:01:19 <tdoc_> hello
15:01:21 <Guest157`> hihi!
15:01:23 <mattmceuen> Here's the agenda for today: https://etherpad.openstack.org/p/openstack-helm-meeting-2018-05-29
15:01:28 <mattmceuen> Welcome
15:01:34 <Guest157`> oops
15:01:38 <mattmceuen> Please add in anything else you'd like to discuss in there
15:01:57 <mattmceuen> Hey d|k :)  you have given up your anonymity!
15:02:15 <portdirect> o/
15:02:24 <mattmceuen> Hey portdirect!  Aren't you on vacation?
15:02:29 <portdirect> Yup
15:02:58 <portdirect> In a mountain chalet on dodgy 3g :)
15:02:59 <roman_g> o/
15:03:05 <mattmceuen> Nice
15:03:27 <mattmceuen> hey roman_g tdoc piotrrr lamt
15:03:34 <rwellum> o/
15:03:40 <mattmceuen> hey rwellum
15:04:03 <rwellum> Hey mattmceuen
15:04:31 <mattmceuen> portdirect just make sure not to work too hard in this team meeting, just provide color commentary and lob virtual rotten tomatos
15:05:20 <mattmceuen> Full disclosure:  I am brushing off all kinds of mental and technological cobwebs this morning following the long (USA) weekend, following the Vancouver summit
15:05:43 <mattmceuen> First thing
15:05:51 <mattmceuen> #topic Quick Summit Recap
15:06:06 <mattmceuen> A lot of good things were set in motion for OSH in Vancouver
15:06:42 <mattmceuen> Met a lot of folks f2f which was excellent as well, and a  number of imminent or potential OSH users
15:07:21 <tdoc_> Props on all the sessions, I attended most of them, good stuff. (I would be one of those.)
15:07:22 <mattmceuen> I won't give a rundown of them all here, since it's not all public - but I will be reaching out to them to get them pulled into our team
15:07:38 <mattmceuen> Awesome - thanks for attending tdoc
15:07:59 <mattmceuen> Glad to hear you thought they were valuable, since obviously I'm slightly biased :-D
15:08:41 <tdoc_> Yeah, I think it gave us a good impression of where the project is and is heading...
15:08:50 <piotrrr> yup, good stuff indeed, +1
15:10:02 <mattmceuen> Also there was a lot of discussion around Airship in the summit, which was great from an OSH perspective, as Airship is a consumer of OSH.  As well as one platform to run it on.
15:10:36 <roman_g> summit recap from youtube viewer: good (not great) presentations - listeners seemed to be a bit frustrated. But I hope that you attracted at least some attention to the Airship, since that one of goals. Overall, looking at other videos and comparing to previous years, it is seen that there is way less hype (which is good), and more work in direction of support of stabilization of the platform.
15:11:29 <portdirect> Thoughts on why people seemed frustrated?
15:11:37 <mattmceuen> Agree with your assessment roman_g that there is overall less hype and more delivery focus in the presentations
15:11:58 <mattmceuen> Yes, was the frustration specific to the OSH presentations, or do you mean overall with the summit roman_g?
15:12:23 <roman_g> OSH/Airship.
15:13:24 <roman_g> Thoughts on why people seemed frustrated? - I would say that not all people who were coming to your sessions were target audience.
15:13:35 <mattmceuen> Ok - I still have a few related sessions I need to catch up on via youtube, and I'll keep an ear out for that
15:13:48 <roman_g> That's what I understood from questions which were asked.
15:14:07 <portdirect> Same, good info roman_g
15:14:25 <mattmceuen> The sessions I was at in person had a good vibe with a lot of Q&A that extended far past the end of the presentation, which was great to see
15:14:39 <roman_g> that's very good
15:15:00 <mattmceuen> But I'll re-watch those too in case I missed some things -- there were a lot of notes to be taken
15:15:05 <mattmceuen> Thanks roman_g
15:15:18 <mattmceuen> Any other summit tidbits to share?
15:15:40 <roman_g> want to visit one of them )
15:15:45 <roman_g> as a speaker
15:16:02 <rwellum> Re the workshop I attended..
15:16:05 <mattmceuen> +1
15:17:15 <rwellum> I could help out a bit with the folks sitting around me - most had no or very little k8s experience. I did think that running a set of scripts one after another was a bit counterintuitive - because on one hand it showed the power of osh on the other hand it didn't teach much about what was going on.
15:17:43 <portdirect> Agree
15:18:00 <rwellum> But I don't have a good alternative tbh.
15:18:01 <mattmceuen> Yeah, that's a hard thing to balance
15:18:12 <mattmceuen> Was wrestling with the same internal monologue rwellum
15:18:37 <portdirect> Unfortunalty the scenarios I had slides for changed st the last minute (while on stage)
15:18:49 <rwellum> Yeah I noticed that portdirect
15:18:53 <portdirect> So had to freestyle the whole thing.
15:18:55 <rwellum> You had to pivot
15:19:04 <piotrrr> My impression is that some people (like me), might not necessarily care much about Airship, but do care about using OSH. And they want to use it to deploy OS on top of their, already existing, kubernetes clusters. So, I can imagine that such people might be a bit frustrated with yet one large box of moving parts being introduced into the picture (Airship). Sure, as far as I understand airship
15:19:06 <piotrrr> will not be a hard requirement for OSH, but I would say it would make sense to make a clearer distinction between the two, where possible.
15:19:24 <mattmceuen> Airship has a nice four-line "stand up the stack" experience, which is awesome for showing that the thing exists and works, but really doesn't give hands-on at all.  OTOH you don't want to make it too deep or else newbies may get left behind
15:19:54 <mattmceuen> So I think the script approach is at least one good middle ground way to demo the product and peel back the curtain just a bit
15:20:08 <rwellum> Many of the people around me were confused when we got to the stage where we had to run 'make'.
15:20:16 <rwellum> For example
15:20:47 <roman_g> Motto: run `watch -n1 "kubectl -n xxxx get pods"` for each namespace in tmux screens, and provide more usefull info on what is happening during installation and other phases.
15:20:55 <mattmceuen> piotrrr yep, no requirement to use airship for OSH at all.
15:21:17 <portdirect> rwellum: all the vms had the wrong things provisioned on them :(
15:21:21 <portdirect> Oh well
15:21:24 <rwellum> roman_g: +1
15:21:28 <roman_g> also good to have some monitoring of docker pulling images, but it's impossible right now
15:21:32 <tdoc_> rwellum: that was a bit unclear in the intructions, people where running the make in the wrong dir.
15:21:47 <rwellum> tdoc_: me too - took a while to figure out
15:22:01 <rwellum> portdirect: :(
15:22:27 <tdoc_> it's because I had some prior knowledge that I was able to follow...
15:22:39 <portdirect> Yeah I'm really sorry
15:22:39 <rwellum> Overall I think most people I spoke to were just very happy to play with a live demo. So that was good.
15:22:54 <tdoc_> but unfortunaltely in my instances the db charts had troubles, so got stuck there.
15:22:59 <mattmceuen> Yes I will say that for having the wrong things provisioned on the VMs -- way to roll with the punches portdirect :)  made for a still valuable workshop
15:23:01 <portdirect> I had to figure it out with over 100 eyes on me
15:23:13 <roman_g> 50 ppl
15:23:24 <rwellum> tdoc_: I think although there were plenty of VM's - there were some latency issues maybe>
15:23:25 <mattmceuen> 300 eyes?
15:23:27 <rwellum> ?
15:23:35 <tdoc_> yup
15:23:59 <piotrrr> I think some people had issues with containers/pods getting stuck for some reason
15:24:00 <tdoc_> not complaining, given the conditions etc, it was still worthwhile..
15:24:11 <mattmceuen> good
15:24:44 <mattmceuen> Alrighty - moving on to our next topic:
15:24:51 <mattmceuen> #topic Storyboard
15:25:06 <mattmceuen> So we had reasons to move to Storyboard before
15:25:12 <mattmceuen> But now we are truly motivated
15:25:24 <mattmceuen> I have been told that once we migrate to Storyboard, we can have
15:25:26 <mattmceuen> ...
15:25:30 <mattmceuen> Honey Badger Stickers
15:25:52 <rwellum> %^$*!!!
15:25:54 <mattmceuen> (which is of course the OSH mascot)
15:25:55 <mattmceuen> I know right
15:25:57 <lamt> is that the condition for stickers?
15:26:01 <mattmceuen> It is
15:26:14 <lamt> I will volunteer then - I want stickers
15:26:17 <mattmceuen> slightly tounge-in-cheekly, but that's the agreement :)
15:27:08 <piotrrr> What's the status of the migration?
15:27:12 <portdirect> Who is leading this atm?
15:27:20 <mattmceuen> The biggest challenges with migration will be
15:27:20 <mattmceuen> 1) communicating it to everyone
15:27:20 <mattmceuen> 2) using the new storyboard-friendly git commit headers
15:27:25 <mattmceuen> rwellum
15:27:43 <mattmceuen> He has done a POC of the migration that he shared a couple weeks back for feedback
15:28:01 <mattmceuen> rwellum, have any concerns been raised?
15:28:01 <rwellum> Yeah the POC is still up for everyone to look at and play with
15:28:09 <rwellum> No not to me.
15:28:26 <portdirect> So let's just pull the trigger?
15:28:30 <rwellum> There are other teams I spoke to at the Summit that are holding back for various reasons.
15:28:31 <mattmceuen> It sounds like the migration itself is small potatos, and the trigger can be pulled whenever we're ready
15:28:41 <rwellum> Yeah - it's all ready I think.
15:28:44 <portdirect> rwellum: what like?
15:28:48 <mattmceuen> honey badger don't care about holding back
15:29:06 <mattmceuen> Do we want to set a target of e.g. next monday so we can communicate?
15:29:19 <portdirect> Also can you update the docs, to point to storyboard etc?
15:29:19 <rwellum> Like Cinder for example, they are so embedded in the old way and the sample migration took days to run and didn't complete.
15:29:29 <mattmceuen> portdirect yup
15:29:45 <mattmceuen> interesting rwellum
15:30:09 <rwellum> But I think for OSH - less worries as still new, you guys are writing the process newly.
15:30:26 <rwellum> Bad English but ykwim
15:30:50 <portdirect> Yeah, we dont have that much stuff to pack up and take over.
15:31:08 <rwellum> Yeah so if we target next monday, I'll contact the infra team and ask them to initiate the next step.
15:31:49 <mattmceuen> Good.  Yeah, the biggest constructive criticism I've received re: OSH is that we could use a better commuity roadmap so it's easy to see where the project is going and to volunteer for work items.  We've been making good strides but the storyboard migration is a great opportunity to get that in good shape.
15:32:12 <portdirect> ++
15:32:28 <mattmceuen> Excellent - let's do that, let me know if Monday turns out to be a bad day for any reason.  I'll plan on sending some comm out in the ML
15:32:48 <rwellum> Ok I will.
15:33:09 <tdoc_> once you guys switch, will that mean new bugs can't be filed in launchpad?
15:33:37 <tdoc_> ie will it be clear to end users at what point they should use which tracker?
15:33:40 <mattmceuen> It definitely means they /shouldn't/ be; will we be able to actually disable launchpad?
15:33:49 <rwellum> That's the idea tdoc_
15:34:00 <rwellum> The disable part - that's one of the things I want to check with infra
15:34:08 <rwellum> I'll report back.
15:34:13 <mattmceuen> excellent
15:34:25 <rwellum> Also think it would be good to add a 'low hanging fruit' project group - for simple things to pass onto new users.
15:34:31 <mattmceuen> +1
15:34:34 <rwellum> Simple bugs etc.
15:34:55 <roman_g> +1
15:35:08 <roman_g> that's important for me
15:35:10 <mattmceuen> We would like to cut our 1.0 release in the next couple months, and identification e.g. low-hanging doc updates as well would be good ones
15:35:13 <piotrrr> +1
15:35:28 <rwellum> Yup - that would be a great addition too
15:35:30 <rwellum> imo
15:35:38 <portdirect> rwellum: do you have the bandwidth to take a stab at getting a low hanging list up?
15:35:45 <rwellum> Yes I will do that.
15:35:55 <portdirect> Awesome :D
15:36:00 <portdirect> Thx dude
15:36:06 <rwellum> np
15:36:46 <mattmceuen> Next topic:
15:36:56 <mattmceuen> #topic Creating a set of guidelines which would help contributors troubleshoot issues, e.g. stuck containers
15:37:10 <mattmceuen> piotrrr, want to speak to this one?
15:38:57 <mattmceuen> Adding some solid operational docs is definitely something we want to do as part of our 1.0 release
15:39:15 <piotrrr> yes, so we're just starting with OSH, and we're running into all kinds of different issues. Stuck containers/pods etc. We have no know-how on to troubleshoot those issues. If we're running into such problems, other contributors/operators might also be.
15:39:22 <mattmceuen> It would be good to capture a list of topics to speak to (and then create storyboard items for!) -- this is a good one
15:39:47 <mattmceuen> +1 piotrrr
15:39:50 <tdoc_> I think it would be nice to have. I've been at the point where I see a bunch of pods in init state and wondering what to do.
15:39:51 <piotrrr> So, my question would be whether the OSH community would like to collab on creating a doc with tips/hints for troubleshooting OSH and OS running on top of it
15:40:10 <portdirect> Yes!
15:40:37 <portdirect> And this is where new users add huge value :)
15:40:45 <roman_g> tdoc_: they were docker-pulling? )) you can't monitor progress of that unfortunatelly
15:41:09 <tdoc_> It's often unclear to me which pod is waiting for which other pod to complete.
15:41:16 <mattmceuen> We have a troubleshooting doc already, we should all get into the habit a little bit more of adding things into it after we fix them!
15:41:22 <portdirect> I know to go check the init containers, the eventually kublet logs - but this is prob not intuitive for new k8s users
15:41:41 <tdoc_> +1
15:41:44 <roman_g> +1.
15:42:02 <mattmceuen> Is there any good "general" kubernetes troubleshooting guide out there that we can refer to for good "technique"?
15:42:23 <portdirect> Just the k8s docs that I'm aware of
15:42:30 <roman_g> haven't seen that
15:42:34 <portdirect> Though they are quite thin
15:42:53 <roman_g> but having dashboard open helps a bit
15:43:04 <rwellum> I have some debug steps from kolla-k8s - some would apply, I can look at the troubleshooting guide and see if any can help.
15:43:23 <portdirect> mattmceuen: in the workshop, how many people were hung up on ceph ns activation?
15:43:28 <mattmceuen> Tip #1 :)  we have LMA user interfaces - good one roman_g
15:44:00 <mattmceuen> I think at least 3-5 folks portdirect
15:44:08 <mattmceuen> that step was easy to miss for whatever reason
15:44:12 <piotrrr> ok, how do we want to start with? Maybe creating a etherpad where everyone from the OSH team could braindump their approaches/hints for troubleshooting. We can organize those into public  docs later on.
15:44:22 <roman_g> apache airflow dag dashboard?
15:44:40 <portdirect> roman_g: not airship
15:44:58 <portdirect> The actual lma stack from osh-infra.
15:45:06 <mattmceuen> https://etherpad.openstack.org/p/openstack-helm-troubleshooting
15:45:20 <roman_g> portdirect: ah, yep
15:45:25 <mattmceuen> ^ let's use that to jot down ideas as they come to us (and troubleshooting steps as we do them)
15:45:37 <roman_g> mattmceuen: pin to channel topic here & in slack?
15:45:58 <mattmceuen> Then we can turn them into storyboard things and doc updates at our convenience
15:46:01 <mattmceuen> Good idea
15:47:06 <mattmceuen> thanks piotrrr for bringing this up, let's revisit next week and see how the etherpadding is going
15:47:11 <rwellum> https://github.com/openstack/kolla-kubernetes/blob/master/doc/source/deployment-guide.rst - look at the ts guide at the end. I did most of that.
15:47:48 <piotrrr> sounds good, thanks
15:48:04 <rwellum> I'll add to the etherpad
15:48:26 <piotrrr> (I would be happy to help turn the notes into proper docs later on)
15:48:43 <mattmceuen> That would be awesome, thank you!
15:49:08 <mattmceuen> rwellum there is a lot of good stuff in there that could be adapted to OSH
15:49:48 <rwellum> Yeah it's all k8s
15:49:59 <mattmceuen> Ok - we have another item tdoc_ wanted to bring up
15:50:03 <mattmceuen> #topic Roundtable
15:50:39 <tdoc_> yeah, so I brought this up the irc chan... Having some DNS issues with rabbitmq.
15:51:08 <tdoc_> It seems in my case the rabbitmq-server won't start because it can't lookup its hostname.
15:51:19 <mattmceuen> I haven't had a chance to catch up on the full conversation yet; you're still seeing the issue tdoc_?
15:51:55 <tdoc_> That DNS record does not exist yet, because the readines/liveliness probes don't pass yet... So chicken-egg.
15:52:16 <portdirect> It's odd that the pod cannot resolve itself though
15:52:17 <tdoc_> I need to add this service.alpha.kubernetes.io/tolerate-unready-endpoints to make it work
15:52:30 <mattmceuen> I think you noted it's not an issue in the gates; do you know what the difference between your environment & the gates might be?
15:53:24 <mattmceuen> What version of k8s are you running?
15:53:39 <tdoc_> 1.10.2
15:54:10 <portdirect> And you've now tried with both kube-dns and coredns?
15:54:15 <tdoc_> I'm somewhat assuming it does not come up it the gates, but not familiar with that environment myself yet...
15:54:23 <tdoc_> yup, tried both
15:54:44 <portdirect> Hmm, this is an odd outlier :/
15:55:00 <portdirect> Are you on very slow machines?
15:55:15 <tdoc_> As far as I've understood the docs though, k8s is not supposed to expose dns records for headless services until the probes indicate ready.
15:55:27 <portdirect> Correct
15:55:47 <portdirect> Though the pod should be able to resolve itself
15:55:52 <tdoc_> yeah, all my stuff is running in our local openstack cloud, so in VMs which might not be the fastest...
15:56:10 <tdoc_> It complains about rabbitmq-rabbitmq-0.rabbitmq-dsv-7b1733.openstack.svc.cluster.local
15:56:29 <tdoc_> Which I think is the record for the service
15:56:37 <portdirect> From the 1st rabbit pod?
15:56:42 <tdoc_> yup
15:56:58 <portdirect> Can you paste the full logs?
15:57:07 <mattmceuen> Please paste them in the OSH chat
15:57:16 <mattmceuen> So we can keep this going since we're almost out of time
15:57:19 <tdoc_> hmm, i'm not sure I have those handy right now
15:57:42 <mattmceuen> No worries, if you can share them when you have them handy that would be helpful
15:57:44 <tdoc_> it's something like ERROR: epmd .... that hostname.... domain not found....
15:57:54 <tdoc_> (sorry best I can do right now)
15:58:15 <mattmceuen> We'll get it figured out
15:58:48 <mattmceuen> Alright, with two minutes left - any final discussion points?
15:59:44 <mattmceuen> Alright - thanks for a great meeting all
15:59:51 <mattmceuen> See you in #openstack-helm
15:59:54 <mattmceuen> #endmeeting