21:00:43 #startmeeting containers 21:00:44 Meeting started Tue Sep 18 21:00:43 2018 UTC and is due to finish in 60 minutes. The chair is strigazi. Information about MeetBot at http://wiki.debian.org/MeetBot. 21:00:45 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 21:00:47 The meeting name has been set to 'containers' 21:01:05 #topic Roll Call 21:01:09 o/ 21:01:10 hello 21:01:13 o/ 21:01:27 o/ 21:02:38 Let's go diretly to stories/ideas, no more bps :) 21:02:49 #topic Stories/Ideas 21:03:33 What I would like more review is on : Fix cluster update command https://storyboard.openstack.org/#!/story/1722573 Patch in review: https://review.openstack.org/#/c/600806/ 21:03:58 To unblock cluster-updates 21:04:13 we'll get some eyes on it 21:04:33 plus we need a patch to be able to fix clusters in UPDATE_FAILED 21:04:47 I can do it, I'll ping you 21:05:11 In heat at least you can do stack update and go back in a good state 21:05:50 #action strigazi to push a patch for allowing cluster-update on UPDATE_FAILED clusters 21:06:30 I would also like to ask for review on making flannel self hosted: 21:06:46 https://review.openstack.org/#/c/597150/ 21:06:56 oh awesome! 21:07:07 that will make managing CNI a little cleaner :D 21:07:10 we'll check that out too 21:07:18 Laste week we rebooted hypervisors for l1tf and many nodes didn't have network 21:07:38 strigazi: i had been working on some changes that would be good to add to https://review.openstack.org/#/c/585420/ 21:08:30 imdigitaljim: I'll strip the patch from the controller manager container, and just have CI code + git mv the container agent 21:08:41 other changes can be added on top 21:08:53 You can push a patch with this one as dependency 21:09:19 imdigitaljim: The changes are for containers or the ci? 21:09:40 some additional changes for the containers 21:09:56 sorry some additional containers* 21:10:00 and only a couple changes 21:10:27 I have a latest (3.3.9) minimal etcd container 21:10:32 if they are not for the CI let's do them ina follow-up patch 21:10:50 and a heat-container-agent using alpine as its base 21:10:53 sounds good 21:10:54 imdigitaljim: sounds good, based on quai.io? 21:11:08 the etcd is 21:11:08 for etcd ^^ 21:11:23 FROM quay.io/coreos/etcd:$ETCD_VERSION 21:11:29 cool 21:11:47 3.3.x cleans the log of the api server 21:11:56 :) 21:11:58 and its much smalelr 21:12:04 from blah blah compacted blah blah 21:12:57 I've also finalized the calico update to latest 21:13:16 on our side, i'll get it all updated soon 21:13:25 had a slow ramp-up since our trip :] 21:13:44 :) 21:14:27 oh, right, can you also update the no-keypair option? 21:14:48 I'll also update the patch to pass the actual public key as a string 21:15:03 sorry for the late 21:15:13 not a keypair object, to allow cluster updates 21:15:17 flwang: welcome 21:16:01 Since flwang is here, 21:16:07 https://review.openstack.org/#/c/590443/ 21:16:08 this one/ 21:16:08 ? 21:16:16 flwang: hi! o/ 21:16:45 imdigitaljim: oh, I thought it was in conflict, looks good 21:16:55 yes this one 21:16:59 imdigitaljim: strigazi: i'm keen to know the output of Blizzard visiting to CERN ;) 21:17:58 flwang: we missed you :) 21:18:02 strigazi: i think going into the next couple weeks ill be looking into the in-place upgrade mechanism as well 21:18:09 flwang: ^ 21:18:42 cool 21:19:08 did you have any new updates to that? 21:19:17 flwang: eandersson might have a fix for the UTs for the cluster healing cmd 21:19:49 strigazi, flwang are you guys using Designate? is anyone else who might be lurking using it? just curious 21:19:53 flwang: we just need to test if it works in an actual py36 env 21:20:33 imdigitaljim: not much, we can sync tmr if you can 21:20:39 yeah sounds good! 21:20:44 strigazi: yep, i saw that 21:21:03 colin-: we have our own DNS, many years old :) 21:21:06 i will test it in with both UT and run it in a py36 env 21:21:18 also ill rebase/update the previous PRs with some new changes, such as starting to use config file instead of flags to move forward from deprecation 21:21:24 colin-: no, we haven't enabled Designate, but it's on our roadmap 21:21:36 I wish we knew someone on the Designate team 21:21:36 ok 21:21:38 and kube-proxy daemonset most likely 21:22:47 imdigitaljim: sounds good, I would do proxy first 21:24:09 imdigitaljim: flwang in calico, can you tell easily which pods dont' have network? 21:24:56 imdigitaljim: flwang with the l1tf reboots, the nodes were up, they had ips but not network acccess 21:25:03 strigazi: what do you mean don't have network? there is a calico command can help you understand the network status of each pod 21:25:12 yeah 21:25:19 i had a blog but i didn't have time to finish it :( shame on me 21:25:20 flwang: ping each other 21:25:25 and you can generally see that the daemonset on the nodes isnt coming online 21:25:28 you can see which nodes are having issues 21:26:07 well, in our case, flannel was up, the pods too but they couldn't connect to each other 21:26:17 strigazi: yep, sure, the debug process with calico is fairly normal like general network debug 21:26:44 ok, I was looking for a heartbeat or smth 21:26:58 1. make sure pod can talk to local node, 2. make sure the node can talk each other 21:27:32 need some iptables knowledge 21:27:39 yes, but it is a bit manual. I was doing this exacltly 21:27:48 due to involving the network policy 21:28:32 strigazi: yep, not sure if there is a fully automated way to debug it, but given it's normal process, we could probably write a script 21:29:03 we're deploying magnum into prod in catalyst cloud this week, so pls forgive my latency for upstream work 21:29:16 no prob 21:29:17 gl :) 21:29:21 yeah gl! 21:29:38 go flwang! 21:29:44 \o/ 21:30:00 cross finger for me :D 21:30:06 keep us posted :) 21:30:14 sure 21:30:21 ah, i do have a question 21:30:35 if heat is going well, magnum will go well too 21:30:41 based on the code, magnum doesn't support rotate certs for k8s 21:30:53 no, it doesn't 21:30:57 but we're showing the menu on magnum-ui 21:31:15 shouldn't we drop it until we can really support it 21:31:25 I think if you change the policy, it will hide 21:31:41 not 100% sure 21:31:53 strigazi: good point, will try 21:32:14 I can ask tmr in our team 21:33:02 https://ibb.co/cFxT4z 21:33:25 that's the dashboard we're using in catalyst cloud 21:34:06 it looks cool 21:34:38 are you in the openstack passport program, not sure how it is called 21:34:47 strigazi: we're 21:35:04 uses will see magnum there? 21:35:18 strigazi: yes, for sure, but not now 21:35:26 nice 21:35:29 probably next couple of weeks 21:35:38 very cool 21:35:48 https://catalystcloud.nz/services/paas/catalyst-kubernetes-service/ 21:36:05 we're using the vanilla upstream Magnum now 21:36:15 and we'd like to stick on that as much as we can 21:36:42 we'd like to upstream whatever can benefit others 21:36:56 :) 21:38:02 It seems we are covered, anything else to discuss? 21:39:15 no 21:39:17 im good 21:39:24 no all set here :) 21:39:27 ah, one small thing 21:39:38 i can manage to get the ds work for keystone auth 21:39:51 i will propose new patch set early next week 21:40:07 +1 21:41:14 Thanks for joining imdigitaljim flwang cbrumm colin- eandersson 21:41:22 Anytime! 21:41:32 strigazi: thank you 21:41:32 #endmeeting