09:01:01 <flwang1> #startmeeting magnum 09:01:02 <openstack> Meeting started Wed Aug 26 09:01:01 2020 UTC and is due to finish in 60 minutes. The chair is flwang1. Information about MeetBot at http://wiki.debian.org/MeetBot. 09:01:03 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 09:01:05 <openstack> The meeting name has been set to 'magnum' 09:01:10 <flwang1> #topic roll call 09:01:12 <flwang1> o/ 09:01:24 <brtknr> o/ 09:01:25 <openstackgerrit> Spyros Trigazis proposed openstack/magnum master: Very-WIP: Drop hyperkube https://review.opendev.org/748141 09:01:25 <dioguerra> o/ 09:01:29 <strigazi> o/ 09:02:56 <flwang1> thank you for join, guys 09:03:13 <flwang1> shall we go through the topic list? 09:03:57 <strigazi> +1 09:04:47 <flwang1> ok, recently we (catalyst cloud) just done a security review by 3rd party and we got some good comments to improve 09:05:29 <flwang1> hence you probably see there is a patch i proposed to separate the ca for k8s, etcd and front-proxy, though we discussed this long time ago 09:06:01 <flwang1> the patch https://review.opendev.org/746864 has been rebased on the ca rotate patch and tested locally 09:06:02 <strigazi> We knew that before, right? That each node could contact etcd. 09:06:09 <flwang1> strigazi: yep 09:06:37 <flwang1> each node can use the kubelet cert to access etcd 09:06:43 <flwang1> to be more clear ^ 09:07:06 <strigazi> or kube-proxy 09:07:10 <flwang1> yes 09:07:16 <strigazi> or any cert signed by the CA 09:07:36 <flwang1> so, please review that one asap, hopefully we can get it in this V cycle 09:08:02 <strigazi> I have a question though 09:08:06 <flwang1> sure 09:08:21 <flwang1> i'm listening 09:08:27 <strigazi> We need this patch. I'm not against it by any means it helps a lot. 09:09:07 <strigazi> The trust and trustee are in all nodes anyway. So one can get whatever certs they need, this is known, right? 09:09:34 <strigazi> And 09:09:53 <strigazi> What about serving service account keypair via the API? 09:10:18 <strigazi> That's it 09:10:58 <flwang1> re the trust and trustee, yep. that's a good point, we can try to limit the request in the future to only allow it came from master node 09:11:20 <flwang1> though i don't know how yet 09:11:20 <strigazi> So RBAC on magnum API 09:11:46 <strigazi> I see different trustees or application creds as a solution. 09:11:47 <flwang1> some silly(home-made) RBAC probably 09:11:57 <flwang1> strigazi: that also works 09:12:12 <strigazi> Why silly, it can't get better than this 09:12:34 <strigazi> different policy per role 09:12:41 <flwang1> sorry, i mean, openstack don't reayll have a good rbac design yet 09:13:52 <strigazi> This point needs discussion. Maybe a spec. I just wanted to mention it. 09:13:58 <strigazi> For the 2nd point? 09:14:20 <strigazi> Since we pass type in the API we could serve the service account RSA keypair 09:14:30 <flwang1> yep, it's a very good point. thanks for the reminder 09:15:35 <flwang1> strigazi: can you explain more about the benefit of serving the service account rsa keypair by api? 09:15:53 <strigazi> Add a new master NG 09:16:47 <strigazi> ATM we can't access the RSA keypair. It is a hidden (as it should) parameter in the heat stack 09:16:52 <flwang1> master NG for what? resizing? 09:17:04 <strigazi> resizing is not blocked by this 09:17:14 <strigazi> adding a new one 09:17:19 <strigazi> master-ng-B 09:17:21 <flwang1> i can't see the point of have another master NG 09:17:48 <flwang1> for worker nodes, NG makes good sense 09:18:04 <strigazi> I want to use bigger master nodes, how do I do this? 09:18:10 <flwang1> but what's the value of multi master NG 09:18:25 <flwang1> nova resizing? 09:19:28 <strigazi> That's an option, but you see my point. 09:19:45 <flwang1> sure 09:19:54 <flwang1> very good comments, i appreciate it 09:20:00 <flwang1> next one? 09:20:04 <strigazi> +1 09:20:16 <flwang1> #topic PodSecurityPolicy and Calico 09:20:30 <strigazi> What did they break again? 09:20:54 <flwang1> i'm still working on this, maybe you guys can help confirm 09:21:29 <flwang1> after using --admission-controller-list label with PodSecurityPolicy, the calico pod can't start, the cluster can't be started 09:22:20 <flwang1> with this post http://blog.tundeoladipupo.com/2019/06/01/Kubernetes,-PodSecurityPolicy-and-Kubeadm/ i think we need a dedicated psp for calico if we want to enable PodSecurityPolicy 09:22:50 <strigazi> we have a very privileged PSP for calico, it doesn't work? 09:23:00 <strigazi> calico is using a PSP already 09:23:18 <flwang1> strigazi: good to know. i haven't reviewed the code yet 09:23:35 <flwang1> i will do another test, and it would be nice if you guys can help test it as well. 09:23:53 <flwang1> the PSP is getting a common requirement for enterprise user 09:24:03 <flwang1> EKS is enabling it by default 09:24:28 <flwang1> btw, i just found our default admission list in the code is very old, should we update it? 09:24:39 <strigazi> sure 09:24:53 <strigazi> At CERN we have it in out CT 09:24:58 <strigazi> At CERN we have it in our CT 09:25:01 <flwang1> right 09:25:35 <strigazi> https://github.com/openstack/magnum/blob/master/magnum/drivers/common/templates/kubernetes/fragments/calico-service-v3-3-x.sh#L18 09:25:36 <flwang1> i will propose a patch based on the default list from v1.16.x, sounds ok? 09:27:36 <strigazi> sure 09:27:51 <strigazi> maybe for V we do the list for 1.19? 09:28:52 <flwang1> strigazi: can do 09:29:25 <strigazi> cool 09:29:26 <flwang1> re the calico psp, maybe i missed something, but at line https://github.com/openstack/magnum/blob/master/magnum/drivers/common/templates/kubernetes/fragments/calico-service.sh#L30 09:29:44 <strigazi> yeap, that's it 09:29:44 <flwang1> i can see this role name, but seems we didn't create the role? 09:29:53 <strigazi> we do 09:30:18 <strigazi> https://github.com/openstack/magnum/blob/master/magnum/drivers/common/templates/kubernetes/fragments/kube-apiserver-to-kubelet-role.sh#L125 09:31:21 <flwang1> ok, got it, so we are using magnum.privileged as the psp 09:31:26 <strigazi> yes 09:31:42 <strigazi> same as GKE was doing (at least when I checked) 09:32:20 <flwang1> ok, i will test this again 09:32:33 <flwang1> let's move on? 09:32:36 <strigazi> +1 09:32:55 <flwang1> #topic Add observations field to CT 09:33:06 <flwang1> dioguerra: ^ 09:34:21 <dioguerra> Hey, so the idea of this is to add a new field visible to the user where we can add observations 09:34:53 <dioguerra> The idea came so that we could label the CT with DEPRECATED/PRODUCTION/BETA or something similar 09:35:09 <flwang1> does the field have to be a enum? 09:35:21 <flwang1> or the list can be defined by the admin via config option? 09:35:50 <flwang1> sorry, i haven't read the code yet 09:35:56 <dioguerra> no, i made it so it is free text (other admins might want to add other observations like HA, Multimaster 09:36:19 <dioguerra> you agree? 09:37:14 <flwang1> i'm not sure, if it's kind like a label or tag, then if i want to have several tags for the CT, i have to do something like AAA-BBB-CCC, is it? 09:39:39 <dioguerra> its not a label, its a new field with free text 09:39:56 <dioguerra> so --obersvations <something> 09:39:59 <flwang1> i understand 09:40:23 <flwang1> i'm just saying a free text is making it like a free tag 09:40:26 <brtknr> Its basically a tag to filter cluster templates right? "observations" is a bit of a mouthful. Can we just call it "tags" 09:41:43 <brtknr> does the current implementation allow user to filter using this field? 09:41:46 <dioguerra> it is not a filter (although you might use it for that) its just a ref http://paste.openstack.org/show/797160/ 09:42:18 <dioguerra> brtknr: no 09:43:38 <flwang1> hmm...i understand this is neat than putting the HA or Prod into the cluster template name, but I expect to make it more useful to deserve having a dedicated db field 09:44:37 <flwang1> dioguerra: i'm not saying i don't like the idea. actually it will be useful, i probably need a bit time to think about it. 09:45:06 <dioguerra> well, the idea of doing filtering with the field cross my mind. We can add it now if you would like or later... 09:46:23 <brtknr> dioguerra: i think that would be the thing which would add value to this proposal 09:46:59 <flwang1> shall we move to next topic? 09:47:04 <jakeyip> will it be easier to filter using tags instead of free text? 09:47:05 <flwang1> we have 15 mins left 09:47:17 <jakeyip> database don't match well on TEXT 09:47:54 <jakeyip> sorry, cont' please 09:49:07 <dioguerra> in our usecase we usually only have some visible templates (3 to 4) so filtering is not really required. everything else is hidden 09:49:36 <dioguerra> jakeyip: tag would be better for the DB yes. but that restricts what you want to put. 09:50:23 <jakeyip> do we have a description field? 09:50:25 <flwang1> i would suggest putting the discussion into the patch 09:50:35 <flwang1> not here 09:50:42 <flwang1> move on? 09:50:47 <jakeyip> +1 09:51:00 <flwang1> #topic Drop hyperkube https://review.opendev.org/748141 09:51:02 <flwang1> strigazi: ^ 09:51:12 <flwang1> tell us more? 09:51:48 <strigazi> k8s community and some ex-openstack members though we should not have too much fun with hyperkube and dropped it. 09:52:22 <strigazi> I have a solution there that gets a tarball with kubelet, kubectl, kubadm and kube-proxy (90mb) 09:52:36 <strigazi> it works, running the kubelet from a bin 09:52:51 <strigazi> and the rest components from their respective images. 09:53:01 <strigazi> All good so far, now the problems 09:53:04 <flwang1> kubeadm? 09:53:19 <strigazi> flwang1: well, it is there 09:53:28 <strigazi> flwang1: can't skip it 09:53:44 <strigazi> even the kube-proxy binary, we don't need it 09:53:50 <flwang1> sounds like another breaking change 09:53:51 <strigazi> we need only kubelet and kubectl 09:54:07 <strigazi> flwang1: which one? kubadm? 09:54:23 <brtknr> hmm why does k8s.gcr.io make other binaries available in containers but not kubelet I wonder 09:55:00 <strigazi> flwang1: I just mention what is in the tarball. Is it clear? 09:55:07 <flwang1> strigazi: i wonder if the new change will allow old cluster to be upgraded to 09:55:56 <brtknr> i suppose that is why the PS is "Very WIP" 09:55:57 <strigazi> flwang1: there is literally nothing we can do for clusters that reference k8s.gcr.io/hyperkube. 09:56:38 <strigazi> flwang1: if your clusters use <my-registry>/hyperkube, we can build a hyperkube 09:57:11 <dioguerra> i need to go o/ 09:57:18 <strigazi> brtknr: does it matter? They decided to stop building it. (the reason is CVEs in debian base image) 09:58:09 <strigazi> brtknr: flwang1: hello? 09:58:23 <flwang1> strigazi: i appreciate the work, my only concern is we need to work out a design that make sure old cluster can be ugpraded 09:59:24 <strigazi> flwang1: what is your situation? (regarding there is literally nothing we can do for clusters that reference k8s.gcr.io/hyperkube. && if your clusters use <my-registry>/hyperkube, we can build a hyperkube) 09:59:52 <flwang1> we're getting hyperkube from dockerhub/catalystcloud 10:00:06 <strigazi> pro account i guess 10:00:17 <flwang1> now i'm trying to build hyperkube for v1.19.x 10:00:22 <strigazi> the free account won't cut it any more 10:00:55 <strigazi> it relatively easy. But let me rephrase 10:01:15 <strigazi> Shall we move to the binary for V, so that we don't have to maintain a new image? 10:01:38 <strigazi> brtknr: ^^ flwang1 ^^ 10:02:40 <flwang1> strigazi: moving to binary is our goal i think, and we don't have choice 10:02:49 <strigazi> flwang1: easy to build, hard to maintain. 10:03:06 <flwang1> you mean maintain the binanry? 10:03:15 <strigazi> flwang1: no, the image 10:03:29 <flwang1> i see 10:03:48 <brtknr> I'm okay with that, I echo flwang's concern that existing clusters can also be upgraded, which should be possible. 10:04:06 <flwang1> yep, but again, as public cloud, and a GAed service, we can't break the upgrade path 10:04:51 <flwang1> though we should be able to do magic in the upgrade-k8s.sh 10:05:11 <flwang1> at least, the good thing is we don't have to replace the operating system 10:05:15 <strigazi> The upstream project broke it. So for V we do the binary hoping they won't break it again. 10:05:55 <flwang1> that's a good excuse for us, but we can't use it for our public cloud customer unfortunately :( 10:06:40 <flwang1> strigazi: i will review your patch and see how can we resolve the upgrade issue 10:09:02 <flwang1> strigazi: brtknr: anything else? 10:09:08 <strigazi> I'm good 10:10:24 <brtknr> strigazi: any reason not to use binaries everything? 10:11:00 <brtknr> there is also a server binaries tarball i assume is for master node 10:11:37 <strigazi> brtknr: they are 300mb and I think it is more secure and elegant running in containers. 10:12:22 <flwang1> ok, let me end the meeting first 10:12:27 <flwang1> #endmeeting