10:00:40 <strigazi> #startmeeting containers 10:00:41 <openstack> Meeting started Tue Jul 17 10:00:40 2018 UTC and is due to finish in 60 minutes. The chair is strigazi. Information about MeetBot at http://wiki.debian.org/MeetBot. 10:00:42 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 10:00:44 <openstack> The meeting name has been set to 'containers' 10:00:48 <strigazi> #topic Roll Call 10:00:54 <strigazi> o/ 10:01:10 <flwang1> o/ 10:02:49 <strigazi> It is the two of us then, let's do it quickly 10:02:56 <flwang1> ok 10:02:58 <sfilatov> o/ 10:03:08 <strigazi> hey sfilatov 10:03:09 <sfilatov> I'm here to discuss deletion :) 10:03:25 <strigazi> #topic Blueprints/Bugs/Ideas 10:03:40 <strigazi> Test v1.11.0 images 10:03:47 <strigazi> #link https://hub.docker.com/r/openstackmagnum/kubernetes-kubelet/tags/ 10:03:59 <strigazi> conformance tests are passing for me 10:04:11 <flwang1> nice nice 10:04:26 <strigazi> Note: you must use cgroupfs as cgroup_driver 10:04:45 <flwang1> so are we going to bump k8s version for rocky to 1.11.0? 10:04:50 <strigazi> with systemd the node is ready, but it cannot schedule pods 10:04:53 <brtknr> o/ 10:05:09 <strigazi> flwang1: I think yes, it is better 10:05:10 <flwang1> strigazi: then we need to document it somewhere 10:05:24 <strigazi> of course 10:05:24 <flwang1> strigazi: cool, i can do that 10:05:37 <strigazi> but we need to test it first :) 10:05:42 <strigazi> not only me :) 10:05:56 <flwang1> strigazi: sure, when i say i can do that, i mean i will test it 10:06:10 <flwang1> and if it works for me, i will propose a patch and add doc 10:06:36 <strigazi> btw I'm still evaluating the gcr.io hyperkube containers vs a fedora based one 10:07:11 <flwang1> any benefit using hyperkube? 10:07:56 <strigazi> we don't build kubernetes at all we just package the hyperkube container 10:07:59 <strigazi> but 10:08:25 <strigazi> hyperkube is based on debian, incompatibilities may occur 10:08:57 <strigazi> I'll push two patches for the two ways to build 10:09:04 <flwang1> hmm... 10:09:30 <flwang1> ok 10:09:37 <strigazi> to build the rpms is trivial like: git clone kube && bazel build ./build/rpms 10:10:52 <strigazi> like this: http://paste.openstack.org/show/726097/ 10:11:57 <flwang1> nice 10:12:27 <strigazi> bazel is a black box for me, but seems to work pretty well and pretty fast 10:12:51 <strigazi> we can take this offile 10:12:56 <strigazi> we can take this offline 10:12:58 <strigazi> next: 10:13:18 <strigazi> this is a trivial change after all: https://review.openstack.org/#/c/582506/ 10:13:23 <strigazi> Resolve stack outputs only on COMPLETE 10:13:42 <strigazi> but I expect to help a lot when magnum poll heat 10:14:04 <strigazi> in devstack you can not see it, but in big stack will make a difference 10:14:48 <strigazi> are you looking? should I move one? 10:15:09 <sfilatov> + from me 10:15:26 <flwang1> that looks good for me 10:15:48 <strigazi> next is the keypair issue and scaling 10:15:50 <flwang1> though i may need to take a look at the resolve_outputs param 10:16:18 <strigazi> flwang1: there is a show_params api 10:16:27 <flwang1> cool 10:16:44 <strigazi> flwang1: resolve_outputs says to heat to not bother with the outputs of the stack 10:17:02 <flwang1> and outputs means more api calls in heat i guess? 10:17:14 <flwang1> to other services 10:17:15 <strigazi> flwang1: even during stack creation heat will though all server to get the ips 10:17:28 <strigazi> flwang1: it means more load on the engine 10:17:39 <flwang1> right, matching what i thought 10:17:44 <flwang1> all good 10:17:53 <strigazi> flwang1: and it means slow api response 10:18:09 <strigazi> flwang1: normally I have 250ms response time 10:18:45 <strigazi> flwang1: with a 50 node cluster in progress, any api call goes to 15 seconds 10:19:17 <flwang1> omg 10:19:26 <strigazi> flwang1: all magnum nodes hit all heat api eventually with the same request and the apis block 10:19:41 <strigazi> but 10:20:07 <strigazi> if you create the stack with the heat api and magnum is not hammering, all good 10:20:29 <strigazi> without output resolve the stack get call is a simple lookup 10:20:44 <flwang1> strigazi: thanks for the clarification 10:21:31 <strigazi> I created a 500 node cluster 2 weeks ago and immediatly I stop magnum from hitting heat, everything was smooth 10:21:53 <strigazi> next, keypair 10:22:15 <strigazi> keypair is an OS resource that it cannot be shared in a project 10:22:44 <strigazi> and a cluster owns let's say a key 10:23:02 <strigazi> admin can't do stack update or other members 10:23:07 <strigazi> with this patch: 10:23:24 <strigazi> #link https://review.openstack.org/#/c/582880/ 10:23:40 <strigazi> users can do a stack update freely 10:23:46 <strigazi> with these params in heat: 10:23:51 <strigazi> deferred_auth_method = trusts 10:23:52 <strigazi> reauthentication_auth_method = trusts 10:24:06 <strigazi> all good so far but 10:24:28 <strigazi> if the user is deleted its trust is deleted so the stack can not be touched again 10:25:18 <strigazi> afaik only solution: pass the key in user_data and not as a keypair in nova 10:25:22 <strigazi> thoughts? 10:25:53 <brtknr> How does it allow other users to make changes? it is not immediately obvious to me 10:25:59 <brtknr> looking at the patch 10:26:21 <strigazi> brtknr the keypair is created on cluster creation 10:26:54 <flwang1> so instead of using the original user's public key, we just generate a common one for everybody? 10:27:10 <strigazi> brtknr: and authtication is done using the trust, so userB will authenticate behind the scenes with the trust of userA 10:27:32 <strigazi> flwang1 the thing is that there is no such thing as common jey 10:27:34 <strigazi> flwang1 the thing is that there is no such thing as common key 10:27:51 <strigazi> the key is only visible by the creator 10:28:06 <brtknr> how does userA enable trust for userB? I suppose this has to be set somewhere? 10:28:06 <flwang1> can't we just generate one and 'share' it to all users in the tenant? 10:28:27 <strigazi> flwang1 impossible in nova 10:28:32 <flwang1> strigazi: ok 10:28:32 <sfilatov> flwang1: can we share nova keys 10:28:35 <sfilatov> y 10:28:45 <strigazi> brtknr: heat does this 10:28:49 <strigazi> sfilatov: no we can't 10:29:55 <strigazi> sfilatov: https://docs.openstack.org/horizon/pike/user/configure-access-and-security-for-instances.html 10:30:01 <strigazi> A key pair belongs to an individual user, not to a project. To share a key pair across multiple users, each user needs to import that key pair. 10:30:23 <strigazi> that is not correctly expressed 10:30:49 <strigazi> it means all users must import the same public_key 10:31:04 <sfilatov> strigazi: yep, so we can't do it natively 10:31:31 <strigazi> we can simulate the shared key with heat using the trust 10:31:48 <strigazi> but as I said if the user is deleted the trust is gone 10:32:32 <brtknr> hmm 10:32:48 <brtknr> would be nice to set group priviledge 10:33:03 <brtknr> e.g any user in admin group can modify the cluster 10:33:03 <strigazi> I don't think it is possible 10:33:54 <brtknr> but this is certainly the next best thing 10:34:02 <strigazi> brtknr: and it is not only desirable for admins 10:34:15 <strigazi> brtknr: in private clouds you have shared resources 10:34:33 <strigazi> in public too, but not as much as in private 10:34:58 <strigazi> and as I mentioned passing the key in user data will work in all cases. 10:35:16 <strigazi> does this sound bad to you ^^ 10:36:37 <brtknr> how does it limit who is allowed to make changes to the cluster or not? 10:36:53 <brtknr> or is it not limited at all? 10:36:55 <strigazi> it does not 10:37:16 <brtknr> sounds a little worrying lol 10:38:19 <brtknr> how about a heat parameter that is a list of users that are allowed to make change 10:38:33 <brtknr> or we assume that anyone in the project is allowed to make changes 10:39:10 <strigazi> you know about this right? https://github.com/strigazi.keys 10:39:36 <strigazi> let's take this offline, I need to explain the problem more I guess 10:39:46 <strigazi> it is a limitation of nova 10:39:54 <strigazi> not magnum or heat 10:40:18 <sfilatov> Let's talk about k8s loadbalancer deletions then 10:40:32 <strigazi> I have one more thing for upgrades, sorry 10:40:36 <sfilatov> sure 10:41:12 <strigazi> For the past weeks, I'm trying to drain nodes before rebuilding them 10:41:49 <strigazi> The issue is that this api call must be executed before every node rebuild 10:42:01 <strigazi> so it must be in the heat workflow 10:42:22 <strigazi> otherwise heat is not managing the status 10:42:32 <strigazi> of the infrastructure anymore 10:43:16 <strigazi> I'm trying this pattern so far: http://paste.openstack.org/show/726098/ 10:43:45 <strigazi> with no success so far 10:44:14 <strigazi> I'm thinking of putting the workflow in the master or in magnum 10:45:07 <sfilatov> And btw is draining the node the only roght way to do this? Are there issues behind upgrading in-place? 10:45:17 <sfilatov> downtime? 10:45:26 <strigazi> in-place there is no such problem 10:46:04 <sfilatov> yes, but the workflow you consider is about draining and rebuilding the nodes 10:46:08 <strigazi> but it means, the OS must support upgrades in place and if you have upgraded a few times 10:46:15 <flwang1> can you remind me the limition of in-place upgrade? 10:46:17 <strigazi> thinks will go wrong 10:46:39 <strigazi> s/thinks/things/ 10:47:10 <strigazi> 1. GKE and cluster-api are not doing in-place 10:47:33 <strigazi> 2. upgrading a OS in-place is not an atomic operation 10:48:28 <strigazi> rebuild works even in ironic 10:49:09 <strigazi> the suggested way from lifecycle sig is replace 10:49:25 <strigazi> only master nodes in-place 10:50:14 <sfilatov_> sry, I'm back 10:50:38 <strigazi> flwang1: sfilatov_ thoughts? 10:50:54 <sfilatov_> I don't see the history since I reconnected 10:51:20 <strigazi> < flwang1> can you remind me the limition of in-place upgrade? 10:51:20 <sfilatov_> strigazi: could copy and paste it pls 10:51:23 <flwang1> strigazi: fair enough 10:51:27 <strigazi> 1. GKE and cluster-api are not doing in-place 10:51:31 <strigazi> 2. upgrading a OS in-place is not an atomic operation 10:51:36 <strigazi> rebuild works even in ironic 10:51:40 <strigazi> the suggested way from lifecycle sig is replace 10:51:44 <strigazi> only master nodes in-place 10:52:33 <strigazi> with multimaster you can even replace masters one by one with no downtime 10:52:39 <sfilatov_> strigazi: what you mean by upgrading OS 10:53:09 <strigazi> kernel verionN to kernel versionN+1 10:53:24 <sfilatov_> ok, got it 10:53:40 <strigazi> have you ever upgraded docker? it is so nice 10:53:57 <strigazi> but mostly the kernel 10:54:06 <sfilatov_> strigazi: that's true 10:54:17 <sfilatov_> strigazi: are you considering rebuilding masters as well? 10:54:34 <strigazi> yes, with a param 10:54:35 <sfilatov_> strigazi: looks like we have more or less the same issues with this 10:55:26 <sfilatov_> strigazi: I agree then. I llooked through API in the upgrade patch 10:55:39 <sfilatov_> strigazi: and it seems we need nodegroups implemented 10:55:52 <strigazi> let's move this to gerrit then 10:55:57 <sfilatov_> ok 10:55:59 <strigazi> sfilatov_: about delete? 10:56:02 <sfilatov_> yes 10:56:04 <strigazi> what is the issue? 10:56:15 <strigazi> I mean, I know the issue 10:56:23 <strigazi> what is the solution(s) 10:56:29 <sfilatov_> I have almost prepared a patch with software deployments for deletions 10:56:50 <strigazi> with an on-delete SD? 10:56:54 <sfilatov_> yes 10:56:56 <strigazi> push 10:57:06 <sfilatov_> I'd like to discuss 2 issues 10:57:12 <strigazi> shoot :) 10:57:23 <sfilatov_> We still need to wait for LB in neutron 10:57:41 <sfilatov_> since cloud provider does not support waiting for LB delettion 10:57:51 <sfilatov_> we can't wait using kubectl 10:58:11 <strigazi> hmm, that is not nice 10:58:29 <strigazi> flwang1: maybe kong has some input for this? 10:58:30 <flwang1> how do you wait for LB in neutron? 10:58:41 <strigazi> you ask the api I imagine 10:58:43 <sfilatov_> you get the LB with the name 10:58:46 <sfilatov_> yes 10:58:59 <sfilatov_> since you know LB name = 'a' + k8s svc id 10:59:09 <sfilatov_> but it's not really nice 10:59:13 <flwang1> and polling neutron api to see if it's still there? 10:59:20 <sfilatov_> yep 10:59:24 <flwang1> hmm... 10:59:44 <sfilatov_> we can fix this via cloud provider 10:59:53 <strigazi> 1. must be solved in the cloud-provider 11:00:01 <strigazi> 2. polling as a work around 11:00:10 <sfilatov_> got it 11:00:12 <strigazi> lgty? 11:00:15 <strigazi> flwang1: ^^ 11:00:24 <flwang1> i'm ok with that 11:00:24 <sfilatov_> the other issue 11:00:35 <sfilatov_> what if the user stopped vms 11:00:42 <sfilatov_> basically - shutdown 11:00:48 <sfilatov_> I have faces the issue 11:00:59 <strigazi> why? I dont get it 11:01:01 <sfilatov_> and there is nothing I can do about it 11:01:11 <sfilatov_> vms are shutdown 11:01:18 <sfilatov_> and k8s api is not available 11:01:37 <sfilatov_> so when delete is triggered we can delete the resources 11:01:44 <sfilatov_> can't * 11:02:32 <flwang1> hmm... i'm wondering if magnum should take care such a corner case 11:02:34 <strigazi> there is no solution for this 11:02:43 <strigazi> only what flwang1 said 11:03:00 <strigazi> if we do it for the corner case though 11:03:14 <flwang1> for this case, user need open a support ticket to ops 11:03:20 <flwang1> and get it removed :D 11:03:32 <strigazi> or he can remove it manually 11:03:38 <flwang1> to avoid magnum ops too boring 11:03:51 <sfilatov_> there's a way 11:03:57 <sfilatov_> to solve all the issues 11:04:10 <flwang1> i even don't think magnum should just bravely delete a cluster and everything on the cluster 11:04:20 <sfilatov_> if we add cluster_id to lb metadata 11:04:40 <flwang1> sfilatov_: Lingxian kong is working on that 11:04:41 <strigazi> and then? 11:04:45 <sfilatov_> and delete lb based on their metadata 11:05:02 <sfilatov_> in this case we don't need to access k8s API 11:05:05 <flwang1> that's the current solution 11:05:28 <strigazi> is there anything stopping us from doing this now? 11:05:40 <sfilatov_> we need to patch cloud provider 11:06:40 <strigazi> flwang1: Lingxian's patch is not in? 11:06:58 <flwang1> https://github.com/kubernetes/cloud-provider-openstack/pull/223 11:07:31 <flwang1> they're just putting the cluster name in lb description 11:07:41 <flwang1> so we're ok to go with current way i think 11:08:02 <sfilatov_> so there's no need for my patch? 11:08:10 <flwang1> i guess so? 11:08:15 <sfilatov_> with software deployment 11:08:16 <flwang1> if you're happy with this way 11:08:18 <strigazi> we still need a patch 11:08:25 <strigazi> in magnum 11:08:51 <flwang1> i can propose a new patch set on this https://review.openstack.org/#/c/497144/ 11:09:04 <flwang1> to check the cluster name 11:09:11 <flwang1> then we should be ok 11:09:22 <strigazi> uuid is better I guess 11:10:00 <flwang1> strigazi: i think so, there is probably limition, i will check with CPO team 11:10:07 <strigazi> folks anything else? we are 10mins late and I'm 10 mins late for another meeting 11:10:16 <strigazi> flwang1: it accepts a string 11:10:21 <flwang1> good for me 11:10:24 <strigazi> flwang1: it can be anything 11:10:42 <flwang1> i mean CPO maybe hard to get the UUID of magnum's cluster 11:11:26 <strigazi> flwang1: it looks like a generic parameter to me, let's see 11:11:27 <flwang1> unless we pass it to somewhere so that CPO can easily get it, just my guess 11:11:36 <flwang1> need to check with the author 11:11:42 <strigazi> cool 11:11:56 <strigazi> sfilatov_: anything else? 11:12:42 <flwang1> strigazi: i think you're good to go ;) 11:12:47 <strigazi> let's wrap this then 11:12:54 <strigazi> thanks flwang1 sfilatov_ and brtknr 11:12:55 <strigazi> #endmeeting