09:00:27 #startmeeting magnum 09:00:28 Meeting started Wed Jul 22 09:00:27 2020 UTC and is due to finish in 60 minutes. The chair is flwang1. Information about MeetBot at http://wiki.debian.org/MeetBot. 09:00:29 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 09:00:31 The meeting name has been set to 'magnum' 09:00:49 #topic roll call 09:00:50 o/ 09:00:53 o/ 09:00:54 O/ 09:01:01 strigazi: hey 09:01:06 stranger 09:01:26 hello 09:01:45 we do have some patches waiting another +2 from you 09:01:58 Bharat Kunwar proposed openstack/magnum stable/ussuri: [k8s] Use helm upgrade --install in deployment loop https://review.opendev.org/742374 09:02:19 brtknr: strigazi: do you have any topic you'd like to discuss today? 09:02:21 flwang1: I know, I'll take care of them 09:02:37 flwang1: I have one, and a half 09:02:56 strigazi: brtknr: the only update from my side is i'm revisiting the /nodes API patch 09:02:58 hi strigazi 09:03:03 will submit patchset soon 09:03:14 flwang1: sounds good 09:03:33 strigazi: you first? 09:03:39 ok 09:03:58 1. For hyperkube and 1.19 09:04:15 oh, god 09:04:29 For after ussuri (current master) 09:05:03 I think we will be safer if we move to deploying the binary, or add an option to deploy the binary. 09:05:20 strigazi: i would try avoid that 09:05:21 For ussuri and older releases, we can do a build 09:05:48 unless we have a good solution for upgrade 09:06:07 flwang1: My argument is: With the binary we maintain nothing. With the build we chase kubernetes releases. 09:06:22 we have broken the upgrade from v1.15 to v1.16, and I don't want to do that again 09:06:47 flwang1: if we do both, we won't have something broken, isn't it? 09:07:16 how to upgrade from container to binary? 09:07:25 Also, wait 09:07:25 strigazi: how does the binary get deployed? 09:08:09 by build do you mean hyperkube build? 09:08:13 stop container and replace with binary in upgrade_kubernetes.sh? 09:08:19 If we build our own image, people that don't use a mirror from both k8s.gcr.io and docker.io/openstackmagnum, they will be broken too 09:08:58 Because in upgrade.sh you need to switch registries 09:09:09 they need to use openstackmagnum , just like for heat-container-agent 09:10:14 yes, but upgrade.sh doesnt do that 09:10:15 strigazi: makes sense, your argument is that since we are changing the mode anyway, lets switch to binary to reduce maintenance overhead 09:10:29 i don't like the idea of binary, we need more discussion about this 09:11:08 brtknr: what do you mean 'changing the mode anyway'? 09:13:04 brtknr means: k8s.gcr.io -> docker.io/openstackmagnum 09:13:44 flwang1: I propose the following: 09:13:50 if we go for binary, is there is a trust palace we can get the binary? 09:14:10 yes, give me a sec 09:15:04 Your concerns of breaking upgrade are very valid. But it will break no matter what we do because of upstream kubernetes. Do we agree on this? 09:15:31 flwang1: ^^ 09:15:43 if we still use hyperkube, why it will break upgrade? 09:15:58 Let me explain 09:15:58 yes 09:16:59 in upgrade_kubernetes.sh we just bump the version of the image. The registry is unchanged. At CERN git have a mirror so we are not affected, but stock magnum will fetch hyperkube from k8s.gcr.io/hyperkube 09:17:23 in upgrade_kubernetes.sh we just bump the version of the image. The registry is unchanged. At CERN we have a mirror of the registry so we are not affected, but stock magnum will fetch hyperkube from k8s.gcr.io/hyperkube 09:17:41 stock magnum means? 09:17:48 upstream magnum 09:18:09 i believe any user use magnum on production will have a mirror 09:18:22 the default code, with the default cluster creation parameters/labels 09:18:28 that said, at least with hyperkube, it won't break prod level usage 09:19:21 sure, this is still an assumption even with high probability. 09:19:32 So what if we cover both cases? 09:19:49 Regarding the safe place for the binary: 09:20:03 https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.18.md#downloads-for-v1186 09:20:20 strigazi: we do need to support both for sure 09:20:22 https://dl.k8s.io/v1.18.6/kubernetes-client-darwin-amd64.tar.gz with a sha512sum 09:20:29 okay 09:20:52 well not darwin, that would be crazy 09:21:04 we(catalystcloud) got a lot of pain because of the breakage from v1.15 -> v1.16 09:21:30 strigazi: would we use a container to deploy the binaries? 09:21:32 you mean crazy to support both? 09:21:58 it was a joke, we use amd64 and a few cases arm AFAIK 09:22:21 ah, you mean macos 09:22:31 sorry, i misunderstood 09:22:40 brtknr: like containerd's logic 09:23:26 We fetch the binary, we write the systemd unit. zero things to maintain 09:23:46 strigazi: makes sense 09:23:49 well apart from magnum's code 09:24:15 I work on this, this week 09:24:26 strigazi: i think it would be diligent to ensure that there is a good upgrade path 09:24:59 strigazi: i can help for the upgrade part 09:25:58 brtknr: we can create an upgrade path for switching to the binary. I don't know if we can do it with the magnum api. We can try 09:26:18 flwang1: brtknr: Regarding supporting high profile users 09:26:37 I guess catalyst have some important client 09:26:43 we have some experiments 09:27:16 we do have some customers running clusters created 1year ago :( 09:27:41 though we tried to push them upgrade 09:27:51 At CERN we dedicate time directly on them, we can't have a generic solution that works for those case and dev clusetrs 09:28:21 strigazi: i appreciate your understanding 09:28:46 Also, we spend some time on serviceType LoadBalancer to create a path for moving to new cluster 09:29:09 with this https://github.com/kubernetes/cloud-provider-openstack/pull/1118 09:30:01 migrate a lb from cluster A to cluster B? 09:30:10 Anyway, we move to solution to do both, hyperkube and binary. People with mirrors will have a transparent upgrade 09:30:13 strigazi: i like the pool 09:30:21 flwang1 no 09:30:26 flwang1: one LB 09:30:36 flwang1: one unmanaged LB 09:30:48 flwang1: add members of both cluster to the LB 09:31:25 gradually remove members and eventually delete the old cluster (or delete in one go when thet new cluster is added to the LB) 09:31:51 strigazi: ok, i will read it later 09:33:31 move on? 09:33:43 strigazi: sure 09:33:47 i'm keen to review your patch 09:33:53 what's your half one? 09:34:32 master resize, I work on dropping the discovery url 09:34:44 the other half if supporting it in the API 09:35:19 support resizing master on api? 09:35:23 yes 09:35:29 will you do it? 09:35:35 i can do it 09:36:28 that's it 09:36:56 as long as you submit the code, i can start a following patch to start on the api part 09:37:02 ok 09:37:28 Looks like this just got merged: https://github.com/kubernetes/autoscaler/pull/3155 09:38:04 brtknr: without the /nodes api support from magnum? 09:38:10 it was blocked by the huawei provider 09:39:23 flwang1: i think this extends the previous approach to update heat stack directly to support nodegroups 09:40:04 brtknr: right, so it still would be nice to have the /nodes api support in magnum, is it? 09:40:35 flwang1: yes I believe so 09:40:43 yes 09:40:59 not only nice, it will be an improvement 09:41:42 strigazi: ok, just want to make sure if i should still put effort on that one 09:42:51 flwang1: as you can see, this PR was open for a long time, long before /nodes api was proposed 09:43:27 brtknr: i see. 09:43:39 brtknr: strigazi: anything else you want to discuss? 09:44:14 I'm good 09:44:46 strigazi: i need your bless on this https://review.opendev.org/#/c/726017/ so that i can start the work on dashboard side 09:45:39 i'm keen to reduce the templates 09:45:56 we need a few more +2s for ussuri release: https://review.opendev.org/#/q/status:open+project:openstack/magnum+branch:stable/ussuri 09:46:14 brtknr: thanks for bring this 09:46:34 i had done a review for the ussuri release recently 09:46:57 Also what do you guys think about this? https://review.opendev.org/#/c/740439/ 09:47:17 I was tired of creating lots of similar templates with 1 small change 09:47:42 brtknr: i love it 09:47:58 great! 09:48:07 i even would like to have a magic command to duplicate template across regions 09:48:42 so i'm thinking if we can "export" a template and "import" it 09:49:21 I think that clone is good only for admins 09:49:41 strigazi: exactly 09:49:52 I know users can clone things manually 09:49:57 end user doesn't really need it 09:50:46 but if it is very easy and we serve it to them, we may promote a pattern that they don't use the public cluster templates 09:51:52 for us, now we maintain v1.16, v1.17 and v1.18, the only difference for those templates are the kube_tag and the template name 09:51:56 It would be greate to do it in the API but I inderstand that it is an overkill 09:52:09 strigazi: it's an overkill 09:52:42 we're using a pipeline to publish templates acutally 09:52:51 for public templates 09:53:06 but i still think a clone command maybe useful 09:53:31 can we introduce it as an admin command? 09:54:04 before we're confident it's worth to open to all users 09:54:24 we can add a fake validation in the client 09:54:58 strigazi: we should be able to check the user role in client, no? 09:55:35 yes, but virtualenv, pip install, sed , profit 09:55:55 I think it's ok, we can do it 09:56:04 ok, brtknr, happy with that? 09:56:22 we're running out time 09:56:28 it's like cephfs that has quotas on the client xD 09:56:40 :D 09:57:00 strigazi: i'm keen to review your master resize code and the binary kube code 09:57:12 +1 09:57:20 brtknr: where are you?/ 09:57:46 ok 09:57:50 sounds good 09:58:03 not 100% clear what we mean by fake validation 09:58:19 brtknr: you check if the user is admin on the client 09:58:32 strigazi: ok i will look into it 09:58:36 brtknr: this is fake in the sense that the user can modifythe client 09:59:23 i'm going to close this meeting now 09:59:31 thank you joining 09:59:36 thanks, I'm good 09:59:48 i hope you guys are doing well in the covid-19 world 10:00:13 #endmeeting