09:00:27 <flwang1> #startmeeting magnum
09:00:28 <openstack> Meeting started Wed Jul 22 09:00:27 2020 UTC and is due to finish in 60 minutes.  The chair is flwang1. Information about MeetBot at http://wiki.debian.org/MeetBot.
09:00:29 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
09:00:31 <openstack> The meeting name has been set to 'magnum'
09:00:49 <flwang1> #topic roll call
09:00:50 <flwang1> o/
09:00:53 <strigazi> o/
09:00:54 <brtknr> O/
09:01:01 <flwang1> strigazi: hey
09:01:06 <flwang1> stranger
09:01:26 <strigazi> hello
09:01:45 <flwang1> we do have some patches waiting another +2 from you
09:01:58 <openstackgerrit> Bharat Kunwar proposed openstack/magnum stable/ussuri: [k8s] Use helm upgrade --install in deployment loop  https://review.opendev.org/742374
09:02:19 <flwang1> brtknr: strigazi: do you have any topic you'd like to discuss today?
09:02:21 <strigazi> flwang1: I know, I'll take care of them
09:02:37 <strigazi> flwang1: I have one, and a half
09:02:56 <flwang1> strigazi: brtknr: the only update from my side is i'm revisiting the /nodes API patch
09:02:58 <brtknr> hi strigazi
09:03:03 <flwang1> will submit patchset soon
09:03:14 <brtknr> flwang1: sounds good
09:03:33 <flwang1> strigazi: you first?
09:03:39 <strigazi> ok
09:03:58 <strigazi> 1. For hyperkube and 1.19
09:04:15 <flwang1> oh, god
09:04:29 <strigazi> For after ussuri (current master)
09:05:03 <strigazi> I think we will be safer if we move to deploying the binary, or add an option to deploy the binary.
09:05:20 <flwang1> strigazi: i would try avoid that
09:05:21 <strigazi> For ussuri and older releases, we can do a build
09:05:48 <flwang1> unless we have a good solution for upgrade
09:06:07 <strigazi> flwang1: My argument is: With the binary we maintain nothing. With the build we chase kubernetes releases.
09:06:22 <flwang1> we have broken the upgrade from v1.15 to v1.16, and I don't want to do that again
09:06:47 <strigazi> flwang1: if we do both, we won't have something broken, isn't it?
09:07:16 <flwang1> how to upgrade from container to binary?
09:07:25 <strigazi> Also, wait
09:07:25 <brtknr> strigazi: how does the binary get deployed?
09:08:09 <brtknr> by build do you mean hyperkube build?
09:08:13 <flwang1> stop container and replace with binary in upgrade_kubernetes.sh?
09:08:19 <strigazi> If we build our own image, people that don't use a mirror from both k8s.gcr.io and docker.io/openstackmagnum, they will be broken too
09:08:58 <strigazi> Because in upgrade.sh you need to switch registries
09:09:09 <flwang1> they need to use openstackmagnum , just like for heat-container-agent
09:10:14 <strigazi> yes, but upgrade.sh doesnt do that
09:10:15 <brtknr> strigazi: makes sense, your argument is that since we are changing the mode anyway, lets switch to binary to reduce maintenance overhead
09:10:29 <flwang1> i don't like the idea of binary, we need more discussion about this
09:11:08 <flwang1> brtknr: what do you mean 'changing the mode anyway'?
09:13:04 <strigazi> brtknr means: k8s.gcr.io -> docker.io/openstackmagnum
09:13:44 <strigazi> flwang1: I propose the following:
09:13:50 <flwang1> if we go for binary, is there is a trust palace we can get the binary?
09:14:10 <strigazi> yes, give me a sec
09:15:04 <strigazi> Your concerns of breaking upgrade are very valid. But it will break no matter what we do because of upstream kubernetes. Do we agree on this?
09:15:31 <strigazi> flwang1: ^^
09:15:43 <flwang1> if we still use hyperkube, why it will break upgrade?
09:15:58 <strigazi> Let me explain
09:15:58 <brtknr> yes
09:16:59 <strigazi> in upgrade_kubernetes.sh we just bump the version of the image. The registry is unchanged. At CERN git have a mirror so we are not affected, but stock magnum will fetch hyperkube from k8s.gcr.io/hyperkube
09:17:23 <strigazi> in upgrade_kubernetes.sh we just bump the version of the image. The registry is unchanged. At CERN we have a mirror of the registry so we are not affected, but stock magnum will fetch hyperkube from k8s.gcr.io/hyperkube
09:17:41 <flwang1> stock magnum means?
09:17:48 <brtknr> upstream magnum
09:18:09 <flwang1> i believe any user use magnum on production will have a mirror
09:18:22 <strigazi> the default code, with the default cluster creation parameters/labels
09:18:28 <flwang1> that said, at least with hyperkube, it won't break prod level usage
09:19:21 <strigazi> sure, this is still an assumption even with high probability.
09:19:32 <strigazi> So what if we cover both cases?
09:19:49 <strigazi> Regarding the safe place for the binary:
09:20:03 <strigazi> https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.18.md#downloads-for-v1186
09:20:20 <flwang1> strigazi: we do need to support both for sure
09:20:22 <strigazi> https://dl.k8s.io/v1.18.6/kubernetes-client-darwin-amd64.tar.gz with a sha512sum
09:20:29 <flwang1> okay
09:20:52 <strigazi> well not darwin, that would be crazy
09:21:04 <flwang1> we(catalystcloud) got a lot of pain because of the breakage from v1.15 -> v1.16
09:21:30 <brtknr> strigazi: would we use a container to deploy the binaries?
09:21:32 <flwang1> you mean crazy to support both?
09:21:58 <strigazi> it was a joke, we use amd64 and a few cases arm AFAIK
09:22:21 <flwang1> ah, you mean macos
09:22:31 <flwang1> sorry, i misunderstood
09:22:40 <strigazi> brtknr: like containerd's logic
09:23:26 <strigazi> We fetch the binary, we write the systemd unit. zero things to maintain
09:23:46 <brtknr> strigazi: makes sense
09:23:49 <strigazi> well apart from magnum's code
09:24:15 <strigazi> I work on this, this week
09:24:26 <brtknr> strigazi: i think it would be diligent to ensure that there is a good upgrade path
09:24:59 <flwang1> strigazi: i can help for the upgrade part
09:25:58 <strigazi> brtknr: we can create an upgrade path for switching to the binary. I don't know if we can do it with the magnum api. We can try
09:26:18 <strigazi> flwang1: brtknr: Regarding supporting high profile users
09:26:37 <strigazi> I guess catalyst have some important client
09:26:43 <strigazi> we have some experiments
09:27:16 <flwang1> we do have some customers running clusters created 1year ago :(
09:27:41 <flwang1> though we tried to push them upgrade
09:27:51 <strigazi> At CERN we dedicate time directly on them, we can't have a generic solution that works for those case and dev clusetrs
09:28:21 <flwang1> strigazi: i appreciate your understanding
09:28:46 <strigazi> Also, we spend some time on serviceType LoadBalancer to create a path for moving to new cluster
09:29:09 <strigazi> with this https://github.com/kubernetes/cloud-provider-openstack/pull/1118
09:30:01 <flwang1> migrate a lb from cluster A to cluster B?
09:30:10 <strigazi> Anyway, we move to solution to do both, hyperkube and binary. People with mirrors will have a transparent upgrade
09:30:13 <brtknr> strigazi: i like the pool
09:30:21 <strigazi> flwang1 no
09:30:26 <strigazi> flwang1: one LB
09:30:36 <strigazi> flwang1: one unmanaged LB
09:30:48 <strigazi> flwang1: add members of both cluster to the LB
09:31:25 <strigazi> gradually remove members and eventually delete the old cluster (or delete in one go when thet new cluster is added to the LB)
09:31:51 <flwang1> strigazi: ok, i will read it later
09:33:31 <strigazi> move on?
09:33:43 <flwang1> strigazi: sure
09:33:47 <flwang1> i'm keen to review your patch
09:33:53 <flwang1> what's your half one?
09:34:32 <strigazi> master resize, I work on  dropping the discovery url
09:34:44 <strigazi> the other half if supporting it in the API
09:35:19 <flwang1> support resizing master on api?
09:35:23 <strigazi> yes
09:35:29 <strigazi> will you do it?
09:35:35 <flwang1> i can do it
09:36:28 <strigazi> that's it
09:36:56 <flwang1> as long as you submit the code, i can start a following patch to start on the api part
09:37:02 <strigazi> ok
09:37:28 <brtknr> Looks like this just got merged: https://github.com/kubernetes/autoscaler/pull/3155
09:38:04 <flwang1> brtknr: without the /nodes api support from magnum?
09:38:10 <strigazi> it was blocked by the huawei provider
09:39:23 <brtknr> flwang1: i think this extends the previous approach to update heat stack directly to support nodegroups
09:40:04 <flwang1> brtknr: right, so it still would be nice to have the /nodes api support in magnum,  is it?
09:40:35 <brtknr> flwang1: yes I believe so
09:40:43 <strigazi> yes
09:40:59 <strigazi> not only nice, it will be an improvement
09:41:42 <flwang1> strigazi: ok, just want to make sure if i should still put effort on that one
09:42:51 <brtknr> flwang1: as you can see, this PR was open for a long time, long before /nodes api was proposed
09:43:27 <flwang1> brtknr: i see.
09:43:39 <flwang1> brtknr: strigazi: anything else you want to discuss?
09:44:14 <strigazi> I'm good
09:44:46 <flwang1> strigazi: i need your bless on this https://review.opendev.org/#/c/726017/ so that i can start the work on dashboard side
09:45:39 <flwang1> i'm keen to reduce the templates
09:45:56 <brtknr> we need a few more +2s for ussuri release:  https://review.opendev.org/#/q/status:open+project:openstack/magnum+branch:stable/ussuri
09:46:14 <flwang1> brtknr: thanks for bring this
09:46:34 <flwang1> i had done a review for the ussuri release recently
09:46:57 <brtknr> Also what do you guys think about this? https://review.opendev.org/#/c/740439/
09:47:17 <brtknr> I was tired of creating lots of similar templates with 1 small change
09:47:42 <flwang1> brtknr: i love it
09:47:58 <brtknr> great!
09:48:07 <flwang1> i even would like to have a magic command to duplicate template across regions
09:48:42 <flwang1> so i'm thinking if we can "export" a template and "import" it
09:49:21 <strigazi> I think that clone is good only for admins
09:49:41 <flwang1> strigazi: exactly
09:49:52 <strigazi> I know users can clone things manually
09:49:57 <flwang1> end user doesn't really need it
09:50:46 <strigazi> but if it is very easy and we serve it to them, we may promote a pattern that they don't use the public cluster templates
09:51:52 <flwang1> for us, now we maintain v1.16, v1.17 and v1.18, the only difference for those templates are the kube_tag and the template name
09:51:56 <strigazi> It would be greate to do it in the API but I inderstand that it is an overkill
09:52:09 <flwang1> strigazi: it's an overkill
09:52:42 <flwang1> we're using a pipeline to publish templates acutally
09:52:51 <flwang1> for public templates
09:53:06 <flwang1> but i still think a clone command maybe useful
09:53:31 <flwang1> can we introduce it as an admin command?
09:54:04 <flwang1> before we're confident it's worth to open to all users
09:54:24 <strigazi> we can add a fake validation in the client
09:54:58 <flwang1> strigazi: we should be able to check the user role in client, no?
09:55:35 <strigazi> yes, but virtualenv, pip install, sed <the file with the validation>, profit
09:55:55 <strigazi> I think it's ok, we can do it
09:56:04 <flwang1> ok, brtknr, happy with that?
09:56:22 <flwang1> we're running out time
09:56:28 <strigazi> it's like cephfs that has quotas on the client xD
09:56:40 <flwang1> :D
09:57:00 <flwang1> strigazi: i'm keen to review your master resize code and the binary kube code
09:57:12 <strigazi> +1
09:57:20 <strigazi> brtknr: where are you?/
09:57:46 <brtknr> ok
09:57:50 <brtknr> sounds good
09:58:03 <brtknr> not 100% clear what we mean by fake validation
09:58:19 <strigazi> brtknr: you check if the user is admin on the client
09:58:32 <brtknr> strigazi: ok i will look into it
09:58:36 <strigazi> brtknr:  this is fake in the sense that the user can modifythe client
09:59:23 <flwang1> i'm going to close this meeting now
09:59:31 <flwang1> thank you joining
09:59:36 <strigazi> thanks, I'm good
09:59:48 <flwang1> i hope you guys are doing well in the covid-19 world
10:00:13 <flwang1> #endmeeting