09:17:49 <jakeyip> #startmeeting magnum 09:17:49 <opendevmeet> Meeting started Wed Sep 13 09:17:49 2023 UTC and is due to finish in 60 minutes. The chair is jakeyip. Information about MeetBot at http://wiki.debian.org/MeetBot. 09:17:49 <opendevmeet> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 09:17:49 <opendevmeet> The meeting name has been set to 'magnum' 09:17:55 <jakeyip> #topic Roll Call 09:17:59 <jakeyip> o/ 09:18:08 <dalees> o/ 09:18:15 <jakeyip> ping gbialas 09:18:21 <mkjpryor> o/ 09:19:00 <gbialas> o/ 09:19:04 <jakeyip> ping mnasiadka :) 09:19:44 <jakeyip> Thanks everyone for joining the meeting 09:19:58 <jakeyip> Agenda: 09:20:01 <jakeyip> #link https://etherpad.opendev.org/p/magnum-weekly-meeting 09:20:08 <jakeyip> Please add your topics to the Agenda 09:21:06 <jakeyip> #topic Changing container_runtime default 09:21:18 <gbialas> Yes, that is mine :) 09:21:22 <jakeyip> gbialas: let's start with this. sorry I wasn't around last week. I read thru the chat. 09:21:43 <jakeyip> I agree we don't want to keep the old things around forever. In fact, I was hoping ClusterAPI solves my problem 09:21:58 <jakeyip> but unfortunately.... long story... :) 09:22:12 <mnasiadka> It's going to solve it in longer run 09:22:31 <gbialas> Well cluster IP is song of the future, till then we have to live with it :) 09:22:36 <mnasiadka> But let's try to make the current driver working out of the box without any labels ;) 09:22:58 <jakeyip> yeah. maybe I should provide some context 09:23:52 <jakeyip> so Antelope supports v1.23. It still works with dockershim. 09:24:52 <mnasiadka> Bobcat is going to support 1.23? 09:25:03 <jakeyip> during last PTG we sort of decided not to touch labels anymore. This is to prevent the case of an operator upgrading Magnum and the default labels changing and breaking existing cluster templates that didn't define it 09:26:27 <jakeyip> For example, an site might be using a template that defaults to dockershim and if we change the runtime, an operator upgrading from Antelope to Bobcat will have it broken because maybe there are some things that expects dockershim and it was the default 09:27:14 <jakeyip> personally, I think the labels is a massive headache. therefore I was hoping CAPI will remove this altogether 09:27:14 <mnasiadka> well, operator is upgrading to Bobcat and still wants to deploy unsupported kubernetes version? 09:27:16 <gbialas> I have spawned cluster in 1.23 version, and then changed default for container_runtime. I have rescaled cluster, and it worked. 09:27:55 <mnasiadka> jakeyip: CAPI will not remove labels, we need to somehow pass additional options to the driver 09:28:23 <mnasiadka> We could manage properly kube_version 09:28:24 <jakeyip> gbialas: the existing clusters will already have the dockershim value. it is new clusters that will have the new value 09:28:31 <mnasiadka> currently we copy kube_tag as kube_version 09:28:46 <mnasiadka> if we would know what is the major version being requested - we could do a conditional on container_runtime 09:28:53 <jakeyip> mnasiadka: yes with CAPI and the CAPI labels we will be very careful not to make this mistake 09:30:21 <jakeyip> mnasiadka: kube_ver is just one, some of them may depend on fcos version, etc 09:31:32 <jakeyip> e.g. use_podman label when Magnum was changing from fedora atomic to fedora coreos 09:31:47 <mnasiadka> just saying that this way we would not break any expectations 09:32:01 <mnasiadka> second thing is - we could have a validator on kubernetes version to the ones we support 09:33:08 <jakeyip> can explain more? 09:33:18 <jakeyip> gbialas any comments? 09:33:35 <jakeyip> I am not objecting to changing, I am open to ideas 09:33:39 <dalees> it would be nice to be able to have label defaults be "sensible" in each version and able to change. It would probably be a lot of change, but it'd help if templates stored defaults when they were created they'd not be affected by code changes of defaults. 09:33:47 <gbialas> We should support some recent version (both k8s and fcos) Right now we are supporting versions which are eol. 09:34:06 <mnasiadka> dalees: don't we do that today? 09:34:46 <dalees> mnasiadka: can't do, if a template is created with no labels, it'll use defaults on cluster creation. So the behaviour changes if the default in code changes. 09:35:04 <dalees> this is why label defaults are annoying; we may break old templates. 09:35:07 <mnasiadka> ah, so we would need to evaluate all labels defaults and push them to the Heat stack 09:35:19 <jakeyip> gbialas: we do support new versions, if you are making a new CT you have to provide the labels that will work for it. this is actually good because this CT won't follow defaults if magnum changes. 09:35:43 <jakeyip> CT becomes immutable and not dependent on Magnum defaults. 09:36:10 <dalees> the other option is to just say "we will change defaults to match k8s version X". And if the user wants to have a template that still works the same, they specify all the labels in the template (we do this at Catalyst Cloud, and are explicit about most labels) 09:36:30 <mnasiadka> well, still containerd will work with 1.23, I guess even with 1.19 09:36:45 <mnasiadka> most projects just do release notes on changed defaults 09:36:59 <jakeyip> mnasiadka: not sure but we don't test it and I'm not keen to look :) 09:38:02 <mnasiadka> yeah, once we test, that it might be better :) 09:38:19 <jakeyip> I like dalees idea on getting the defaults set in the CT on creation. 09:38:37 <mnasiadka> so what for now? we change the default or just write in the docs that any not EOL kubernetes version needs setting container_runtime to containerd? 09:39:36 <dalees> I think that is a difficult change, the defaults live in Heat template files not python code - so I don't think you could just iterate over them and store. 09:39:56 <jakeyip> the way decided last PTG is to document it. Every version we have the tags that should be used when creating CT. That way CT stores all the info 09:40:10 <mnasiadka> ok, so let's continue with that for now 09:40:12 <jakeyip> I also like mnasiadka idea of eventually not providing default lol 09:40:31 <mnasiadka> Yeah, maybe we should have some mandatory labels :) 09:40:51 <dalees> mnasiadka: Jake is working on documenting "known good" labels for k8s versions, containerd is one of them. https://review.opendev.org/c/openstack/magnum/+/894006/2/doc/source/user/index.rst 09:41:01 <jakeyip> if this problem follows us for a while (e.g. CAPI won't save us), dalees may want to explore that idea 09:42:15 <mnasiadka> ok, so that probably concludes 09:42:40 <jakeyip> then we just have a breaking change behaviour version and after that all CT will suck defaults in at create time and store them. Then we are free to rachet the labels each version 09:43:19 <mnasiadka> yup 09:43:57 <jakeyip> gbialas: you ok with that? we don't mind help in getting a patch for this idea, then we can update labels each version 09:44:30 <gbialas> Yes, im ok with this. 09:44:49 <jakeyip> I've just noticed the time, let's move on quickly. gbialas do message me if you want to discuss more. Sorry to disappoint :) 09:45:06 <jakeyip> #topic Fixing meeting name 09:45:08 <gbialas> Ii is as it is :) 09:45:10 <jakeyip> mnasiadka: updates? 09:45:53 <mnasiadka> #link https://opendev.org/opendev/irc-meetings/src/branch/master/meetings/containers-team-meeting.yaml 09:45:58 <mnasiadka> haven't raised a patch to rename it 09:46:06 <mnasiadka> I was thinking of leaving the name as is 09:46:16 <mnasiadka> changing id to magnum - so it points to correct logs on eavesdrop 09:47:14 <mnasiadka> does it make sense? 09:47:24 <mnasiadka> or should we also rename OpenStack Containers to OpenStack Magnum? 09:47:56 <jakeyip> I am not sure where the bug is actually. 09:48:03 <jakeyip> is this a community thing or can both of us decide? 09:48:24 <jakeyip> or 3 of us (sorry dalees :D ) 09:50:04 <jakeyip> OH I think I understand now. because we start meeting with 'magnum' 09:50:24 <jakeyip> old logs here https://meetings.opendev.org/meetings/containers/ 09:50:31 <jakeyip> new logs here https://meetings.opendev.org/meetings/magnum/ 09:50:35 <jakeyip> am I right? 09:52:44 <jakeyip> did I lose everyone... ? 09:54:05 * dalees is here 09:54:14 <jakeyip> hm 09:54:30 <jakeyip> nvm let's go on to next topic, nearly time 09:54:34 <jakeyip> #topic ClusterAPI 09:56:18 <jakeyip> Update: We have encountered a bit of a roadblock while at the very last stage of the ClusterAPI driver. It turns out that the current implementation, contributed by StackHPC, may conflict with another driver VeXXHost developed in-house 09:58:13 <mnasiadka> jakeyip: we can decide - old logs are in containers 09:58:22 <mnasiadka> jakeyip: but when you write startmeeting magnum - the logs get into magnum 09:58:37 <mnasiadka> jakeyip: so either we change only id, or name as well - core reviewers can decide :) 09:58:38 <jakeyip> we have discussed about this a fair bit. The biggest consideration is keeping the community happy and engaged so the project is substainable. 09:59:27 <jakeyip> thanks mnasiadka, we will discuss this offline so as not to waste others' time 09:59:42 <mnasiadka> jakeyip: ack 10:00:13 <jakeyip> so in view of that, the current implementation in Magnum can't be merged as is 10:00:41 <jakeyip> There are a few things we will need to do 10:01:20 <jakeyip> 1. Find a way for both drivers to co-exist 10:02:23 <jakeyip> 2. Refactor the driver interface code so others can contribute out of tree drivers without conflicts 10:03:55 <jakeyip> 3. Possibly merge StackHPC and VeXXHost drivers so in the future there is 1 good reference CAPI driver 10:04:17 <jakeyip> I think that's all, mnasiadka or dalees, do you have anything to add? 10:04:53 <dalees> No, you covered what I was going to say in #2 10:05:26 <jakeyip> questions anyone? 10:06:13 <mnasiadka> not me 10:07:32 <jakeyip> mkjpryor has been strangely silent. :) 10:07:52 <jakeyip> maybe he is very disappointed 10:09:05 <jakeyip> ok let's move on 10:09:11 <jakeyip> #topic Open Discussion 10:10:47 <jakeyip> Oh I noticed a BU Mentorshop Agenda. It doesn't have a name to it, do we have someone who wants to talk about that? 10:12:07 <jakeyip> alright if there's nothing let's end the meeting. 10:12:19 <mkjpryor> Apologies - got distracted by an urgent email 10:12:27 <jakeyip> oh that's ok 10:12:37 <mkjpryor> We are currently extracting our patches into an out-of-tree driver that we will work on for the time being 10:13:37 <mkjpryor> The problem with our driver and VEXXHOST's co-existing in the same installation is that we use the same image properties to identify our drivers 10:13:53 <mkjpryor> So we would need to come up with a way out of that 10:14:14 <jakeyip> yeah we've discussed about that, we will find a way around that 10:14:57 <mkjpryor> More generally, we are keen to find a way to share as much code between our driver and VEXXHOST's as possible 10:15:14 <mkjpryor> But that is probably a fair distance off at the moment 10:15:18 <jakeyip> ideal scenario is that a Magnum installation will be able to use any number of drivers. The ClusterTemplates will hint which driver to use. 10:15:27 <mkjpryor> That makes sense I think 10:16:05 <mkjpryor> I don't really understand why it wasn't always like that, but I also wasn't using Magnum when the API was designed 10:16:07 <jakeyip> mkjpryor: yes, I think that is the most helpful. we would really love if both StackHPC and VEXXHOST can work together to come up with one good reference. 10:16:35 <jakeyip> the API was designed in mesos days so things changed :) 10:16:42 <mkjpryor> We do favour our Helm approach, TBH, as it allows us to reuse the intelligence in the Helm chart in other places, e.g. for proper gitops and in Azimuth 10:17:19 <mkjpryor> I'm not sure whether VEXXHOST's objections are because we are using Helm or because we are not using ClusterClass 10:17:49 <mkjpryor> FWIW, we are probably going to modify the Helm charts to use ClusterClass at some point in the next few months 10:18:04 <mkjpryor> That may bring us a bit closer 10:19:18 <dalees> mkjpryor, question for you (and I'll ask the same of Vexxhost): how open are you to contributions to your out of tree driver and helm charts? There are lots of changes we've been waiting to submit once it merges, which won't happen for a while. I believe one or two are created as PRs but had no attention. 10:20:17 <mkjpryor> Very open, for us 10:22:14 <mkjpryor> I did see a PR to the CAPI Helm charts for Flatcar integration 10:22:56 <jakeyip> Hi all, I think we can continue the CAPI discussion, but let's close the meeting 10:23:05 <mkjpryor> I need to have a proper think about how we support multiple OSs there, because I'm not convinced that PR is the cleanest approach 10:23:14 <jakeyip> #endmeeting