09:00:33 #startmeeting magnum 09:00:34 Meeting started Wed Jul 19 09:00:33 2023 UTC and is due to finish in 60 minutes. The chair is jakeyip. Information about MeetBot at http://wiki.debian.org/MeetBot. 09:00:34 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 09:00:34 The meeting name has been set to 'magnum' 09:00:41 #topic Roll Call 09:00:43 o/ 09:00:55 o/ 09:00:56 o/ 09:01:14 o/ 09:02:15 thanks all for joining the meeting 09:02:34 Agenda: 09:02:35 #link https://etherpad.opendev.org/p/magnum-weekly-meeting 09:02:37 ny 09:03:03 #topic k8s conformance testing 09:03:48 let's start with this. can someone take over? 09:04:08 Sure lol. 09:04:39 So, basically for the last dozen or so k8s releases, up till 1.24 openstack magnum has been certified as a provider for k8s 09:05:34 There's a set of conformance tests that were run by lxkong for a while and then guilhermesp_____ for a while 09:05:35 (thank you both for keeping up on that for so long) 09:06:09 So, as of early May we fell out of conformance - what with k8s releasing every 3 months it doesn't take long for things to expire 09:06:52 k8s is getting ready to release 1.28 (in August I think) so it would be good to target that or at least 1.27 to get back on their list of certified providers. 09:06:59 That's step 1. 09:07:47 Step 2 would be to get a periodic job setup to run the conformance tests so that we A. keep track of when they merge things that we should be aware of and B. don't have to manually run the tests anymore and can just pull logs from that to submit when the time comes. 09:08:14 #link https://github.com/cncf/k8s-conformance/tree/master Conformance Info 09:08:36 #link https://github.com/cncf/k8s-conformance/tree/master/v1.24/openstack-magnum Our last passing conformance application thingy 09:09:15 Now, when last guilhermesp_____ had tried to run the tests with the latest magnum ( I think it was Antelope) it didn't pass with k8s 1.25 09:09:23 So passing 1.25 and 1.26 should be fairly straightfoward; I submitted Catalyst Cloud's Magnum 1.25 a while back: https://github.com/cncf/k8s-conformance/pull/2414 09:09:29 Unfortunately, I don't have his logs on me to tell you what the issue was. 09:09:33 Oh sweet 09:09:43 That is promising 09:09:48 there were only minor changes required, most of these changes are merged now. 09:09:53 (if not all) 09:10:04 Oh even better then 09:10:32 So I guess my ask is if catalyst cloud runs vanilla openstack magnum or do you have extra stuff added into it? 09:11:05 This relates to Magnum Heat driver of course - we are migrating to Magnum CAPI driver and will want to remain passing conformance but there's some version after which we won't be looking for conformance for Magnum Heat. 09:11:45 Yeah this rings a bell. I had talked to Matt Pryor about this at the summit a little I think. 09:11:50 we run Magnum Wallaby, with several extra patches. I've gone through them recently and only a couple need to go upstream that relate to conformance. 09:12:18 Sounded like vexxhost and stackhpc had made different drivers so neither of them were running pure antelope magnum 09:12:33 dalees: oh okay that doesn't sound so bad 09:12:35 for me, I have been testing with devstack and v1.25 in Antelope and v1.27 in Bobcat (IIRC) 09:13:04 jakeyip: would you be able to run the conformance tests with that environment? 09:13:17 it's not conformance though, just a basic set of tests as the environment I have will probably not be big enough 09:13:42 dalees: what's the environment that you use to run? 09:14:08 Ahhh got it - yeah that was my issue actually - hence my looking for an openstack provider that is running pure magnum and my desire to get it running as a periodic non voting gate job 09:14:15 So it would be true magnum 09:14:18 hm ping mnasiadka (sorry I forgot) 09:14:22 and the latest magnum to boot 09:14:32 I use our preproduction environment for conformance submissions, so real metal. 09:14:56 disk/ram? 09:18:10 I think the tests just required a couple of control plane and couple of workers. I created the control plane with c2r4 and c4r8 workers 09:18:28 since guilhermesp_____ was the last to do it, is it possible that we contact them to find out (1) what error they were having and maybe solve that? 09:18:28 (err, 3x control plane, not 2) 09:18:34 dalees: yeah I think so 09:19:06 jakeyip: yeah I asked for his logs in that thread.. I will go back and see if he had them. 09:19:21 jakeyip: guilhermesp_____ emailed on 6th May with "I am trying to run conformance against 1.25 and 1.26 now, but it looks like we are still with this ongoing? https://review.opendev.org/c/openstack/magnum/+/874092 Im still facing issues to create the cluster due to "PodSecurityPolicy\" is unknown." 09:19:36 oh 09:19:39 I don't think we need to test for everything between our last cert and the current cert, for the record. 09:19:42 Oh yeah! 09:19:46 thanks dalees :) 09:19:58 so that is merged 09:21:31 yeah of cos it's PodSecurityPolicy 09:21:38 Lol 09:21:58 Was that merged in antelope? 09:22:12 Or its in bobcat/master? 09:22:14 no, Bobcat 09:22:27 Okay that makes sense. 09:22:57 so the issue is that at that patch breaks compatibility between k8s < 1.25 and >=1.25 09:22:58 I will ask guilhermesp_____ to get master and run the tests again and hopefully we will be good to go. That solves part 1 I think :) 09:23:23 Oh 09:23:47 so vexxhost would need to be running sometyhing greater than 1.25 09:23:58 for it to pass/work with master magnum? 09:24:01 jakeyip: does it? we run 1.23, 1.24 and 1.25 currently, maybe we have some other patches (or our templates explicitly list everything needed for older ones to function still) 09:24:49 at Antelope cycle we didn't want to break compatibility, partly because we respect OpenStack deprecation cycle, etc, so we needed to do some comms first etc 09:25:17 yeah it should be working with master / bobcat 09:26:05 I wonder if we should publish a set of magnum templates with labels set to working versions for that k8s release and magnum release. 09:26:33 That sounds like a good idea 09:26:52 dalees: so if a cloud has public templates created for user and upgraded Magnum to a version past this patch, new clusters with existing templates <1.25 will not have PodSecurityPolicy 09:28:35 dalees: I was working on that. it's a big-ish job because I needed to reformat the docs... and face it no one likes docs. no one likes reviewing docs too :P 09:29:04 jakeyip: ah indeed, unless they define `admission_control_list` in older templates 09:29:18 jakeyip: I volunteer to review your docs 09:29:22 Just add me :) 09:29:26 thanks diablo_rojo :P 09:29:44 Lol of course :) 09:30:07 so the idea is instead of updating default labels for each version, which comes with it's own problems, we will publish the working labels for each version 09:30:26 currently the docs format is there is a 'default' for each version 09:30:50 Makes sense 09:31:03 this is the review I was working on https://review.opendev.org/c/openstack/magnum/+/881802 09:32:09 Tab is open :) 09:32:12 (and sorry I was wrong, it was v1.23 for Antelope) 09:32:41 Ahh got it. 09:34:05 since v1.23 is so old, in Bobcat PTG we decided support v1.25+ and remove PodSecurityPolicy (without a toggle / detection code to add PSP if k8s Makes sense. 09:34:44 I attempted to write some logic / flags, but the fact that I am working to support an EOL k8s made that work discouraging very quickly 09:35:04 hopefully operators can understand why Bobcat is such a big change 09:35:11 Yeah. Their release cadence is insane. 09:37:05 so I guess my question is, how do we handle conformance given this info? 09:37:28 Bobcat release is 2023-10-04 09:37:33 Like long term? 09:37:44 Or right now? 09:38:19 right now, there is no Magnum version that supports a non-EOL version of K8S 09:39:02 heh, k8s 1.25 End of Life is 2023-10-28. So perhaps Bobcat should try to also ensure support for 1.26, else it'll only be relevant for a few weeks? 09:39:14 Hmmmm yeah okay I see your point. 09:39:54 (1) we backport the 'breaking' patch to Antelop. That will give us the ability to run conformance on Antelope on v1.25 09:40:07 (2) we wait until Oct and support v1.25 to v1.27 09:41:23 I would prefer option 1, but I understand that may not be the 'best' option 09:41:34 I think (2) is possible right now. (1) needs a bunch more work. 09:41:40 Right 09:42:01 Where 'best' is for magnum devs? 09:43:27 Hmmmm 09:44:11 Personally, I think our efforts will be best placed towards (2), so we can concentrate on the ClusterAPI efforts 09:44:35 That seems reasonable to me. 09:45:11 Personally, I hate it that Antelope is going to be a 'sad' release (only support v1.23), but we should just focus on CAPI to reduce the sadness period 09:45:41 ..the other issue is the whole Skip Level Upgrade Release process. 09:45:58 So well maybe not an issue but something to think about. 09:46:52 yeah... on the surface I find that will be difficult to support given k8s release cadence 09:47:01 sorry -thinking outloud kinda. 09:47:11 Yeah definitely not a straightforward solution 09:47:46 SLURP will mean a yearly cycle? that'll be 4 K8S releases? 09:48:11 Basically yeah. SLURP means people will go from Antelope to C 09:48:11 hopefully with CAPI we don't need to worry about that :fingers-crossed: 09:48:15 skipping Bobcat. 09:48:24 Yeah that would be nice lol 09:48:34 yeah more work to be done there 09:48:59 we need help on tests. then we can tackle SLURP testing. 09:49:29 running out of time, so summarise? 09:49:30 I guess we advise folks using magnum to not do slurp with k8s for the short term 09:49:41 Sorry to hog the whole meeting. 09:49:48 I appreciate all of everyone's input! 09:50:26 so to summarise, can you see if you can run conformance with master, and let us know? 09:50:29 So. Basically we are currently stuck between a rock and a hardplace but for the meantime we will focus on CAPI and then when bobcat releases, recert magnum. 09:50:37 Oh yeah I can look into that. 09:51:11 that will allow us to identify any issues, so when Bobcat gets cut we can recert it straightaway 09:51:19 +2 09:51:22 sounds like a good plan to me 09:51:27 sounds good to me 09:51:54 Thank you everyone! 09:51:55 #action diablo_rojo to do conformance testing on master with k8s v1.25+ 09:52:16 diablo_rojo: thanks for helping out, we need all the help we can get :) 09:52:47 #topic ClusterAPI 09:53:44 I have been reviewing some of the patches and testing CAPI out with varying levels of success 09:54:32 I think I may need some help to point me to the right direction to test - e.g. which patchset that will work with devstack 09:55:20 we continue to also, and travisholton has flatcar functional which we're keen to contribute. 09:56:40 dalees: have you been able to test it on devstack? 09:56:40 jakeyip: I think the top of the patch chain is https://review.opendev.org/c/openstack/magnum/+/884891/2 (aside from flatcar addition) 09:56:45 jakeyip: the latest active patch is #880805 09:57:24 oh, travisholton will know better than I! 09:57:36 travisholton: will 880805 work in devstack? 09:57:43 884891 works as well..just in WIP still 09:58:21 jakeyip: yes I have had it running in devstack. I did all the CAPI management setup manually though, which I believe is in the devstack scripts now. 09:58:23 jakeyip: yes I've been using patches in that set for a few weeks now 09:58:53 cool I will try to jump straight to 880805 09:58:53 the work that I've done to set up flatcar is based on those and that has been working as well 09:59:32 #action jakeyip to test CAPI with #880805 09:59:51 any other things related to CAPI? 10:00:03 hoping to see johnthetubaguy or Matt Pryor around soon, we have much to discuss on those patchsets and things to contribute 10:00:13 +1 10:01:37 +1, hopefully next week, if someone can ping them and let them know 10:02:18 I want to discuss a few more things, but better to have StackHPC here too. Such as: 1) What we agree to merge to start with, vs add features later. 2) helm chart upstream location (magnum owned - do we need a new repo?). 3) oci support for helm charts, and a few other things ;) I will add to agenda for next time 10:03:40 yes those are all important. I've created the placeholder for next week's agenda, please populate before we forget https://etherpad.opendev.org/p/magnum-weekly-meeting 10:05:03 travisholton, dalees, diablo_rojo: we are overtime. anything else? 10:05:15 no not from me 10:05:21 None from me 10:05:25 all good for today, thanks jakeyip 10:05:26 thanks jakeyip ! 10:05:36 thanks everyone for coming! 10:05:47 #endmeeting