diablo_rojo | Meeting in ten min here right? for magnum? | 08:50 |
---|---|---|
dalees | diablo_rojo: yes, depends who is available today. | 08:53 |
travisholton | o/ | 08:57 |
jakeyip | hi all I am back! :) | 08:57 |
jakeyip | yes there will be meeting today | 08:57 |
jakeyip | please feel free to populate the agenda https://etherpad.opendev.org/p/magnum-weekly-meeting | 08:59 |
dalees | welcome back jakeyip, a good holiday i hope | 08:59 |
diablo_rojo | Welcome back jakeyip :) | 08:59 |
jakeyip | yeah it was great. | 09:00 |
jakeyip | #startmeeting magnum | 09:00 |
opendevmeet | Meeting started Wed Jul 19 09:00:33 2023 UTC and is due to finish in 60 minutes. The chair is jakeyip. Information about MeetBot at http://wiki.debian.org/MeetBot. | 09:00 |
opendevmeet | Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. | 09:00 |
opendevmeet | The meeting name has been set to 'magnum' | 09:00 |
jakeyip | #topic Roll Call | 09:00 |
jakeyip | o/ | 09:00 |
dalees | o/ | 09:00 |
travisholton | o/ | 09:00 |
diablo_rojo | o/ | 09:01 |
jakeyip | thanks all for joining the meeting | 09:02 |
jakeyip | Agenda: | 09:02 |
jakeyip | #link https://etherpad.opendev.org/p/magnum-weekly-meeting | 09:02 |
jakeyip | ny | 09:02 |
jakeyip | #topic k8s conformance testing | 09:03 |
jakeyip | let's start with this. can someone take over? | 09:03 |
diablo_rojo | Sure lol. | 09:04 |
diablo_rojo | So, basically for the last dozen or so k8s releases, up till 1.24 openstack magnum has been certified as a provider for k8s | 09:04 |
diablo_rojo | There's a set of conformance tests that were run by lxkong for a while and then guilhermesp_____ for a while | 09:05 |
diablo_rojo | (thank you both for keeping up on that for so long) | 09:05 |
diablo_rojo | So, as of early May we fell out of conformance - what with k8s releasing every 3 months it doesn't take long for things to expire | 09:06 |
diablo_rojo | k8s is getting ready to release 1.28 (in August I think) so it would be good to target that or at least 1.27 to get back on their list of certified providers. | 09:06 |
diablo_rojo | That's step 1. | 09:06 |
diablo_rojo | Step 2 would be to get a periodic job setup to run the conformance tests so that we A. keep track of when they merge things that we should be aware of and B. don't have to manually run the tests anymore and can just pull logs from that to submit when the time comes. | 09:07 |
diablo_rojo | #link https://github.com/cncf/k8s-conformance/tree/master Conformance Info | 09:08 |
diablo_rojo | #link https://github.com/cncf/k8s-conformance/tree/master/v1.24/openstack-magnum Our last passing conformance application thingy | 09:08 |
diablo_rojo | Now, when last guilhermesp_____ had tried to run the tests with the latest magnum ( I think it was Antelope) it didn't pass with k8s 1.25 | 09:09 |
dalees | So passing 1.25 and 1.26 should be fairly straightfoward; I submitted Catalyst Cloud's Magnum 1.25 a while back: https://github.com/cncf/k8s-conformance/pull/2414 | 09:09 |
diablo_rojo | Unfortunately, I don't have his logs on me to tell you what the issue was. | 09:09 |
diablo_rojo | Oh sweet | 09:09 |
diablo_rojo | That is promising | 09:09 |
dalees | there were only minor changes required, most of these changes are merged now. | 09:09 |
dalees | (if not all) | 09:09 |
diablo_rojo | Oh even better then | 09:10 |
diablo_rojo | So I guess my ask is if catalyst cloud runs vanilla openstack magnum or do you have extra stuff added into it? | 09:10 |
dalees | This relates to Magnum Heat driver of course - we are migrating to Magnum CAPI driver and will want to remain passing conformance but there's some version after which we won't be looking for conformance for Magnum Heat. | 09:11 |
diablo_rojo | Yeah this rings a bell. I had talked to Matt Pryor about this at the summit a little I think. | 09:11 |
dalees | we run Magnum Wallaby, with several extra patches. I've gone through them recently and only a couple need to go upstream that relate to conformance. | 09:11 |
diablo_rojo | Sounded like vexxhost and stackhpc had made different drivers so neither of them were running pure antelope magnum | 09:12 |
diablo_rojo | dalees: oh okay that doesn't sound so bad | 09:12 |
jakeyip | for me, I have been testing with devstack and v1.25 in Antelope and v1.27 in Bobcat (IIRC) | 09:12 |
diablo_rojo | jakeyip: would you be able to run the conformance tests with that environment? | 09:13 |
jakeyip | it's not conformance though, just a basic set of tests as the environment I have will probably not be big enough | 09:13 |
jakeyip | dalees: what's the environment that you use to run? | 09:13 |
diablo_rojo | Ahhh got it - yeah that was my issue actually - hence my looking for an openstack provider that is running pure magnum and my desire to get it running as a periodic non voting gate job | 09:14 |
diablo_rojo | So it would be true magnum | 09:14 |
jakeyip | hm ping mnasiadka (sorry I forgot) | 09:14 |
diablo_rojo | and the latest magnum to boot | 09:14 |
dalees | I use our preproduction environment for conformance submissions, so real metal. | 09:14 |
jakeyip | disk/ram? | 09:14 |
dalees | I think the tests just required a couple of control plane and couple of workers. I created the control plane with c2r4 and c4r8 workers | 09:18 |
jakeyip | since guilhermesp_____ was the last to do it, is it possible that we contact them to find out (1) what error they were having and maybe solve that? | 09:18 |
dalees | (err, 3x control plane, not 2) | 09:18 |
diablo_rojo | dalees: yeah I think so | 09:18 |
diablo_rojo | jakeyip: yeah I asked for his logs in that thread.. I will go back and see if he had them. | 09:19 |
dalees | jakeyip: guilhermesp_____ emailed on 6th May with "I am trying to run conformance against 1.25 and 1.26 now, but it looks like we are still with this ongoing? https://review.opendev.org/c/openstack/magnum/+/874092 Im still facing issues to create the cluster due to "PodSecurityPolicy\" is unknown." | 09:19 |
jakeyip | oh | 09:19 |
diablo_rojo | I don't think we need to test for everything between our last cert and the current cert, for the record. | 09:19 |
diablo_rojo | Oh yeah! | 09:19 |
diablo_rojo | thanks dalees :) | 09:19 |
dalees | so that is merged | 09:19 |
jakeyip | yeah of cos it's PodSecurityPolicy | 09:21 |
diablo_rojo | Lol | 09:21 |
diablo_rojo | Was that merged in antelope? | 09:21 |
diablo_rojo | Or its in bobcat/master? | 09:22 |
jakeyip | no, Bobcat | 09:22 |
diablo_rojo | Okay that makes sense. | 09:22 |
jakeyip | so the issue is that at that patch breaks compatibility between k8s < 1.25 and >=1.25 | 09:22 |
diablo_rojo | I will ask guilhermesp_____ to get master and run the tests again and hopefully we will be good to go. That solves part 1 I think :) | 09:22 |
diablo_rojo | Oh | 09:23 |
diablo_rojo | so vexxhost would need to be running sometyhing greater than 1.25 | 09:23 |
diablo_rojo | for it to pass/work with master magnum? | 09:23 |
dalees | jakeyip: does it? we run 1.23, 1.24 and 1.25 currently, maybe we have some other patches (or our templates explicitly list everything needed for older ones to function still) | 09:24 |
jakeyip | at Antelope cycle we didn't want to break compatibility, partly because we respect OpenStack deprecation cycle, etc, so we needed to do some comms first etc | 09:24 |
jakeyip | yeah it should be working with master / bobcat | 09:25 |
dalees | I wonder if we should publish a set of magnum templates with labels set to working versions for that k8s release and magnum release. | 09:26 |
diablo_rojo | That sounds like a good idea | 09:26 |
jakeyip | dalees: so if a cloud has public templates created for user and upgraded Magnum to a version past this patch, new clusters with existing templates <1.25 will not have PodSecurityPolicy | 09:26 |
jakeyip | dalees: I was working on that. it's a big-ish job because I needed to reformat the docs... and face it no one likes docs. no one likes reviewing docs too :P | 09:28 |
dalees | jakeyip: ah indeed, unless they define `admission_control_list` in older templates | 09:29 |
diablo_rojo | jakeyip: I volunteer to review your docs | 09:29 |
diablo_rojo | Just add me :) | 09:29 |
jakeyip | thanks diablo_rojo :P | 09:29 |
diablo_rojo | Lol of course :) | 09:29 |
jakeyip | so the idea is instead of updating default labels for each version, which comes with it's own problems, we will publish the working labels for each version | 09:30 |
jakeyip | currently the docs format is there is a 'default' for each version | 09:30 |
diablo_rojo | Makes sense | 09:30 |
jakeyip | this is the review I was working on https://review.opendev.org/c/openstack/magnum/+/881802 | 09:31 |
diablo_rojo | Tab is open :) | 09:32 |
jakeyip | (and sorry I was wrong, it was v1.23 for Antelope) | 09:32 |
diablo_rojo | Ahh got it. | 09:32 |
jakeyip | since v1.23 is so old, in Bobcat PTG we decided support v1.25+ and remove PodSecurityPolicy (without a toggle / detection code to add PSP if k8s <v1.25). | 09:34 |
diablo_rojo | Makes sense. | 09:34 |
jakeyip | I attempted to write some logic / flags, but the fact that I am working to support an EOL k8s made that work discouraging very quickly | 09:34 |
jakeyip | hopefully operators can understand why Bobcat is such a big change | 09:35 |
diablo_rojo | Yeah. Their release cadence is insane. | 09:35 |
jakeyip | so I guess my question is, how do we handle conformance given this info? | 09:37 |
jakeyip | Bobcat release is 2023-10-04 | 09:37 |
diablo_rojo | Like long term? | 09:37 |
diablo_rojo | Or right now? | 09:37 |
jakeyip | right now, there is no Magnum version that supports a non-EOL version of K8S | 09:38 |
dalees | heh, k8s 1.25 End of Life is 2023-10-28. So perhaps Bobcat should try to also ensure support for 1.26, else it'll only be relevant for a few weeks? | 09:39 |
diablo_rojo | Hmmmm yeah okay I see your point. | 09:39 |
jakeyip | (1) we backport the 'breaking' patch to Antelop. That will give us the ability to run conformance on Antelope on v1.25 | 09:39 |
jakeyip | (2) we wait until Oct and support v1.25 to v1.27 | 09:40 |
diablo_rojo | I would prefer option 1, but I understand that may not be the 'best' option | 09:41 |
jakeyip | I think (2) is possible right now. (1) needs a bunch more work. | 09:41 |
diablo_rojo | Right | 09:41 |
diablo_rojo | Where 'best' is for magnum devs? | 09:42 |
diablo_rojo | Hmmmm | 09:43 |
jakeyip | Personally, I think our efforts will be best placed towards (2), so we can concentrate on the ClusterAPI efforts | 09:44 |
diablo_rojo | That seems reasonable to me. | 09:44 |
jakeyip | Personally, I hate it that Antelope is going to be a 'sad' release (only support v1.23), but we should just focus on CAPI to reduce the sadness period | 09:45 |
diablo_rojo | ..the other issue is the whole Skip Level Upgrade Release process. | 09:45 |
diablo_rojo | So well maybe not an issue but something to think about. | 09:45 |
jakeyip | yeah... on the surface I find that will be difficult to support given k8s release cadence | 09:46 |
diablo_rojo | sorry -thinking outloud kinda. | 09:47 |
diablo_rojo | Yeah definitely not a straightforward solution | 09:47 |
jakeyip | SLURP will mean a yearly cycle? that'll be 4 K8S releases? | 09:47 |
diablo_rojo | Basically yeah. SLURP means people will go from Antelope to C | 09:48 |
jakeyip | hopefully with CAPI we don't need to worry about that :fingers-crossed: | 09:48 |
diablo_rojo | skipping Bobcat. | 09:48 |
diablo_rojo | Yeah that would be nice lol | 09:48 |
jakeyip | yeah more work to be done there | 09:48 |
jakeyip | we need help on tests. then we can tackle SLURP testing. | 09:48 |
jakeyip | running out of time, so summarise? | 09:49 |
diablo_rojo | I guess we advise folks using magnum to not do slurp with k8s for the short term | 09:49 |
diablo_rojo | Sorry to hog the whole meeting. | 09:49 |
diablo_rojo | I appreciate all of everyone's input! | 09:49 |
jakeyip | so to summarise, can you see if you can run conformance with master, and let us know? | 09:50 |
diablo_rojo | So. Basically we are currently stuck between a rock and a hardplace but for the meantime we will focus on CAPI and then when bobcat releases, recert magnum. | 09:50 |
diablo_rojo | Oh yeah I can look into that. | 09:50 |
jakeyip | that will allow us to identify any issues, so when Bobcat gets cut we can recert it straightaway | 09:51 |
diablo_rojo | +2 | 09:51 |
diablo_rojo | sounds like a good plan to me | 09:51 |
dalees | sounds good to me | 09:51 |
diablo_rojo | Thank you everyone! | 09:51 |
jakeyip | #action diablo_rojo to do conformance testing on master with k8s v1.25+ | 09:51 |
jakeyip | diablo_rojo: thanks for helping out, we need all the help we can get :) | 09:52 |
jakeyip | #topic ClusterAPI | 09:52 |
jakeyip | I have been reviewing some of the patches and testing CAPI out with varying levels of success | 09:53 |
jakeyip | I think I may need some help to point me to the right direction to test - e.g. which patchset that will work with devstack | 09:54 |
dalees | we continue to also, and travisholton has flatcar functional which we're keen to contribute. | 09:55 |
jakeyip | dalees: have you been able to test it on devstack? | 09:56 |
dalees | jakeyip: I think the top of the patch chain is https://review.opendev.org/c/openstack/magnum/+/884891/2 (aside from flatcar addition) | 09:56 |
travisholton | jakeyip: the latest active patch is #880805 | 09:56 |
dalees | oh, travisholton will know better than I! | 09:57 |
jakeyip | travisholton: will 880805 work in devstack? | 09:57 |
travisholton | 884891 works as well..just in WIP still | 09:57 |
dalees | jakeyip: yes I have had it running in devstack. I did all the CAPI management setup manually though, which I believe is in the devstack scripts now. | 09:58 |
travisholton | jakeyip: yes I've been using patches in that set for a few weeks now | 09:58 |
jakeyip | cool I will try to jump straight to 880805 | 09:58 |
travisholton | the work that I've done to set up flatcar is based on those and that has been working as well | 09:58 |
jakeyip | #action jakeyip to test CAPI with #880805 | 09:59 |
jakeyip | any other things related to CAPI? | 09:59 |
dalees | hoping to see johnthetubaguy or Matt Pryor around soon, we have much to discuss on those patchsets and things to contribute | 10:00 |
travisholton | +1 | 10:00 |
jakeyip | +1, hopefully next week, if someone can ping them and let them know | 10:01 |
dalees | I want to discuss a few more things, but better to have StackHPC here too. Such as: 1) What we agree to merge to start with, vs add features later. 2) helm chart upstream location (magnum owned - do we need a new repo?). 3) oci support for helm charts, and a few other things ;) I will add to agenda for next time | 10:02 |
jakeyip | yes those are all important. I've created the placeholder for next week's agenda, please populate before we forget https://etherpad.opendev.org/p/magnum-weekly-meeting | 10:03 |
jakeyip | travisholton, dalees, diablo_rojo: we are overtime. anything else? | 10:05 |
travisholton | no not from me | 10:05 |
diablo_rojo | None from me | 10:05 |
dalees | all good for today, thanks jakeyip | 10:05 |
diablo_rojo | thanks jakeyip ! | 10:05 |
jakeyip | thanks everyone for coming! | 10:05 |
jakeyip | #endmeeting | 10:05 |
opendevmeet | Meeting ended Wed Jul 19 10:05:47 2023 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) | 10:05 |
opendevmeet | Minutes: https://meetings.opendev.org/meetings/magnum/2023/magnum.2023-07-19-09.00.html | 10:05 |
opendevmeet | Minutes (text): https://meetings.opendev.org/meetings/magnum/2023/magnum.2023-07-19-09.00.txt | 10:05 |
opendevmeet | Log: https://meetings.opendev.org/meetings/magnum/2023/magnum.2023-07-19-09.00.log.html | 10:05 |
jakeyip | good talk everyone. see you next week! I'll be around for 15 mins or so, if anyone needs me | 10:07 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!