Friday, 2025-07-18

opendevreviewOpenStack Proposal Bot proposed openstack/magnum-ui master: Imported Translations from Zanata  https://review.opendev.org/c/openstack/magnum-ui/+/95526003:44
opendevreviewJasleen proposed openstack/magnum-capi-helm master: Support token based kubeconfig files for the capi management cluster  https://review.opendev.org/c/openstack/magnum-capi-helm/+/95395613:06
opendevreviewJasleen proposed openstack/magnum-capi-helm master: Support token based kubeconfig files for the capi management cluster  https://review.opendev.org/c/openstack/magnum-capi-helm/+/95395613:16
andrewbogott_Is there anyone here know knows about capi-helm? And/or is there a proper channel/mailing list/etc to discuss capi-helm or other magnum issues? (I know this channel is for devs but it's all I've got)14:50
atmarkandrewbogott_: It looks like you are missing cluster api resources in the management cluster. Which version of capi-helm-charts are you using? 15:01
atmarkDoes `kubectl api-resources | grep cluster.x-k8s.io` return anything?15:03
andrewbogott_atmark: regarding version, I have 'default_helm_chart_version=0.16.0' in magnum.conf, it seems to be getting the 16.0 chart.15:06
andrewbogott_(trying to find my kubectl context, one moment...)15:08
atmarkI'm on 0.16.0 too. No issues so far.  I encountered `ensure CRDs are installed first, resource mapping`  yesterday when I realized I forgot to install cluster api resources 15:08
andrewbogott_ok, I'm caught up15:10
atmarkDid you run `clusterctl init --core cluster-api:v1.9.6 --bootstrap kubeadm:v1.9.6 --control-plane kubeadm:v1.9.6 --infrastructure openstack:v0.11.3` ? 15:10
andrewbogott_indeed, kubectl api-resources | grep cluster.x-k8s.io doesn't return anything.15:10
andrewbogott_Looks like I did 'clusterctl init --infrastructure openstack' 15:11
andrewbogott_Not sure what guide I was following, trying to dig that back up...15:12
andrewbogott_heh, do the docs at https://docs.openstack.org/magnum-capi-helm/ even have a section about setting up the management cluster?15:13
andrewbogott_I'm running that clusterctl command you pasted now. Interested in if that's from a step-by-step guide I can follow when I rebuild this...15:14
atmarkNot that I'm aware of. I'm using portion of this from devstack to setup the management cluster using k3s https://opendev.org/openstack/magnum-capi-helm/src/branch/master/devstack/contrib/new-devstack.sh#L192-L26915:15
atmarkportion of this guide*15:15
andrewbogott_I'm using k3s too, so that's good.15:16
andrewbogott_But not devstack so I was ignoring that section15:16
andrewbogott_Anyway... after running your suggested command, coe cluster create produces the same error message as before15:16
andrewbogott_let's see if this paste is legible...15:17
andrewbogott_https://www.irccloud.com/pastebin/GV7RUzql/15:17
atmarkDoes `kubectl api-resources | grep cluster.x-k8s.io` return anything now?15:19
andrewbogott_oh, good question!15:20
andrewbogott_yes, lots15:20
atmarkHere's what I have running on my management cluster https://paste.openstack.org/show/bFpuMVDKiJvz2WpZRrkz/15:21
andrewbogott_Can I safely assume that magnum controller is talking to my management k8s service? Or could this be a simple network issue?15:21
andrewbogott_yeah, my api-resources output looks like yours15:22
atmarkandrewbogott_: Yes, magnum talks to the management cluster. 15:23
andrewbogott_ok, so at least I have an interesting problem :)15:24
atmarkwhich version is your magnum-capi-helm?15:24
andrewbogott_ii  python3-magnum-capi-helm             1.2.0-3~bpo12+1                      all          Magnum driver that uses Kubernetes Cluster API via Helm15:25
andrewbogott_Epoxy15:25
atmarkHave you tried creating workload cluster without using Magnum? e.g. helm upgrade my-cluster capi/openstack-cluster --install -f ./clouds.yaml -f ./cluster-configuration.yaml 15:26
atmarkhttps://github.com/azimuth-cloud/capi-helm-charts/tree/main/charts/openstack-cluster#managing-a-workload-cluster15:26
andrewbogott_I can copy/paste that command if you want but can't claim to understand what's happening :)15:27
atmarkI think I found your issue15:29
atmarkRun this for me please `helm list -A` 15:29
andrewbogott_https://www.irccloud.com/pastebin/aJH6iweA/15:30
atmarkIs your kubeconfig pointing to your management cluster?  15:32
atmarkIf so, you are missing CAPI addon manager and janitor15:32
andrewbogott_you mean, as opposed to a different cluster?  k8s mgmt cluster is all I've got.15:32
andrewbogott_https://www.irccloud.com/pastebin/3kw5PTfq/15:33
atmarkYou are missing these charts https://paste.openstack.org/show/bcXGt9bUWZGKjIjULw5J/ 15:33
andrewbogott_ok! stay tuned...15:34
atmarkLook at the installation steps https://opendev.org/openstack/magnum-capi-helm/src/branch/master/devstack/contrib/new-devstack.sh#L192-L269│15:34
atmarkYou can even just copy paste those commands 15:35
atmarkBetter to reset your k3s installation first15:35
andrewbogott_So... just to clarify (and maybe editorialize) -- there are no docs about how to set this up, only a script to do so on devstack?15:41
andrewbogott_That's OK, I can certainly adapt that script. But I definitely didn't think "Oh I should totally read the devstack instructions since there are 8 critical steps that are only mentioned there?)15:41
andrewbogott_oops, gotta update my k8s.conf after the rebuild...15:43
andrewbogott_ok! It's getting further now. Will wait a bit and see where we land...15:45
andrewbogott_Seems to be 'CREATE_IN_PROGRESS' forever. But I will keep waiting :)15:55
andrewbogott_thank you for all the help, btw, atmark -- it looks like I'm getting much further, even if not to the finishline15:55
andrewbogott_yeah, seems stuck15:59
andrewbogott_oh, it made a VM though!16:00
atmark Just the script from devstack. I adapted the script too and installed it on management cluster that's HA.  16:05
atmarkandrewbogott_: Glad it works16:05
andrewbogott_it made the controller node but seems to be hanging rather than making workers. Magnum seems to think it's still working on it but I imagine the actual capi process has long since errored out.16:06
atmarkCheck the logs on capo-controller-manager-xxxx pod in capo-system namespace  16:10
atmarkit might be your project doesn't have resources anymore 16:12
andrewbogott_you mean, like, because of resource quotas? I don't think i'm over quota16:14
andrewbogott_Logs are pretty happy although I do see this16:15
andrewbogott_"Bootstrap data secret reference is not yet available" controller="openstackmachine" controllerGroup="infrastructure.cluster.x-k8s.io" controllerKind="OpenStackMachine" OpenStackMachine="magnum-admin/test-cluster-06-sp3akzg6hxqu-default-worker-4v2pj-rj7sr" namespace="magnum-admin" name="test-cluster-06-sp3akzg6hxqu-default-worker-4v2pj-rj7sr" reconcileID="a2069e7c-0af5-4be1-9691-7fa41fdd9184" 16:15
andrewbogott_openStackMachine="test-cluster-06-sp3akzg6hxqu-default-worker-4v2pj-rj7sr" machine="test-cluster-06-sp3akzg6hxqu-default-worker-4v2pj-rj7sr" cluster="test-cluster-06-sp3akzg6hxqu" openStackCluster="test-cluster-06-sp3akzg6hxqu"16:15
atmarkYes resource quotas. Here's ways to debug  https://github.com/azimuth-cloud/capi-helm-charts/blob/main/charts/openstack-cluster/DEBUGGING.md16:16
atmark`Bootstrap data secret reference is not yet available` do you have barbican installed?16:16
andrewbogott_My guess is that I have installed but not working. And it's not installed at all in the prod cloud where I hope to deploy this.16:17
andrewbogott_Is it possible to run w/out it?  If not I'll go on a side-quest to set up a minimal version.16:17
andrewbogott_I'm going to rip out the endpoint in my test cloud and see what happens :)16:19
atmarkI thought barbican is required for magnum-capi-helm driver but looks like it's not 16:25
atmarkMy bad16:25
andrewbogott_unclear, it's certainly acting like it's required :)16:26
atmarkYou can also check the logs for capi-controller-manager-xxx on capi-system namespace16:26
atmarkDid you build your own image?16:27
andrewbogott_no, upstream image16:28
andrewbogott_this is weird... controller-manager logs say16:29
andrewbogott_cluster is not reachable: Get \"https://185.15.57.22:644316:29
andrewbogott_but that's the IP of a different loadbalancer, unrelated to this cluster16:29
andrewbogott_in a different project16:29
andrewbogott_the lb it created is test-cluster-07-i3wtgzrhm4q6-control-plane-x8nkr, IP 10.0.0.24016:30
andrewbogott_hmmmm actually maybe I'm wrong, I don't know /what/ that IP is16:31
atmarkDid it assign floating IP to lb?16:31
* andrewbogott_ digs deeper16:31
andrewbogott_yes, you're right, it did.16:31
andrewbogott_So it is the right IP for the managed lb. But the lb is in an error state.16:31
andrewbogott_Also, I wish it wouldn't use that subnet for the lb IP. Maybe that's configured in the template somehow?16:32
atmarkYes it's configurable in the template 16:32
atmark--external-network 16:32
andrewbogott_ok, starting over with that :)  16:32
andrewbogott_while I'm in here I guess I should change --network-driver from flannel (the default, I guess?) to calico?16:34
atmarkThe capi driver doesn't respect --network-driver so it doesn't matter what it's set to. 16:36
andrewbogott_ok :)16:37
atmarkI have set it to Calico but in cluster I'm running Cilium16:37
andrewbogott_lol, will it /only/ create clusters with floating IPs assigned? I can't unset external_network and also can't set it to anything other than a floating range...16:43
andrewbogott_I guess I need to make a 'fake' floating IP range with internal IPs but I'd rather it just skipped it16:44
atmarkHere is what options in the template it respects  https://opendev.org/openstack/magnum-capi-helm/src/commit/3159a017a61b42a315ac0a2aa9383b317965d0ae/doc/source/configuration/index.rst16:44
atmarkandrewbogott_: there's no logic in the driver to disable floating IP so you are only choice is to edit the values.yaml in the charts16:48
atmarkYou have to set this line to false https://github.com/azimuth-cloud/capi-helm-charts/blob/main/charts/openstack-cluster/values.yaml#L19016:48
andrewbogott_makes sense16:49
andrewbogott_I'll live with it consuming a floating IP for now and see if I can figure out what connectivity wasn't working16:49
atmarkDo you allow NAT Hairpining? It looks your workers couldn't reach that public IP from inside. 16:52
atmarkWorkers will connect to https://185.15.57.22:6443 endpoint to join the cluster. 16:54
andrewbogott_It should work, assuming the firewall rules are set up properly. Does magnum manage that or do I need to set up default security groups somehow?16:57
andrewbogott_Seems weird that they use the public network + the load balancer for what is clearly internal communication though.16:58
andrewbogott_*they16:58
atmarkBy default, Magnum manages the security groups for the workload clusters. Are you using Octavia? 17:00
andrewbogott_yes17:00
atmarkDoes the octavia listener has any ACL?  17:01
andrewbogott_I don't think so. The pool is in error state though, let me investigate that a bit...17:03
andrewbogott_I set an internal network rather than leaving it on default and now it's making worker nodes!17:40
atmarkNice17:50
atmarkI set mine to use existing network as well17:51
atmarkbtw, you can also reach the magnum devs on Kubernetes Slack in the #openstack-magnum channel17:55
andrewbogott_oh good to know!17:56
andrewbogott_Right now everything is looking good but the status still says create_in_progress.  And my apparently very-underpowered management cluster is struggling so I'm trying to be patient :)17:57
atmarkandrewbogott_: that's a bug in the driver https://bugs.launchpad.net/magnum/+bug/2115207. As you can see in the comment section, someone uploaded a patch to fix that bug. I patched mine. 18:01
atmarkThere's a patch from upstream waiting to be merged https://review.opendev.org/c/openstack/magnum-capi-helm/+/95080618:02
andrewbogott_oh, so it will never not say 'create in progress'?18:02
atmarkPatch is exactly the same. It's one change in the code.18:03
atmarkThe status should transition to CREATE_COMPLETE 18:05
* andrewbogott_ patches18:05
andrewbogott_what do you think, atmark, healthy or unhealthy?19:07
andrewbogott_https://www.irccloud.com/pastebin/DNczdr72/19:07
andrewbogott_magnum still thinks it's creating but it sounds like that is not reliable.19:08
atmarkDid you patch the driver?19:33
atmarkWhat does this command show you? `openstack coe cluster show -c health_status_reason $name` 19:36
-opendevstatus- NOTICE: The Gerrit service on review.opendev.org will be offline briefly for a configuration and version update, but should return to service momentarily20:06
andrewbogott_I did patch the driver.  health status displays as {}20:41
andrewbogott_or, I mean, health_status_reason20:41
atmarkis the status still stuck in CREATE_IN_PROGRESS? 21:15
atmarkNot sure why openstack-cinder-csi-controllerplugin-7b66d6bb65-6v787 pod keeps crashing 21:16

Generated by irclog2html.py 4.0.0 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!