Friday, 2025-07-18

opendevreview	OpenStack Proposal Bot proposed openstack/magnum-ui master: Imported Translations from Zanata https://review.opendev.org/c/openstack/magnum-ui/+/955260	03:44
opendevreview	Jasleen proposed openstack/magnum-capi-helm master: Support token based kubeconfig files for the capi management cluster https://review.opendev.org/c/openstack/magnum-capi-helm/+/953956	13:06
opendevreview	Jasleen proposed openstack/magnum-capi-helm master: Support token based kubeconfig files for the capi management cluster https://review.opendev.org/c/openstack/magnum-capi-helm/+/953956	13:16
andrewbogott_	Is there anyone here know knows about capi-helm? And/or is there a proper channel/mailing list/etc to discuss capi-helm or other magnum issues? (I know this channel is for devs but it's all I've got)	14:50
atmark	andrewbogott_: It looks like you are missing cluster api resources in the management cluster. Which version of capi-helm-charts are you using?	15:01
atmark	Does `kubectl api-resources \| grep cluster.x-k8s.io` return anything?	15:03
andrewbogott_	atmark: regarding version, I have 'default_helm_chart_version=0.16.0' in magnum.conf, it seems to be getting the 16.0 chart.	15:06
andrewbogott_	(trying to find my kubectl context, one moment...)	15:08
atmark	I'm on 0.16.0 too. No issues so far. I encountered `ensure CRDs are installed first, resource mapping` yesterday when I realized I forgot to install cluster api resources	15:08
andrewbogott_	ok, I'm caught up	15:10
atmark	Did you run `clusterctl init --core cluster-api:v1.9.6 --bootstrap kubeadm:v1.9.6 --control-plane kubeadm:v1.9.6 --infrastructure openstack:v0.11.3` ?	15:10
andrewbogott_	indeed, kubectl api-resources \| grep cluster.x-k8s.io doesn't return anything.	15:10
andrewbogott_	Looks like I did 'clusterctl init --infrastructure openstack'	15:11
andrewbogott_	Not sure what guide I was following, trying to dig that back up...	15:12
andrewbogott_	heh, do the docs at https://docs.openstack.org/magnum-capi-helm/ even have a section about setting up the management cluster?	15:13
andrewbogott_	I'm running that clusterctl command you pasted now. Interested in if that's from a step-by-step guide I can follow when I rebuild this...	15:14
atmark	Not that I'm aware of. I'm using portion of this from devstack to setup the management cluster using k3s https://opendev.org/openstack/magnum-capi-helm/src/branch/master/devstack/contrib/new-devstack.sh#L192-L269	15:15
atmark	portion of this guide*	15:15
andrewbogott_	I'm using k3s too, so that's good.	15:16
andrewbogott_	But not devstack so I was ignoring that section	15:16
andrewbogott_	Anyway... after running your suggested command, coe cluster create produces the same error message as before	15:16
andrewbogott_	let's see if this paste is legible...	15:17
andrewbogott_	https://www.irccloud.com/pastebin/GV7RUzql/	15:17
atmark	Does `kubectl api-resources \| grep cluster.x-k8s.io` return anything now?	15:19
andrewbogott_	oh, good question!	15:20
andrewbogott_	yes, lots	15:20
atmark	Here's what I have running on my management cluster https://paste.openstack.org/show/bFpuMVDKiJvz2WpZRrkz/	15:21
andrewbogott_	Can I safely assume that magnum controller is talking to my management k8s service? Or could this be a simple network issue?	15:21
andrewbogott_	yeah, my api-resources output looks like yours	15:22
atmark	andrewbogott_: Yes, magnum talks to the management cluster.	15:23
andrewbogott_	ok, so at least I have an interesting problem :)	15:24
atmark	which version is your magnum-capi-helm?	15:24
andrewbogott_	ii python3-magnum-capi-helm 1.2.0-3~bpo12+1 all Magnum driver that uses Kubernetes Cluster API via Helm	15:25
andrewbogott_	Epoxy	15:25
atmark	Have you tried creating workload cluster without using Magnum? e.g. helm upgrade my-cluster capi/openstack-cluster --install -f ./clouds.yaml -f ./cluster-configuration.yaml	15:26
atmark	https://github.com/azimuth-cloud/capi-helm-charts/tree/main/charts/openstack-cluster#managing-a-workload-cluster	15:26
andrewbogott_	I can copy/paste that command if you want but can't claim to understand what's happening :)	15:27
atmark	I think I found your issue	15:29
atmark	Run this for me please `helm list -A`	15:29
andrewbogott_	https://www.irccloud.com/pastebin/aJH6iweA/	15:30
atmark	Is your kubeconfig pointing to your management cluster?	15:32
atmark	If so, you are missing CAPI addon manager and janitor	15:32
andrewbogott_	you mean, as opposed to a different cluster? k8s mgmt cluster is all I've got.	15:32
andrewbogott_	https://www.irccloud.com/pastebin/3kw5PTfq/	15:33
atmark	You are missing these charts https://paste.openstack.org/show/bcXGt9bUWZGKjIjULw5J/	15:33
andrewbogott_	ok! stay tuned...	15:34
atmark	Look at the installation steps https://opendev.org/openstack/magnum-capi-helm/src/branch/master/devstack/contrib/new-devstack.sh#L192-L269│	15:34
atmark	You can even just copy paste those commands	15:35
atmark	Better to reset your k3s installation first	15:35
andrewbogott_	So... just to clarify (and maybe editorialize) -- there are no docs about how to set this up, only a script to do so on devstack?	15:41
andrewbogott_	That's OK, I can certainly adapt that script. But I definitely didn't think "Oh I should totally read the devstack instructions since there are 8 critical steps that are only mentioned there?)	15:41
andrewbogott_	oops, gotta update my k8s.conf after the rebuild...	15:43
andrewbogott_	ok! It's getting further now. Will wait a bit and see where we land...	15:45
andrewbogott_	Seems to be 'CREATE_IN_PROGRESS' forever. But I will keep waiting :)	15:55
andrewbogott_	thank you for all the help, btw, atmark -- it looks like I'm getting much further, even if not to the finishline	15:55
andrewbogott_	yeah, seems stuck	15:59
andrewbogott_	oh, it made a VM though!	16:00
atmark	Just the script from devstack. I adapted the script too and installed it on management cluster that's HA.	16:05
atmark	andrewbogott_: Glad it works	16:05
andrewbogott_	it made the controller node but seems to be hanging rather than making workers. Magnum seems to think it's still working on it but I imagine the actual capi process has long since errored out.	16:06
atmark	Check the logs on capo-controller-manager-xxxx pod in capo-system namespace	16:10
atmark	it might be your project doesn't have resources anymore	16:12
andrewbogott_	you mean, like, because of resource quotas? I don't think i'm over quota	16:14
andrewbogott_	Logs are pretty happy although I do see this	16:15
andrewbogott_	"Bootstrap data secret reference is not yet available" controller="openstackmachine" controllerGroup="infrastructure.cluster.x-k8s.io" controllerKind="OpenStackMachine" OpenStackMachine="magnum-admin/test-cluster-06-sp3akzg6hxqu-default-worker-4v2pj-rj7sr" namespace="magnum-admin" name="test-cluster-06-sp3akzg6hxqu-default-worker-4v2pj-rj7sr" reconcileID="a2069e7c-0af5-4be1-9691-7fa41fdd9184"	16:15
andrewbogott_	openStackMachine="test-cluster-06-sp3akzg6hxqu-default-worker-4v2pj-rj7sr" machine="test-cluster-06-sp3akzg6hxqu-default-worker-4v2pj-rj7sr" cluster="test-cluster-06-sp3akzg6hxqu" openStackCluster="test-cluster-06-sp3akzg6hxqu"	16:15
atmark	Yes resource quotas. Here's ways to debug https://github.com/azimuth-cloud/capi-helm-charts/blob/main/charts/openstack-cluster/DEBUGGING.md	16:16
atmark	`Bootstrap data secret reference is not yet available` do you have barbican installed?	16:16
andrewbogott_	My guess is that I have installed but not working. And it's not installed at all in the prod cloud where I hope to deploy this.	16:17
andrewbogott_	Is it possible to run w/out it? If not I'll go on a side-quest to set up a minimal version.	16:17
andrewbogott_	I'm going to rip out the endpoint in my test cloud and see what happens :)	16:19
atmark	I thought barbican is required for magnum-capi-helm driver but looks like it's not	16:25
atmark	My bad	16:25
andrewbogott_	unclear, it's certainly acting like it's required :)	16:26
atmark	You can also check the logs for capi-controller-manager-xxx on capi-system namespace	16:26
atmark	Did you build your own image?	16:27
andrewbogott_	no, upstream image	16:28
andrewbogott_	this is weird... controller-manager logs say	16:29
andrewbogott_	cluster is not reachable: Get \"https://185.15.57.22:6443	16:29
andrewbogott_	but that's the IP of a different loadbalancer, unrelated to this cluster	16:29
andrewbogott_	in a different project	16:29
andrewbogott_	the lb it created is test-cluster-07-i3wtgzrhm4q6-control-plane-x8nkr, IP 10.0.0.240	16:30
andrewbogott_	hmmmm actually maybe I'm wrong, I don't know /what/ that IP is	16:31
atmark	Did it assign floating IP to lb?	16:31
* andrewbogott_ digs deeper		16:31
andrewbogott_	yes, you're right, it did.	16:31
andrewbogott_	So it is the right IP for the managed lb. But the lb is in an error state.	16:31
andrewbogott_	Also, I wish it wouldn't use that subnet for the lb IP. Maybe that's configured in the template somehow?	16:32
atmark	Yes it's configurable in the template	16:32
atmark	--external-network	16:32
andrewbogott_	ok, starting over with that :)	16:32
andrewbogott_	while I'm in here I guess I should change --network-driver from flannel (the default, I guess?) to calico?	16:34
atmark	The capi driver doesn't respect --network-driver so it doesn't matter what it's set to.	16:36
andrewbogott_	ok :)	16:37
atmark	I have set it to Calico but in cluster I'm running Cilium	16:37
andrewbogott_	lol, will it /only/ create clusters with floating IPs assigned? I can't unset external_network and also can't set it to anything other than a floating range...	16:43
andrewbogott_	I guess I need to make a 'fake' floating IP range with internal IPs but I'd rather it just skipped it	16:44
atmark	Here is what options in the template it respects https://opendev.org/openstack/magnum-capi-helm/src/commit/3159a017a61b42a315ac0a2aa9383b317965d0ae/doc/source/configuration/index.rst	16:44
atmark	andrewbogott_: there's no logic in the driver to disable floating IP so you are only choice is to edit the values.yaml in the charts	16:48
atmark	You have to set this line to false https://github.com/azimuth-cloud/capi-helm-charts/blob/main/charts/openstack-cluster/values.yaml#L190	16:48
andrewbogott_	makes sense	16:49
andrewbogott_	I'll live with it consuming a floating IP for now and see if I can figure out what connectivity wasn't working	16:49
atmark	Do you allow NAT Hairpining? It looks your workers couldn't reach that public IP from inside.	16:52
atmark	Workers will connect to https://185.15.57.22:6443 endpoint to join the cluster.	16:54
andrewbogott_	It should work, assuming the firewall rules are set up properly. Does magnum manage that or do I need to set up default security groups somehow?	16:57
andrewbogott_	Seems weird that they use the public network + the load balancer for what is clearly internal communication though.	16:58
andrewbogott_	*they	16:58
atmark	By default, Magnum manages the security groups for the workload clusters. Are you using Octavia?	17:00
andrewbogott_	yes	17:00
atmark	Does the octavia listener has any ACL?	17:01
andrewbogott_	I don't think so. The pool is in error state though, let me investigate that a bit...	17:03
andrewbogott_	I set an internal network rather than leaving it on default and now it's making worker nodes!	17:40
atmark	Nice	17:50
atmark	I set mine to use existing network as well	17:51
atmark	btw, you can also reach the magnum devs on Kubernetes Slack in the #openstack-magnum channel	17:55
andrewbogott_	oh good to know!	17:56
andrewbogott_	Right now everything is looking good but the status still says create_in_progress. And my apparently very-underpowered management cluster is struggling so I'm trying to be patient :)	17:57
atmark	andrewbogott_: that's a bug in the driver https://bugs.launchpad.net/magnum/+bug/2115207. As you can see in the comment section, someone uploaded a patch to fix that bug. I patched mine.	18:01
atmark	There's a patch from upstream waiting to be merged https://review.opendev.org/c/openstack/magnum-capi-helm/+/950806	18:02
andrewbogott_	oh, so it will never not say 'create in progress'?	18:02
atmark	Patch is exactly the same. It's one change in the code.	18:03
atmark	The status should transition to CREATE_COMPLETE	18:05
* andrewbogott_ patches		18:05
andrewbogott_	what do you think, atmark, healthy or unhealthy?	19:07
andrewbogott_	https://www.irccloud.com/pastebin/DNczdr72/	19:07
andrewbogott_	magnum still thinks it's creating but it sounds like that is not reliable.	19:08
atmark	Did you patch the driver?	19:33
atmark	What does this command show you? `openstack coe cluster show -c health_status_reason $name`	19:36
-opendevstatus- NOTICE: The Gerrit service on review.opendev.org will be offline briefly for a configuration and version update, but should return to service momentarily		20:06
andrewbogott_	I did patch the driver. health status displays as {}	20:41
andrewbogott_	or, I mean, health_status_reason	20:41
atmark	is the status still stuck in CREATE_IN_PROGRESS?	21:15
atmark	Not sure why openstack-cinder-csi-controllerplugin-7b66d6bb65-6v787 pod keeps crashing	21:16

Generated by irclog2html.py 4.0.0 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!