| opendevreview | Michal Nasiadka proposed openstack/ansible-collection-kolla master: CI: Switch linters to lint-requirements.txt https://review.opendev.org/c/openstack/ansible-collection-kolla/+/987584 | 05:46 |
|---|---|---|
| opendevreview | Michal Nasiadka proposed openstack/ansible-collection-kolla master: CI: Switch linters to lint-requirements.txt https://review.opendev.org/c/openstack/ansible-collection-kolla/+/987584 | 05:48 |
| opendevreview | Michal Nasiadka proposed openstack/ansible-collection-kolla master: CI: Switch linters to lint-requirements.txt https://review.opendev.org/c/openstack/ansible-collection-kolla/+/987584 | 05:48 |
| opendevreview | Ilia Petrov proposed openstack/kolla-ansible master: Fix Skyline nginx service proxy paths https://review.opendev.org/c/openstack/kolla-ansible/+/987587 | 05:50 |
| opendevreview | Merged openstack/ansible-collection-kolla master: CI: Switch linters to lint-requirements.txt https://review.opendev.org/c/openstack/ansible-collection-kolla/+/987584 | 06:13 |
| mnasiadka | blanson[m], bbezak, frickler: https://review.opendev.org/c/openstack/releases/+/987590 gazpacho rc1 patch - please try to not merge anything in master before 2026.1 is branched :) | 07:25 |
| bbezak | kk | 07:25 |
| opendevreview | Ilia Petrov proposed openstack/kolla-ansible master: Fix Skyline nginx service proxy paths https://review.opendev.org/c/openstack/kolla-ansible/+/987587 | 07:34 |
| blanson[m] | ack | 07:42 |
| opendevreview | Pierre Riteau proposed openstack/kayobe master: Bump Ansible collections and roles https://review.opendev.org/c/openstack/kayobe/+/987594 | 08:03 |
| opendevreview | OpenStack Release Bot proposed openstack/ansible-collection-kolla stable/2026.1: Update .gitreview for stable/2026.1 https://review.opendev.org/c/openstack/ansible-collection-kolla/+/987597 | 08:11 |
| opendevreview | OpenStack Release Bot proposed openstack/ansible-collection-kolla stable/2026.1: Update TOX_CONSTRAINTS_FILE for stable/2026.1 https://review.opendev.org/c/openstack/ansible-collection-kolla/+/987598 | 08:11 |
| opendevreview | OpenStack Release Bot proposed openstack/ansible-collection-kolla master: Update master for stable/2026.1 https://review.opendev.org/c/openstack/ansible-collection-kolla/+/987599 | 08:11 |
| tafkamax | Hi I have a question. `neutron_dhcp_agent_dnsmasq_qdhcp-5225f6f9-dc56-42e8-b0aa-d866b49b36ed` <-- when we ran stop on our containers these still remained active on compute nodes. | 08:14 |
| tafkamax | I think these are generated if dhcp is enabled on a subnet? | 08:14 |
| opendevreview | OpenStack Release Bot proposed openstack/kolla-ansible stable/2026.1: Update .gitreview for stable/2026.1 https://review.opendev.org/c/openstack/kolla-ansible/+/987600 | 08:14 |
| opendevreview | OpenStack Release Bot proposed openstack/kolla-ansible stable/2026.1: Update TOX_CONSTRAINTS_FILE for stable/2026.1 https://review.opendev.org/c/openstack/kolla-ansible/+/987601 | 08:14 |
| opendevreview | OpenStack Release Bot proposed openstack/kolla-ansible master: Update master for stable/2026.1 https://review.opendev.org/c/openstack/kolla-ansible/+/987602 | 08:14 |
| tafkamax | Is this OK if these remaing running or should they be stopped some other way? | 08:14 |
| opendevreview | OpenStack Release Bot proposed openstack/kolla stable/2026.1: Update .gitreview for stable/2026.1 https://review.opendev.org/c/openstack/kolla/+/987603 | 08:15 |
| tafkamax | They are spawned with this image quay.io/openstack.kolla/neutron-dhcp-agent:2025.1-ubuntu-noble | 08:15 |
| opendevreview | OpenStack Release Bot proposed openstack/kolla stable/2026.1: Update TOX_CONSTRAINTS_FILE for stable/2026.1 https://review.opendev.org/c/openstack/kolla/+/987604 | 08:15 |
| opendevreview | OpenStack Release Bot proposed openstack/kolla master: Update master for stable/2026.1 https://review.opendev.org/c/openstack/kolla/+/987605 | 08:15 |
| tafkamax | * They are spawned with this image quay.io/openstack.kolla/neutron-dhcp-agent:2025.1-ubuntu-noble | 08:15 |
| tafkamax | * They are spawned with this image neutron-dhcp-agent:2025.1-ubuntu-noble | 08:15 |
| mnasiadka | yes, that’s how it should be - neutron-dhcp-agent stop should not stop the services running | 08:17 |
| mnasiadka | But I think we should have a look in some operational commands how to clean up :) | 08:17 |
| tafkamax | oh okay | 08:18 |
| tafkamax | Or migrate them to a second compute host? | 08:18 |
| tafkamax | R/N I shutdown all services on a single compute | 08:19 |
| tafkamax | for maintenance | 08:19 |
| mnasiadka | No, neutron-dhcp-agent should manage the number of running services (whether you configure it to maintain ha config or not) | 08:21 |
| mnasiadka | You can just remove them | 08:21 |
| tafkamax | ok | 08:21 |
| tafkamax | opendevreview: seems like a small fix, we are experiencing this aswell | 08:22 |
| tafkamax | will test it out on our test cluster | 08:22 |
| opendevreview | Seunghun Lee proposed openstack/kayobe master: DNM: Test --continue-on-unreachable https://review.opendev.org/c/openstack/kayobe/+/910511 | 08:26 |
| opendevreview | Seunghun Lee proposed openstack/kayobe master: DNM: Test --continue-on-unreachable https://review.opendev.org/c/openstack/kayobe/+/910511 | 08:27 |
| Vii | Congrats on the 2026.1 release! I think a lot of really great work went into this cycle, huge thanks to everyone involved :) | 08:28 |
| opendevreview | Pierre Riteau proposed openstack/kayobe master: Add release note for broken conditionals https://review.opendev.org/c/openstack/kayobe/+/986629 | 08:28 |
| blanson[m] | Taavi Ansper: I'll check it out, can't be merged before the rc thing, but you can backport afterward | 08:30 |
| tafkamax | We will probably cherry pick if works on test | 08:31 |
| blanson[m] | how's your skyline experience btw ? | 08:31 |
| blanson[m] | we'd like to deploy it, some old school people are reluctant about it. from what I've seen recently it seemed to work fine ? | 08:32 |
| opendevreview | Merged openstack/ansible-collection-kolla stable/2026.1: Update .gitreview for stable/2026.1 https://review.opendev.org/c/openstack/ansible-collection-kolla/+/987597 | 08:34 |
| opendevreview | Merged openstack/ansible-collection-kolla stable/2026.1: Update TOX_CONSTRAINTS_FILE for stable/2026.1 https://review.opendev.org/c/openstack/ansible-collection-kolla/+/987598 | 08:36 |
| opendevreview | Merged openstack/kolla stable/2026.1: Update .gitreview for stable/2026.1 https://review.opendev.org/c/openstack/kolla/+/987603 | 08:37 |
| fprzewozn | blanson[m] I've implemented it recently - TLS enabled, paired with OIDC IdP. It worked almost fine, for SSO got issue with https://bugs.launchpad.net/skyline-apiserver/+bug/2083564 (for now I got just skyline_keystone_url override on config set to {{ keystone_public_url }}/v3/. Second issue I got was | 08:39 |
| fprzewozn | https://bugs.launchpad.net/skyline-apiserver/+bug/2133859 (resolved by using 2025.2 image). And third was https://bugs.launchpad.net/kolla-ansible/+bug/2091935 but I fixed it in https://review.opendev.org/c/openstack/kolla-ansible/+/980968 | 08:39 |
| opendevreview | Pierre Riteau proposed openstack/kayobe master: [WIP] Add support for --use-test-images option https://review.opendev.org/c/openstack/kayobe/+/987607 | 08:39 |
| opendevreview | Merged openstack/kolla stable/2026.1: Update TOX_CONSTRAINTS_FILE for stable/2026.1 https://review.opendev.org/c/openstack/kolla/+/987604 | 08:41 |
| fprzewozn | But if you are planning on running 2025.2 without SSO, then it should work out of the box :) | 08:42 |
| opendevreview | Seunghun Lee proposed openstack/kolla-ansible master: Add flag for MariaDB heuristic recovery https://review.opendev.org/c/openstack/kolla-ansible/+/961675 | 08:42 |
| blanson[m] | fprzewozn: we plan on using sso, 2025.2 or 2026.1 hopefully nothing older for skyline the sso part would be internal use only so I guess we can educate people while this is fixed in skyline ? | 08:42 |
| blanson[m] | client facing wouldn't use sso (for now) | 08:43 |
| fprzewozn | in SSO deployment pay attention to endpoints and firewalls, I've encountered some totally unrelated error messages where the issue was that during auth redirect it wasn't able to connect from internal interface to it's own external one | 08:47 |
| fprzewozn | ofc same case as for Horizon with SSO, but here you got 2 ports | 08:47 |
| opendevreview | Merged openstack/kolla-ansible stable/2026.1: Update .gitreview for stable/2026.1 https://review.opendev.org/c/openstack/kolla-ansible/+/987600 | 08:48 |
| opendevreview | Merged openstack/kolla-ansible stable/2026.1: Update TOX_CONSTRAINTS_FILE for stable/2026.1 https://review.opendev.org/c/openstack/kolla-ansible/+/987601 | 08:51 |
| tafkamax | <blanson[m]> "how's your skyline experience..." <- Most ppl use it more than horizon | 09:00 |
| blanson[m] | Taavi Ansper: do they ? I might be living under a rock tbh | 09:01 |
| tafkamax | In my org i mean | 09:02 |
| tafkamax | users and admin(s) | 09:02 |
| blanson[m] | oh ! yh I make do with horizon but it's getting old... | 09:04 |
| blanson[m] | and there are some interesting new panels in skyline like the barbican one that would be really useful for clients | 09:04 |
| tafkamax | I am running 2025.2 in production with SSO | 09:05 |
| tafkamax | oidc and keycloak | 09:05 |
| tafkamax | I had the issue I fixed myself. | 09:05 |
| tafkamax | with the federation stuff | 09:05 |
| tafkamax | and also there was a bug in skyline itself, but that was fixed quickly. | 09:05 |
| tafkamax | oh fprzewozn linked the skyline-apiserver bug. Interesting that for me it works, but he still mentions its not working. | 09:06 |
| opendevreview | Pierre Riteau proposed openstack/kayobe master: [WIP] Add support for --use-test-images option https://review.opendev.org/c/openstack/kayobe/+/987607 | 09:14 |
| fprzewozn | tafkamax I had this bug on 2025.1 and switched skyline to 2025.2 tag | 09:32 |
| opendevreview | Pierre Riteau proposed openstack/kayobe master: [WIP] Add support for --use-test-images option https://review.opendev.org/c/openstack/kayobe/+/987607 | 09:42 |
| opendevreview | Franciszek Przewoźny proposed openstack/kolla-ansible master: Allow overwriting Prometheus exporter listen addresses https://review.opendev.org/c/openstack/kolla-ansible/+/983281 | 09:44 |
| tafkamax | blanson: You had EXP with cyborg. | 09:50 |
| tafkamax | I got my GPU to show under accelerators. | 09:50 |
| tafkamax | I suppose I need to create a profile. But I don't know what options to put there. openstack accelerator device profile create --help | 09:51 |
| tafkamax | There are not many guides | 09:51 |
| blanson[m] | hum | 09:55 |
| blanson[m] | let me see my notes | 09:56 |
| tafkamax | can i specify hostname? | 10:02 |
| blanson[m] | $ openstack accelerator device profile create gpu_a40 '["resources:PGPU":"1","trait:CUSTOM_GPU_PRODUCT_ID_2235":"required"]' is my example | 10:03 |
| tafkamax | aha | 10:03 |
| blanson[m] | you get the trait name from os resource provider trait list | 10:04 |
| tafkamax | Hmm, could cyborg be not fully working if i get these for some CLI invocations: 'Proxy' object has no attribute 'get_attribute' | 10:04 |
| tafkamax | openstack accelerator device attribute show | 10:04 |
| *** jhorstmann is now known as Guest8838 | 10:04 | |
| blanson[m] | and then in nova, os flavor set --property 'accel:device_profile=gpu_a40' gpu.a40.xlarge | 10:05 |
| blanson[m] | I have no cyborg capable cluster on hand to test the cli | 10:06 |
| tafkamax | aha I got CUSTOM_NVIDIA_26B9 from trait list | 10:09 |
| tafkamax | i used the trait list on the GPU itself | 10:10 |
| blanson[m] | in resource provider list you should have something like <hostname>_0000:xx:00.0 ? | 10:10 |
| tafkamax | Yes its there | 10:10 |
| tafkamax | it had two traits that and CYBORG_OWNER | 10:10 |
| blanson[m] | then trait list on this you should have CUSTOM_GPU_NVIDIA and CUSTOM_GPU_PRODUCT_XXXXXX ? or something ? | 10:11 |
| tafkamax | just CUSTOM_NVIDIA_26B9 | 10:12 |
| tafkamax | i wonder if its because no nvidia-smi is installed? | 10:12 |
| blanson[m] | I also have, from my notes: add intel/amd_iommu=on, iommu=pt, and bind vfio-pci driver to all GPUs | 10:13 |
| tafkamax | hmm well iommu | 10:18 |
| tafkamax | [ 3.477685] iommu: Default domain type: Translated | 10:18 |
| tafkamax | [ 3.477685] iommu: DMA domain TLB invalidation policy: lazy mode | 10:18 |
| tafkamax | [ 3.524567] pci 0000:c0:00.3: Adding to iommu group 0 | 10:18 |
| tafkamax | and so on... | 10:18 |
| tafkamax | vfio-pci was not enabled on the hypervisor | 10:19 |
| tafkamax | modprobe vfio-pci | 10:20 |
| tafkamax | Create or edit /etc/modprobe.d/vfio.conf and add your IDs: | 10:21 |
| tafkamax | options vfio-pci ids=xxxx:yyyy,xxxx:zzzz | 10:21 |
| tafkamax | Does that sound reasonable? | 10:21 |
| blanson[m] | it probbly does | 10:27 |
| blanson[m] | I remember us going nuclear and patching initramfs to early load it | 10:28 |
| blanson[m] | but it's probbly overkill | 10:28 |
| tafkamax | i will continue this #_oftc_#openstack-cyborg:matrix.org | 10:44 |
| tafkamax | I tried a simple profile with a single trait | 10:44 |
| tafkamax | and it failed aswell | 10:44 |
| opendevreview | Pierre Riteau proposed openstack/ansible-collection-kolla master: Add Docker config option for Prometheus endpoint https://review.opendev.org/c/openstack/ansible-collection-kolla/+/936717 | 11:33 |
| blanson[m] | woooo 2026.1 | 11:58 |
| blanson[m] | mnasiadka: I assume everything is good and we can start have merges again ? | 11:59 |
| mnasiadka | I think let’s hold off for tomorrows images upload - and try running all CI jobs on 2026.1 - and focus on fixing them | 12:00 |
| mnasiadka | In case we’ll need to backport any fixes before doing a final GA release of gazpacho | 12:00 |
| mnasiadka | Because now it’s only rc1 | 12:00 |
| blanson[m] | ack | 12:01 |
| blanson[m] | will try to get ahead of tmrw images and build some in our repo this PM to see if there are any catastrophic failures | 12:01 |
| opendevreview | Pierre Riteau proposed openstack/kayobe master: Add support for --use-test-images option https://review.opendev.org/c/openstack/kayobe/+/987607 | 12:02 |
| butjar_ | We have observed error messages in Horizon in our most recent kolla deployments stating “Too many open files” which manifested as 5xx server errors in the UI. The error appears to have also affected the neutron CI: https://review.opendev.org/c/openstack/neutron/+/971064. Moreover, https://review.opendev.org/c/openstack/kolla-ansible/+/971801 was recently merged, which might fix the issue. | 12:29 |
| butjar_ | After some research we think the issue is related to some changes in docker engine v29, since ulimits where decreased drastically: https://docs.docker.com/engine/release-notes/29/#packaging-updates-9. | 12:31 |
| butjar_ | I'm not sure if the issue is already known, and if this is the wrong place for raising it. Should we file a bug at launchpad for it? | 12:33 |
| opendevreview | Pierre Riteau proposed openstack/kayobe master: Add support for using Kolla test images https://review.opendev.org/c/openstack/kayobe/+/987607 | 12:43 |
| opendevreview | Piotr Milewski proposed openstack/kolla-ansible master: prometheus: extend openstack-exporter service disabling and tuning flags https://review.opendev.org/c/openstack/kolla-ansible/+/987647 | 12:45 |
| opendevreview | Piotr Milewski proposed openstack/kolla-ansible master: prometheus: extend openstack-exporter service disabling and tuning flags https://review.opendev.org/c/openstack/kolla-ansible/+/987647 | 13:13 |
| opendevreview | Pierre Riteau proposed openstack/kayobe master: Add support for using Kolla test images https://review.opendev.org/c/openstack/kayobe/+/987607 | 13:45 |
| tafkamax | blanson: got it working, in the end the compute node was left disabled from maintenance | 14:59 |
| tafkamax | and the query from placement had !COMPUTE_NODE_DISABLED | 14:59 |
| tafkamax | :D | 14:59 |
| tafkamax | vm attached the gpu instantly and nvidia-smi just works | 15:05 |
| blanson[m] | nice, if you ever get vgpus working I'd like to know how | 15:06 |
| blanson[m] | lol | 15:06 |
| tafkamax | We don't want to pay nvidida tax | 15:06 |
| blanson[m] | because I pulled my hair out with them | 15:07 |
| blanson[m] | you could use intel b50 pro | 15:07 |
| tafkamax | we have amd gpu, but i see amd is not supported in the cyborg drivers | 15:07 |
| blanson[m] | and up | 15:07 |
| tafkamax | hmm true | 15:07 |
| tafkamax | that is a good idea | 15:07 |
| blanson[m] | driver is in the kernel | 15:07 |
| blanson[m] | everything is free | 15:07 |
| blanson[m] | cyborg support is ??? | 15:07 |
| blanson[m] | but in theory it's just virtual functions from cyborg pov, should work just fine ? | 15:08 |
| blanson[m] | (says the guy that never tested it) | 15:08 |
| tafkamax | that is actually an very good idea. as we have 1U nodes and we got 2 spare L40S though and just added 1 currently, we will add the 2nd one later on. so the other nodes we coould try the intel b50 | 15:08 |
| tafkamax | and our R&D might use it in their phys machine if umm it doesnt work | 15:08 |
| tafkamax | and if it doesnt it should just PCIE passthrough? | 15:08 |
| mnasiadka | butjar_: we’re setting them in K-A for each container - see https://github.com/openstack/kolla-ansible/blob/627ed9b9a8f8dcb7885de00a956c65bbbb369af0/ansible/group_vars/all/common.yml#L108 | 17:39 |
| opendevreview | Michal Nasiadka proposed openstack/kolla-ansible master: cinder: Copy multipath.conf into cinder-volume container https://review.opendev.org/c/openstack/kolla-ansible/+/987732 | 17:44 |
Generated by irclog2html.py 4.1.0 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!