opendevreview | Jonathan Rosser proposed openstack/openstack-ansible master: Increase number of threads to 2 for glance in AIO https://review.opendev.org/c/openstack/openstack-ansible/+/916810 | 07:51 |
---|---|---|
jrosser_ | o/ morning | 07:53 |
noonedeadpunk | o/ | 07:55 |
noonedeadpunk | so, I see very weird TLS issues since yesterday | 07:56 |
noonedeadpunk | https://zuul.opendev.org/t/openstack/build/2aebbc523beb427495137d5fee6a347f | 07:56 |
noonedeadpunk | which I don't get frankly speaking | 07:56 |
noonedeadpunk | huh, I wonder if it could be related to new chain for Let's Encrypt.... | 08:11 |
noonedeadpunk | Though, unlikely: ERROR: Could not install packages due to an OSError: HTTPSConnectionPool(host='172.29.236.101', port=8181): Max retries exceeded with url: /constraints/upper_constraints_cached.txt | 08:11 |
jrosser_ | that is pretty odd | 08:14 |
jrosser_ | because the task 2 before also has the same --constraint and nothing went wrong there | 08:14 |
jrosser_ | not sure LE is involved in this though? | 08:15 |
jrosser_ | it is more that `/openstack/venvs/utility-28.1.0.dev102/bin/pip` is likely seeing the certifi CA bundle, not the system CA store | 08:16 |
noonedeadpunk | not in that... | 08:16 |
noonedeadpunk | at least should not | 08:16 |
noonedeadpunk | I'm spawning AIO now... | 08:16 |
noonedeadpunk | but we changed nothing there. | 08:17 |
jrosser_ | and we set `REQUESTS_CA_BUNDLE` in order to make that work | 08:17 |
noonedeadpunk | but, it's around time where Root for LE should be changed.... | 08:17 |
noonedeadpunk | yeah | 08:17 |
noonedeadpunk | I actually can't recall if we do use LE for TLS jobs, as we actually could... | 08:17 |
jrosser_ | well remember it is step-ca with a private acme CA, not actual LE for CI jobs | 08:19 |
noonedeadpunk | actually yes, you're right | 08:19 |
noonedeadpunk | we can't issue LE as there's no domain | 08:19 |
jrosser_ | correct | 08:19 |
jrosser_ | and rate limit blah blah too | 08:20 |
noonedeadpunk | and stepca is a separate job | 08:20 |
jrosser_ | so btw i did have a look at 24.04 | 08:20 |
noonedeadpunk | aio almost at utility installation step | 08:20 |
noonedeadpunk | I think everything is terrible there? | 08:20 |
jrosser_ | and somehow lxc is pretty broken, like could not boot 24.04 image on 24.04 host | 08:21 |
jrosser_ | however, i might have another go on a metal deploy instead | 08:21 |
noonedeadpunk | well, given that openstack services known not to work on python3.12 | 08:21 |
noonedeadpunk | and it's not known if py3.12 will be added to 2024.2 even | 08:21 |
jrosser_ | i do think that the debian people have been chipping away at that | 08:22 |
noonedeadpunk | due to quite big libraries got deprecated/not supported for py312 | 08:22 |
noonedeadpunk | yup | 08:22 |
jrosser_ | so even though there is no formal support we might get quite some way, but yes absolutely not something releaseable | 08:22 |
noonedeadpunk | and, I reproduced in AIO | 08:23 |
jrosser_ | i wonder if there is a new pip or something | 08:25 |
noonedeadpunk | well, we constrain it... | 08:26 |
noonedeadpunk | and https://review.opendev.org/c/openstack/openstack-ansible/+/916792 is not merged | 08:27 |
noonedeadpunk | ofc curl https://172.29.236.101:8181/constraints/upper_constraints_cached.txt has no complaints | 08:27 |
jrosser_ | what would be interesting is to dump the env vars just before that task | 08:28 |
jrosser_ | though if it is something like the ssh session has not re-started (and so not loaded REQUESTS_CA_BUNDLE) then i would expect it to run OK second time if you re-run the utility plabook | 08:29 |
jrosser_ | we are probably missing an explicit `meta: reset_connection` after we drop the env vars | 08:30 |
noonedeadpunk | nah, re-run fails as well. | 08:30 |
noonedeadpunk | I think cert looks okeyish? https://paste.openstack.org/show/bUuQ8DrpcXwux83VAqy3/ | 08:31 |
noonedeadpunk | oh, yes | 08:32 |
noonedeadpunk | that is indeed env not being re-loaded | 08:32 |
noonedeadpunk | (or connection not being re-established) | 08:33 |
jrosser_ | should go here perhaps https://opendev.org/openstack/openstack-ansible-openstack_hosts/src/branch/master/tasks/main.yml#L55 | 08:34 |
noonedeadpunk | yeah, as a handler | 08:34 |
jrosser_ | this could easily be an updated sshd or something with different/fixed behaviour | 08:34 |
noonedeadpunk | I kind of wonder how 2023.2 then passing frankly speaking | 09:05 |
noonedeadpunk | or maybe it's not | 09:05 |
noonedeadpunk | https://zuul.opendev.org/t/openstack/build/c007bb5834bc4fd08c3d6a376ad76604 passing at least | 09:06 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible-openstack_hosts master: Re-initiate connection after environment is changed https://review.opendev.org/c/openstack/openstack-ansible-openstack_hosts/+/917589 | 09:14 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible master: Update global pins for 2024.1 https://review.opendev.org/c/openstack/openstack-ansible/+/916792 | 09:15 |
noonedeadpunk | so it's actually smth we merged I believe | 09:18 |
noonedeadpunk | as upgrade jobs fail on N+1 | 09:18 |
jrosser_ | i wonder if it is the pip caching | 09:22 |
jrosser_ | like now it has to actually request that contstraints file every single time | 09:22 |
jrosser_ | where before it might be sat on the disk | 09:23 |
noonedeadpunk | but it was for nginx only? | 09:32 |
noonedeadpunk | ah, or it wasn't | 09:32 |
noonedeadpunk | as it was client caching | 09:32 |
noonedeadpunk | so you think that it could be constraints be fetched by pip during bootstrap-ansible? | 09:33 |
noonedeadpunk | and then cached and passed between venvs? | 09:33 |
jrosser_ | would be interesting to know if `python -m pip cache dir` is the same place for all the venv and also the system pip | 09:35 |
jrosser_ | if that is the case then the cache would be shared | 09:36 |
jrosser_ | andrews patch set an nginx header which told pip not to cache | 09:36 |
noonedeadpunk | yup, you're right here | 09:49 |
noonedeadpunk | https://paste.openstack.org/show/bpmpuGRUPRuj2lT76i09/ | 09:49 |
noonedeadpunk | which explains how disabling cache broken tls jobs... | 09:49 |
noonedeadpunk | that's kinda fun consequence. As then things are broken outside of aio I'd assume | 09:49 |
admin1 | is there a tag to tell osa just to recreate the rabbitmq queues ? | 09:56 |
noonedeadpunk | no, not specifically | 09:59 |
noonedeadpunk | rabbitmq role needs some kind of love to modernize it a bit | 09:59 |
jrosser_ | i guess that then there is some potential different behaviour also on lxc, as there really would be a unique cache per service | 09:59 |
jrosser_ | noonedeadpunk: i think that andrew was going to look at fixing that as part of making a quorum queue migration script | 10:00 |
noonedeadpunk | I was also thinking about modernizing rabbitmq_upgrade thingy | 10:00 |
noonedeadpunk | as with rolling upgrade we shouldn't really need to fully re-install it | 10:01 |
noonedeadpunk | the way we do today | 10:01 |
Ra001 | Hi I am trying to install recent version of openstack with OVN, and I am struggling to configure provider network for Octavia service , here is my configurations https://pastebin.com/01xSnisH | 10:07 |
noonedeadpunk | Ra001: hey there, let me check | 10:10 |
noonedeadpunk | I think it should be `vlan` network, not `flat` | 10:11 |
noonedeadpunk | also /24 might be a bit low just in case, as 1 LB effectivelly occupies 2 IPs from this network | 10:12 |
noonedeadpunk | so you are capped with 128 LBs kinda | 10:12 |
Ra001 | Well yes sure I can increase the range but I am not sure about the vlan part this lb is on vlan network I create linke bond.7 | 10:13 |
Ra001 | Well yes sure I can increase the range but I am not sure about the vlan part this lb is on vlan network I create like bond.7 | 10:13 |
Ra001 | Also after running the playbook ithink openvswitch is trying to create the bridge br-lbaas and is showing error "error: "could not add network device br-lbaas to ofproto (File exists)" | 10:15 |
noonedeadpunk | Ra001: ok, so the thing with octavia is, that this network should be 1. available on controllers and passed to LXC containers. That's where you need to create br-lbaas with netplan | 10:16 |
noonedeadpunk | and 2. On compute/net nodes this same network should be managed with neutron. So you should not try to create it with netplan there | 10:17 |
Ra001 | the eth14 and eth13 used to be dummy interfaces so do I have to use real interface or just use the same configuaration and openvswitch will take care for me? | 10:19 |
noonedeadpunk | so eth14/13 is the interface that will appear inside LXC container | 10:21 |
noonedeadpunk | they should not be relevant for metal hosts | 10:21 |
Ra001 | Also If i have to disable netplan interface then what should be the IP address for the bridge and host_bind_override parameter should be replaced with network_interface: bond.7 I am not sure | 10:21 |
noonedeadpunk | basically, on compute/network nodes, you should not do anything as long as it's provider_networks are configured properly | 10:22 |
noonedeadpunk | you don't need any IP on the bridge itself just in case | 10:22 |
noonedeadpunk | (unless you're doing metal deployment) | 10:22 |
Ra001 | Well I am doing metal deployment , I have services running in lxc but only on infra nodes | 10:24 |
Ra001 | hypervisor nodes are bare servers with some nodes running ceph osd also | 10:25 |
noonedeadpunk | yeah, so, you need to create br-lbaas only on controllers. and you don't anywhere else | 10:27 |
noonedeadpunk | and they don't need to have IPs on them | 10:27 |
noonedeadpunk | ips should be assinged inside lxc containers, but that should be done by playbooks | 10:28 |
jrosser_ | the dummy interfaces are only relevant for AIO/CI cases, not at all for a real deployment | 10:28 |
noonedeadpunk | and still - I think this interface should be either vlan (actually, depending on how you configured other networks, like external one) | 10:28 |
noonedeadpunk | or you need to apply host_bind_override to make it bond.7 | 10:30 |
Ra001 | Yeah but how openvswitch will identify the vlan id , should i define network_interface : bond.7 | 10:30 |
Ra001 | Yeah thats what i am trying to say. Also if i define network_interfce : bond.7 what happens to control nodes | 10:31 |
noonedeadpunk | Ra001: so. for compute/net nodes, this network is managed with Neutron. So basically, octavia role (or you manually) should create a network of type vlan (or flat) with tag 7 for that | 10:32 |
jrosser_ | ^ this is just a regular provider network, like your external network | 10:33 |
noonedeadpunk | we have variables for that: https://opendev.org/openstack/openstack-ansible-os_octavia/src/branch/master/defaults/main.yml#L359-L362 | 10:33 |
Ra001 | Okay let me try | 10:35 |
noonedeadpunk | so this patch doesn't seem to help tls jobs: https://review.opendev.org/c/openstack/openstack-ansible-openstack_hosts/+/917589 :( | 11:29 |
jrosser_ | noonedeadpunk what was your SCENAIO there? i can take a look too | 11:38 |
noonedeadpunk | ./scripts/gate-check-commit.sh aio_metal_tls | 11:39 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible-os_neutron master: Fix OVS/OVN versions for EL https://review.opendev.org/c/openstack/openstack-ansible-os_neutron/+/917192 | 12:43 |
opendevreview | Merged openstack/openstack-ansible-os_neutron stable/2023.2: Add debian package libstrongswan-standard-plugins https://review.opendev.org/c/openstack/openstack-ansible-os_neutron/+/916765 | 12:43 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible-os_neutron master: Update OVN/OVS versions for EL https://review.opendev.org/c/openstack/openstack-ansible-os_neutron/+/917193 | 12:46 |
nixbuilder | What is the correct way to have glance use cinder for image storage. I have followed the instructions and samples but it still does not work. Is that still supported under Bobcat? | 12:53 |
nixbuilder | https://paste.openstack.org/show/bzipGD83lhGX15Gc1tBB/ | 12:55 |
noonedeadpunk | nixbuilder: I think this should have worked | 12:56 |
noonedeadpunk | we kinda have a job for that as well: https://zuul.opendev.org/t/openstack/build/50d01769c7c94479a423be18db9e39b8 | 12:57 |
noonedeadpunk | though question is - what driver do you use for cinder? | 12:57 |
noonedeadpunk | as in case of iscsi there might be connection questions between glance and cinder | 12:58 |
noonedeadpunk | as this job works only because cinder_volume and glance are on the same host with LVM_iSCSI driver | 12:59 |
nixbuilder | noonedeadpunk: This is what I use... https://paste.openstack.org/show/bS3q6pjTvHaL95WMRVqM/. Creating/deleting volumes works fine. Just not creating images... I keep getting '500 Internal Server Error' when uploading images. | 13:02 |
noonedeadpunk | you get 500 from glance? anything there in logs? | 13:03 |
nixbuilder | noonedeadpunk: Here are the logs... I don't see a whole lot there to give me a clue... probably missing something https://paste.openstack.org/show/bQWqXsH6rMeMtZwVOHVk/ | 13:07 |
nixbuilder | noonedeadpunk: The SAN logs shows the volume being created (volume-ef88e185-b512-4f76-a0bd-ab77148c292c) but then gets deleted five seconds after it gets created :-( | 13:10 |
noonedeadpunk | nixbuilder: so, this record should have some stack trace in glance logs | 13:11 |
noonedeadpunk | Apr 30 07:52:45 infra01 haproxy[178643]: 172.29.236.100:43892 [30/Apr/2024:07:52:39.941] glance_api-front-2 glance_api-back/infra01 0/0/0/5533/5533 500 | 13:11 |
jrosser_ | noonedeadpunk: so i also reproduced the pip ssl error | 13:16 |
jrosser_ | but i then do things like re-run utility-install a couple of times, wait a bit whilst i read some docs etc | 13:17 |
jrosser_ | and then it just worked /o\ | 13:17 |
noonedeadpunk | huh | 13:18 |
jrosser_ | and now if i delete ansible facts, remove the utility venv, delete .pip/cache, delete /var/www/repo | 13:19 |
jrosser_ | it still works ok, so there is some odd first-time issue | 13:19 |
noonedeadpunk | well, it's aligned with idea of connection persistence | 13:25 |
nixbuilder | noonedeadpunk: After thinking about it some I decided to look at the glance rootwrap.conf file and lo and behold... the path problem reared it's ugly head again...https://paste.openstack.org/show/bWmYAZ9j7Oen0wLPJiNE/ | 13:32 |
noonedeadpunk | nixbuilder: oh, huh, thta;s surprising | 13:34 |
nixbuilder | nonedeadpunk: However if I don't try to use the SAN and let it store the image in a file, then it works just fine. | 13:35 |
jrosser_ | i wonder if this actually is in the 28.x release https://review.opendev.org/c/openstack/openstack-ansible-os_glance/+/901562 | 13:36 |
jrosser_ | looks like it is | 13:37 |
nixbuilder | jrosser_: Yep... that looks right... I'll apply that patch before re-installing and see if that fixes it. | 13:40 |
jrosser_ | nixbuilder: well - i think you should have it already unless i've missed something? | 13:41 |
jrosser_ | from your paste you look to be running openstack-ansible tag 28.2.0 | 13:42 |
noonedeadpunk | it should be from 28.0.0 I guess? | 13:44 |
nixbuilder | jrosser_: Yes... 28.2.0. But I guess I need to set that 'glance_rootwrap_conf_overrides' from now on. | 13:45 |
jrosser_ | i don;t think so https://review.opendev.org/c/openstack/openstack-ansible-os_glance/+/901562/1/vars/main.yml | 13:45 |
jrosser_ | it is combined with the defaults from _glance_rootwrap_conf_overrides i think | 13:46 |
noonedeadpunk | yeah, but this one was cut with stable/2023.2? https://review.opendev.org/c/openstack/openstack-ansible-os_glance/+/900930 | 13:47 |
jrosser_ | right, so i think this was included by default in all 2023.2 releases | 13:48 |
jrosser_ | i'm confused about why the file is not correct in that case | 13:48 |
jrosser_ | nixbuilder: does this make sense? we think that patch should already be applied if you are running 28.2.0 | 13:50 |
jrosser_ | so that could be for example, changing the point you have checked out the openstack-ansible repo and forgetting to `bootstrap-ansible.sh` to get the correct versions of the roles in place? | 13:51 |
nixbuilder | jrosser_: I did run `bootstrap-ansible.sh` and in fact the 'glance_rootwrap_conf_overrides' is in the 'roles' directory... so the patch is there but it didn't take for some reason. | 13:53 |
jrosser_ | it would be really helpful if you could try to see why | 13:54 |
nixbuilder | jrosser_: I will try... shortly :-D | 13:55 |
noonedeadpunk | yeah, that is really surprising | 14:00 |
noonedeadpunk | #startmeeting openstack_ansible_meeting | 15:00 |
opendevmeet | Meeting started Tue Apr 30 15:00:38 2024 UTC and is due to finish in 60 minutes. The chair is noonedeadpunk. Information about MeetBot at http://wiki.debian.org/MeetBot. | 15:00 |
opendevmeet | Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. | 15:00 |
opendevmeet | The meeting name has been set to 'openstack_ansible_meeting' | 15:00 |
noonedeadpunk | #topic rollcall | 15:00 |
noonedeadpunk | o/ | 15:00 |
jrosser_ | o/ hello | 15:00 |
noonedeadpunk | #topic office hours | 15:02 |
noonedeadpunk | So, except weird current issues with tls job, one thing I wanted to discuss, if we should backport https://review.opendev.org/c/openstack/openstack-ansible-os_neutron/+/917192 | 15:03 |
noonedeadpunk | as there're mixed feelings about it | 15:04 |
noonedeadpunk | I totally don't like that OVN/OVS will get upgraded on it's own | 15:04 |
noonedeadpunk | but then it won't help on current versions | 15:05 |
jrosser_ | did we get any info about why these things were upgraded on a stable branch? | 15:05 |
NeilHanlon | hiya | 15:05 |
noonedeadpunk | and in case of minor upgrade - it might be even worse | 15:05 |
noonedeadpunk | I talked to rdo folks today, and kinda boiled down - neutron core recommended to use latter version of it | 15:05 |
noonedeadpunk | there're some comments in the patch above | 15:05 |
noonedeadpunk | along with RH ticket | 15:06 |
NeilHanlon | thanks for the link.. will look into this a bit more, too | 15:07 |
NeilHanlon | i do think it makes sense to pin versions to not... surprise people | 15:07 |
noonedeadpunk | yeah, true. though if we should backport - is tricky I think | 15:07 |
NeilHanlon | yeah.. i think it is hard to say without knowing how many people are using OSA and targeting RHEL or Rocky | 15:08 |
NeilHanlon | _and_ using distro path | 15:08 |
noonedeadpunk | NeilHanlon: btw, maybe you know... If we install specific version - I assume it means that it won't be auto-upgraded to the newer one? | 15:08 |
NeilHanlon | not on its own, you'd have to also add excludepkgs | 15:08 |
noonedeadpunk | and it's regardless distro path | 15:08 |
NeilHanlon | oh, right.. we install this on source path too.. nvm | 15:09 |
jrosser_ | yes like UCA, some things always come from RDO | 15:09 |
jrosser_ | even for source installs | 15:09 |
* NeilHanlon forgot about that particular nuance | 15:09 | |
noonedeadpunk | so I was suggested to use `excludepkgs=*rdo-openvswitch-3.[!2]*,rdo-ovn*-23.0[!9]*,rdo-ovn*-2[!3]*` | 15:09 |
* jrosser_ parses | 15:10 | |
noonedeadpunk | but then we can potentially do it only for trunk install.. which is the only working one though :D | 15:10 |
NeilHanlon | :D | 15:10 |
NeilHanlon | yeah i think that looks okay... though tbh I didn't know until just now that excludepkgs supported that format | 15:10 |
noonedeadpunk | I never tried it tbh | 15:11 |
noonedeadpunk | But also I somehow thought that it shouldn't be needed... | 15:11 |
NeilHanlon | there is a 'versionlock' plugin which can be used, but by default, dnf/yum will update to the latest version contained in the repository | 15:11 |
noonedeadpunk | as I'd assume that ovn23.09 can not or should not be upgraded to ovn24.03 | 15:11 |
noonedeadpunk | as these should be kinda conflicting ones... | 15:12 |
NeilHanlon | let me check how they're configured.. | 15:12 |
NeilHanlon | that's a good point though, they are named differently, not just versioned differently | 15:13 |
noonedeadpunk | yeah | 15:13 |
noonedeadpunk | though rdo-ovn-central is named same... | 15:13 |
NeilHanlon | i think it's the rdo-ovn-XXX stuff that does the upgrading | 15:13 |
noonedeadpunk | So I'm very confused, but also got rusty with EL | 15:14 |
noonedeadpunk | yeah | 15:14 |
noonedeadpunk | ok, yes, DNF update does replace things... | 15:14 |
NeilHanlon | https://git.centos.org/rpms/rdo-openvswitch/blob/c9s-sig-cloud-openstack-caracal/f/SPECS/rdo-openvswitch.spec | 15:14 |
noonedeadpunk | https://paste.openstack.org/show/b9hX0edVOSAnxF5LMgxm/ | 15:15 |
NeilHanlon | are we targeting D now, or C? | 15:15 |
noonedeadpunk | Provides: ovn | 15:15 |
noonedeadpunk | C | 15:15 |
noonedeadpunk | or well. B even | 15:15 |
noonedeadpunk | we haven't merged B->C switch yet | 15:15 |
NeilHanlon | yeah it looks like when they do a new version, they make they obsolete the old ones | 15:16 |
noonedeadpunk | ok, so in fact that version pinning doesn't make too much sense as they're upgraded right away | 15:16 |
NeilHanlon | but i don't know why we'd be getting 24.3 now, tbh.. | 15:16 |
noonedeadpunk | they backported it to all stable branches as well | 15:16 |
noonedeadpunk | afaik | 15:17 |
NeilHanlon | right but I don't see ovn24.03 in their rdo-openvswitch package until D | 15:17 |
NeilHanlon | https://git.centos.org/rpms/rdo-openvswitch/blob/c9s-sig-cloud-openstack-dalmatian/f/SPECS/rdo-openvswitch.spec#_9 | 15:17 |
noonedeadpunk | ah, it's not released yet | 15:17 |
noonedeadpunk | but it's already present in trunk | 15:17 |
* NeilHanlon confused | 15:18 | |
noonedeadpunk | we don't install it through cloudsig | 15:18 |
NeilHanlon | ah | 15:18 |
NeilHanlon | i forgot how this RDO/cloud sig stuff works | 15:18 |
NeilHanlon | but i am remembering now | 15:18 |
noonedeadpunk | so we just fetch https://trunk.rdoproject.org/centos9-bobcat/current/delorean.repo | 15:20 |
noonedeadpunk | and place under yum.repos.d | 15:20 |
noonedeadpunk | which is pre-cloudsig release kinda | 15:21 |
noonedeadpunk | would make sense to do cloudsig package install ofc on production, BUT, they do also ship tons of dependencies | 15:21 |
noonedeadpunk | like rabbitmq, ceph, messaging repos, etc | 15:21 |
noonedeadpunk | so really smth we'd prefer not to have | 15:22 |
noonedeadpunk | so rdo-ovn replacing ovn23.09.x86_64 really neglects any pinning attempt kinda... | 15:23 |
noonedeadpunk | NeilHanlon: what would you suggest?:) | 15:23 |
NeilHanlon | vodka | 15:23 |
noonedeadpunk | hahaha | 15:23 |
NeilHanlon | I will follow up with amorlej and see what we can come up with :D | 15:24 |
noonedeadpunk | ok, awesome, thanks for that | 15:24 |
* NeilHanlon bows | 15:24 | |
noonedeadpunk | as in fact I'm a bit lost in options as it all looks a bit confusing | 15:24 |
NeilHanlon | yeah, some of this requires a Doctorate in RPMology | 15:25 |
noonedeadpunk | as a matter of fact I know for sure that such upgrade between releases is executed and user is unaware - may lead to all kinds of weird things and downtimes | 15:25 |
noonedeadpunk | like in APT I would do just pin to version and forget about that I guess... | 15:25 |
noonedeadpunk | though UCA close to never does like this... | 15:26 |
noonedeadpunk | ok | 15:28 |
noonedeadpunk | regarding TLS failure - so far it's very weird. Though reproducible, which is nice | 15:29 |
jrosser_ | do we have an LXC equivalent job? | 15:31 |
noonedeadpunk | no, I don't think so | 15:31 |
jrosser_ | hmm i might try that too | 15:31 |
jrosser_ | so far i don't find anything | 15:31 |
noonedeadpunk | if I was to guess - I think it would work | 15:31 |
jrosser_ | it is a shame that the meta task does not seem to print any useful output | 15:32 |
jrosser_ | not OK / Changed / Skipped, just nothing | 15:32 |
noonedeadpunk | as we do restart container | 15:32 |
jrosser_ | which does make me wonder if it actually does anything | 15:32 |
noonedeadpunk | I kinda think if we should instead pass that somehow in python_venv_build | 15:35 |
jrosser_ | the CA bundle location? | 15:35 |
noonedeadpunk | yeah | 15:36 |
jrosser_ | i was looking at the pip module source and it uses module_utils.run_command | 15:36 |
noonedeadpunk | as we do same for uwsgi | 15:36 |
jrosser_ | which kind of a long way from being command/shell module tbh | 15:36 |
noonedeadpunk | https://opendev.org/openstack/ansible-role-uwsgi/src/branch/master/vars/debian.yml#L36 | 15:36 |
noonedeadpunk | but then I did logout from VM - login again and it worked | 15:37 |
noonedeadpunk | (as long as env was reloaded) | 15:37 |
jrosser_ | i am just trying a second run of gate-check-commit in the same shell again | 15:37 |
jrosser_ | but ~30mins between tries | 15:38 |
noonedeadpunk | are env cached in facts? | 15:40 |
jrosser_ | they are | 15:41 |
jrosser_ | we could perhaps use ansible_env to set environment: on the task | 15:41 |
jrosser_ | which is pretty wierd construct tbh | 15:42 |
noonedeadpunk | yeah, and have risk of overriding thing | 15:42 |
jrosser_ | huh now it worked on my second run of gate-check-commit | 15:43 |
jrosser_ | no logout | 15:43 |
noonedeadpunk | though I was thinking as if instead reset_connection correct thing would be to clear_facts | 15:43 |
jrosser_ | i would have been much longer than controlpersist timeout there | 15:44 |
jrosser_ | between first and second run | 15:44 |
jrosser_ | this looks massively out of date https://opendev.org/openstack/openstack-ansible/src/branch/master/tests/roles/bootstrap-host/templates/user_variables.aio.yml.j2#L283 | 15:51 |
noonedeadpunk | I found this today: https://opendev.org/openstack/openstack-ansible-openstack_hosts/src/branch/master/vars/redhat-9.yml#L73 | 15:52 |
noonedeadpunk | btw, I will be on OpenInfra Days in Sweden next week on Tuesday | 15:53 |
noonedeadpunk | which means I won't be around on Monday and Wednesday either... | 15:53 |
NeilHanlon | lol at openstack queens | 15:54 |
NeilHanlon | i can run next week, if we still want to meet | 15:54 |
noonedeadpunk | Damian would be also away just in case... | 15:54 |
noonedeadpunk | so potentially makes sense to skip, despite we're coming very close to the release due date | 15:55 |
noonedeadpunk | Which is Jun 03 | 15:55 |
noonedeadpunk | basically in a month... | 15:56 |
jrosser_ | hmm ok so we need to make sure we prioritise | 15:58 |
jrosser_ | and just for more /o\ i think there will be zed moved to unmaintained | 15:58 |
jrosser_ | still need to finish victoria which is close to working again | 15:59 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible master: Add Tempest test for OVN Octavia driver https://review.opendev.org/c/openstack/openstack-ansible/+/916872 | 16:00 |
noonedeadpunk | well, I've tried to add ovn provider testing for octavia ^ | 16:00 |
noonedeadpunk | then capi I think is pretty much ready | 16:01 |
jrosser_ | there is messaging improvement stuff on all roles (?) too | 16:01 |
noonedeadpunk | for skyline 1 path about yarn left: https://review.opendev.org/c/openstack/openstack-ansible-os_skyline/+/914405 | 16:01 |
noonedeadpunk | yeah | 16:01 |
noonedeadpunk | this one I aim to finish pushing by end of the week | 16:02 |
noonedeadpunk | I paused just because gates are borked anyway... | 16:02 |
noonedeadpunk | and, ovn-bgp-agent is the last thing I guess? | 16:03 |
noonedeadpunk | ah, and we're out of time | 16:03 |
noonedeadpunk | #endmeeting | 16:03 |
opendevmeet | Meeting ended Tue Apr 30 16:03:44 2024 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) | 16:03 |
opendevmeet | Minutes: https://meetings.opendev.org/meetings/openstack_ansible_meeting/2024/openstack_ansible_meeting.2024-04-30-15.00.html | 16:03 |
opendevmeet | Minutes (text): https://meetings.opendev.org/meetings/openstack_ansible_meeting/2024/openstack_ansible_meeting.2024-04-30-15.00.txt | 16:03 |
opendevmeet | Log: https://meetings.opendev.org/meetings/openstack_ansible_meeting/2024/openstack_ansible_meeting.2024-04-30-15.00.log.html | 16:03 |
NeilHanlon | ope, apparently my VM decided to die just then 🙃 | 16:15 |
spatel | can we configure multiple AZ for single cinder-volume ? | 16:54 |
noonedeadpunk | spatel: good question | 16:55 |
noonedeadpunk | I'm not sure about that | 16:55 |
noonedeadpunk | spatel: you can define that per storage driver | 16:56 |
spatel | I am deal with strange issue where my k8s running on A availability zone and cinder in different AZ | 16:56 |
spatel | I have multiple compute AZ but storage is shared between all of them :( | 16:56 |
noonedeadpunk | and should it be available cross-az? | 16:56 |
noonedeadpunk | I'm not sure it's "issue" | 16:56 |
noonedeadpunk | as then you have cinder cross-attach | 16:56 |
noonedeadpunk | which means that cinder volume can be in any az | 16:57 |
spatel | Trying to understand where to configure cross-az ? | 16:57 |
noonedeadpunk | it's either true or false | 16:57 |
spatel | where should I set that config? | 16:57 |
noonedeadpunk | https://docs.openstack.org/nova/latest/configuration/config.html#cinder.cross_az_attach | 16:57 |
spatel | in cinder.conf? | 16:58 |
spatel | or magnum ? | 16:58 |
noonedeadpunk | that is nova config | 16:58 |
noonedeadpunk | don't know about magnum | 16:58 |
spatel | By default there is no availability zone restriction on volume attach. | 16:59 |
noonedeadpunk | yup, that's right | 16:59 |
spatel | I have submitted issue here - https://github.com/vexxhost/magnum-cluster-api/issues/366 | 17:00 |
spatel | Look like k8s doesn't like multi-AZ | 17:00 |
spatel | k9s csi plugin set based on where compute AZ configured | 17:01 |
noonedeadpunk | have no clue about that | 17:01 |
spatel | k8s CSI plugin need some patch to tell it not to follow worker/compute AZ rule. | 17:01 |
noonedeadpunk | though it seems you got an answer there | 17:01 |
spatel | But still we need patch in magnum / magnim-capi | 17:02 |
noonedeadpunk | but frankly - I'm not sure how to even sort this out | 17:02 |
jrosser_ | it is the config file for CSI plugin that needs patch | 17:02 |
jrosser_ | not the plugin itself | 17:02 |
noonedeadpunk | as ideally, I guess you'd want to have volume created in a same az where vm runs | 17:02 |
noonedeadpunk | though I absolutely sure csi does not care about it | 17:02 |
noonedeadpunk | so you will get absolutely random assignment for volume there | 17:03 |
spatel | If CSI AZ and Worker AZ is different then you can't attach volume to pod | 17:03 |
noonedeadpunk | which is slightly different from behaviour when creating VM at the first place, where az will attempt to be respected | 17:03 |
spatel | You will get anti-affinity error | 17:03 |
spatel | https://paste.opendev.org/show/bJg4klKcPC6GmAhp9dvk/ | 17:04 |
spatel | Right now I am manually changing this value - topology.cinder.csi.openstack.org/zone=general | 17:05 |
jrosser_ | as far as i can see you have the answer in your github issue | 17:05 |
spatel | After magnum or mcapi patch we can override it with k8s-template or label | 17:06 |
spatel | I think its a hack but not a long term solution | 17:06 |
spatel | How do i tell CSI to change your its AZ ? | 17:07 |
jrosser_ | right, so the routes are 1) you patch this yourself and make a PR (it is open source after all) 2) wait for someone else to do that for you | 17:07 |
spatel | I am just trying to understand how many components required to patch | 17:08 |
spatel | I believe CSI and Magnum both need a patch | 17:08 |
spatel | magnum need to patch to accept label (Ex: csi_avaibility_zone) or something and we need to pass this value to CSI plugin to set AZ according in k8s during deployment | 17:09 |
spatel | I may be wrong just trying to understand workflow | 17:10 |
nixbuilder | OK... so I have re-installed once more from scratch and still uploading images to glance fails... I went in and fixed the rootwrap.conf file, restarted glance-api and images now upload just fine... here is what I have documented https://paste.openstack.org/show/bOvckMH64N5pIykPc12A/ | 17:22 |
noonedeadpunk | that is very weird | 17:24 |
noonedeadpunk | I've checked couple of my latest AIOs and they do have correct rootwrap... | 17:25 |
jrosser_ | nixbuilder: it would be just as valuable to re-run the glance playbook and provide the output here for the relevant tasks, as to reinstall | 17:26 |
noonedeadpunk | and I do see it in CI as well https://zuul.opendev.org/t/openstack/build/f1d435c814034268b0dd42cbedb6ffc9/log/logs/etc/openstack/aio1-glance-container-b0e673b6/glance/rootwrap.conf.txt#20 | 17:26 |
jrosser_ | what we need to understand is where the content that gets into your rootwrap file is coming from, and why it's wrong | 17:27 |
noonedeadpunk | sounds like there might be wrong ordering or smth | 17:27 |
noonedeadpunk | but I don't see anything obvious | 17:28 |
noonedeadpunk | (like adjusting 1 file and then symlinking smth completely different) | 17:28 |
jrosser_ | so for example, dropping in a debug: task immediately before that file is templated, with `_glance_rootwrap_conf_overrides` and `glance_core_files` displayed would be helpful | 17:29 |
jrosser_ | and seeing this task running with -vvv would be interesting https://opendev.org/openstack/openstack-ansible-os_glance/src/branch/master/tasks/glance_post_install.yml#L126 | 17:31 |
noonedeadpunk | and in fact - yes, /etc/glance is a symlink | 17:31 |
noonedeadpunk | https://opendev.org/openstack/openstack-ansible-os_glance/src/branch/master/tasks/glance_install.yml#L86-L95 | 17:32 |
noonedeadpunk | but then by time rootwrap change is run, it should be linked already properly | 17:35 |
noonedeadpunk | (although better to re-define glance_etc_dir to be inside venv) | 17:36 |
opendevreview | Jonathan Rosser proposed openstack/openstack-ansible-ops master: Add CAPI job to OPS repo https://review.opendev.org/c/openstack/openstack-ansible-ops/+/916640 | 17:42 |
opendevreview | Jonathan Rosser proposed openstack/openstack-ansible-ops master: Test all supported versions of k8s workload cluster with magnum-cluster-api https://review.opendev.org/c/openstack/openstack-ansible-ops/+/916649 | 17:44 |
jrosser_ | this is all really very complicated with the way the service config files are templated | 17:45 |
jrosser_ | i understand that it is some kind of optimisation | 17:45 |
jrosser_ | but it really is unobvious what is happening | 17:45 |
noonedeadpunk | so the only way I can technically explain that weirdness would be smth being off with symlink to /etc/glance | 17:49 |
noonedeadpunk | but then whole glance.conf should be off... | 17:49 |
noonedeadpunk | nixbuilder: what's the output of ls -l /openstack/venvs/glance-*/etc/glance/rootwrap.conf ? | 17:51 |
nixbuilder | nonedeadpunk: '-rw-r--r-- 1 root root 1079 Apr 30 12:13 /openstack/venvs/glance-28.2.0/etc/glance/rootwrap.conf' | 17:52 |
noonedeadpunk | huh | 17:53 |
noonedeadpunk | I have it as 640 owned by root:glance.... | 17:53 |
noonedeadpunk | and I found broken permissions for /etc/glance/rootwrap.d ... | 17:55 |
nixbuilder | noonedeadpunk: For some reason... that file '/openstack/venvs/glance-28.2.0/etc/glance/rootwrap.conf' is correct... hmmm... but the operative file '/etc/glance/rootwrap.conf' remains wrong. When I edit that file, uploading images works. | 17:56 |
noonedeadpunk | what is `stat /etc/glance`? | 17:57 |
noonedeadpunk | as it should be a symlink `File: /etc/glance -> ../openstack/venvs/glance-28.1.0.dev87/etc/glance` | 17:57 |
noonedeadpunk | or you're having distro install? | 17:57 |
nixbuilder | noonedeadpunk: OK... /etc/glance is a symlink to '/openstack/venvs/glance-28.2.0/etc/glance' | 17:59 |
nixbuilder | noonedeadpunk: No... my install is from source" | 18:00 |
nixbuilder | nonedeadpunk: 'install_method: source' | 18:00 |
noonedeadpunk | but /etc/glance/rootwrap.conf and /openstack/venvs/glance-28.2.0/etc/glance/rootwrap.conf are different despite these are symlinks???? | 18:00 |
nixbuilder | noonedeadpunk: No... I did not realize about the symlink... so forget what I said about there being a difference | 18:02 |
nixbuilder | noonedeadpunk: I also re-ran the glance install to catpure the output for you and no changes were made. I ran it again with '-t glance-config' and still no changes. That's why I re-install from scratch because sometime re-running playbooks does not change things. | 18:04 |
jrosser_ | ansible is idempotent, so if there is nothing to change, it won't | 18:09 |
jrosser_ | i am still not really understanding if the file on the disk is right, or wrong now | 18:09 |
nixbuilder | jrosser: Since I manually changed the file, it is correct. | 18:12 |
jrosser_ | right - so we still need to understand why it was coming out wrong in the first place | 18:13 |
noonedeadpunk | and why we don't see that in CI | 18:13 |
jrosser_ | nixbuilder: currently you have the reproducer for whatver is going on here :) | 18:13 |
nixbuilder | jrosser_: Stand by... running os-glance-install | 18:16 |
nixbuilder | jrosser_: So I just removed my edit to rootwrap.conf and set the file back to what it was after the initial install, then ran 'openstack-ansible os-glance-install.yml -t glance-config -vvv | tee ~/logs/glance-debug.log' expecting that the playbook would pickup my 'glance_rootwrap_conf_overrides' and it did not. | 18:23 |
nixbuilder | jrosser_: Trying to figure a way to get that file into paste | 18:24 |
nixbuilder | jrosser_: I mean the log file I generated | 18:25 |
noonedeadpunk | you can try dropping `/openstack/venvs/glance-28.1.0.dev87/etc/glance` and then run the role with `-e venv_rebuild=true` | 18:25 |
nixbuilder | noonedeadpunk: OK... trying that now | 18:29 |
jrosser_ | just to be sure you’ve not defined glance_rootwrap_config_overrides, your comment suggests you have set some variable for that | 18:29 |
jrosser_ | the default for that var is empty {} | 18:30 |
nixbuilder | jrosser_: It's empty... re-running now | 18:32 |
jrosser_ | ok no worries, we just have to be very precise here or there will be confusion | 18:32 |
noonedeadpunk | how in the world you can reproduce that.... | 18:33 |
noonedeadpunk | maybe I indeed should spawn 28.2.0 explicitly... | 18:33 |
noonedeadpunk | seems we need to spread https://opendev.org/openstack/openstack-ansible-os_cinder/commit/f5ba9b5b9a88ab09f3d6afded011976525c863a3 | 18:36 |
nixbuilder | noonedeadpunk: Which file/task/role sets the contents of the rootwrap.conf? | 18:41 |
nixbuilder | noonedeadpunk: I am trying to help narrow down where this is failing :-D | 18:41 |
nixbuilder | noonedeadpunk: BTW... running 'openstack-ansible -e venv_rebuild=true os-glance-install.yml | tee ~/logs/glance-debug.log' fails to set the proper exec_dir path. | 18:42 |
noonedeadpunk | this one: https://opendev.org/openstack/openstack-ansible-os_glance/src/branch/master/tasks/glance_post_install.yml#L126-L138 | 18:43 |
nixbuilder | noonedeadpunk: OK... thanks. | 18:45 |
jrosser_ | nixbuilder: secret weapon is codesearch.opendev.org | 18:50 |
jrosser_ | put the variable name into that and you will be almost stright to where it is defined / used | 18:50 |
nixbuilder | noonedeadpunk: Ok... so here is the relevant output from my latest install of os-glance (https://paste.openstack.org/show/bkIbH18PlFemDCwHQdN3/) I hope this helps narrow this down. | 18:51 |
noonedeadpunk | um | 18:52 |
noonedeadpunk | rootwrap is not there... | 18:52 |
noonedeadpunk | it feels really that https://opendev.org/openstack/openstack-ansible-os_glance/commit/c2428ab8da9cc3868b5ae86140a63e4a33e28eca is not there | 18:53 |
nixbuilder | noonedeadpunk: This was my command line... 'openstack-ansible -e venv_rebuild=true os-glance-install.yml -vvvv | tee ~/logs/glance-debug.log' and then I searched for the 'Copy common config' | 18:54 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible-os_glance master: Set correct permissions for rootwrap.d https://review.opendev.org/c/openstack/openstack-ansible-os_glance/+/917785 | 18:55 |
noonedeadpunk | nixbuilder: task is correct. but it doesn't attempt to manage rootwrap config | 18:55 |
noonedeadpunk | like either os_glance role is older then expected | 18:56 |
noonedeadpunk | or glance_core_files somehow overriden | 18:56 |
noonedeadpunk | nixbuilder: can you check that on deploy host /etc/ansible/roles/os_glance/vars/main.yml contains rootwrap.conf as part of glance_core_files? | 18:56 |
jrosser_ | ^ also in that directory please do `git log` and let us know what the hash of the latest commit is | 18:57 |
nixbuilder | jrosser:_:noonedeadpunk: Answers to both here https://paste.openstack.org/show/b3RExIQPkbqC7ELihhs2/ | 19:03 |
noonedeadpunk | I really don't get what's going on | 19:05 |
noonedeadpunk | sha apparently is correct | 19:05 |
noonedeadpunk | nixbuilder: but do you have this specific block? https://opendev.org/openstack/openstack-ansible-os_glance/src/commit/d6f5f69bf9efee901a6d0c2b6e700bc2ad782dbe/vars/main.yml#L91-L97 | 19:06 |
nixbuilder | noonedeadpunk: Sorry... did not get all the paste the first time :) https://paste.openstack.org/show/bG3puj6EpQ26WAEK23rK/ | 19:06 |
noonedeadpunk | nixbuilder: ok, I have idea :D you're doing that all times on the same host, right? | 19:06 |
nixbuilder | nooneddeadpunk: Yes... I started out with AIO and then gradually worked in my changes to fit our environment. | 19:07 |
noonedeadpunk | so, do you apparently have /root/.ansible/roles folder? | 19:08 |
nixbuilder | noonedeadpunk: No | 19:09 |
noonedeadpunk | ugh | 19:09 |
nixbuilder | noonedeadpunk: The /root/.ansible folder exists... but no roles | 19:10 |
noonedeadpunk | yeah, and I jsut looked at you previous paste and `task path: /etc/ansible/roles/os_glance/tasks/glance_post_install.yml:111` | 19:11 |
noonedeadpunk | so I hoped that maybe you have os_glance coming from some different path | 19:11 |
noonedeadpunk | as this really doesn't add up for me | 19:11 |
noonedeadpunk | so, you have rootwrap.conf in glance_core_files, but then when you iterate over them here https://paste.openstack.org/show/bkIbH18PlFemDCwHQdN3/ - it's not there | 19:12 |
noonedeadpunk | /o\ | 19:12 |
jrosser_ | we never did print out glance_core_files though? | 19:12 |
jrosser_ | as a debug | 19:12 |
nixbuilder | noonedeadpunk: https://paste.openstack.org/show/bc0IG9hfQSV4NKKAHRJg/ | 19:13 |
jrosser_ | nixbuilder: can you try something like this as the first tasks in /etc/ansible/roles/os_glance/tasks/main.yml https://paste.opendev.org/show/bAGTxq1DXLSYWi0SGWAM/ | 19:16 |
nixbuilder | jrosser_: https://paste.openstack.org/show/bDc2GEX15jalvY9KPgoX/ | 19:22 |
jrosser_ | hmm sorry you might need to adjust the point those tasks are to get all the vars in scope | 19:23 |
jrosser_ | but even so, do you see that there are references to rootwrap there? | 19:24 |
nixbuilder | jrosser_: Second time... https://paste.openstack.org/show/b9A7uAQRL1dWhAP2rKwr/ | 19:28 |
noonedeadpunk | how that is even possible | 19:28 |
jrosser_ | errrr wtf | 19:28 |
noonedeadpunk | nixbuilder: don't you have some override for glance_core_files ?:) | 19:28 |
noonedeadpunk | oh | 19:29 |
jrosser_ | what OS is this | 19:29 |
noonedeadpunk | wait a second | 19:29 |
noonedeadpunk | Is it Rocky? | 19:29 |
noonedeadpunk | https://opendev.org/openstack/openstack-ansible-os_glance/src/branch/master/vars/redhat.yml#L36-L41 | 19:29 |
* jrosser_ thinking the same :) | 19:29 | |
noonedeadpunk | why in the world we have that /o\ | 19:29 |
nixbuilder | Yes... Rocky | 19:30 |
noonedeadpunk | ok, that was useful | 19:31 |
noonedeadpunk | as that;s totally a bug | 19:31 |
nixbuilder | nonedeadpunk: No overrides for 'glance_core_files' | 19:32 |
noonedeadpunk | yeah, I found it | 19:34 |
jrosser_ | good job we put that debug at the top of the file :) | 19:35 |
nixbuilder | noonedeadpunk: I'm glad you found it... I was beginning to think I was crazy =-O | 19:36 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible-os_glance master: Fix rootwrap.conf distribution for EL https://review.opendev.org/c/openstack/openstack-ansible-os_glance/+/917787 | 19:37 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible-os_glance master: Fix rootwrap.conf distribution for EL https://review.opendev.org/c/openstack/openstack-ansible-os_glance/+/917787 | 19:37 |
noonedeadpunk | nixbuilder: this should be fixing it ^ | 19:37 |
noonedeadpunk | I had the same feeling frankly speaking... | 19:37 |
noonedeadpunk | thanks a lot for your time and help narrowing this down | 19:38 |
nixbuilder | noonedeadpunk:jrosser_: No problem... and thank you guys for taking the time to help find this. | 19:39 |
nixbuilder | Success... after applying the patches, glance installed normally https://paste.openstack.org/show/bIpTJUq03NqecSqi5vet/ | 20:12 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!