Tuesday, 2024-04-30

opendevreviewJonathan Rosser proposed openstack/openstack-ansible master: Increase number of threads to 2 for glance in AIO  https://review.opendev.org/c/openstack/openstack-ansible/+/91681007:51
jrosser_o/ morning07:53
noonedeadpunko/07:55
noonedeadpunkso, I see very weird TLS issues since yesterday07:56
noonedeadpunkhttps://zuul.opendev.org/t/openstack/build/2aebbc523beb427495137d5fee6a347f07:56
noonedeadpunkwhich I don't get frankly speaking07:56
noonedeadpunkhuh, I wonder if it could be related to new chain for Let's Encrypt....08:11
noonedeadpunkThough, unlikely: ERROR: Could not install packages due to an OSError: HTTPSConnectionPool(host='172.29.236.101', port=8181): Max retries exceeded with url: /constraints/upper_constraints_cached.txt08:11
jrosser_that is pretty odd08:14
jrosser_because the task 2 before also has the same --constraint and nothing went wrong there08:14
jrosser_not sure LE is involved in this though?08:15
jrosser_it is more that `/openstack/venvs/utility-28.1.0.dev102/bin/pip` is likely seeing the certifi CA bundle, not the system CA store08:16
noonedeadpunknot in that...08:16
noonedeadpunkat least should not08:16
noonedeadpunkI'm spawning AIO now...08:16
noonedeadpunkbut we changed nothing there.08:17
jrosser_and we set `REQUESTS_CA_BUNDLE` in order to make that work08:17
noonedeadpunkbut, it's around time where Root for LE should be changed....08:17
noonedeadpunkyeah08:17
noonedeadpunkI actually can't recall if we do use LE for TLS jobs, as we actually could...08:17
jrosser_well remember it is step-ca with a private acme CA, not actual LE for CI jobs08:19
noonedeadpunkactually yes, you're right08:19
noonedeadpunkwe can't issue LE as there's no domain08:19
jrosser_correct08:19
jrosser_and rate limit blah blah too08:20
noonedeadpunkand stepca is a separate job08:20
jrosser_so btw i did have a look at 24.0408:20
noonedeadpunkaio almost at utility installation step08:20
noonedeadpunkI think everything is terrible there?08:20
jrosser_and somehow lxc is pretty broken, like could not boot 24.04 image on 24.04 host08:21
jrosser_however, i might have another go on a metal deploy instead08:21
noonedeadpunkwell, given that openstack services known not to work on python3.1208:21
noonedeadpunkand it's not known if py3.12 will be added to 2024.2 even08:21
jrosser_i do think that the debian people have been chipping away at that08:22
noonedeadpunkdue to quite big libraries got deprecated/not supported for py31208:22
noonedeadpunkyup08:22
jrosser_so even though there is no formal support we might get quite some way, but yes absolutely not something releaseable08:22
noonedeadpunkand, I reproduced in AIO08:23
jrosser_i wonder if there is a new pip or something08:25
noonedeadpunkwell, we constrain it...08:26
noonedeadpunkand https://review.opendev.org/c/openstack/openstack-ansible/+/916792 is not merged 08:27
noonedeadpunkofc curl https://172.29.236.101:8181/constraints/upper_constraints_cached.txt has no complaints08:27
jrosser_what would be interesting is to dump the env vars just before that task08:28
jrosser_though if it is something like the ssh session has not re-started (and so not loaded REQUESTS_CA_BUNDLE) then i would expect it to run OK second time if you re-run the utility plabook08:29
jrosser_we are probably missing an explicit `meta: reset_connection` after we drop the env vars08:30
noonedeadpunknah, re-run fails as well.08:30
noonedeadpunkI think cert looks okeyish? https://paste.openstack.org/show/bUuQ8DrpcXwux83VAqy3/08:31
noonedeadpunkoh, yes08:32
noonedeadpunkthat is indeed env not being re-loaded08:32
noonedeadpunk(or connection not being re-established)08:33
jrosser_should go here perhaps https://opendev.org/openstack/openstack-ansible-openstack_hosts/src/branch/master/tasks/main.yml#L5508:34
noonedeadpunkyeah, as a handler08:34
jrosser_this could easily be an updated sshd or something with different/fixed behaviour08:34
noonedeadpunkI kind of wonder how 2023.2 then passing frankly speaking09:05
noonedeadpunkor maybe it's not09:05
noonedeadpunkhttps://zuul.opendev.org/t/openstack/build/c007bb5834bc4fd08c3d6a376ad76604 passing at least09:06
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible-openstack_hosts master: Re-initiate connection after environment is changed  https://review.opendev.org/c/openstack/openstack-ansible-openstack_hosts/+/91758909:14
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible master: Update global pins for 2024.1  https://review.opendev.org/c/openstack/openstack-ansible/+/91679209:15
noonedeadpunkso it's actually smth we merged I believe09:18
noonedeadpunkas upgrade jobs fail on N+109:18
jrosser_i wonder if it is the pip caching09:22
jrosser_like now it has to actually request that contstraints file every single time09:22
jrosser_where before it might be sat on the disk09:23
noonedeadpunkbut it was for nginx only?09:32
noonedeadpunkah, or it wasn't09:32
noonedeadpunkas it was client caching09:32
noonedeadpunkso you think that it could be constraints be fetched by pip during bootstrap-ansible?09:33
noonedeadpunkand then cached and passed between venvs?09:33
jrosser_would be interesting to know if `python -m pip cache dir` is the same place for all the venv and also the system pip09:35
jrosser_if that is the case then the cache would be shared09:36
jrosser_andrews patch set an nginx header which told pip not to cache09:36
noonedeadpunkyup, you're right here09:49
noonedeadpunkhttps://paste.openstack.org/show/bpmpuGRUPRuj2lT76i09/09:49
noonedeadpunkwhich explains how disabling cache broken tls jobs...09:49
noonedeadpunkthat's kinda fun consequence. As then things are broken outside of aio I'd assume09:49
admin1is there a tag to tell osa just to recreate the rabbitmq queues ? 09:56
noonedeadpunkno, not specifically09:59
noonedeadpunkrabbitmq role needs some kind of love to modernize it a bit09:59
jrosser_i guess that then there is some potential different behaviour also on lxc, as there really would be a unique cache per service09:59
jrosser_noonedeadpunk: i think that andrew was going to look at fixing that as part of making a quorum queue migration script10:00
noonedeadpunkI was also thinking about modernizing rabbitmq_upgrade thingy10:00
noonedeadpunkas with rolling upgrade we shouldn't really need to fully re-install it10:01
noonedeadpunkthe way we do today10:01
Ra001Hi I am trying to install recent version of openstack with OVN, and I am struggling to configure provider network for Octavia service , here is my configurations https://pastebin.com/01xSnisH10:07
noonedeadpunkRa001: hey there, let me check10:10
noonedeadpunkI think it should be `vlan` network, not `flat`10:11
noonedeadpunkalso /24 might be a bit low just in case, as 1 LB effectivelly occupies 2 IPs from this network10:12
noonedeadpunkso you are capped with 128 LBs kinda10:12
Ra001Well yes sure I can increase the range but I am not sure about the vlan part this lb is on vlan network I create linke bond.710:13
Ra001Well yes sure I can increase the range but I am not sure about the vlan part this lb is on vlan network I create like bond.710:13
Ra001Also after running the playbook ithink openvswitch is trying to create the bridge br-lbaas and is showing error "error: "could not add network device br-lbaas to ofproto (File exists)"10:15
noonedeadpunkRa001: ok, so the thing with octavia is, that this network should be 1. available on controllers and passed to LXC containers. That's where you need to create br-lbaas with netplan10:16
noonedeadpunkand 2. On compute/net nodes this same network should be managed with neutron. So you should not try to create it with netplan there10:17
Ra001the eth14 and eth13 used to be dummy interfaces so do I have to use real interface or just use the same configuaration and openvswitch will take care for me?10:19
noonedeadpunkso eth14/13 is the interface that will appear inside LXC container10:21
noonedeadpunkthey should not be relevant for metal hosts10:21
Ra001Also If i have to disable netplan interface then what should be the IP address for the bridge and host_bind_override parameter should be replaced with network_interface: bond.7 I am not sure10:21
noonedeadpunkbasically, on compute/network nodes, you should not do anything as long as it's provider_networks are configured properly10:22
noonedeadpunkyou don't need any IP on the bridge itself just in case10:22
noonedeadpunk(unless you're doing metal deployment)10:22
Ra001Well I am doing metal deployment , I have services running in lxc but only on infra nodes10:24
Ra001hypervisor nodes are bare servers with some nodes running ceph osd also10:25
noonedeadpunkyeah, so, you need to create br-lbaas only on controllers. and you don't anywhere else10:27
noonedeadpunkand they don't need to have IPs on them10:27
noonedeadpunkips should be assinged inside lxc containers, but that should be done by playbooks10:28
jrosser_the dummy interfaces are only relevant for AIO/CI cases, not at all for a real deployment10:28
noonedeadpunkand still - I think this interface should be either vlan (actually, depending on how you configured other networks, like external one)10:28
noonedeadpunkor you need to apply host_bind_override to make it  bond.710:30
Ra001Yeah but how openvswitch will identify the vlan id , should i define network_interface : bond.7 10:30
Ra001Yeah thats what i am trying to say. Also if i define network_interfce : bond.7 what happens to control nodes10:31
noonedeadpunkRa001: so. for compute/net nodes, this network is managed with Neutron. So basically, octavia role (or you manually) should create a network of type vlan (or flat) with tag 7 for that10:32
jrosser_^ this is just a regular provider network, like your external network10:33
noonedeadpunkwe have variables for that: https://opendev.org/openstack/openstack-ansible-os_octavia/src/branch/master/defaults/main.yml#L359-L36210:33
Ra001Okay let me try 10:35
noonedeadpunkso this patch doesn't seem to help tls jobs: https://review.opendev.org/c/openstack/openstack-ansible-openstack_hosts/+/917589 :(11:29
jrosser_ noonedeadpunk what was your SCENAIO there? i can take a look too11:38
noonedeadpunk./scripts/gate-check-commit.sh aio_metal_tls11:39
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible-os_neutron master: Fix OVS/OVN versions for EL  https://review.opendev.org/c/openstack/openstack-ansible-os_neutron/+/91719212:43
opendevreviewMerged openstack/openstack-ansible-os_neutron stable/2023.2: Add debian package libstrongswan-standard-plugins  https://review.opendev.org/c/openstack/openstack-ansible-os_neutron/+/91676512:43
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible-os_neutron master: Update OVN/OVS versions for EL  https://review.opendev.org/c/openstack/openstack-ansible-os_neutron/+/91719312:46
nixbuilderWhat is the correct way to have glance use cinder for image storage.  I have followed the instructions and samples but it still does not work. Is that still supported under Bobcat?12:53
nixbuilderhttps://paste.openstack.org/show/bzipGD83lhGX15Gc1tBB/12:55
noonedeadpunknixbuilder: I think this should have worked12:56
noonedeadpunkwe kinda have a job for that as well: https://zuul.opendev.org/t/openstack/build/50d01769c7c94479a423be18db9e39b812:57
noonedeadpunkthough question is - what driver do you use for cinder?12:57
noonedeadpunkas in case of iscsi there might be connection questions between glance and cinder12:58
noonedeadpunkas this job works only because cinder_volume and glance are on the same host with LVM_iSCSI driver12:59
nixbuildernoonedeadpunk: This is what I use... https://paste.openstack.org/show/bS3q6pjTvHaL95WMRVqM/.  Creating/deleting volumes works fine.  Just not creating images... I keep getting '500 Internal Server Error' when uploading images.13:02
noonedeadpunkyou get 500 from glance? anything there in logs?13:03
nixbuildernoonedeadpunk: Here are the logs... I don't see a whole lot there to give me a clue... probably missing something https://paste.openstack.org/show/bQWqXsH6rMeMtZwVOHVk/13:07
nixbuildernoonedeadpunk: The SAN logs shows the volume being created (volume-ef88e185-b512-4f76-a0bd-ab77148c292c) but then gets deleted five seconds after it gets created :-(13:10
noonedeadpunknixbuilder: so, this record should have some stack trace in glance logs13:11
noonedeadpunkApr 30 07:52:45 infra01 haproxy[178643]: 172.29.236.100:43892 [30/Apr/2024:07:52:39.941] glance_api-front-2 glance_api-back/infra01 0/0/0/5533/5533 50013:11
jrosser_noonedeadpunk: so i also reproduced the pip ssl error13:16
jrosser_but i then do things like re-run utility-install a couple of times, wait a bit whilst i read some docs etc 13:17
jrosser_and then it just worked /o\13:17
noonedeadpunkhuh13:18
jrosser_and now if i delete ansible facts, remove the utility venv, delete .pip/cache, delete /var/www/repo13:19
jrosser_it still works ok, so there is some odd first-time issue13:19
noonedeadpunkwell, it's aligned with idea of connection persistence13:25
nixbuildernoonedeadpunk: After thinking about it some I decided to look at the glance rootwrap.conf file and lo and behold... the path problem reared it's ugly head again...https://paste.openstack.org/show/bWmYAZ9j7Oen0wLPJiNE/13:32
noonedeadpunknixbuilder: oh, huh, thta;s surprising13:34
nixbuildernonedeadpunk: However if I don't try to use the SAN and let it store the image in a file, then it works just fine.13:35
jrosser_i wonder if this actually is in the 28.x release https://review.opendev.org/c/openstack/openstack-ansible-os_glance/+/90156213:36
jrosser_looks like it is13:37
nixbuilderjrosser_: Yep... that looks right... I'll apply that patch before re-installing and see if that fixes it.13:40
jrosser_nixbuilder: well - i think you should have it already unless i've missed something?13:41
jrosser_from your paste you look to be running openstack-ansible tag 28.2.013:42
noonedeadpunkit should be from 28.0.0 I guess?13:44
nixbuilderjrosser_: Yes... 28.2.0. But I guess I need to set that 'glance_rootwrap_conf_overrides' from now on.13:45
jrosser_i don;t think so https://review.opendev.org/c/openstack/openstack-ansible-os_glance/+/901562/1/vars/main.yml13:45
jrosser_it is combined with the defaults from _glance_rootwrap_conf_overrides i think13:46
noonedeadpunkyeah, but this one was cut with stable/2023.2? https://review.opendev.org/c/openstack/openstack-ansible-os_glance/+/90093013:47
jrosser_right, so i think this was included by default in all 2023.2 releases13:48
jrosser_i'm confused about why the file is not correct in that case13:48
jrosser_nixbuilder: does this make sense? we think that patch should already be applied if you are running 28.2.013:50
jrosser_so that could be for example, changing the point you have checked out the openstack-ansible repo and forgetting to `bootstrap-ansible.sh` to get the correct versions of the roles in place?13:51
nixbuilderjrosser_: I did run `bootstrap-ansible.sh`  and in fact the 'glance_rootwrap_conf_overrides' is in the 'roles' directory... so the patch is there but it didn't take for some reason. 13:53
jrosser_it would be really helpful if you could try to see why13:54
nixbuilderjrosser_: I will try... shortly :-D13:55
noonedeadpunkyeah, that is really surprising14:00
noonedeadpunk#startmeeting openstack_ansible_meeting15:00
opendevmeetMeeting started Tue Apr 30 15:00:38 2024 UTC and is due to finish in 60 minutes.  The chair is noonedeadpunk. Information about MeetBot at http://wiki.debian.org/MeetBot.15:00
opendevmeetUseful Commands: #action #agreed #help #info #idea #link #topic #startvote.15:00
opendevmeetThe meeting name has been set to 'openstack_ansible_meeting'15:00
noonedeadpunk#topic rollcall15:00
noonedeadpunko/15:00
jrosser_o/ hello15:00
noonedeadpunk#topic office hours15:02
noonedeadpunkSo, except weird current issues with tls job, one thing I wanted to discuss, if we should backport https://review.opendev.org/c/openstack/openstack-ansible-os_neutron/+/91719215:03
noonedeadpunkas there're mixed feelings about it15:04
noonedeadpunkI totally don't like that OVN/OVS will get upgraded on it's own15:04
noonedeadpunkbut then it won't help on current versions15:05
jrosser_did we get any info about why these things were upgraded on a stable branch?15:05
NeilHanlonhiya15:05
noonedeadpunkand in case of minor upgrade - it might be even worse15:05
noonedeadpunkI talked to rdo folks today, and kinda boiled down - neutron core recommended to use latter version of it15:05
noonedeadpunkthere're some comments in the patch above15:05
noonedeadpunkalong with RH ticket 15:06
NeilHanlonthanks for the link.. will look into this a bit more, too15:07
NeilHanloni do think it makes sense to pin versions to not... surprise people15:07
noonedeadpunkyeah, true. though if we should backport - is tricky I think15:07
NeilHanlonyeah.. i think it is hard to say without knowing how many people are using OSA and targeting RHEL or Rocky15:08
NeilHanlon_and_ using distro path15:08
noonedeadpunkNeilHanlon: btw, maybe you know... If we install specific version - I assume it means that it won't be auto-upgraded to the newer one?15:08
NeilHanlonnot on its own, you'd have to also add excludepkgs 15:08
noonedeadpunkand it's regardless distro path15:08
NeilHanlonoh, right.. we install this on source path too.. nvm15:09
jrosser_yes like UCA, some things always come from RDO15:09
jrosser_even for source installs15:09
* NeilHanlon forgot about that particular nuance15:09
noonedeadpunkso I was suggested to use `excludepkgs=*rdo-openvswitch-3.[!2]*,rdo-ovn*-23.0[!9]*,rdo-ovn*-2[!3]*`15:09
* jrosser_ parses15:10
noonedeadpunkbut then we can potentially do it only for trunk install.. which is the only working one though :D15:10
NeilHanlon:D 15:10
NeilHanlonyeah i think that looks okay... though tbh I didn't know until just now that excludepkgs supported that format15:10
noonedeadpunkI never tried it tbh15:11
noonedeadpunkBut also I somehow thought that it shouldn't be needed...15:11
NeilHanlonthere is a 'versionlock' plugin which can be used, but by default, dnf/yum will update to the latest version contained in the repository15:11
noonedeadpunkas I'd assume that ovn23.09 can not or should not be upgraded to ovn24.0315:11
noonedeadpunkas these should be kinda conflicting ones...15:12
NeilHanlonlet me check how they're configured..15:12
NeilHanlonthat's a good point though, they are named differently, not just versioned differently15:13
noonedeadpunkyeah15:13
noonedeadpunkthough rdo-ovn-central is named same...15:13
NeilHanloni think it's the rdo-ovn-XXX stuff that does the upgrading15:13
noonedeadpunkSo I'm very confused, but also got rusty with EL 15:14
noonedeadpunkyeah15:14
noonedeadpunkok, yes, DNF update does replace things... 15:14
NeilHanlonhttps://git.centos.org/rpms/rdo-openvswitch/blob/c9s-sig-cloud-openstack-caracal/f/SPECS/rdo-openvswitch.spec15:14
noonedeadpunkhttps://paste.openstack.org/show/b9hX0edVOSAnxF5LMgxm/15:15
NeilHanlonare we targeting D now, or C?15:15
noonedeadpunkProvides:   ovn15:15
noonedeadpunkC15:15
noonedeadpunkor well. B even15:15
noonedeadpunkwe haven't merged B->C switch yet15:15
NeilHanlonyeah it looks like when they do a new version, they make they obsolete the old ones15:16
noonedeadpunkok, so in fact that version pinning doesn't make too much sense as they're upgraded right away15:16
NeilHanlonbut i don't know why we'd be getting 24.3 now, tbh..15:16
noonedeadpunkthey backported it to all stable branches as well15:16
noonedeadpunkafaik15:17
NeilHanlonright but I don't see ovn24.03 in their rdo-openvswitch package until D15:17
NeilHanlonhttps://git.centos.org/rpms/rdo-openvswitch/blob/c9s-sig-cloud-openstack-dalmatian/f/SPECS/rdo-openvswitch.spec#_915:17
noonedeadpunkah, it's not released yet15:17
noonedeadpunkbut it's already present in trunk15:17
* NeilHanlon confused15:18
noonedeadpunkwe don't install it through cloudsig15:18
NeilHanlonah15:18
NeilHanloni forgot how this RDO/cloud sig stuff works15:18
NeilHanlonbut i am remembering now15:18
noonedeadpunkso we just fetch https://trunk.rdoproject.org/centos9-bobcat/current/delorean.repo15:20
noonedeadpunkand place under yum.repos.d15:20
noonedeadpunkwhich is pre-cloudsig release kinda15:21
noonedeadpunkwould make sense to do cloudsig package install ofc on production, BUT, they do also ship tons of dependencies15:21
noonedeadpunklike rabbitmq, ceph, messaging repos, etc15:21
noonedeadpunkso really smth we'd prefer not to have15:22
noonedeadpunkso rdo-ovn replacing ovn23.09.x86_64 really neglects any pinning attempt kinda...15:23
noonedeadpunkNeilHanlon: what would you suggest?:)15:23
NeilHanlonvodka15:23
noonedeadpunkhahaha15:23
NeilHanlonI will follow up with amorlej and see what we can come up with :D 15:24
noonedeadpunkok, awesome, thanks for that15:24
* NeilHanlon bows15:24
noonedeadpunkas in fact I'm a bit lost in options as it all looks a bit confusing15:24
NeilHanlonyeah, some of this requires a Doctorate in RPMology15:25
noonedeadpunkas a matter of fact I know for sure that such upgrade between releases is executed and user is unaware - may lead to all kinds of weird things and downtimes15:25
noonedeadpunklike in APT I would do just pin to version and forget about that I guess...15:25
noonedeadpunkthough UCA close to never does like this...15:26
noonedeadpunkok15:28
noonedeadpunkregarding TLS failure - so far it's very weird. Though reproducible, which is nice15:29
jrosser_do we have an LXC equivalent job?15:31
noonedeadpunkno, I don't think so15:31
jrosser_hmm i might try that too15:31
jrosser_so far i don't find anything15:31
noonedeadpunkif I was to guess - I think it would work15:31
jrosser_it is a shame that the meta task does not seem to print any useful output15:32
jrosser_not OK / Changed / Skipped, just nothing15:32
noonedeadpunkas we do restart container 15:32
jrosser_which does make me wonder if it actually does anything15:32
noonedeadpunkI kinda think if we should instead pass that somehow in python_venv_build15:35
jrosser_the CA bundle location?15:35
noonedeadpunkyeah15:36
jrosser_i was looking at the pip module source and it uses module_utils.run_command15:36
noonedeadpunkas we do same for uwsgi15:36
jrosser_which kind of a long way from being command/shell module tbh15:36
noonedeadpunkhttps://opendev.org/openstack/ansible-role-uwsgi/src/branch/master/vars/debian.yml#L3615:36
noonedeadpunkbut then I did logout from VM - login again and it worked15:37
noonedeadpunk(as long as env was reloaded)15:37
jrosser_i am just trying a second run of gate-check-commit in the same shell again15:37
jrosser_but ~30mins between tries15:38
noonedeadpunkare env cached in facts?15:40
jrosser_they are15:41
jrosser_we could perhaps use ansible_env to set environment: on the task15:41
jrosser_which is pretty wierd construct tbh15:42
noonedeadpunkyeah, and have risk of overriding thing15:42
jrosser_huh now it worked on my second run of gate-check-commit15:43
jrosser_no logout15:43
noonedeadpunkthough I was thinking as if instead reset_connection correct thing would be to clear_facts15:43
jrosser_i would have been much longer than controlpersist timeout there15:44
jrosser_between first and second run15:44
jrosser_this looks massively out of date https://opendev.org/openstack/openstack-ansible/src/branch/master/tests/roles/bootstrap-host/templates/user_variables.aio.yml.j2#L28315:51
noonedeadpunkI found this today: https://opendev.org/openstack/openstack-ansible-openstack_hosts/src/branch/master/vars/redhat-9.yml#L7315:52
noonedeadpunkbtw, I will be on OpenInfra Days in Sweden next week on Tuesday15:53
noonedeadpunkwhich means I won't be around on Monday and Wednesday either...15:53
NeilHanlonlol at openstack queens15:54
NeilHanloni can run next week, if we still want to meet15:54
noonedeadpunkDamian would be also away just in case...15:54
noonedeadpunkso potentially makes sense to skip, despite we're coming very close to the release due date15:55
noonedeadpunkWhich is Jun 0315:55
noonedeadpunkbasically in a month...15:56
jrosser_hmm ok so we need to make sure we prioritise 15:58
jrosser_and just for more /o\ i think there will be zed moved to unmaintained15:58
jrosser_still need to finish victoria which is close to working again15:59
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible master: Add Tempest test for OVN Octavia driver  https://review.opendev.org/c/openstack/openstack-ansible/+/91687216:00
noonedeadpunkwell, I've tried to add ovn provider testing for octavia ^16:00
noonedeadpunkthen capi I think is pretty much ready16:01
jrosser_there is messaging improvement stuff on all roles (?) too16:01
noonedeadpunkfor skyline 1 path about yarn left: https://review.opendev.org/c/openstack/openstack-ansible-os_skyline/+/91440516:01
noonedeadpunkyeah16:01
noonedeadpunkthis one I aim to finish pushing by end of the week16:02
noonedeadpunkI paused just because gates are borked anyway...16:02
noonedeadpunkand, ovn-bgp-agent is the last thing I guess? 16:03
noonedeadpunkah, and we're out of time16:03
noonedeadpunk#endmeeting16:03
opendevmeetMeeting ended Tue Apr 30 16:03:44 2024 UTC.  Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4)16:03
opendevmeetMinutes:        https://meetings.opendev.org/meetings/openstack_ansible_meeting/2024/openstack_ansible_meeting.2024-04-30-15.00.html16:03
opendevmeetMinutes (text): https://meetings.opendev.org/meetings/openstack_ansible_meeting/2024/openstack_ansible_meeting.2024-04-30-15.00.txt16:03
opendevmeetLog:            https://meetings.opendev.org/meetings/openstack_ansible_meeting/2024/openstack_ansible_meeting.2024-04-30-15.00.log.html16:03
NeilHanlonope, apparently my VM decided to die just then 🙃16:15
spatelcan we configure multiple AZ for single cinder-volume ?16:54
noonedeadpunkspatel: good question16:55
noonedeadpunkI'm not sure about that16:55
noonedeadpunkspatel: you can define that per storage driver16:56
spatelI am deal with strange issue where my k8s running on A availability zone and cinder in different AZ 16:56
spatelI have multiple compute AZ but storage is shared between all of them :(16:56
noonedeadpunkand should it be available cross-az?16:56
noonedeadpunkI'm not sure it's "issue"16:56
noonedeadpunkas then you have cinder cross-attach16:56
noonedeadpunkwhich means that cinder volume can be in any az16:57
spatelTrying to understand where to configure cross-az ? 16:57
noonedeadpunkit's either true or false16:57
spatelwhere should I set that config?16:57
noonedeadpunkhttps://docs.openstack.org/nova/latest/configuration/config.html#cinder.cross_az_attach16:57
spatelin cinder.conf?16:58
spatelor magnum ?16:58
noonedeadpunkthat is nova config16:58
noonedeadpunkdon't know about magnum16:58
spatelBy default there is no availability zone restriction on volume attach. 16:59
noonedeadpunkyup, that's right16:59
spatelI have submitted issue here - https://github.com/vexxhost/magnum-cluster-api/issues/36617:00
spatelLook like k8s doesn't like multi-AZ 17:00
spatelk9s csi plugin set based on where compute AZ configured 17:01
noonedeadpunkhave no clue about that17:01
spatelk8s CSI plugin need some patch to tell it not to follow worker/compute AZ rule. 17:01
noonedeadpunkthough it seems you got an answer there17:01
spatelBut still we need patch in magnum / magnim-capi 17:02
noonedeadpunkbut frankly - I'm not sure how to even sort this out17:02
jrosser_it is the config file for CSI plugin that needs patch17:02
jrosser_not the plugin itself17:02
noonedeadpunkas ideally, I guess you'd want to have volume created in a same az where vm runs17:02
noonedeadpunkthough I absolutely sure csi does not care about it17:02
noonedeadpunkso you will get absolutely random assignment for volume there17:03
spatelIf CSI AZ and Worker AZ is different then you can't attach volume to pod 17:03
noonedeadpunkwhich is slightly different from behaviour when creating VM at the first place, where az will attempt to be respected17:03
spatelYou will get anti-affinity error 17:03
spatelhttps://paste.opendev.org/show/bJg4klKcPC6GmAhp9dvk/17:04
spatelRight now I am manually changing this value - topology.cinder.csi.openstack.org/zone=general 17:05
jrosser_as far as i can see you have the answer in your github issue17:05
spatelAfter magnum or mcapi patch we can override it with k8s-template or label 17:06
spatelI think its a hack but not a long term solution 17:06
spatelHow do i tell CSI to change your its AZ ? 17:07
jrosser_right, so the routes are 1) you patch this yourself and make a PR (it is open source after all) 2) wait for someone else to do that for you17:07
spatelI am just trying to understand how many components required to patch 17:08
spatelI believe CSI and Magnum both need a patch 17:08
spatelmagnum need to patch to accept label  (Ex: csi_avaibility_zone) or something and we need to pass this value to CSI plugin to set AZ according in k8s during deployment 17:09
spatelI may be wrong just trying to understand workflow 17:10
nixbuilderOK... so I have re-installed once more from scratch and still uploading images to glance fails... I went in and fixed the rootwrap.conf file, restarted glance-api and images now upload just fine... here is what I have documented https://paste.openstack.org/show/bOvckMH64N5pIykPc12A/17:22
noonedeadpunkthat is very weird17:24
noonedeadpunkI've checked couple of my latest AIOs and they do have correct rootwrap...17:25
jrosser_nixbuilder: it would be just as valuable to re-run the glance playbook and provide the output here for the relevant tasks, as to reinstall17:26
noonedeadpunkand I do see it in CI as well https://zuul.opendev.org/t/openstack/build/f1d435c814034268b0dd42cbedb6ffc9/log/logs/etc/openstack/aio1-glance-container-b0e673b6/glance/rootwrap.conf.txt#2017:26
jrosser_what we need to understand is where the content that gets into your rootwrap file is coming from, and why it's wrong17:27
noonedeadpunksounds like there might be wrong ordering or smth17:27
noonedeadpunkbut I don't see anything obvious17:28
noonedeadpunk(like adjusting 1 file and then symlinking smth completely different)17:28
jrosser_so for example, dropping in a debug: task immediately before that file is templated, with `_glance_rootwrap_conf_overrides` and `glance_core_files` displayed would be helpful17:29
jrosser_and seeing this task running with -vvv would be interesting https://opendev.org/openstack/openstack-ansible-os_glance/src/branch/master/tasks/glance_post_install.yml#L12617:31
noonedeadpunkand in fact - yes, /etc/glance is a symlink17:31
noonedeadpunkhttps://opendev.org/openstack/openstack-ansible-os_glance/src/branch/master/tasks/glance_install.yml#L86-L9517:32
noonedeadpunkbut then by time rootwrap change is run, it should be linked already properly17:35
noonedeadpunk(although better to re-define glance_etc_dir to be inside venv)17:36
opendevreviewJonathan Rosser proposed openstack/openstack-ansible-ops master: Add CAPI job to OPS repo  https://review.opendev.org/c/openstack/openstack-ansible-ops/+/91664017:42
opendevreviewJonathan Rosser proposed openstack/openstack-ansible-ops master: Test all supported versions of k8s workload cluster with magnum-cluster-api  https://review.opendev.org/c/openstack/openstack-ansible-ops/+/91664917:44
jrosser_this is all really very complicated with the way the service config files are templated17:45
jrosser_i understand that it is some kind of optimisation17:45
jrosser_but it really is unobvious what is happening17:45
noonedeadpunkso the only way I can technically explain that weirdness would be smth being off with symlink to /etc/glance17:49
noonedeadpunkbut then whole glance.conf should be off...17:49
noonedeadpunknixbuilder: what's the output of ls -l /openstack/venvs/glance-*/etc/glance/rootwrap.conf ?17:51
nixbuildernonedeadpunk: '-rw-r--r-- 1 root root 1079 Apr 30 12:13 /openstack/venvs/glance-28.2.0/etc/glance/rootwrap.conf'17:52
noonedeadpunkhuh17:53
noonedeadpunkI have it as 640 owned by root:glance....17:53
noonedeadpunkand I found broken permissions for /etc/glance/rootwrap.d ...17:55
nixbuildernoonedeadpunk: For some reason... that file '/openstack/venvs/glance-28.2.0/etc/glance/rootwrap.conf' is correct... hmmm... but the operative file '/etc/glance/rootwrap.conf' remains wrong.  When I edit that file, uploading images works.17:56
noonedeadpunkwhat is `stat /etc/glance`?17:57
noonedeadpunkas it should be a symlink `File: /etc/glance -> ../openstack/venvs/glance-28.1.0.dev87/etc/glance`17:57
noonedeadpunkor you're having distro install?17:57
nixbuildernoonedeadpunk: OK... /etc/glance is a symlink to '/openstack/venvs/glance-28.2.0/etc/glance'17:59
nixbuildernoonedeadpunk: No... my install is from source"18:00
nixbuildernonedeadpunk: 'install_method: source'18:00
noonedeadpunkbut /etc/glance/rootwrap.conf and /openstack/venvs/glance-28.2.0/etc/glance/rootwrap.conf are different despite these are symlinks????18:00
nixbuildernoonedeadpunk: No... I did not realize about the symlink... so forget what I said about there being a difference18:02
nixbuildernoonedeadpunk: I also re-ran the glance install to catpure the output for you and no changes were made.  I ran it again with '-t glance-config' and still no changes.  That's why I re-install from scratch because sometime re-running playbooks does not change things.18:04
jrosser_ansible is idempotent, so if there is nothing to change, it won't18:09
jrosser_i am still not really understanding if the file on the disk is right, or wrong now18:09
nixbuilderjrosser: Since I manually changed the file, it is correct.18:12
jrosser_right - so we still need to understand why it was coming out wrong in the first place18:13
noonedeadpunkand why we don't see that in CI18:13
jrosser_nixbuilder: currently you have the reproducer for whatver is going on here :)18:13
nixbuilderjrosser_: Stand by... running os-glance-install18:16
nixbuilderjrosser_: So I just removed my edit to rootwrap.conf and set the file back to what it was after the initial install, then ran 'openstack-ansible os-glance-install.yml -t glance-config -vvv | tee ~/logs/glance-debug.log' expecting that the playbook would pickup my 'glance_rootwrap_conf_overrides' and it did not.18:23
nixbuilderjrosser_: Trying to figure a way to get that file into paste18:24
nixbuilderjrosser_: I mean the log file I generated18:25
noonedeadpunkyou can try dropping `/openstack/venvs/glance-28.1.0.dev87/etc/glance` and then run the role with `-e venv_rebuild=true`18:25
nixbuildernoonedeadpunk: OK... trying that now18:29
jrosser_just to be sure you’ve not defined glance_rootwrap_config_overrides, your comment suggests you have set some variable for that18:29
jrosser_the default for that var is empty {}18:30
nixbuilderjrosser_: It's empty... re-running now18:32
jrosser_ok no worries, we just have to be very precise here or there will be confusion18:32
noonedeadpunkhow in the world you can reproduce that....18:33
noonedeadpunkmaybe I indeed should spawn 28.2.0 explicitly...18:33
noonedeadpunkseems we need to spread https://opendev.org/openstack/openstack-ansible-os_cinder/commit/f5ba9b5b9a88ab09f3d6afded011976525c863a3 18:36
nixbuildernoonedeadpunk: Which file/task/role sets the contents of the rootwrap.conf?18:41
nixbuildernoonedeadpunk: I am trying to help narrow down where this is failing :-D18:41
nixbuildernoonedeadpunk: BTW... running 'openstack-ansible -e venv_rebuild=true os-glance-install.yml | tee ~/logs/glance-debug.log' fails to set the proper exec_dir path.18:42
noonedeadpunkthis one: https://opendev.org/openstack/openstack-ansible-os_glance/src/branch/master/tasks/glance_post_install.yml#L126-L13818:43
nixbuildernoonedeadpunk: OK... thanks.18:45
jrosser_nixbuilder: secret weapon is codesearch.opendev.org18:50
jrosser_put the variable name into that and you will be almost stright to where it is defined / used18:50
nixbuildernoonedeadpunk: Ok... so here is the relevant output from my latest install of os-glance (https://paste.openstack.org/show/bkIbH18PlFemDCwHQdN3/) I hope this helps narrow this down.18:51
noonedeadpunkum18:52
noonedeadpunkrootwrap is not there...18:52
noonedeadpunkit feels really that https://opendev.org/openstack/openstack-ansible-os_glance/commit/c2428ab8da9cc3868b5ae86140a63e4a33e28eca is not there18:53
nixbuildernoonedeadpunk: This was my command line... 'openstack-ansible -e venv_rebuild=true os-glance-install.yml -vvvv | tee ~/logs/glance-debug.log' and then I searched for the 'Copy common config' 18:54
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible-os_glance master: Set correct permissions for rootwrap.d  https://review.opendev.org/c/openstack/openstack-ansible-os_glance/+/91778518:55
noonedeadpunknixbuilder: task is correct. but it doesn't attempt to manage rootwrap config18:55
noonedeadpunklike either os_glance role is older then expected18:56
noonedeadpunkor glance_core_files somehow overriden18:56
noonedeadpunknixbuilder: can you check that on deploy host /etc/ansible/roles/os_glance/vars/main.yml contains rootwrap.conf as part of glance_core_files?18:56
jrosser_^ also in that directory please do `git log` and let us know what the hash of the latest commit is18:57
nixbuilderjrosser:_:noonedeadpunk: Answers to both here https://paste.openstack.org/show/b3RExIQPkbqC7ELihhs2/19:03
noonedeadpunkI really don't get what's going on19:05
noonedeadpunksha apparently is correct19:05
noonedeadpunknixbuilder: but do you have this specific block? https://opendev.org/openstack/openstack-ansible-os_glance/src/commit/d6f5f69bf9efee901a6d0c2b6e700bc2ad782dbe/vars/main.yml#L91-L9719:06
nixbuildernoonedeadpunk:  Sorry... did not get all the paste the first time :) https://paste.openstack.org/show/bG3puj6EpQ26WAEK23rK/19:06
noonedeadpunknixbuilder: ok, I have idea :D you're doing that all times on the same host, right?19:06
nixbuildernooneddeadpunk: Yes... I started out with AIO and then gradually worked in my changes to fit our environment.19:07
noonedeadpunkso, do you apparently have /root/.ansible/roles folder?19:08
nixbuildernoonedeadpunk: No19:09
noonedeadpunkugh19:09
nixbuildernoonedeadpunk:  The /root/.ansible folder exists... but no roles19:10
noonedeadpunkyeah, and I jsut looked at you previous paste and `task path: /etc/ansible/roles/os_glance/tasks/glance_post_install.yml:111`19:11
noonedeadpunkso I hoped that maybe you have os_glance coming from some different path19:11
noonedeadpunkas this really doesn't add up for me19:11
noonedeadpunkso, you have rootwrap.conf in glance_core_files, but then when you iterate over them here https://paste.openstack.org/show/bkIbH18PlFemDCwHQdN3/ - it's not there19:12
noonedeadpunk /o\19:12
jrosser_we never did print out glance_core_files though?19:12
jrosser_as a debug19:12
nixbuildernoonedeadpunk: https://paste.openstack.org/show/bc0IG9hfQSV4NKKAHRJg/19:13
jrosser_nixbuilder: can you try something like this as the first tasks in /etc/ansible/roles/os_glance/tasks/main.yml https://paste.opendev.org/show/bAGTxq1DXLSYWi0SGWAM/19:16
nixbuilderjrosser_: https://paste.openstack.org/show/bDc2GEX15jalvY9KPgoX/19:22
jrosser_hmm sorry you might need to adjust the point those tasks are to get all the vars in scope19:23
jrosser_but even so, do you see that there are references to rootwrap there?19:24
nixbuilderjrosser_: Second time... https://paste.openstack.org/show/b9A7uAQRL1dWhAP2rKwr/19:28
noonedeadpunkhow that is even possible19:28
jrosser_errrr wtf19:28
noonedeadpunknixbuilder: don't you have some override for glance_core_files ?:)19:28
noonedeadpunkoh19:29
jrosser_what OS is this19:29
noonedeadpunkwait a second19:29
noonedeadpunkIs it Rocky?19:29
noonedeadpunkhttps://opendev.org/openstack/openstack-ansible-os_glance/src/branch/master/vars/redhat.yml#L36-L4119:29
* jrosser_ thinking the same :)19:29
noonedeadpunkwhy in the world we have that /o\19:29
nixbuilderYes... Rocky19:30
noonedeadpunkok, that was useful19:31
noonedeadpunkas that;s totally a bug19:31
nixbuildernonedeadpunk: No overrides for 'glance_core_files'19:32
noonedeadpunkyeah, I found it19:34
jrosser_good job we put that debug at the top of the file :)19:35
nixbuildernoonedeadpunk: I'm glad you found it... I was beginning to think I was crazy =-O19:36
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible-os_glance master: Fix rootwrap.conf distribution for EL  https://review.opendev.org/c/openstack/openstack-ansible-os_glance/+/91778719:37
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible-os_glance master: Fix rootwrap.conf distribution for EL  https://review.opendev.org/c/openstack/openstack-ansible-os_glance/+/91778719:37
noonedeadpunknixbuilder: this should be fixing it ^19:37
noonedeadpunkI had the same feeling frankly speaking...19:37
noonedeadpunkthanks a lot for your time and help narrowing this down19:38
nixbuildernoonedeadpunk:jrosser_: No problem... and thank you guys for taking the time to help find this.19:39
nixbuilderSuccess... after applying the patches, glance installed normally https://paste.openstack.org/show/bIpTJUq03NqecSqi5vet/20:12

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!