Wednesday, 2024-08-07

opendevreviewIvan Vnučko proposed openstack/kolla-ansible master: Add backend TLS encryption of MariaDB replication and SST traffic  https://review.opendev.org/c/openstack/kolla-ansible/+/92531705:07
Core4244tafkamax Could you may be help me with your keycloak settings in general pls05:51
Core4244I'm lost at this point trying to get it to work05:51
Core4244okay I fixed it. The problem was my keycloak was also using self signed certificate and keystone failed to get info from keycloak due to ssl errors.07:14
Core4244I had to disable verification in wsgi, for testing07:15
*** jhorstmann is now known as Guest239107:25
tafkamaxGood job!07:34
opendevreviewRoman Krček proposed openstack/kolla-ansible master: Put memcache_security_strategy in single place.  https://review.opendev.org/c/openstack/kolla-ansible/+/92544407:54
stromgrenkevko: To continue my question from yesterday regarding libvirt secrets missing on one host. I've done some more research and can see that the files are present in /etc/libvirt/secrets and that they're copied in during the start of the container. I can set them manually, but it's not persisted through reboots. On both the working and non-working machines there are two secrets with different 07:57
stromgrenuuid for "ceph client.cinder". On the working host the correct one is shown when running "virsh secret-list" but on the non-working one it shows the incorrect one. So it seems like there's an old secret lurking that needs to be removed.07:57
opendevreviewRoman Krček proposed openstack/kolla-ansible master: Put memcache_security_strategy in single place at all.yml  https://review.opendev.org/c/openstack/kolla-ansible/+/92544407:57
opendevreviewMichal Nasiadka proposed openstack/kolla master: bifrost: Fix ansible and deps installation  https://review.opendev.org/c/openstack/kolla/+/92424608:13
opendevreviewMichal Nasiadka proposed openstack/kolla master: WIP: Switch to Ubuntu 24.04  https://review.opendev.org/c/openstack/kolla/+/90758908:14
mnasiadkahmm, CI seems to be red09:18
opendevreviewIvan Halomi proposed openstack/kolla-ansible master: Fix octavia test fail if a port is missing  https://review.opendev.org/c/openstack/kolla-ansible/+/92585209:18
mnasiadkahttps://658019d6165ae4c8eab6-597ff148d0ea9164d11e7cb764cf9b04.ssl.cf1.rackcdn.com/924246/31/check/kolla-ansible-rocky9/c1f8070/primary/logs/ansible/init-runonce09:18
mnasiadkafrickler: seen that?09:18
fricklermnasiadka: nope, that seems new, at least some things got merged yesterday09:26
SvenKieskemnasiadka: maybe just a temporary infra failure? I don't recall any invasive changes in that area09:35
mnasiadkasounds like openstackclient/sdk failure09:37
opendevreviewPierre Riteau proposed openstack/kolla-ansible master: CI: Fix variable name for Nova noVNC FQDN  https://review.opendev.org/c/openstack/kolla-ansible/+/92585409:37
SvenKieskemhm, afaik frickler reported a critical regression in osc, but I'm not sure it was in this area?09:39
SvenKieskeno, that was in nova: https://bugs.launchpad.net/python-openstackclient/+bug/207621209:40
opendevreviewIvan Halomi proposed openstack/kolla-ansible master: Fix octavia test fail if a port is missing  https://review.opendev.org/c/openstack/kolla-ansible/+/92585209:49
fricklerkolla should not be using 7.0.0 yet, or does it? in the build log I found 6.6.0 still09:50
mnasiadkawe're not limiting that to anything09:52
mnasiadkawe just install what is latest in pypi - https://opendev.org/openstack/kolla-ansible/src/branch/master/roles/openstack-clients/defaults/main.yml09:53
fricklerhmm, ok, so if it is reproducible, that might be yet another osc regression then09:58
mnasiadkalet's see if pinning helps10:02
opendevreviewRafal Lewandowski proposed openstack/kolla-ansible master: [WIP] Enable ML2/OVN and distributed FIP by default.  https://review.opendev.org/c/openstack/kolla-ansible/+/90495910:04
opendevreviewMichal Nasiadka proposed openstack/kolla-ansible master: CI: Pin openstackclient to <7  https://review.opendev.org/c/openstack/kolla-ansible/+/92585710:04
mnasiadkafrickler: seems quota was migrated to SDK - https://opendev.org/openstack/python-openstackclient/commit/70fbf687cf10e67d1e05120b15533725f106a1f610:06
fricklermnasiadka: yes, lots of things were, but that doesn't explain why it is failing this way10:08
mnasiadkatrue, just stating the obvious :)10:10
fricklerhmm, do we actually not deploy cinder in that scenario? it is indeed missing in the catalog in those OSC cmds10:20
mnasiadkawe only deploy cinder in multinode scenarios10:47
mnasiadkaseems pinning helps10:47
frickleryes, the regression is pretty obvious from the commit you linked10:48
fricklerfwiw I'll be offline soon for a couple of hours, will read the list of review reqs after the meeting ;)10:49
fricklerfiled https://bugs.launchpad.net/python-openstackclient/+bug/207622910:56
opendevreviewMichal Nasiadka proposed openstack/kolla-ansible master: CI: Pin openstackclient to <7  https://review.opendev.org/c/openstack/kolla-ansible/+/92585711:06
opendevreviewMichal Nasiadka proposed openstack/kolla-ansible master: CI: Fix cases where port is not available yet  https://review.opendev.org/c/openstack/kolla-ansible/+/92450611:14
opendevreviewMichal Nasiadka proposed openstack/kolla-ansible master: Bump Ansible versions to 2.16 and 2.17  https://review.opendev.org/c/openstack/kolla-ansible/+/92174311:14
opendevreviewMichal Nasiadka proposed openstack/kolla-ansible master: CI: Fix cases where port is not available yet  https://review.opendev.org/c/openstack/kolla-ansible/+/92450611:37
opendevreviewMichal Nasiadka proposed openstack/kolla-ansible master: CI: Fix cases where port is not available yet  https://review.opendev.org/c/openstack/kolla-ansible/+/92450611:38
opendevreviewMichal Nasiadka proposed openstack/kolla-ansible master: CI: Fix cases where port is not available yet  https://review.opendev.org/c/openstack/kolla-ansible/+/92450611:38
opendevreviewMichal Nasiadka proposed openstack/kolla-ansible master: Bump Ansible versions to 2.16 and 2.17  https://review.opendev.org/c/openstack/kolla-ansible/+/92174311:38
opendevreviewMichal Nasiadka proposed openstack/kolla master: WIP: Switch to Ubuntu 24.04  https://review.opendev.org/c/openstack/kolla/+/90758911:43
SvenKieskemnasiadka: https://review.opendev.org/c/openstack/kolla-ansible/+/925852 and https://review.opendev.org/c/openstack/kolla-ansible/+/924506 are quite similar, can we maybe merge (ha!) these together?11:51
mnasiadkawell, doesn't the second one contain more or less the first one? ;-)11:52
SvenKieskeyes, I wasn't aware there's already a fix in the making when writing the bug report11:53
opendevreviewMichal Nasiadka proposed openstack/kolla-ansible master: CI: Fix cases where port is not available yet  https://review.opendev.org/c/openstack/kolla-ansible/+/92450611:56
opendevreviewMichal Nasiadka proposed openstack/kolla-ansible master: Bump Ansible versions to 2.16 and 2.17  https://review.opendev.org/c/openstack/kolla-ansible/+/92174311:56
opendevreviewMichal Nasiadka proposed openstack/kolla-ansible master: CI: Fix cases where port is not available yet  https://review.opendev.org/c/openstack/kolla-ansible/+/92450612:07
opendevreviewMichal Nasiadka proposed openstack/kolla-ansible master: CI: Fix cases where port is not available yet  https://review.opendev.org/c/openstack/kolla-ansible/+/92450612:07
opendevreviewMichal Nasiadka proposed openstack/kolla-ansible master: Bump Ansible versions to 2.16 and 2.17  https://review.opendev.org/c/openstack/kolla-ansible/+/92174312:08
stromgrenkevko: I have now fixed the issue with the lurking libvirt secret. Removed the incorrect files from /etc/kolla/nova-libvort/secrets, restarted the libvirt container and set the secret manually. It now persists between reboots. The old uuid is from a test-deploy that I did. Leving this here for completeness if anyone else encounters the same issue.12:14
SvenKieskestromgren: nice you found a solution. if you really want to archive this I suggest to write a blogpost or send it to the mailing list, afaik the IRC logs are not indexed by search engines so nobody will find the solution in here :)12:26
SvenKieskestromgren: what we can learn from this: always purge test deployments completely before redeployment :)12:26
stromgrenSvenKieske: I'll write something up and send it to the mailing list. The issue was very strange because the multiple secret files existed on all hosts, but only one of them read the incorrect one. At least I'm a little more versed in how the wiring behind the scenes works now :)12:30
SvenKieskestromgren: that's very kind of you for people also stumbling over this, much appreciated! :)12:30
SvenKieskeand yes, debugging stuff always leads to a huge increase in knowledge imho :D12:31
MikeCTZAdoign debugging myself now after a kolla-ansible / openstack upgrade failed (we had a network disconnect) and then we tried to sort it after, so finding all sorts of weirdnesses, and then we made a few mistakes after as well ... so having some "fun" at the moment12:32
SvenKieskeouch, network outage during upgrades is for sure one of the worst cases of bad luck you can have12:33
stromgrenSounds like you have your work cut out for you then. Always eye opening (and a bit sobering) to find your own mistakes..12:35
SvenKieskeif you have (huge) issues during an upgrade/deployment my first advice would be to roll back, if you can. and then analyze. that assumes you can roll back, of course.12:36
SvenKieskeif you have some experience you can start fixing problems during upgrades and continue upgrading, but that is an operation that must be done very carefully imho.12:37
MikeCTZAyes indeed, its not massive things that arent working but small annoying things12:38
SvenKieskeI always like to point to this piece of wisdom when it comes to any software deployments, it has a lot of good rules what to do and what not to do, not openstack specific: https://lwn.net/Articles/562333/12:38
MikeCTZAeg keystone tokens, which could be the case of a horizon and a few admin CLI commands not working, we think everything has upgraded fine now12:38
SvenKieskemhm, you mean fernet tokens, I guess? might be worth to check the fernet containers. the rotation runs via cron jobs inside the container and is a little brittle, afaik.12:39
SvenKieskeafaik the cron job still doesn't log anywhere useful, so not to docker logs, or fluentd or anything, might be misremembering though12:40
MikeCTZAthe tokens are the same on all 3 nodes. on 2 we get errors on 1 not eg "exception.TokenNotFound(e) keystone.exception.TokenNotFound: Could not recognize Fernet token" ... so trying to figure it all out12:40
SvenKieskeMikeCTZA: on which release are you?12:42
MikeCTZAwe were on Yoga, which was the reason we moved fwd to Zed and then we had issues and mistakes were made on my side ... so now we are on Antelope/2023.112:43
SvenKieskeif you want to understand the low level details of fernet key rotation this bug: https://bugs.launchpad.net/kolla-ansible/+bug/1809469 and this fix have quite some details: https://opendev.org/openstack/kolla-ansible/commit/6c1442c385450004dd253f3f464fe4336194be9912:43
SvenKieskebut those where way before your release, so I'm fairly certain you have no bug in your k-a with regards to fernet12:44
MikeCTZAthanks I'll give them a read they may have something that helps12:44
SvenKieskeit might be that the rotation during network split or upgrade left some clients with invalid tokens12:44
SvenKieskeyou might be able to just trigger key rotation manually (there's a cronjob for that in the fernet container)12:44
SvenKieskebut be careful :)12:45
MikeCTZAwe actually trashed all the tokens and started them fresh and let kolla-ansible recreate them and it didnt change things, we put the old ones back after backing them up, so its confusing us ... we dont think we quite understand what the mechanics are 12:48
SvenKieskemhm, that's interesting12:48
MikeCTZAhmmm I see my keystone_fernet container is now unhealthy12:49
opendevreviewVerification of a change to openstack/kayobe master failed: Add support for customising Neutron physical network names  https://review.opendev.org/c/openstack/kayobe/+/92233512:50
SvenKieskedid you fix the underlying network issue? many issues in openstack are actually side effects of the whole system being a distributed system. if the network doesn't work reliably all kinds of stuff pops up with problems.12:51
MikeCTZAthere was a switch PSU issue so that was sorted pretty quickly12:52
mnasiadkamgoddard mnasiadka bbezak frickler kevko SvenKieske mmalchuk gkoper jangutter jsuazo jovial osmanlicilegi mattcrees dougszu darmach - meeting in 5 minutes12:54
SvenKieske\o/12:55
MikeCTZAend of my work day here ... will get back to this tmrw12:55
mnasiadka#startmeeting kolla13:00
opendevmeetMeeting started Wed Aug  7 13:00:00 2024 UTC and is due to finish in 60 minutes.  The chair is mnasiadka. Information about MeetBot at http://wiki.debian.org/MeetBot.13:00
opendevmeetUseful Commands: #action #agreed #help #info #idea #link #topic #startvote.13:00
opendevmeetThe meeting name has been set to 'kolla'13:00
mnasiadka#topic rollcall13:00
mnasiadkao/13:00
mmalchuko/13:00
mhinero/13:00
SvenKieskeo/13:00
darmacho/13:01
mattcreeso/13:01
mnasiadka#topic agenda13:02
mnasiadka* CI status13:02
mnasiadka* Release tasks13:02
mnasiadka* Regular stable releases (first meeting in a month)13:02
mnasiadka* Current cycle planning13:02
mnasiadka* Additional agenda (from whiteboard)13:02
mnasiadka* Open discussion13:02
mnasiadka#topic CI Status13:02
mnasiadkaso, status is RED13:02
mnasiadka(for kolla-ansible)13:03
mnasiadkapinning openstackclient in https://review.opendev.org/c/openstack/kolla-ansible/+/92585713:03
mnasiadka#topic Release tasks13:03
mnasiadkaIt's R-8 week, nothing on our calendar13:04
jovialWill we need to pin the openstack client in kayobe too?13:04
mnasiadka#topic Regular stable releases13:04
mnasiadkajovial: it depends, if you have Nova without Cinder, or Cinder without Nova ;-)13:04
mnasiadka(in any job)13:04
jovialAhh, cheers13:04
mnasiadkabbezak: did we release anything last month?13:05
mnasiadkaI did see some movement maybe, but maybe I was just dreaming ;-)13:05
mnasiadkaAh, bbezak is not here13:05
mnasiadkaso I'll ask him later13:05
SvenKieskeyes we did13:05
mnasiadkaOk, so I'll raise this months releases after the meeting13:06
mnasiadka#topic Current cycle planning13:06
SvenKieskehttps://review.opendev.org/c/openstack/releases/+/92518213:06
mnasiadkaSvenKieske: thanks13:06
mnasiadkaok, it's my first day after vacation - so going to work again on Noble and Ansible bump13:07
mnasiadkaand we'll see how that goes13:07
mnasiadkaI guess no other major features need discussing13:07
mnasiadka#topic Additional agenda (from whiteboard)13:08
mnasiadkaoh boy, that's long13:08
mnasiadkamhiner [2024-08-07]: please review:13:08
mnasiadkaRefactor of docker worker https://review.opendev.org/c/openstack/kolla-ansible/+/90829513:08
mnasiadkaRefactor of kolla_container_facts https://review.opendev.org/c/openstack/kolla-ansible/+/91141713:08
mnasiadkaMove actions to kolla_container_facts https://review.opendev.org/c/openstack/kolla-ansible/+/91150513:08
mnasiadkaAdd uninstall tasks to ACK https://review.opendev.org/c/openstack/ansible-collection-kolla/+/92508313:08
mnasiadkaAdd action for getting container names list https://review.opendev.org/c/openstack/kolla-ansible/+/92438913:08
mhinerAlso, the two changes mentioned in TODOs in migration patchset are done - last two on the above list.13:08
mhinerShoud I add them to the migration patchset?13:08
mhinerDoing this will put migration in the fifth place in the relation chain.13:08
ihalomithere is also problem in that docker worker refactor in upgrade tests, since they use old release of ansible-collection-kolla which doesnt install docker 6.0.013:10
mhinerthat was discussed on previous meeting13:10
SvenKieskethat last one should be fixed by the osbpo patch13:10
ihalomithen during upgrade it switches to master version of a-c-k but bootstrap-servers is not called again 13:11
SvenKieskeafaik this one it was: https://review.opendev.org/c/openstack/ansible-collection-kolla/+/91625813:11
ihalomiSvenKieske: yes but it needs to be backported to 2024.1 release of a-c-k13:12
mnasiadkaI'm still amazed why are we not using a venv and recommending users to use a venv ;-)13:13
SvenKieskeI guess that's a different discussion that also already took place. I guess it's just missing people actually implementing it.13:14
SvenKieskeand maybe we should also use venvs ourselves everywhere, before recommending it to users ;)13:14
mnasiadkaok, reviewed some of these patches, let's see the answers later13:15
mnasiadka(r-krcek)[2024-08-07] please review13:15
mnasiadkaChange to dev mode https://review.opendev.org/c/openstack/kolla-ansible/+/925714 and https://review.opendev.org/c/openstack/kolla/+/92571213:15
mnasiadkak-a check command https://review.opendev.org/c/openstack/kolla-ansible/+/59973513:15
mnasiadkamemcache_security_strategy templating https://review.opendev.org/c/openstack/kolla-ansible/+/92544413:15
ihalomimnasiadka: what about backporting this so docker worker tests will pass? https://review.opendev.org/c/openstack/ansible-collection-kolla/+/91625813:16
mnasiadkajust propose it, we can discuss it in Gerrit13:17
SvenKieskeregarding dev mode: doesn't the proposed approach break for modern pip which doesn't allow to install outside of venv? will comment on the patch13:20
mnasiadkaWell, first of all it's not a bug fix - it's not backportable13:20
mnasiadkabut let's continue in Gerrit13:20
mnasiadkaok, and now a huge blob of text13:21
mnasiadka(SvenKieske)[2024-08-07]: How do we want to handle slurp upgrades in the future?13:21
mnasiadkacurrently it's afaik only planned to make a special upgrade command for 2023.1 release to upgrade rmq: https://review.opendev.org/c/openstack/kolla-ansible/+/91897613:21
mnasiadkaproblem I see is: won't we need a similar change for the next slurp upgrade cycle as well, and wouldn't it thus make sense to add a generic slurp-upgrade command to master?13:21
mnasiadka(mattcrees): My understanding is that going forward we should only bump RabbitMQ versions once a year during the major/odd releases. As such, we shouldn't need this additional rabbit upgrade in future releases.13:21
mnasiadkamaybe I missed something or is the intent to just always provide one-off patches for the slurp releases to keep the master codebase lean?13:21
mnasiadkaIt's also unclear to me how we handle the process when users set `rabbitmq_image` via global.yml, we need to handle these users during future upgrades, see comments on the patch.13:21
mnasiadkawhat I mean by this: how do we make sure we don't forget a reno for slurp that users need to now change their rabbitmq_image var if they changed it, etc. do we possibly want to test this in CI slurp jobs maybe?13:21
mnasiadka(mattcrees): I guess bumping RabbitMQ versions should already have a reno?13:21
mnasiadka(mattcrees):  We do definitely want to test the double rabbit upgrade in CI, I plan for a follow-up patch once the current ones get merged. 13:21
mnasiadkaOk, so RMQ double version upgrade - since we do rolling upgrade, we can't jump across two releases in one go13:22
SvenKieskeokay, if someone (mnasiadka?) could clarify if we just intend to bump rmq going forward once a year (is that possible with upstreams release cadence? i don't know), I'm fine.13:22
SvenKieskeI actually just now read mattcrees replies, thanks for those :)13:22
mnasiadkawell, RMQ in the past had two major releases per year13:23
SvenKieskeI guess my fear of missing a reno is overblown, I agree we will have reno if we bump rmq version13:23
mnasiadkaAnd it seems they are not changing this13:24
SvenKieskeso if they (rmq) stick to their release cadence we can't only upgrade once a year, can we?13:24
mnasiadkaSo I would be reluctant to agree bumping every two cycles13:24
mnasiadkaI would rather thing about having upgrade prechecks that would check if an upgrade is possible?13:24
SvenKieskeso my proposal would be to move the rmq upgrade command to regular release (master)13:24
mnasiadka*think13:24
SvenKieskecurrently it's sitting on the stable branch13:24
mattcreesWe've got a patch in progress for that kind of precheck: https://review.opendev.org/c/openstack/kolla-ansible/+/91897713:25
mnasiadkaso, let's assume somebody wants to upgrade from A to C13:25
SvenKieskeah you beat me with the link posting matt, thx :913:26
mnasiadkaso we would need to upgrade RMQ to B version, then C version13:26
mattcreesAnd I agree, if we're not sticking to one bump a year then this should get into master. It might be nice to get a less symlinky way of having multiple versions if anyone has ideas around that ;) 13:26
SvenKieskelet's fix the symlinks maybe later :D13:26
mnasiadkaSo then yes, we need a command for upgrading RMQ13:28
mnasiadkaor a slurp-upgrade subcommand that upgrades RMQ first, and then does a regular upgrade13:28
mnasiadkafine with anything really13:28
SvenKieskealright, my point was just that we maybe move the existing approach in gerrit just from the current stable branch to master, ty! :) we can discuss details and improvements on the patches I guess13:29
mattcreesSure, sounds like we're in agreement on the plan. I'll get those patches pointing to master when I get a moment 13:29
mnasiadkawe can and probably we should, although we can't test it in master ;-)13:30
mnasiadkaAnyway, happy to review anything13:31
SvenKieskealright, I have two little things CI and python 3.12 related13:31
SvenKieskewhich are not on the agenda, but I guess the agenda is empty?13:31
mnasiadkamattcrees: can we gather all those RMQ related patches in some Gerrit topic and reference it in somwhere on the whiteboard so it's easy to track them?13:32
mnasiadka#topic Open discussion13:32
mnasiadkaSvenKieske: now you can13:32
mnasiadka;-)13:32
mattcreesYeah sure thing13:32
SvenKieskeso I tried fixing a linting issue locally, when I ran into this: https://review.opendev.org/c/openstack/kolla-ansible/+/92567113:32
SvenKieskewhich is really weird because this flake8 error should've been catched by CI long ago, and should've complained everytime since it was merged13:33
SvenKieskeI still haven't figured out why CI didn't catch it (I had not that much time to look into it yet)13:33
SvenKieskesecond thing I stumbled over, when doing this, which is interesting for mnasiadka I guess, is that flake8 won't really work with python3.12 without https://review.opendev.org/c/openstack/kolla-ansible/+/92567013:34
SvenKieskeso you might want to incorporate that into your ubuntu noble testing13:34
mnasiadkaor we just merge it now ;-)13:35
SvenKieskediscussed this also with frickler and he had also no immediate idea why CI didn't catch this. also I would like to propose to run our linters on python3.12 maybe?13:35
mnasiadkawell, question is where do we install flake8 ;-)13:35
SvenKieskexD13:35
SvenKieskeI actually am not sure currently where to set python version for our linters, yeah13:35
SvenKieskeI guess I can figure that out somewhere13:35
mnasiadkaI'll have a look - basically the problem with Noble patch is that we always assumed Ubuntu nodes for py3 testing are the same as the ones we use for building Ubuntu13:36
mnasiadkaand now only the latter is moving to Noble13:36
mnasiadkawhich I'll need to solve somehow13:36
mnasiadkabut maybe we should have some additional jobs for py3.12 testing13:36
mnasiadkaI'll have a look13:36
SvenKieskeyeah, I was also thinking about better adding more jobs for py312 instead of just moving, seems more secure, even if the increased load is unfortunate13:37
SvenKieskethanks for looking into ti13:38
SvenKieskeit*13:38
SvenKieskeI still want to figure out why the linter didn't complain about that f string, that's a pretty old basic check. and I accidently even fixed a bug there, because the hostname wasn't printed properly13:39
SvenKieskeat least it didn't find more bugs..13:40
mnasiadkahaha13:40
SvenKieskebut I fear that somehow that linter is just partially broken and doesn't report anything..13:41
SvenKieskeguess that's paranoia, maybe I should add a testcase that's always failing to ensure the linter works :D13:42
SvenKieskeI also have some code around that I need to push that checks if all our imports are in our requirements.yml/txt files, it's not quite polished enough yet13:43
mnasiadkaah, hacking 3.0.1 depends on flake8<3.8.0 and >=3.6.013:44
mnasiadkaand 3.7.9 doesn't fail on this one13:44
mnasiadkaand we don't have flake8 in lint-requirements.txt13:45
SvenKieskemhm, maybe a bug in flake813:45
SvenKieskeyeah you need a newer flake8 afaik on py31213:45
SvenKieskemaybe should regularly bump those13:45
mnasiadkaI guess so13:45
mnasiadkaI don't even know why we limited hacking13:45
SvenKieskethere was some bug in the past afaik, but CI is green now ;)13:46
SvenKieskesomewhere was also the flake8 version pinned, don't quite remember if it was in u-c or somewhere?13:46
mnasiadkayeah well, need more people paying attention :)13:46
mnasiadkajust checked, flake8 is not in u-c13:46
mnasiadkaafter your patch we should be better anyway13:46
mnasiadkathanks13:46
SvenKieskeimpression I got at least was that the basic py312 jobs don't test enough stuff :D13:46
SvenKieskeyeah thanks as well, guess we can conclude the meeting13:47
mnasiadkaquestion if we should backport the CI patch - I guess it could make some sense13:47
SvenKieskethen we need first to merge all the flake8 backport fixes: https://review.opendev.org/c/openstack/kolla-ansible/+/92567113:48
SvenKieskethose are trivial, but the CI will complain13:48
mnasiadkathose are merging13:49
mnasiadkaanyway, let's finish13:49
mnasiadkathank you all for coming13:49
mnasiadka#endmeeting13:49
opendevmeetMeeting ended Wed Aug  7 13:49:12 2024 UTC.  Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4)13:49
opendevmeetMinutes:        https://meetings.opendev.org/meetings/kolla/2024/kolla.2024-08-07-13.00.html13:49
opendevmeetMinutes (text): https://meetings.opendev.org/meetings/kolla/2024/kolla.2024-08-07-13.00.txt13:49
opendevmeetLog:            https://meetings.opendev.org/meetings/kolla/2024/kolla.2024-08-07-13.00.log.html13:49
mnasiadkasee you next week13:49
SvenKieskethank you13:49
darmachsee you and thanks!13:49
SvenKieskeI'll be on vacation next week13:49
mmalchukmnasiadka thanks13:49
SvenKieskethat is, from wednesday onwards, I still stick around monday/tuesday13:50
opendevreviewMichal Nasiadka proposed openstack/kolla-ansible master: CI: Bump codespell pin to <3  https://review.opendev.org/c/openstack/kolla-ansible/+/92587413:52
kevkoany second core for https://review.opendev.org/c/openstack/kolla-ansible/+/92454814:22
opendevreviewMichal Arbet proposed openstack/kolla master: Fix build of prometheus-ovn-exporter  https://review.opendev.org/c/openstack/kolla/+/92588614:43
opendevreviewMerged openstack/kolla-ansible stable/2023.2: fix flake8 error in database_shards.py  https://review.opendev.org/c/openstack/kolla-ansible/+/92573815:03
opendevreviewMerged openstack/kolla-ansible stable/2023.1: fix flake8 error in database_shards.py  https://review.opendev.org/c/openstack/kolla-ansible/+/92573915:03
opendevreviewVerification of a change to openstack/kolla-ansible stable/2024.1 failed: fix flake8 error in database_shards.py  https://review.opendev.org/c/openstack/kolla-ansible/+/92573715:03
SvenKieskekevko: afaik you missed one instance of nova_libvirt_secrets mount in the libvirt_cleanup task, see comment15:07
kevkoSvenKieske: i really don't know what you mean ...15:13
kevkoSvenKieske: why do you have  a problem with volume removal in playbook for removal ? 15:14
kevkoSvenKieske: explain to me please ..15:14
SvenKieskeI will explain on the patchset, give me one second please15:16
SvenKieskereplied15:20
kevkoSvenKieske: also15:21
SvenKieskeworking on a reply to that :)15:30
kevkoSvenKieske: did you test it if you are saying it will fail ? 15:30
kevkoSvenKieske: because I did 15:30
kevkoSvenKieske: check the reply 15:30
kevkoI have ceph refactor there from 30th of January 2024 ...working in 3 productions already ...and we are unable to merge this code 15:32
kevkothis related is actually bug ..and we are not able to merge even if it is bug 15:32
kevkoi can reply as fast as i can to reviewers comments ..and then i am again waiting for months 15:32
SvenKieskereplied, and I took just test it. you had to wait 9 minutes for that, for which I'm really sorry. You can pay me for faster replies I guess :P15:36
kevkoI am talking about it in general ...not now 15:37
SvenKieskeI understand it's important to fix bugs. But sometimes I feel like we rush out bugfixes without doing proper QA and then having to fix 3 more bugs introduced by one "fix". please take your time. I won't let other people pressure me to reviewer "faster", that's just the path to more bugs, more broken code.15:37
SvenKieskeI can understand it if it takes months, sure :)15:38
SvenKieskebut imho this is a real edge-case bug. I agree we could put in a comment to remove the volume in e cycle15:38
kevkoSvenKieske: you are testing something reallly different :D :D :D :D 15:39
SvenKieskeno? the cleanup playbook loops over `nova_libvirt_secrets`? is that redefined somewhere? my search didn't find anything. but I will admit you know the codebase way better than me, please explain :)15:40
SvenKieskeand I hate the autogenerated kolla_container.py stuff, I always find it hard to follow through what the actual code there does..15:42
kevkoSvenKieske: where can you see {{ }} ????? https://github.com/openstack/kolla-ansible/blob/a70af118385fab4747e35dc92d2c78006f90b0c6/ansible/roles/nova-cell/tasks/libvirt-cleanup.yml#L5415:42
kevkoSvenKieske: there is a module remove_volume which is removing the item iterated from loop ...in loop you dont have jinja variables but strings ...15:43
SvenKieskeah lol, good point15:43
SvenKieskebut what happens if that volume doesn't exist? doesn't it throw an error? I would not be surprised if it just ignores it15:44
kevkoSvenKieske: it's same as you are iterating list of   /tmp/non_file /tmp/file   /tmp/dir /tmp/nodir   and you will use file  module with path  of {{ item }} and state: absent 15:44
opendevreviewMerged openstack/kolla-ansible master: CI: Pin openstackclient to <7  https://review.opendev.org/c/openstack/kolla-ansible/+/92585715:45
kevkoSvenKieske: check my comment ...you have there playbook output where i added 'non_existant_volume' and commented out all other tasks ...and just run on my testing deployment ... it's OK 15:45
SvenKieskeyeah you are right15:45
SvenKieskeand remove_volume doesn't handle any errors, lol omg..15:45
kevkowhich error ? 15:46
SvenKieskewell what does docker/podman/whatever do if you try to remove a volume by name which doesn't exist? I bet you get an error?15:46
kevkoif i am calling remove_volume  foo .. bar ... i just saying i don't want to have a volume defined with name foo and bar 15:46
SvenKieskedoesn't it actually remove the volume from disk? does it only delete the definition of the volume? ah wait that's the purge command I guess?15:47
SvenKieskeI guess I need some caffeine..15:47
kevkoSvenKieske: it is ...yes it's throwing and error ...but this is ansible python module 15:47
kevkoSvenKieske: again ..it's same as you  write a   file: /tmp/file_which_not_exist state:absent 15:48
kevkoSvenKieske: ansible will be green everything ok ...15:48
SvenKieskeI'm just saying we should either catch that error gracefully, or if we think we are a low level python lib we should at least bubble up the error to the caller15:48
SvenKieskebut that's not for the current review I guess, ok.15:48
kevkoroot@controller0:~# rm /tmp/file_which_not_exist15:48
kevkorm: cannot remove '/tmp/file_which_not_exist': No such file or directory15:48
kevkoyou can't compare ansible module and some bash call 15:48
SvenKieskeI don't. It's just good programming practice to either handle all error conditions yourself, or if you can't at least inform your caller (from the point of view of a library author) that what the caller wanted is not possible. silent failures are just bad.15:49
SvenKieskewhich is also was rm does above, it informs you it can't delete that thing. there is a "--force" switch if you don't bother about errors :)15:51
kevkoSvenKieske: so you are saying you will better write a playbook where you will firstly use the stat module to check if file exist in first task and then write another task with file module with state absent with when: my_previous_task.exists ???? 15:51
kevkoSvenKieske: or you just use the file module ...15:51
SvenKieskekevko: yes the file module handles that. I'm just saying our custom kolla modules sometimes don't handle that, which is bad imho.15:52
SvenKieskebut as I said, that's not the topic of the review, thanks for the explanation, you got my +1. unfortunately frickler currently hasn't the time to do the review (he replied as much I guess already). so maybe some other core has some cycles.15:53
kevkoSvenKieske: Sorry, but this is ridiculous. Ansible is not programming per se ... you should be programming the modules that do the low-level work for you ... and as you can see ... the module is written correctly... Ideally, Ansible should be usable in a very simple way ...15:53
kevkoSvenKieske: thank you :D ...finally :D 15:55
kevkoSvenKieske: going for cigarette :D 15:55
SvenKieskewell we actually do raise error codes for docker volumes, I was looking at wrong code, due to a test venv: https://github.com/openstack/kolla-ansible/blob/master/ansible/module_utils/kolla_docker_worker.py#L49015:57
kevkoSvenKieske: yeah, i've also checked 15:58
kevkoit's frustrating ..because this is for me for example important code 15:58
SvenKieskeyeah sorry, was really looking at some python site-packages inside a venv which had the same function name but different code15:58
kevkoi need to backport to several versions for now for our customers ...15:58
SvenKieskeshould put my test envs outside of the code repo I guess, but it's convenient15:59
SvenKieskehow did you trigger this error live, I didn't really understand that from the bugreport.15:59
kevkobecause they decided they want to have AZs and several ceph clusters ... several DCs ... and without proper kolla-ansible code you can't do it in good way 15:59
SvenKieskeah okay, so they had to change nova keyring to support multiple ceph clusters?16:00
SvenKieskedid this work in the past?16:00
kevkoSvenKieske: if you change the keyring for example ... your kolla-ansible code will report CHANGED for keyrings ...so the kolla copy script which is copying files in mounted /etc/kolla/libvirt to /var/lib/kolla/config_files in container  will actually copy to /etc/libvirt/secrets which ARE mounted twice from kolla-ansible 16:02
kevkoso you will get resource busy ....16:02
kevkoyou will not trigger this bug if you never change the keyring 16:02
kevkobut as i spend really much time to write it good ...i manipulate with keyrings ..change them ...etc etc ..etc ...so i've tested it more than CI can ...16:03
kevkodid you get it little bit :D ? 16:03
kevkoto be honest ...i really didn't understand what is going on on the first seen 16:03
kevkosimply said ...there is docker volume mounted to /etc/libvirt/secrets and in config.json there is a secrets folder which kolla copy script wants to copy to /etc/libvirts/secrets and MOREOVER that script is doing rm -rf and then copy it ... but this is mount point in docker container and can't be removed...because it's actually a volume mounted ...16:05
kevkojust ugly loop 16:06
kevkoegg or chicken 16:06
SvenKieskemhm, I'm wondering if our copy-into-container stuff is just broken :D16:07
kevkoSvenKieske: it's not ...again ...did you get the point ? 16:08
kevkoSvenKieske: our script just do this ->   1. Read config json 2. Okay, I need to copy this path 3. rm -rf /destination 4. Copy the path from 2. to /destination ....16:09
kevkoSvenKieske: which is correct right ? 16:09
SvenKieskeI'm actually not sure if this is a good way to copy files into containers, no.16:09
SvenKieskebut I need to recheck the code, the history etc first16:10
kevkoSvenKieske: why not ? 16:10
kevkoSvenKieske: what you don't like ? 16:10
SvenKieskewell it's not a simple filesystem, it's layered stuff, it's in use by docker daemon, it's complicated. But I really need to read the code. very well possible that it's ugly but there's nothing better to do.16:12
kevkojust to finish the story ^^^    ...So script will fail and will throw resource is busy ...because that path is actually docker volume mounted from outside 16:12
kevkoSvenKieske: i actually don't know why it's not mounted directly into container and there is some "copy script" ...this is a history i don't know anything about 16:13
kevkoSvenKieske: i was just explayning where i found a bug 16:13
kevkoi mean configuration ... 16:13
SvenKieskesure, do you happen to have a pointer to our current copy stuff? I just don't find it currently :D16:14
kevkoSvenKieske: https://github.com/openstack/kolla/blob/master/docker/base/set_configs.py16:16
SvenKieskeah right, there's still a patch from you open in that area. I still have a draft comment there where I try to document how this stuff works, but it's not finished.16:18
kevkoSvenKieske: yeah, and again ..it's quite important bug and nobody cares :D 16:19
SvenKieskeI care that much that I take away your/the original authors obligation and document the algorithm myself. this config merge is way to complex to code down without docs.16:21
kevkoSvenKieske: just imagine you change your api-paste.ini file ...or change policy.yaml ... i mean ..place the custom one to /etc/kolla/config/whatever/your_modified_config ....   but after some time you will just decide that you don't need anything custom and you will want to go back to the default one ...   you are in trouble ....16:21
SvenKieskewhich is evident by the fact that it's not bug free.16:21
kevkoSvenKieske: i have also some unit tests written somewhere ...but didn't have a time for this ...as i am using it for now ..and don't have a time to make preasure on gerrit .... because kolla-ansible code is harder to maintain downstream 16:22
SvenKieskemaybe we should just get rid of "merge" mechanic and just have simple "copy defaults" "copy custom stuff, you must include defaults yourself" :D16:22
SvenKieskeand everytime something is copied old config is nuked completely.16:23
SvenKieskethat would be simple16:23
SvenKieskeat least simpler than now.16:23
SvenKieskebut I guess that won't happen :D16:23
kevkoSvenKieske: this is not about merge mechanism in kolla ..this is about copy script ...once something is copied as config.json has defined it  ...the default is overriden ...and if you remove it ....copy script just will not copy anything ...but the file will be still present  16:24
SvenKieskethis folder alone makes me cry: https://github.com/openstack/kolla/tree/master/docker/base mixing code and repository information (data)16:24
SvenKieskeI guess I call it a day for today16:24
kevkoSvenKieske: haha, it's only about some nice reorganization ...kolla is simpler than kolla-ansible 16:25
SvenKieskeyeah, complex enough already. at least it's not DNS I guess, which is about naming things and cache invalidation.16:26
kevkoyep16:26
kevkoI have 32 patches open 16:27
kevkosome of them are wips ...but almost everything is ready to be merged ... something about 20 - 25 with renos ..with bugs ..everything ...still in gerrit 16:28
kevkosad story 16:28
kevko:'(16:28
opendevreviewMichal Arbet proposed openstack/kolla-ansible master: Use more descriptive libvirt secret names corresponding to reality  https://review.opendev.org/c/openstack/kolla-ansible/+/92454816:38
mnasiadkakevko: we all have a high number of patches, instead of complaining - maybe we should find a solution to get them reviewed and merged?17:18
opendevreviewMichal Nasiadka proposed openstack/kolla-ansible master: CI: Bump codespell pin to <3  https://review.opendev.org/c/openstack/kolla-ansible/+/92587417:19
opendevreviewMichal Nasiadka proposed openstack/kolla-ansible master: CI: Fix cases where port is not available yet  https://review.opendev.org/c/openstack/kolla-ansible/+/92450617:20
opendevreviewMichal Nasiadka proposed openstack/kolla-ansible master: Bump Ansible versions to 2.16 and 2.17  https://review.opendev.org/c/openstack/kolla-ansible/+/92174317:20
opendevreviewMichal Nasiadka proposed openstack/kolla-ansible master: CI: Use u-c in openstack-clients role  https://review.opendev.org/c/openstack/kolla-ansible/+/92590917:25
kevkomnasiadka: yeah, but how ? Some mix of comments not-resolved/resolved, age of patch, maturity ? Queue ? 17:58
kevkoWhich will result in some type of weight ? Or what ? 17:59
arcayneIs this the best channel for Kayobe-specific questions?18:17
kevkoYes it is 18:23
kevkoBut I am not responsible to answer your question :/ but I am sure someone does 18:23
arcayneAwesome, and no worries -thanks for confirming I'm in the right channel. :)18:24
arcayneI'm working through my first attempt at using Kayobe, but I'm struggling to wrap my head around the network-interface configs. My goal is to configure each overcloud node to have two bonds (one bond per dual-port NIC), with multiple VLANs/bridges tied to each bond. I've done this plenty of times in other environments, with OVS, etc - but I'm just going dumb trying to figure out how 18:34
arcayneto translate that knowledge to Kayobe.18:34
arcayneIf I could find even an example config to reference, that would be super helpful - but the Kayobe docs don't seem to go that deep, and the couple of real-world kayobe-config repos from StackHPC are also farily simple looking (in terms of the network-interface files, at least). Does anyone know of any other resources I can check out?18:36

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!