opendevreview | Ivan Vnučko proposed openstack/kolla-ansible master: Add backend TLS encryption of MariaDB replication and SST traffic https://review.opendev.org/c/openstack/kolla-ansible/+/925317 | 05:07 |
---|---|---|
Core4244 | tafkamax Could you may be help me with your keycloak settings in general pls | 05:51 |
Core4244 | I'm lost at this point trying to get it to work | 05:51 |
Core4244 | okay I fixed it. The problem was my keycloak was also using self signed certificate and keystone failed to get info from keycloak due to ssl errors. | 07:14 |
Core4244 | I had to disable verification in wsgi, for testing | 07:15 |
*** jhorstmann is now known as Guest2391 | 07:25 | |
tafkamax | Good job! | 07:34 |
opendevreview | Roman Krček proposed openstack/kolla-ansible master: Put memcache_security_strategy in single place. https://review.opendev.org/c/openstack/kolla-ansible/+/925444 | 07:54 |
stromgren | kevko: To continue my question from yesterday regarding libvirt secrets missing on one host. I've done some more research and can see that the files are present in /etc/libvirt/secrets and that they're copied in during the start of the container. I can set them manually, but it's not persisted through reboots. On both the working and non-working machines there are two secrets with different | 07:57 |
stromgren | uuid for "ceph client.cinder". On the working host the correct one is shown when running "virsh secret-list" but on the non-working one it shows the incorrect one. So it seems like there's an old secret lurking that needs to be removed. | 07:57 |
opendevreview | Roman Krček proposed openstack/kolla-ansible master: Put memcache_security_strategy in single place at all.yml https://review.opendev.org/c/openstack/kolla-ansible/+/925444 | 07:57 |
opendevreview | Michal Nasiadka proposed openstack/kolla master: bifrost: Fix ansible and deps installation https://review.opendev.org/c/openstack/kolla/+/924246 | 08:13 |
opendevreview | Michal Nasiadka proposed openstack/kolla master: WIP: Switch to Ubuntu 24.04 https://review.opendev.org/c/openstack/kolla/+/907589 | 08:14 |
mnasiadka | hmm, CI seems to be red | 09:18 |
opendevreview | Ivan Halomi proposed openstack/kolla-ansible master: Fix octavia test fail if a port is missing https://review.opendev.org/c/openstack/kolla-ansible/+/925852 | 09:18 |
mnasiadka | https://658019d6165ae4c8eab6-597ff148d0ea9164d11e7cb764cf9b04.ssl.cf1.rackcdn.com/924246/31/check/kolla-ansible-rocky9/c1f8070/primary/logs/ansible/init-runonce | 09:18 |
mnasiadka | frickler: seen that? | 09:18 |
frickler | mnasiadka: nope, that seems new, at least some things got merged yesterday | 09:26 |
SvenKieske | mnasiadka: maybe just a temporary infra failure? I don't recall any invasive changes in that area | 09:35 |
mnasiadka | sounds like openstackclient/sdk failure | 09:37 |
opendevreview | Pierre Riteau proposed openstack/kolla-ansible master: CI: Fix variable name for Nova noVNC FQDN https://review.opendev.org/c/openstack/kolla-ansible/+/925854 | 09:37 |
SvenKieske | mhm, afaik frickler reported a critical regression in osc, but I'm not sure it was in this area? | 09:39 |
SvenKieske | no, that was in nova: https://bugs.launchpad.net/python-openstackclient/+bug/2076212 | 09:40 |
opendevreview | Ivan Halomi proposed openstack/kolla-ansible master: Fix octavia test fail if a port is missing https://review.opendev.org/c/openstack/kolla-ansible/+/925852 | 09:49 |
frickler | kolla should not be using 7.0.0 yet, or does it? in the build log I found 6.6.0 still | 09:50 |
mnasiadka | we're not limiting that to anything | 09:52 |
mnasiadka | we just install what is latest in pypi - https://opendev.org/openstack/kolla-ansible/src/branch/master/roles/openstack-clients/defaults/main.yml | 09:53 |
frickler | hmm, ok, so if it is reproducible, that might be yet another osc regression then | 09:58 |
mnasiadka | let's see if pinning helps | 10:02 |
opendevreview | Rafal Lewandowski proposed openstack/kolla-ansible master: [WIP] Enable ML2/OVN and distributed FIP by default. https://review.opendev.org/c/openstack/kolla-ansible/+/904959 | 10:04 |
opendevreview | Michal Nasiadka proposed openstack/kolla-ansible master: CI: Pin openstackclient to <7 https://review.opendev.org/c/openstack/kolla-ansible/+/925857 | 10:04 |
mnasiadka | frickler: seems quota was migrated to SDK - https://opendev.org/openstack/python-openstackclient/commit/70fbf687cf10e67d1e05120b15533725f106a1f6 | 10:06 |
frickler | mnasiadka: yes, lots of things were, but that doesn't explain why it is failing this way | 10:08 |
mnasiadka | true, just stating the obvious :) | 10:10 |
frickler | hmm, do we actually not deploy cinder in that scenario? it is indeed missing in the catalog in those OSC cmds | 10:20 |
mnasiadka | we only deploy cinder in multinode scenarios | 10:47 |
mnasiadka | seems pinning helps | 10:47 |
frickler | yes, the regression is pretty obvious from the commit you linked | 10:48 |
frickler | fwiw I'll be offline soon for a couple of hours, will read the list of review reqs after the meeting ;) | 10:49 |
frickler | filed https://bugs.launchpad.net/python-openstackclient/+bug/2076229 | 10:56 |
opendevreview | Michal Nasiadka proposed openstack/kolla-ansible master: CI: Pin openstackclient to <7 https://review.opendev.org/c/openstack/kolla-ansible/+/925857 | 11:06 |
opendevreview | Michal Nasiadka proposed openstack/kolla-ansible master: CI: Fix cases where port is not available yet https://review.opendev.org/c/openstack/kolla-ansible/+/924506 | 11:14 |
opendevreview | Michal Nasiadka proposed openstack/kolla-ansible master: Bump Ansible versions to 2.16 and 2.17 https://review.opendev.org/c/openstack/kolla-ansible/+/921743 | 11:14 |
opendevreview | Michal Nasiadka proposed openstack/kolla-ansible master: CI: Fix cases where port is not available yet https://review.opendev.org/c/openstack/kolla-ansible/+/924506 | 11:37 |
opendevreview | Michal Nasiadka proposed openstack/kolla-ansible master: CI: Fix cases where port is not available yet https://review.opendev.org/c/openstack/kolla-ansible/+/924506 | 11:38 |
opendevreview | Michal Nasiadka proposed openstack/kolla-ansible master: CI: Fix cases where port is not available yet https://review.opendev.org/c/openstack/kolla-ansible/+/924506 | 11:38 |
opendevreview | Michal Nasiadka proposed openstack/kolla-ansible master: Bump Ansible versions to 2.16 and 2.17 https://review.opendev.org/c/openstack/kolla-ansible/+/921743 | 11:38 |
opendevreview | Michal Nasiadka proposed openstack/kolla master: WIP: Switch to Ubuntu 24.04 https://review.opendev.org/c/openstack/kolla/+/907589 | 11:43 |
SvenKieske | mnasiadka: https://review.opendev.org/c/openstack/kolla-ansible/+/925852 and https://review.opendev.org/c/openstack/kolla-ansible/+/924506 are quite similar, can we maybe merge (ha!) these together? | 11:51 |
mnasiadka | well, doesn't the second one contain more or less the first one? ;-) | 11:52 |
SvenKieske | yes, I wasn't aware there's already a fix in the making when writing the bug report | 11:53 |
opendevreview | Michal Nasiadka proposed openstack/kolla-ansible master: CI: Fix cases where port is not available yet https://review.opendev.org/c/openstack/kolla-ansible/+/924506 | 11:56 |
opendevreview | Michal Nasiadka proposed openstack/kolla-ansible master: Bump Ansible versions to 2.16 and 2.17 https://review.opendev.org/c/openstack/kolla-ansible/+/921743 | 11:56 |
opendevreview | Michal Nasiadka proposed openstack/kolla-ansible master: CI: Fix cases where port is not available yet https://review.opendev.org/c/openstack/kolla-ansible/+/924506 | 12:07 |
opendevreview | Michal Nasiadka proposed openstack/kolla-ansible master: CI: Fix cases where port is not available yet https://review.opendev.org/c/openstack/kolla-ansible/+/924506 | 12:07 |
opendevreview | Michal Nasiadka proposed openstack/kolla-ansible master: Bump Ansible versions to 2.16 and 2.17 https://review.opendev.org/c/openstack/kolla-ansible/+/921743 | 12:08 |
stromgren | kevko: I have now fixed the issue with the lurking libvirt secret. Removed the incorrect files from /etc/kolla/nova-libvort/secrets, restarted the libvirt container and set the secret manually. It now persists between reboots. The old uuid is from a test-deploy that I did. Leving this here for completeness if anyone else encounters the same issue. | 12:14 |
SvenKieske | stromgren: nice you found a solution. if you really want to archive this I suggest to write a blogpost or send it to the mailing list, afaik the IRC logs are not indexed by search engines so nobody will find the solution in here :) | 12:26 |
SvenKieske | stromgren: what we can learn from this: always purge test deployments completely before redeployment :) | 12:26 |
stromgren | SvenKieske: I'll write something up and send it to the mailing list. The issue was very strange because the multiple secret files existed on all hosts, but only one of them read the incorrect one. At least I'm a little more versed in how the wiring behind the scenes works now :) | 12:30 |
SvenKieske | stromgren: that's very kind of you for people also stumbling over this, much appreciated! :) | 12:30 |
SvenKieske | and yes, debugging stuff always leads to a huge increase in knowledge imho :D | 12:31 |
MikeCTZA | doign debugging myself now after a kolla-ansible / openstack upgrade failed (we had a network disconnect) and then we tried to sort it after, so finding all sorts of weirdnesses, and then we made a few mistakes after as well ... so having some "fun" at the moment | 12:32 |
SvenKieske | ouch, network outage during upgrades is for sure one of the worst cases of bad luck you can have | 12:33 |
stromgren | Sounds like you have your work cut out for you then. Always eye opening (and a bit sobering) to find your own mistakes.. | 12:35 |
SvenKieske | if you have (huge) issues during an upgrade/deployment my first advice would be to roll back, if you can. and then analyze. that assumes you can roll back, of course. | 12:36 |
SvenKieske | if you have some experience you can start fixing problems during upgrades and continue upgrading, but that is an operation that must be done very carefully imho. | 12:37 |
MikeCTZA | yes indeed, its not massive things that arent working but small annoying things | 12:38 |
SvenKieske | I always like to point to this piece of wisdom when it comes to any software deployments, it has a lot of good rules what to do and what not to do, not openstack specific: https://lwn.net/Articles/562333/ | 12:38 |
MikeCTZA | eg keystone tokens, which could be the case of a horizon and a few admin CLI commands not working, we think everything has upgraded fine now | 12:38 |
SvenKieske | mhm, you mean fernet tokens, I guess? might be worth to check the fernet containers. the rotation runs via cron jobs inside the container and is a little brittle, afaik. | 12:39 |
SvenKieske | afaik the cron job still doesn't log anywhere useful, so not to docker logs, or fluentd or anything, might be misremembering though | 12:40 |
MikeCTZA | the tokens are the same on all 3 nodes. on 2 we get errors on 1 not eg "exception.TokenNotFound(e) keystone.exception.TokenNotFound: Could not recognize Fernet token" ... so trying to figure it all out | 12:40 |
SvenKieske | MikeCTZA: on which release are you? | 12:42 |
MikeCTZA | we were on Yoga, which was the reason we moved fwd to Zed and then we had issues and mistakes were made on my side ... so now we are on Antelope/2023.1 | 12:43 |
SvenKieske | if you want to understand the low level details of fernet key rotation this bug: https://bugs.launchpad.net/kolla-ansible/+bug/1809469 and this fix have quite some details: https://opendev.org/openstack/kolla-ansible/commit/6c1442c385450004dd253f3f464fe4336194be99 | 12:43 |
SvenKieske | but those where way before your release, so I'm fairly certain you have no bug in your k-a with regards to fernet | 12:44 |
MikeCTZA | thanks I'll give them a read they may have something that helps | 12:44 |
SvenKieske | it might be that the rotation during network split or upgrade left some clients with invalid tokens | 12:44 |
SvenKieske | you might be able to just trigger key rotation manually (there's a cronjob for that in the fernet container) | 12:44 |
SvenKieske | but be careful :) | 12:45 |
MikeCTZA | we actually trashed all the tokens and started them fresh and let kolla-ansible recreate them and it didnt change things, we put the old ones back after backing them up, so its confusing us ... we dont think we quite understand what the mechanics are | 12:48 |
SvenKieske | mhm, that's interesting | 12:48 |
MikeCTZA | hmmm I see my keystone_fernet container is now unhealthy | 12:49 |
opendevreview | Verification of a change to openstack/kayobe master failed: Add support for customising Neutron physical network names https://review.opendev.org/c/openstack/kayobe/+/922335 | 12:50 |
SvenKieske | did you fix the underlying network issue? many issues in openstack are actually side effects of the whole system being a distributed system. if the network doesn't work reliably all kinds of stuff pops up with problems. | 12:51 |
MikeCTZA | there was a switch PSU issue so that was sorted pretty quickly | 12:52 |
mnasiadka | mgoddard mnasiadka bbezak frickler kevko SvenKieske mmalchuk gkoper jangutter jsuazo jovial osmanlicilegi mattcrees dougszu darmach - meeting in 5 minutes | 12:54 |
SvenKieske | \o/ | 12:55 |
MikeCTZA | end of my work day here ... will get back to this tmrw | 12:55 |
mnasiadka | #startmeeting kolla | 13:00 |
opendevmeet | Meeting started Wed Aug 7 13:00:00 2024 UTC and is due to finish in 60 minutes. The chair is mnasiadka. Information about MeetBot at http://wiki.debian.org/MeetBot. | 13:00 |
opendevmeet | Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. | 13:00 |
opendevmeet | The meeting name has been set to 'kolla' | 13:00 |
mnasiadka | #topic rollcall | 13:00 |
mnasiadka | o/ | 13:00 |
mmalchuk | o/ | 13:00 |
mhiner | o/ | 13:00 |
SvenKieske | o/ | 13:00 |
darmach | o/ | 13:01 |
mattcrees | o/ | 13:01 |
mnasiadka | #topic agenda | 13:02 |
mnasiadka | * CI status | 13:02 |
mnasiadka | * Release tasks | 13:02 |
mnasiadka | * Regular stable releases (first meeting in a month) | 13:02 |
mnasiadka | * Current cycle planning | 13:02 |
mnasiadka | * Additional agenda (from whiteboard) | 13:02 |
mnasiadka | * Open discussion | 13:02 |
mnasiadka | #topic CI Status | 13:02 |
mnasiadka | so, status is RED | 13:02 |
mnasiadka | (for kolla-ansible) | 13:03 |
mnasiadka | pinning openstackclient in https://review.opendev.org/c/openstack/kolla-ansible/+/925857 | 13:03 |
mnasiadka | #topic Release tasks | 13:03 |
mnasiadka | It's R-8 week, nothing on our calendar | 13:04 |
jovial | Will we need to pin the openstack client in kayobe too? | 13:04 |
mnasiadka | #topic Regular stable releases | 13:04 |
mnasiadka | jovial: it depends, if you have Nova without Cinder, or Cinder without Nova ;-) | 13:04 |
mnasiadka | (in any job) | 13:04 |
jovial | Ahh, cheers | 13:04 |
mnasiadka | bbezak: did we release anything last month? | 13:05 |
mnasiadka | I did see some movement maybe, but maybe I was just dreaming ;-) | 13:05 |
mnasiadka | Ah, bbezak is not here | 13:05 |
mnasiadka | so I'll ask him later | 13:05 |
SvenKieske | yes we did | 13:05 |
mnasiadka | Ok, so I'll raise this months releases after the meeting | 13:06 |
mnasiadka | #topic Current cycle planning | 13:06 |
SvenKieske | https://review.opendev.org/c/openstack/releases/+/925182 | 13:06 |
mnasiadka | SvenKieske: thanks | 13:06 |
mnasiadka | ok, it's my first day after vacation - so going to work again on Noble and Ansible bump | 13:07 |
mnasiadka | and we'll see how that goes | 13:07 |
mnasiadka | I guess no other major features need discussing | 13:07 |
mnasiadka | #topic Additional agenda (from whiteboard) | 13:08 |
mnasiadka | oh boy, that's long | 13:08 |
mnasiadka | mhiner [2024-08-07]: please review: | 13:08 |
mnasiadka | Refactor of docker worker https://review.opendev.org/c/openstack/kolla-ansible/+/908295 | 13:08 |
mnasiadka | Refactor of kolla_container_facts https://review.opendev.org/c/openstack/kolla-ansible/+/911417 | 13:08 |
mnasiadka | Move actions to kolla_container_facts https://review.opendev.org/c/openstack/kolla-ansible/+/911505 | 13:08 |
mnasiadka | Add uninstall tasks to ACK https://review.opendev.org/c/openstack/ansible-collection-kolla/+/925083 | 13:08 |
mnasiadka | Add action for getting container names list https://review.opendev.org/c/openstack/kolla-ansible/+/924389 | 13:08 |
mhiner | Also, the two changes mentioned in TODOs in migration patchset are done - last two on the above list. | 13:08 |
mhiner | Shoud I add them to the migration patchset? | 13:08 |
mhiner | Doing this will put migration in the fifth place in the relation chain. | 13:08 |
ihalomi | there is also problem in that docker worker refactor in upgrade tests, since they use old release of ansible-collection-kolla which doesnt install docker 6.0.0 | 13:10 |
mhiner | that was discussed on previous meeting | 13:10 |
SvenKieske | that last one should be fixed by the osbpo patch | 13:10 |
ihalomi | then during upgrade it switches to master version of a-c-k but bootstrap-servers is not called again | 13:11 |
SvenKieske | afaik this one it was: https://review.opendev.org/c/openstack/ansible-collection-kolla/+/916258 | 13:11 |
ihalomi | SvenKieske: yes but it needs to be backported to 2024.1 release of a-c-k | 13:12 |
mnasiadka | I'm still amazed why are we not using a venv and recommending users to use a venv ;-) | 13:13 |
SvenKieske | I guess that's a different discussion that also already took place. I guess it's just missing people actually implementing it. | 13:14 |
SvenKieske | and maybe we should also use venvs ourselves everywhere, before recommending it to users ;) | 13:14 |
mnasiadka | ok, reviewed some of these patches, let's see the answers later | 13:15 |
mnasiadka | (r-krcek)[2024-08-07] please review | 13:15 |
mnasiadka | Change to dev mode https://review.opendev.org/c/openstack/kolla-ansible/+/925714 and https://review.opendev.org/c/openstack/kolla/+/925712 | 13:15 |
mnasiadka | k-a check command https://review.opendev.org/c/openstack/kolla-ansible/+/599735 | 13:15 |
mnasiadka | memcache_security_strategy templating https://review.opendev.org/c/openstack/kolla-ansible/+/925444 | 13:15 |
ihalomi | mnasiadka: what about backporting this so docker worker tests will pass? https://review.opendev.org/c/openstack/ansible-collection-kolla/+/916258 | 13:16 |
mnasiadka | just propose it, we can discuss it in Gerrit | 13:17 |
SvenKieske | regarding dev mode: doesn't the proposed approach break for modern pip which doesn't allow to install outside of venv? will comment on the patch | 13:20 |
mnasiadka | Well, first of all it's not a bug fix - it's not backportable | 13:20 |
mnasiadka | but let's continue in Gerrit | 13:20 |
mnasiadka | ok, and now a huge blob of text | 13:21 |
mnasiadka | (SvenKieske)[2024-08-07]: How do we want to handle slurp upgrades in the future? | 13:21 |
mnasiadka | currently it's afaik only planned to make a special upgrade command for 2023.1 release to upgrade rmq: https://review.opendev.org/c/openstack/kolla-ansible/+/918976 | 13:21 |
mnasiadka | problem I see is: won't we need a similar change for the next slurp upgrade cycle as well, and wouldn't it thus make sense to add a generic slurp-upgrade command to master? | 13:21 |
mnasiadka | (mattcrees): My understanding is that going forward we should only bump RabbitMQ versions once a year during the major/odd releases. As such, we shouldn't need this additional rabbit upgrade in future releases. | 13:21 |
mnasiadka | maybe I missed something or is the intent to just always provide one-off patches for the slurp releases to keep the master codebase lean? | 13:21 |
mnasiadka | It's also unclear to me how we handle the process when users set `rabbitmq_image` via global.yml, we need to handle these users during future upgrades, see comments on the patch. | 13:21 |
mnasiadka | what I mean by this: how do we make sure we don't forget a reno for slurp that users need to now change their rabbitmq_image var if they changed it, etc. do we possibly want to test this in CI slurp jobs maybe? | 13:21 |
mnasiadka | (mattcrees): I guess bumping RabbitMQ versions should already have a reno? | 13:21 |
mnasiadka | (mattcrees): We do definitely want to test the double rabbit upgrade in CI, I plan for a follow-up patch once the current ones get merged. | 13:21 |
mnasiadka | Ok, so RMQ double version upgrade - since we do rolling upgrade, we can't jump across two releases in one go | 13:22 |
SvenKieske | okay, if someone (mnasiadka?) could clarify if we just intend to bump rmq going forward once a year (is that possible with upstreams release cadence? i don't know), I'm fine. | 13:22 |
SvenKieske | I actually just now read mattcrees replies, thanks for those :) | 13:22 |
mnasiadka | well, RMQ in the past had two major releases per year | 13:23 |
SvenKieske | I guess my fear of missing a reno is overblown, I agree we will have reno if we bump rmq version | 13:23 |
mnasiadka | And it seems they are not changing this | 13:24 |
SvenKieske | so if they (rmq) stick to their release cadence we can't only upgrade once a year, can we? | 13:24 |
mnasiadka | So I would be reluctant to agree bumping every two cycles | 13:24 |
mnasiadka | I would rather thing about having upgrade prechecks that would check if an upgrade is possible? | 13:24 |
SvenKieske | so my proposal would be to move the rmq upgrade command to regular release (master) | 13:24 |
mnasiadka | *think | 13:24 |
SvenKieske | currently it's sitting on the stable branch | 13:24 |
mattcrees | We've got a patch in progress for that kind of precheck: https://review.opendev.org/c/openstack/kolla-ansible/+/918977 | 13:25 |
mnasiadka | so, let's assume somebody wants to upgrade from A to C | 13:25 |
SvenKieske | ah you beat me with the link posting matt, thx :9 | 13:26 |
mnasiadka | so we would need to upgrade RMQ to B version, then C version | 13:26 |
mattcrees | And I agree, if we're not sticking to one bump a year then this should get into master. It might be nice to get a less symlinky way of having multiple versions if anyone has ideas around that ;) | 13:26 |
SvenKieske | let's fix the symlinks maybe later :D | 13:26 |
mnasiadka | So then yes, we need a command for upgrading RMQ | 13:28 |
mnasiadka | or a slurp-upgrade subcommand that upgrades RMQ first, and then does a regular upgrade | 13:28 |
mnasiadka | fine with anything really | 13:28 |
SvenKieske | alright, my point was just that we maybe move the existing approach in gerrit just from the current stable branch to master, ty! :) we can discuss details and improvements on the patches I guess | 13:29 |
mattcrees | Sure, sounds like we're in agreement on the plan. I'll get those patches pointing to master when I get a moment | 13:29 |
mnasiadka | we can and probably we should, although we can't test it in master ;-) | 13:30 |
mnasiadka | Anyway, happy to review anything | 13:31 |
SvenKieske | alright, I have two little things CI and python 3.12 related | 13:31 |
SvenKieske | which are not on the agenda, but I guess the agenda is empty? | 13:31 |
mnasiadka | mattcrees: can we gather all those RMQ related patches in some Gerrit topic and reference it in somwhere on the whiteboard so it's easy to track them? | 13:32 |
mnasiadka | #topic Open discussion | 13:32 |
mnasiadka | SvenKieske: now you can | 13:32 |
mnasiadka | ;-) | 13:32 |
mattcrees | Yeah sure thing | 13:32 |
SvenKieske | so I tried fixing a linting issue locally, when I ran into this: https://review.opendev.org/c/openstack/kolla-ansible/+/925671 | 13:32 |
SvenKieske | which is really weird because this flake8 error should've been catched by CI long ago, and should've complained everytime since it was merged | 13:33 |
SvenKieske | I still haven't figured out why CI didn't catch it (I had not that much time to look into it yet) | 13:33 |
SvenKieske | second thing I stumbled over, when doing this, which is interesting for mnasiadka I guess, is that flake8 won't really work with python3.12 without https://review.opendev.org/c/openstack/kolla-ansible/+/925670 | 13:34 |
SvenKieske | so you might want to incorporate that into your ubuntu noble testing | 13:34 |
mnasiadka | or we just merge it now ;-) | 13:35 |
SvenKieske | discussed this also with frickler and he had also no immediate idea why CI didn't catch this. also I would like to propose to run our linters on python3.12 maybe? | 13:35 |
mnasiadka | well, question is where do we install flake8 ;-) | 13:35 |
SvenKieske | xD | 13:35 |
SvenKieske | I actually am not sure currently where to set python version for our linters, yeah | 13:35 |
SvenKieske | I guess I can figure that out somewhere | 13:35 |
mnasiadka | I'll have a look - basically the problem with Noble patch is that we always assumed Ubuntu nodes for py3 testing are the same as the ones we use for building Ubuntu | 13:36 |
mnasiadka | and now only the latter is moving to Noble | 13:36 |
mnasiadka | which I'll need to solve somehow | 13:36 |
mnasiadka | but maybe we should have some additional jobs for py3.12 testing | 13:36 |
mnasiadka | I'll have a look | 13:36 |
SvenKieske | yeah, I was also thinking about better adding more jobs for py312 instead of just moving, seems more secure, even if the increased load is unfortunate | 13:37 |
SvenKieske | thanks for looking into ti | 13:38 |
SvenKieske | it* | 13:38 |
SvenKieske | I still want to figure out why the linter didn't complain about that f string, that's a pretty old basic check. and I accidently even fixed a bug there, because the hostname wasn't printed properly | 13:39 |
SvenKieske | at least it didn't find more bugs.. | 13:40 |
mnasiadka | haha | 13:40 |
SvenKieske | but I fear that somehow that linter is just partially broken and doesn't report anything.. | 13:41 |
SvenKieske | guess that's paranoia, maybe I should add a testcase that's always failing to ensure the linter works :D | 13:42 |
SvenKieske | I also have some code around that I need to push that checks if all our imports are in our requirements.yml/txt files, it's not quite polished enough yet | 13:43 |
mnasiadka | ah, hacking 3.0.1 depends on flake8<3.8.0 and >=3.6.0 | 13:44 |
mnasiadka | and 3.7.9 doesn't fail on this one | 13:44 |
mnasiadka | and we don't have flake8 in lint-requirements.txt | 13:45 |
SvenKieske | mhm, maybe a bug in flake8 | 13:45 |
SvenKieske | yeah you need a newer flake8 afaik on py312 | 13:45 |
SvenKieske | maybe should regularly bump those | 13:45 |
mnasiadka | I guess so | 13:45 |
mnasiadka | I don't even know why we limited hacking | 13:45 |
SvenKieske | there was some bug in the past afaik, but CI is green now ;) | 13:46 |
SvenKieske | somewhere was also the flake8 version pinned, don't quite remember if it was in u-c or somewhere? | 13:46 |
mnasiadka | yeah well, need more people paying attention :) | 13:46 |
mnasiadka | just checked, flake8 is not in u-c | 13:46 |
mnasiadka | after your patch we should be better anyway | 13:46 |
mnasiadka | thanks | 13:46 |
SvenKieske | impression I got at least was that the basic py312 jobs don't test enough stuff :D | 13:46 |
SvenKieske | yeah thanks as well, guess we can conclude the meeting | 13:47 |
mnasiadka | question if we should backport the CI patch - I guess it could make some sense | 13:47 |
SvenKieske | then we need first to merge all the flake8 backport fixes: https://review.opendev.org/c/openstack/kolla-ansible/+/925671 | 13:48 |
SvenKieske | those are trivial, but the CI will complain | 13:48 |
mnasiadka | those are merging | 13:49 |
mnasiadka | anyway, let's finish | 13:49 |
mnasiadka | thank you all for coming | 13:49 |
mnasiadka | #endmeeting | 13:49 |
opendevmeet | Meeting ended Wed Aug 7 13:49:12 2024 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) | 13:49 |
opendevmeet | Minutes: https://meetings.opendev.org/meetings/kolla/2024/kolla.2024-08-07-13.00.html | 13:49 |
opendevmeet | Minutes (text): https://meetings.opendev.org/meetings/kolla/2024/kolla.2024-08-07-13.00.txt | 13:49 |
opendevmeet | Log: https://meetings.opendev.org/meetings/kolla/2024/kolla.2024-08-07-13.00.log.html | 13:49 |
mnasiadka | see you next week | 13:49 |
SvenKieske | thank you | 13:49 |
darmach | see you and thanks! | 13:49 |
SvenKieske | I'll be on vacation next week | 13:49 |
mmalchuk | mnasiadka thanks | 13:49 |
SvenKieske | that is, from wednesday onwards, I still stick around monday/tuesday | 13:50 |
opendevreview | Michal Nasiadka proposed openstack/kolla-ansible master: CI: Bump codespell pin to <3 https://review.opendev.org/c/openstack/kolla-ansible/+/925874 | 13:52 |
kevko | any second core for https://review.opendev.org/c/openstack/kolla-ansible/+/924548 | 14:22 |
opendevreview | Michal Arbet proposed openstack/kolla master: Fix build of prometheus-ovn-exporter https://review.opendev.org/c/openstack/kolla/+/925886 | 14:43 |
opendevreview | Merged openstack/kolla-ansible stable/2023.2: fix flake8 error in database_shards.py https://review.opendev.org/c/openstack/kolla-ansible/+/925738 | 15:03 |
opendevreview | Merged openstack/kolla-ansible stable/2023.1: fix flake8 error in database_shards.py https://review.opendev.org/c/openstack/kolla-ansible/+/925739 | 15:03 |
opendevreview | Verification of a change to openstack/kolla-ansible stable/2024.1 failed: fix flake8 error in database_shards.py https://review.opendev.org/c/openstack/kolla-ansible/+/925737 | 15:03 |
SvenKieske | kevko: afaik you missed one instance of nova_libvirt_secrets mount in the libvirt_cleanup task, see comment | 15:07 |
kevko | SvenKieske: i really don't know what you mean ... | 15:13 |
kevko | SvenKieske: why do you have a problem with volume removal in playbook for removal ? | 15:14 |
kevko | SvenKieske: explain to me please .. | 15:14 |
SvenKieske | I will explain on the patchset, give me one second please | 15:16 |
SvenKieske | replied | 15:20 |
kevko | SvenKieske: also | 15:21 |
SvenKieske | working on a reply to that :) | 15:30 |
kevko | SvenKieske: did you test it if you are saying it will fail ? | 15:30 |
kevko | SvenKieske: because I did | 15:30 |
kevko | SvenKieske: check the reply | 15:30 |
kevko | I have ceph refactor there from 30th of January 2024 ...working in 3 productions already ...and we are unable to merge this code | 15:32 |
kevko | this related is actually bug ..and we are not able to merge even if it is bug | 15:32 |
kevko | i can reply as fast as i can to reviewers comments ..and then i am again waiting for months | 15:32 |
SvenKieske | replied, and I took just test it. you had to wait 9 minutes for that, for which I'm really sorry. You can pay me for faster replies I guess :P | 15:36 |
kevko | I am talking about it in general ...not now | 15:37 |
SvenKieske | I understand it's important to fix bugs. But sometimes I feel like we rush out bugfixes without doing proper QA and then having to fix 3 more bugs introduced by one "fix". please take your time. I won't let other people pressure me to reviewer "faster", that's just the path to more bugs, more broken code. | 15:37 |
SvenKieske | I can understand it if it takes months, sure :) | 15:38 |
SvenKieske | but imho this is a real edge-case bug. I agree we could put in a comment to remove the volume in e cycle | 15:38 |
kevko | SvenKieske: you are testing something reallly different :D :D :D :D | 15:39 |
SvenKieske | no? the cleanup playbook loops over `nova_libvirt_secrets`? is that redefined somewhere? my search didn't find anything. but I will admit you know the codebase way better than me, please explain :) | 15:40 |
SvenKieske | and I hate the autogenerated kolla_container.py stuff, I always find it hard to follow through what the actual code there does.. | 15:42 |
kevko | SvenKieske: where can you see {{ }} ????? https://github.com/openstack/kolla-ansible/blob/a70af118385fab4747e35dc92d2c78006f90b0c6/ansible/roles/nova-cell/tasks/libvirt-cleanup.yml#L54 | 15:42 |
kevko | SvenKieske: there is a module remove_volume which is removing the item iterated from loop ...in loop you dont have jinja variables but strings ... | 15:43 |
SvenKieske | ah lol, good point | 15:43 |
SvenKieske | but what happens if that volume doesn't exist? doesn't it throw an error? I would not be surprised if it just ignores it | 15:44 |
kevko | SvenKieske: it's same as you are iterating list of /tmp/non_file /tmp/file /tmp/dir /tmp/nodir and you will use file module with path of {{ item }} and state: absent | 15:44 |
opendevreview | Merged openstack/kolla-ansible master: CI: Pin openstackclient to <7 https://review.opendev.org/c/openstack/kolla-ansible/+/925857 | 15:45 |
kevko | SvenKieske: check my comment ...you have there playbook output where i added 'non_existant_volume' and commented out all other tasks ...and just run on my testing deployment ... it's OK | 15:45 |
SvenKieske | yeah you are right | 15:45 |
SvenKieske | and remove_volume doesn't handle any errors, lol omg.. | 15:45 |
kevko | which error ? | 15:46 |
SvenKieske | well what does docker/podman/whatever do if you try to remove a volume by name which doesn't exist? I bet you get an error? | 15:46 |
kevko | if i am calling remove_volume foo .. bar ... i just saying i don't want to have a volume defined with name foo and bar | 15:46 |
SvenKieske | doesn't it actually remove the volume from disk? does it only delete the definition of the volume? ah wait that's the purge command I guess? | 15:47 |
SvenKieske | I guess I need some caffeine.. | 15:47 |
kevko | SvenKieske: it is ...yes it's throwing and error ...but this is ansible python module | 15:47 |
kevko | SvenKieske: again ..it's same as you write a file: /tmp/file_which_not_exist state:absent | 15:48 |
kevko | SvenKieske: ansible will be green everything ok ... | 15:48 |
SvenKieske | I'm just saying we should either catch that error gracefully, or if we think we are a low level python lib we should at least bubble up the error to the caller | 15:48 |
SvenKieske | but that's not for the current review I guess, ok. | 15:48 |
kevko | root@controller0:~# rm /tmp/file_which_not_exist | 15:48 |
kevko | rm: cannot remove '/tmp/file_which_not_exist': No such file or directory | 15:48 |
kevko | you can't compare ansible module and some bash call | 15:48 |
SvenKieske | I don't. It's just good programming practice to either handle all error conditions yourself, or if you can't at least inform your caller (from the point of view of a library author) that what the caller wanted is not possible. silent failures are just bad. | 15:49 |
SvenKieske | which is also was rm does above, it informs you it can't delete that thing. there is a "--force" switch if you don't bother about errors :) | 15:51 |
kevko | SvenKieske: so you are saying you will better write a playbook where you will firstly use the stat module to check if file exist in first task and then write another task with file module with state absent with when: my_previous_task.exists ???? | 15:51 |
kevko | SvenKieske: or you just use the file module ... | 15:51 |
SvenKieske | kevko: yes the file module handles that. I'm just saying our custom kolla modules sometimes don't handle that, which is bad imho. | 15:52 |
SvenKieske | but as I said, that's not the topic of the review, thanks for the explanation, you got my +1. unfortunately frickler currently hasn't the time to do the review (he replied as much I guess already). so maybe some other core has some cycles. | 15:53 |
kevko | SvenKieske: Sorry, but this is ridiculous. Ansible is not programming per se ... you should be programming the modules that do the low-level work for you ... and as you can see ... the module is written correctly... Ideally, Ansible should be usable in a very simple way ... | 15:53 |
kevko | SvenKieske: thank you :D ...finally :D | 15:55 |
kevko | SvenKieske: going for cigarette :D | 15:55 |
SvenKieske | well we actually do raise error codes for docker volumes, I was looking at wrong code, due to a test venv: https://github.com/openstack/kolla-ansible/blob/master/ansible/module_utils/kolla_docker_worker.py#L490 | 15:57 |
kevko | SvenKieske: yeah, i've also checked | 15:58 |
kevko | it's frustrating ..because this is for me for example important code | 15:58 |
SvenKieske | yeah sorry, was really looking at some python site-packages inside a venv which had the same function name but different code | 15:58 |
kevko | i need to backport to several versions for now for our customers ... | 15:58 |
SvenKieske | should put my test envs outside of the code repo I guess, but it's convenient | 15:59 |
SvenKieske | how did you trigger this error live, I didn't really understand that from the bugreport. | 15:59 |
kevko | because they decided they want to have AZs and several ceph clusters ... several DCs ... and without proper kolla-ansible code you can't do it in good way | 15:59 |
SvenKieske | ah okay, so they had to change nova keyring to support multiple ceph clusters? | 16:00 |
SvenKieske | did this work in the past? | 16:00 |
kevko | SvenKieske: if you change the keyring for example ... your kolla-ansible code will report CHANGED for keyrings ...so the kolla copy script which is copying files in mounted /etc/kolla/libvirt to /var/lib/kolla/config_files in container will actually copy to /etc/libvirt/secrets which ARE mounted twice from kolla-ansible | 16:02 |
kevko | so you will get resource busy .... | 16:02 |
kevko | you will not trigger this bug if you never change the keyring | 16:02 |
kevko | but as i spend really much time to write it good ...i manipulate with keyrings ..change them ...etc etc ..etc ...so i've tested it more than CI can ... | 16:03 |
kevko | did you get it little bit :D ? | 16:03 |
kevko | to be honest ...i really didn't understand what is going on on the first seen | 16:03 |
kevko | simply said ...there is docker volume mounted to /etc/libvirt/secrets and in config.json there is a secrets folder which kolla copy script wants to copy to /etc/libvirts/secrets and MOREOVER that script is doing rm -rf and then copy it ... but this is mount point in docker container and can't be removed...because it's actually a volume mounted ... | 16:05 |
kevko | just ugly loop | 16:06 |
kevko | egg or chicken | 16:06 |
SvenKieske | mhm, I'm wondering if our copy-into-container stuff is just broken :D | 16:07 |
kevko | SvenKieske: it's not ...again ...did you get the point ? | 16:08 |
kevko | SvenKieske: our script just do this -> 1. Read config json 2. Okay, I need to copy this path 3. rm -rf /destination 4. Copy the path from 2. to /destination .... | 16:09 |
kevko | SvenKieske: which is correct right ? | 16:09 |
SvenKieske | I'm actually not sure if this is a good way to copy files into containers, no. | 16:09 |
SvenKieske | but I need to recheck the code, the history etc first | 16:10 |
kevko | SvenKieske: why not ? | 16:10 |
kevko | SvenKieske: what you don't like ? | 16:10 |
SvenKieske | well it's not a simple filesystem, it's layered stuff, it's in use by docker daemon, it's complicated. But I really need to read the code. very well possible that it's ugly but there's nothing better to do. | 16:12 |
kevko | just to finish the story ^^^ ...So script will fail and will throw resource is busy ...because that path is actually docker volume mounted from outside | 16:12 |
kevko | SvenKieske: i actually don't know why it's not mounted directly into container and there is some "copy script" ...this is a history i don't know anything about | 16:13 |
kevko | SvenKieske: i was just explayning where i found a bug | 16:13 |
kevko | i mean configuration ... | 16:13 |
SvenKieske | sure, do you happen to have a pointer to our current copy stuff? I just don't find it currently :D | 16:14 |
kevko | SvenKieske: https://github.com/openstack/kolla/blob/master/docker/base/set_configs.py | 16:16 |
SvenKieske | ah right, there's still a patch from you open in that area. I still have a draft comment there where I try to document how this stuff works, but it's not finished. | 16:18 |
kevko | SvenKieske: yeah, and again ..it's quite important bug and nobody cares :D | 16:19 |
SvenKieske | I care that much that I take away your/the original authors obligation and document the algorithm myself. this config merge is way to complex to code down without docs. | 16:21 |
kevko | SvenKieske: just imagine you change your api-paste.ini file ...or change policy.yaml ... i mean ..place the custom one to /etc/kolla/config/whatever/your_modified_config .... but after some time you will just decide that you don't need anything custom and you will want to go back to the default one ... you are in trouble .... | 16:21 |
SvenKieske | which is evident by the fact that it's not bug free. | 16:21 |
kevko | SvenKieske: i have also some unit tests written somewhere ...but didn't have a time for this ...as i am using it for now ..and don't have a time to make preasure on gerrit .... because kolla-ansible code is harder to maintain downstream | 16:22 |
SvenKieske | maybe we should just get rid of "merge" mechanic and just have simple "copy defaults" "copy custom stuff, you must include defaults yourself" :D | 16:22 |
SvenKieske | and everytime something is copied old config is nuked completely. | 16:23 |
SvenKieske | that would be simple | 16:23 |
SvenKieske | at least simpler than now. | 16:23 |
SvenKieske | but I guess that won't happen :D | 16:23 |
kevko | SvenKieske: this is not about merge mechanism in kolla ..this is about copy script ...once something is copied as config.json has defined it ...the default is overriden ...and if you remove it ....copy script just will not copy anything ...but the file will be still present | 16:24 |
SvenKieske | this folder alone makes me cry: https://github.com/openstack/kolla/tree/master/docker/base mixing code and repository information (data) | 16:24 |
SvenKieske | I guess I call it a day for today | 16:24 |
kevko | SvenKieske: haha, it's only about some nice reorganization ...kolla is simpler than kolla-ansible | 16:25 |
SvenKieske | yeah, complex enough already. at least it's not DNS I guess, which is about naming things and cache invalidation. | 16:26 |
kevko | yep | 16:26 |
kevko | I have 32 patches open | 16:27 |
kevko | some of them are wips ...but almost everything is ready to be merged ... something about 20 - 25 with renos ..with bugs ..everything ...still in gerrit | 16:28 |
kevko | sad story | 16:28 |
kevko | :'( | 16:28 |
opendevreview | Michal Arbet proposed openstack/kolla-ansible master: Use more descriptive libvirt secret names corresponding to reality https://review.opendev.org/c/openstack/kolla-ansible/+/924548 | 16:38 |
mnasiadka | kevko: we all have a high number of patches, instead of complaining - maybe we should find a solution to get them reviewed and merged? | 17:18 |
opendevreview | Michal Nasiadka proposed openstack/kolla-ansible master: CI: Bump codespell pin to <3 https://review.opendev.org/c/openstack/kolla-ansible/+/925874 | 17:19 |
opendevreview | Michal Nasiadka proposed openstack/kolla-ansible master: CI: Fix cases where port is not available yet https://review.opendev.org/c/openstack/kolla-ansible/+/924506 | 17:20 |
opendevreview | Michal Nasiadka proposed openstack/kolla-ansible master: Bump Ansible versions to 2.16 and 2.17 https://review.opendev.org/c/openstack/kolla-ansible/+/921743 | 17:20 |
opendevreview | Michal Nasiadka proposed openstack/kolla-ansible master: CI: Use u-c in openstack-clients role https://review.opendev.org/c/openstack/kolla-ansible/+/925909 | 17:25 |
kevko | mnasiadka: yeah, but how ? Some mix of comments not-resolved/resolved, age of patch, maturity ? Queue ? | 17:58 |
kevko | Which will result in some type of weight ? Or what ? | 17:59 |
arcayne | Is this the best channel for Kayobe-specific questions? | 18:17 |
kevko | Yes it is | 18:23 |
kevko | But I am not responsible to answer your question :/ but I am sure someone does | 18:23 |
arcayne | Awesome, and no worries -thanks for confirming I'm in the right channel. :) | 18:24 |
arcayne | I'm working through my first attempt at using Kayobe, but I'm struggling to wrap my head around the network-interface configs. My goal is to configure each overcloud node to have two bonds (one bond per dual-port NIC), with multiple VLANs/bridges tied to each bond. I've done this plenty of times in other environments, with OVS, etc - but I'm just going dumb trying to figure out how | 18:34 |
arcayne | to translate that knowledge to Kayobe. | 18:34 |
arcayne | If I could find even an example config to reference, that would be super helpful - but the Kayobe docs don't seem to go that deep, and the couple of real-world kayobe-config repos from StackHPC are also farily simple looking (in terms of the network-interface files, at least). Does anyone know of any other resources I can check out? | 18:36 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!