Friday, 2025-06-06

clarkbfungi: should we approve https://review.opendev.org/c/opendev/system-config/+/951873 ?14:53
corvus+214:55
fungiclarkb: yes14:56
clarkbI've approved it15:05
fungithanks!15:06
Ramereth[m]Yes, everything still seem stable and good like it was before?15:07
clarkbmnasiadka: frickler ^15:07
clarkbRamereth[m]: the only reports I've heard are positive so that is my assumption15:07
mnasiadkaYes, I commented the other day that all seems to be back to normal :)15:10
fricklerI didn't check today, but all arm builds passed on https://review.opendev.org/c/opendev/zuul-providers/+/951471 yesterday, too15:23
opendevreviewMerged opendev/system-config master: Block access to Gitea's archive feature  https://review.opendev.org/c/opendev/system-config/+/95187317:01
clarkbthat is behind the hourly jobs17:04
clarkbthe gitea job failed doing an apt cache update on gitea10. Due to running each gitea in sequence that is as far as we got and I think only gitea09 may have a new app.ini config17:13
clarkbI'm going to try a manual apt-get update on gitea10 now17:13
clarkbTemporary failure resolving 'us.archive.ubuntu.com'17:14
clarkb;; communications error to 127.0.0.1#53: connection refused17:15
fungilovely17:16
fungidns config problem?17:16
clarkberror: failed to read /var/lib/unbound/root.key17:16
clarkbfrom journalctl -u unbound17:16
clarkbfungi: I suspect this is fallout from full disks since 10,11,13, and 14 alls seem to have hte same issue17:17
clarkb09 and 12 appear to be ok and they were also the ones that didn't fill disks17:18
fungiah, i wonder if ansible tried to overwrite the file while the disk was full17:18
clarkbfungi: maybe we can modify /etc/resolv.conf to point at 1.1.1.1/8.8.8.8 temporarily, then reinstall the unbound security package infos then restart unbound?17:18
clarkbthen restore /etc/resolv.conf?17:18
clarkbthen reenqueue the job17:18
clarkbany chance you're able to dig into that?17:19
fungisure, on it17:19
clarkbI want to say that data comes from that package that we now have to explicitly install. But ya maybe package updates went sideways or something?17:20
fungiunfortunately the package is not still present in /var/cache/apt/archives/ or i'd just reinstall from there17:20
clarkb-rw-r--r--  1 unbound unbound    0 Jun  4 14:10 root.key17:20
clarkb0 bytes definitely not something I would expect unbound to be able to handle17:20
fungii think the file is generated, because it's not in the list of files shipped in any package17:21
clarkbah17:21
clarkbso script ran and wrote out bytes that couldnt' be preserved17:22
clarkbpossible it worked until we rebooted too if things were in cache somehow?17:22
fungithe unbound-anchor package includes a /usr/sbin/unbound-anchor tool which seems to do that17:23
fungiyeah, i expect the running unbound daemon had it in process memory until the reboot17:24
fungilooks like /etc/init.d/unbound us supposed to run it?17:24
fungispecifically on start it runs `/usr/lib/unbound/package-helper root_trust_anchor_update 2>&1 | logger -p daemon.info -t unbound-anchor`17:25
clarkbI wonder if systemd even runs that script anymore?17:26
clarkb/usr/lib/systemd/system/unbound.service also exists but it also has a pre start helper to run that17:26
clarkbmaybe we can just stop start unbound?17:27
fungilooks like systemd is using /usr/lib/systemd/system/unbound.service17:27
clarkbperhaps its a chicken and egg?17:28
fungiwhich runs `/usr/lib/unbound/package-helper root_trust_anchor_update` in ExecStartPre17:28
clarkbdo we need working dns for that script to run?17:28
fungimaybe just removing the empty /var/lib/unbound/root.key would work?17:28
clarkboh ya that could be. Maybe it sees the file is present and doesn't check for non zero size17:29
fungialternatively i can shuffle a copy over from another machine like gitea0917:29
clarkbfungi: I think we can try (re)moving the empty file then do a systemctl stop unbound && systemctl start unbound and see if it is happier17:29
fungiyeah, /var/lib/unbound/root.key has content after that17:30
fungiand apt update works again17:30
clarkband there is an unbound process cool17:30
fungii'll repeat on the other affected servers17:30
clarkbthanks17:30
clarkbalso gitea09 did not restart as expected so we need to followup and do that too17:31
clarkbbut first fix dns, reenqueue the change to deploy so that the other 5 get updated app.ini files17:31
fungiokay, on gitea10, 11, 13 and 14 i rmoved the empty root.key file, restarted unbound and tested apt update is working17:33
fungishall i do the deploy reenqueue as well or were you working on that already?17:33
clarkbgo for it17:33
fungiwell, that was unexpected... my workstation just spontaneously powered itself off and back on17:34
fungii'll be a few then i'll get that going17:35
clarkbok17:35
fungii should be in the deploy pipeline again now17:38
fungier, it should be17:38
clarkbI see it17:38
fungias do i17:39
fungigoing to take this opportunity to do a bit of power recabling on the workstation while that re-runs17:39
clarkbfungi: the safe gitea restart process is to disable the host in haproxy, then docker-compose down on the host, then docker-compose up -d mariadb memcached zuul-web, then wait for the web to load on that particular node, then docker-compose up -d to start the ssh daemon, finally reenable the host in haproxy17:39
clarkbThe haproxy stuff is probably not super necessary either as it should notice that things are down, but that risks breaking a non zero number of connections whereas stopping things in haproxy first should cause stuff to drain a bit more17:40
clarkband then the container start order is done so that gerrit doesn't try to replicate before the gitea service can be aware of the updates (the ssh container will accept the updates and the gitea service won't be aware of those updates even though they will be in the git repo)17:41
fungiwe have that encoded in a playbook too yes?17:42
clarkbya17:42
fungiwould it make more sense to run that manually from bridge?17:42
clarkbwell the deploy job is already running the same playbook. The problem is those tasks only trigger if the container images change currently17:43
clarkband since there are no new container images they won't run17:43
clarkbI guess we could make a new one off playbook that doesn't have that restriction similar to how we have the zuul restart playbooks17:44
clarkbthen run that17:44
clarkbthat isn't a terrible idea17:44
fungiyeah, i guess trying to configure it to also run for config changes might be overkill17:46
fungithis doesn't come up often17:46
clarkbhttps://zuul.opendev.org/t/openstack/build/6e81a7d43ad841c1afebb6ae95774f33 success this time around17:48
clarkband I see the new config option in the app.ini file on gitea1017:48
clarkbI'm happy to just sort of work through these really quickly unless you think we should try and sort out a better automated system or tooling first17:49
funginah, manual process is fine, it was more a matter of me remembering the mystic socat incantations (or more likely grepping them out of my shell history on the lb)17:50
clarkbit should be documented in our docs too17:50
clarkbbut ya I often just look at my history :)17:50
fungiah, right, good old-fashioned runbooks17:52
clarkbjust to avoid any confusion and stepping on toes. Should I start on that now with gitea09 or did you want to do it?17:54
fungii was going to do it, just making sure i have the right keys in place so i don't need my workstation for this18:01
clarkbok I'll hang around and can be an extra set of eyes and hands if necessary18:03
fungii have a root screen session going on gitea-lb0218:04
fungii'll start by disabling gitea09 in the http and https pools18:04
fungiboth are showing in maint now18:07
fungiproceeding with the docker-compose down now18:08
fungion gitea0918:09
fungiclarkb: when you mentioned the specific subset of containers to up, you listed zuul-web. was that supposed to be something else?18:09
clarkbfungi: yes sorry it should be gitea-web18:10
clarkbwe want to start the three containers that are not gitea-ssh18:10
fungiah, yep i see it18:10
clarkbthen wait for the web service to respond then start the last container gitea-ssh18:10
fungiokay, those three are up and gitea-ssh is still down18:10
clarkband https://gitea09.opendev.org:3081/opendev/system-config/ loads so now you can just do a default up -d to start the last container18:11
clarkb(also if you hit the code dropdown on that page there are no more artifact links as expected)18:11
fungiawesome!18:12
clarkbwe just need the gitea web service sufficiently up that it can process gerrit replication and having the web service respond seems to be sufficient indicator18:12
fungiokay, did a full docker-compose up -d now and will add gitea09 back to the pools and start the same process on 10 through 14 in serial18:13
clarkbyup sounds great18:14
clarkbslittle: ^ fyi I think its come up before that starlingx has asked about non working gitea archive links. We're formally disabling them today due to all the problems we've had with them historically18:21
clarkbsomething we probably would've done sooner if we had realized it was an option. I think we did our big initial debug around this and it wasn't an option then didn't realize they later made it a possibility18:21
fungiokay, all done18:27
clarkbI hopped off the screen on lb02. I think you can close it whenever you're comofrotable18:28
clarkband ya I was folliwing along and checking the code dropdowns as each server finished. I think it looks good from here18:28
fungicool18:29
fungiand done18:30
clarkbthanks!18:30
fungithank you for the help!18:31
johnsomHi packaging gurus, I have a question you might know the answer to.20:25
johnsomwe have: https://github.com/openstack/octavia/blob/master/setup.cfg#L2720:25
johnsomdata_files we like to include in the octavia package. I'm trying to fill out a pyproject.toml for octavia. Since we are still using pbr which creates the manifest, I assume I don't need to include anything special in the pyproject.toml for those. Is that a correct assumption?20:26
Clark[m]Yes PBR should continue to honor the setup.cfg20:28
johnsomExcellent, thank you20:28
fungithough you could move them to the equivalent pyproject keys and let setuptools handle them directly from there as well20:29
fungii don't recall the exact names off the top of my head, but they should be in the pyproject spec20:30
johnsomI looked at using "package_data", but I was worried it might conflict with the manifest file settings. (basically this whole data_files to pyproject.toml is a bit foggy for me)20:30
johnsomHere is the proposed patch if you are curious what I am up to: https://review.opendev.org/c/openstack/octavia/+/951994/1/pyproject.toml20:33
fungiwhat i usually do to test is just run pyproject-build in the old and new versions of the repo, then list the file contents of the resulting sdist/wheel files and copare them. also sometimes extract the metadata files and diff them too20:34
fungidid that extensively for the setup.cfg->pyproject.toml conversions in several opendev tools20:35
johnsomYeah, that is a good idea. I used validate-pyproject for my syntax check20:35
johnsomI need to do a bit more research, but I might propose validate-pyproject in global requirements so we have some validation on the file. It caught a few of my mistakes.20:38
fungisounds great to me. i also usually run twine --check as part of my dist file validation for my personal projects20:39
fungisince twine is what we're using to upload them to pypi eventually, it's good to know early if it's going to balk20:40
johnsomYep, it seems good. The files in the tarball and twine check passed20:55

Generated by irclog2html.py 4.0.0 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!