Thursday, 2024-12-12

fungiubuntu mirror vos release still in progress00:20
corvusno i meant the other thing but your idea sounds smart and potentially useful :)00:27
corvusi agree that changing the runtime for speculative execution complicates that a bit, for the images we build.  basically, mirroring images we don't build is easy, but that doesn't help with images we do build -- that needs the runtime change.00:28
clarkbya and reducing the total number of requests by mirroring things like mariadb will cut down on total requests and maybe we get into a place where things mostly work00:39
clarkbgitea 1.22.5 looks happy in CI maybe we upgrade that tomorrow00:39
opendevreviewMerged openstack/diskimage-builder master: Force grub2-install to bypass secureboot complaints.  https://review.opendev.org/c/openstack/diskimage-builder/+/93744204:13
*** rlandy_ is now known as rlandy11:27
*** jhorstmann is now known as Guest286911:51
fungiubuntu mirror refresh finished at 06:47:26 utc, i'm going to rerun it to make sure it's quick/no-op13:58
fungiand then i'll drop the lock13:58
fungi#status log Rebooted wiki.openstack.org to get OpenId logins working again14:06
fungiload average has already climbed to nearly 100 within a few minutes of coming back up14:07
opendevstatusfungi: finished logging14:08
fungialso it looks like the last update to mirror.ubuntu-ports was 5 months ago, i'll look into that and see if something's wrong there. mirror.epel too (4 months)14:21
fungisecond pass on mirror.ubuntu is still taking a while, but not erroring15:07
fungithere it went15:09
fungistarting one more immediate pass to see if this will go any quicker now15:09
fungiof course now it's erroring on something not afs-related... "Wrong checksum during receive of 'http://security.ubuntu.com/ubuntu/dists/jammy-security/main/binary-amd64/Packages.xz'"15:11
fungiokay, after a couple more tries it's back to pulling successfully again, so i must have caught them mid-update upstream15:18
opendevreviewJeremy Stanley proposed opendev/system-config master: Goodbye EPEL 7 mirroring  https://review.opendev.org/c/opendev/system-config/+/93763415:28
fungithat ^ is what i sussed out of https://static.opendev.org/mirror/logs/rsync-mirrors/epel.log15:29
fungiand looks like the issue with ubuntu-ports is that the lockfile for our script was left behind when something aborted it15:56
fungii've cleared that with the flock held, and am manually refreshing it in the same screen session i used for mirror.ubuntu15:57
clarkbfungi: I left a comment on 93763416:01
fungithanks!16:02
fungiunfortunately the ubuntu-ports mirroring is not going so well...16:02
fungiInternal error of the underlying BerkeleyDB database:16:02
fungiWithin checksums.db subtable pool at get: BDB0075 DB_PAGE_NOTFOUND: Requested page not found16:02
fungii wonder if we need to rebuild that reprepro db16:03
clarkbhttps://review.opendev.org/c/opendev/system-config/+/937574 is the gitea upgrade to 1.22.5. Our CI job was happy with it16:04
clarkbI'm around all day today if we want to proceed with it. My rough goal today is to update gitea, get a dib release sorted out, and start poking at docker compose + podman in a CI job16:09
clarkbfungi: oh did epel empty the repo in such a way that we sync that emptying first? I guess thats nice of them16:10
fungiyeah, the error is literally because the file we want to check isn't there any longer16:11
fungioh, i take that back16:11
fungithey emptied upstream which broke our syncing, so we're stuck at the immediate pre-cleanup state16:11
fungihttps://static.opendev.org/mirror/epel/7/16:11
fungiso yeah, cleanup definitely a good idea for this one16:12
clarkbya we've got cleanup encoded in many of the other scripts so should be straightforward16:14
opendevreviewJeremy Stanley proposed opendev/system-config master: Goodbye EPEL 7 mirroring  https://review.opendev.org/c/opendev/system-config/+/93763416:16
fungiright, i adapted that ^ from another adjacent script16:16
clarkbfungi: should we add a log line to indicate we're deleting things? We only set -x if running interactively (so not when run under cron)16:18
clarkbbut that code looks correct16:19
fungithe script i cargo-culted it from didn't log anything about it, but sure we can16:19
fungifollowing https://docs.opendev.org/opendev/system-config/latest/reprepro.html#advanced-recovery-techniques for mirror.ubuntu-ports, the db backup step is taking a while16:19
clarkbfungi: the fedora script emits a purging old mirrors line16:20
clarkbI'm going to reboot for my semi weekly local updates. Back soon16:20
*** Guest2639 is now known as diablo_rojo_phone16:22
opendevreviewJeremy Stanley proposed opendev/system-config master: Goodbye EPEL 7 mirroring  https://review.opendev.org/c/opendev/system-config/+/93763416:23
clarkbfungi: I just realized we can delete 8 as well16:26
clarkbdo we want to do that ina followup change or in one go?16:26
opendevreviewJeremy Stanley proposed opendev/system-config master: Goodbye EPEL 7 mirroring  https://review.opendev.org/c/opendev/system-config/+/93763416:26
fungik, i'm going to pop out to grab a quick lunch and start rebuilding the ubuntu-ports checksum db when i get back16:26
clarkbfungi: sorry one more thing on that chnage (related to removing 8) but enjoy your lunch first16:28
opendevreviewMerged openstack/diskimage-builder master: Update default Ubuntu to noble (latest LTS)  https://review.opendev.org/c/openstack/diskimage-builder/+/93620916:59
opendevreviewClark Boylan proposed opendev/system-config master: WIP Run containers on Noble with docker compose and podman  https://review.opendev.org/c/opendev/system-config/+/93764117:39
clarkbcorvus: ^ your previous testing was really helpful. I did change a few things to make it more targetted to opendev's use case (for example the socket path) definitely open to feedback on that sort of thing17:40
clarkbalso I fully expect this first pass to fail17:40
clarkbfor the next patchset I'll need to trim the list of jobs down to a representative non noble job and the paste job that I'm bumping up to noble.17:41
corvusclarkb: cool!  does the docker-compose shim need to set the env var?17:43
corvusoh i see17:44
corvusthe path thing you just mentioned :)17:44
corvusyeah, i guess i didn't think of doing that because i probably was trying to keep the ability to run docker too... but if we don't need that then this is easier17:44
corvusand means we could just remove the docker shim later17:44
corvusthat wfm.  :)17:45
clarkbya I think for the zuul tutorial whta you did is correct since you don't know what people are doing on their laptops17:48
clarkbbut for opendev I suspect we can get away with this if it works17:48
clarkbmy understanding is that socket activiation works by passing file descriptors through fork and exec so we don't even need to update config on the podman side17:50
clarkbbut we shall see17:50
fungithink i've found a typo in the reprepro database rebuild docs17:51
clarkbI'm realizing that sorting out the selenium machinery for testing is something that will be impacted by the docker compose + podman stuff. THough I suspect we can do something as simple as use podman if it is installed18:11
corvusclarkb: how is it affected?18:12
clarkbcorvus: it uses a lot of docker commands and not docker-compose18:16
clarkbcorvus: we run a selenium server container that interacts with the browser18:16
clarkbso paste installed podman and claims to have disabled docker and enabled podman but I strongly suspect that we ended up using docker and not podman in my test18:17
clarkbI'm probably going to jump straight to forcing a failure and holding a node18:17
clarkbbecuase debugging that from afar seems unfun18:18
opendevreviewClark Boylan proposed opendev/system-config master: WIP Run containers on Noble with docker compose and podman  https://review.opendev.org/c/opendev/system-config/+/93764118:20
clarkbcorvus: I guess if we modified that selenium role to use docker compose then it could rely on the new stuff and maintain compatibility with the old stuff. That may actually be the best way to approach this18:23
corvususing docker-compose with selenium sounds nice; i think it's easier to deal with rather than long cmdlines, but, shouldn't the docker commands "just work"?18:30
corvusie, if the test is running "docker run selenium" shouldn't that invoke the docker cli which will then tell the podman socket to start a container?18:31
clarkboh thats a good point18:31
clarkbso now I'm wondering if we did actually use podman or not :)18:32
clarkbholding a node and inspecting it directly seems like an even better idea now18:33
clarkbI'm not even sure how you'd distinguish. Maybe via journalctl -u podman vs -u docker?18:34
clarkbthere are probably signs in the process listing too18:34
clarkbok I think it is using docker based on journalctl output18:39
clarkbsomething si restarting dockerd and it listens on the sock18:40
clarkband it seems to turn off podman when it does so18:40
clarkbI wonder if dockerd also uses socket activiation so I need to override its socket activation file18:40
clarkbyup that is it I think18:41
opendevreviewJeremy Stanley proposed opendev/system-config master: Goodbye EPEL 7 and 8 mirroring  https://review.opendev.org/c/opendev/system-config/+/93763418:41
funginext up, centos mirroring has also been stuck for 6 months according to grafana18:45
fungioh, that's non-stream centos18:46
fungii guess we could get rid of that entirely now?18:46
clarkbfungi: yes I think so18:47
clarkbthis is fun docker listens on docker.sock even if you change its systemd socket activiation path18:52
clarkbor maybe I can't override the listen path this way?18:53
clarkbpodman[21823]: Error: wrong number of file descriptors for socket activation protocol (2 != 1) <- ok I think that is what is happening my override isn't actually an override but an addition to18:55
clarkbya systemctl list-sockets confirms. So I need to find a way to acutally override thsi config18:57
fungican it be uninstalled? or killed with fire? how about silver crosses and wooden stakes through the pid?18:59
clarkbI think the issue is the use of the .d dir loading all the config and merging to together19:00
clarkbinstead I need to create a new copy of the socket file entirely. I'm worknig on that now (have to unravel what I did already and start from a clean slate though)19:00
fungialso good, if somewhat less dramatic19:01
clarkbya that sorted it out. But then immediately hit Error response from daemon: container create: running container create option: invalid log driver: invalid argument19:05
corvusa ritual purging is always a good idea19:05
clarkbthis is all coming back to me podman doesn't support syslog19:05
clarkbbecause why would you need ti...19:05
clarkbI think maybe there is a journald option though whcih will probably be sufficient for us19:06
clarkbyes switching to journald from syslog seems to continue to capture the logs via syslog in our /var/log/containers dir19:10
fungiyeah, you can still tell journald to stream to syslog19:10
clarkbthis is because journald and syslog work together and both understand the tag option. Ok I think I have enough now to push a new patchset19:10
fungiand apparently we already do19:10
clarkband docker ps -a works too19:11
clarkbhacks19:11
fungiokay, the present state of https://review.opendev.org/937634 looks ready to go, and should clean up old package trees as desired19:13
opendevreviewClark Boylan proposed opendev/system-config master: WIP Run containers on Noble with docker compose and podman  https://review.opendev.org/c/opendev/system-config/+/93764119:17
clarkbfungi: approved thanks19:19
clarkbso I think maybe we can start switching our containers over to journald across the board19:19
fungithe ubuntu-ports reprepro db recreation is probably going to take at least a few more hours19:19
clarkbthat might be a step 0 here19:19
clarkbbut I'll need to investigate the behaviors a bit more19:19
clarkbside note docker will happily listen on multiple sockets. Podman will not19:20
clarkba behavior difference! but one without too much impact19:20
fungithey're risking that bug-compatible reputation19:24
clarkbalso I wonder if it is worth filing a bug against systemd since it didn't explode or complain when I had two different services activated by the same socket path19:28
clarkbseems like that should be a warning at the very least19:28
clarkbI would have to test this a lot more to quantify it proeprly19:28
fungiyeah, seems like it would try to start both, and they would race to listen on the socket19:35
fungithough i guess systemd is actually what listens on the socket, and then opens pipes to the services that get activated, so maybe it just multiplexes to as many as you have?19:36
clarkbfungi: no it seemed like systemd killed podman then started docker19:36
clarkbfungi: the way I had it set up podman was running until docker comamnds talking to the socket were running then podman got killed and docker started19:36
clarkbI suspect it may be whichever config is first in the systemd internals19:37
opendevreviewMerged opendev/system-config master: Goodbye EPEL 7 and 8 mirroring  https://review.opendev.org/c/opendev/system-config/+/93763419:48
opendevreviewClark Boylan proposed opendev/system-config master: WIP Run containers on Noble with docker compose and podman  https://review.opendev.org/c/opendev/system-config/+/93764119:49
Clark[m]I'm eating lunch but ^ is working (the failure is artificial and there to capture the node for inspection)20:16
Clark[m]The next steps will be to look into things like database backups that use docker commands. I guess those should keep working though as long as the container names are correct20:17
Clark[m]And then we can update those after the transition to noble + podman is done for each service?20:17
Clark[m]Oh we also need to handle the "is my container newer than before" check that gitea and jitsi use.20:17
Clark[m]corvus: also I didn't use any config to change cgroup manager. I suspect that isn't necessary as we run as root20:18
clarkbdocker compose is less verbose about whether or not an image has updated as a result of a pull when compared to docker-compose (did just confirm that)21:19
corvusi think refactoring that to use jq or something would be more robust21:21
clarkbya the inspect output is json so could do that. Another option may be to simply do docker image list | sha256sum; docker compose pull ; docker image list | sha256sum and compare the two sums21:22
clarkbthe annoying thing about using inspect is we'll need to inspect every one of the images we care about pretty explicitly. Doable but potentially a fair bit of ansible looping and slow compared to just comparing docker image list output.21:23
clarkbI'll start looking at the inspect angle next21:23
clarkboh docker image list takes a format option so maybe that is easier and then we just compare the image id columns21:24
opendevreviewClark Boylan proposed opendev/system-config master: WIP Run containers on Noble with docker compose and podman  https://review.opendev.org/c/opendev/system-config/+/93764121:34
clarkbcorvus: ^ I think that is a simple and easy to udnerstand method21:35
clarkbthought it just occurred to me that the output may not be order stable between docker image listings21:35
clarkb(however I think it is based on the timestamp so it should be)21:35
clarkbya the order seems to be consistently based on the image timestamp21:37
corvusclarkb: if it's weird, could always run it through `|sort`21:41
corvusregardless, i like it21:41
corvus(that particular use of json doesn't require that it be valid :)21:41
clarkbok that hit jinja errors due to the {{ }}21:44
clarkbyou can actually do docker image list --quiet and it will only print the IDs. I'll do that21:44
clarkbmaybe a little less descriptive but a comment can fix it21:44
opendevreviewClark Boylan proposed opendev/system-config master: WIP Run containers on Noble with docker compose and podman  https://review.opendev.org/c/opendev/system-config/+/93764121:46
clarkbfungi: should we proceed with https://review.opendev.org/c/opendev/system-config/+/937574 ? I'm guessing you are the only one who is going to review that today21:54
clarkb(thats the gitea upgrade)21:54
clarkbone thing I've caught myself doing is continuing to use docker commands for everything even though the backend is podman21:59
clarkbone upside to this is its checking rough compatibility between old docker and new pdoman expectations which is helpful the transition period between running on older node and then running on noble. After we switch we can make things more podman specific22:00
clarkbsweet comparing container images that way seems to work. I'll push up a change to "backport" this to jitsi meet and and gitea config management and anywhere else I can find us doing similar22:07
opendevreviewClark Boylan proposed opendev/system-config master: Refactor check for new container images  https://review.opendev.org/c/opendev/system-config/+/93765522:13
opendevreviewClark Boylan proposed opendev/system-config master: Log with journald and not syslog in lodgeit docker compose  https://review.opendev.org/c/opendev/system-config/+/93765622:16
clarkbI'm avoiding stacking these and will use depends on in the WIP change to collect things together bceause I'm not sure what order makes the most sense to land this stuff in and for the most part they are not interrelated22:17
fungiyeah, let's go ahead with 937574, i'll approve it now22:19
opendevreviewClark Boylan proposed opendev/system-config master: Update gitea containers to use journald logging  https://review.opendev.org/c/opendev/system-config/+/93765722:24
opendevreviewClark Boylan proposed opendev/system-config master: WIP Run containers on Noble with docker compose and podman  https://review.opendev.org/c/opendev/system-config/+/93764122:25
clarkbI think this is going to be the basic structure of docker compose + podman work. Basically use this WIP change to test specific services in a forward looking manner. Make changes to update syslog -> journald and whatever else we find doesn't work in a docker and podman compatible manner. Then at some point we can land the WIP change without modifying any specific services, then we22:26
clarkbredeploy specific services onto noble and they will transition to the new backend as part of the new platform22:26
clarkbmostly I want to make sure we've got good coverage of things before we commit to this22:27
clarkbsince I would like to avoid a mix of runtimes22:27
clarkbfungi: thank you for approving that22:27
fungii should be around for a while still to test when it deploys22:28
fungiubuntu-ports index rebuild is still in progress22:29
clarkbI'm guessing if we had nvme's instead of openafs on whatever our disks are it would be much faster :)22:29
fungieh, probably22:30
fungiyeah, i mean, in theory i could run it locally on a fileserver with the right tools, but there's no urgency22:31
clarkband now I'm hitting quota errors. Probably a good indicator to work on something else22:38
clarkbif the gitea 1.22.5 change hits rate limits I think we can probably just try again tomorrow first thing22:46
fungiyeah, i can make a point of rechecking it first thing when i wake up22:47
fungisystem-config-upload-image-gitea succeeded at least22:51
opendevreviewJay Faulkner proposed openstack/diskimage-builder master: Fix: Run final Gentoo tasks in post-install.d  https://review.opendev.org/c/openstack/diskimage-builder/+/93765822:54
fungisystem-config-run-gitea is only just to the point of installing docker now though22:56
fungilooks like it installed haproxy container images successfully at leasy23:02
fungileast23:02
fungigitea images too23:06
fungilooking likely to pass23:07
clarkbgreat I am around (and just got a new cup of tea)23:23
fungiwell, the console log stream has been silent for over 20 minutes, so not really sure if that's good or bad23:26
fungibut zuul seems to think it's only 13 minutes from finishing at least23:27
clarkbits the nested ansible which doesn't emit to the console23:31
clarkbI am pretty sure that is normal23:31
clarkbwhen it completes we'll get a huge amount of output that it was buffering23:31
fungiyeah, i figured23:33
fungianyway, that it got past fetching container images is encouraging23:34
clarkbI've got another docker compose with podman related update for when gitea is done (I need to edit the logging driver for haproxy too in the gitea job)23:39
opendevreviewJay Faulkner proposed openstack/diskimage-builder master: Stop using deprecated pkg_resources API  https://review.opendev.org/c/openstack/diskimage-builder/+/90769123:44
fungisystem-config-run-gitea is wrapping up finally23:54
fungiyeah, it's running the testinfra tests now23:57
clarkbslow node I guess. Here's hoping we don't hit a job timeout after all that23:58
fungibuild started 22:42:33 so it's already been over an hour23:58
fungitimeout's 4800 seconds23:59

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!