fungi | ubuntu mirror vos release still in progress | 00:20 |
---|---|---|
corvus | no i meant the other thing but your idea sounds smart and potentially useful :) | 00:27 |
corvus | i agree that changing the runtime for speculative execution complicates that a bit, for the images we build. basically, mirroring images we don't build is easy, but that doesn't help with images we do build -- that needs the runtime change. | 00:28 |
clarkb | ya and reducing the total number of requests by mirroring things like mariadb will cut down on total requests and maybe we get into a place where things mostly work | 00:39 |
clarkb | gitea 1.22.5 looks happy in CI maybe we upgrade that tomorrow | 00:39 |
opendevreview | Merged openstack/diskimage-builder master: Force grub2-install to bypass secureboot complaints. https://review.opendev.org/c/openstack/diskimage-builder/+/937442 | 04:13 |
*** rlandy_ is now known as rlandy | 11:27 | |
*** jhorstmann is now known as Guest2869 | 11:51 | |
fungi | ubuntu mirror refresh finished at 06:47:26 utc, i'm going to rerun it to make sure it's quick/no-op | 13:58 |
fungi | and then i'll drop the lock | 13:58 |
fungi | #status log Rebooted wiki.openstack.org to get OpenId logins working again | 14:06 |
fungi | load average has already climbed to nearly 100 within a few minutes of coming back up | 14:07 |
opendevstatus | fungi: finished logging | 14:08 |
fungi | also it looks like the last update to mirror.ubuntu-ports was 5 months ago, i'll look into that and see if something's wrong there. mirror.epel too (4 months) | 14:21 |
fungi | second pass on mirror.ubuntu is still taking a while, but not erroring | 15:07 |
fungi | there it went | 15:09 |
fungi | starting one more immediate pass to see if this will go any quicker now | 15:09 |
fungi | of course now it's erroring on something not afs-related... "Wrong checksum during receive of 'http://security.ubuntu.com/ubuntu/dists/jammy-security/main/binary-amd64/Packages.xz'" | 15:11 |
fungi | okay, after a couple more tries it's back to pulling successfully again, so i must have caught them mid-update upstream | 15:18 |
opendevreview | Jeremy Stanley proposed opendev/system-config master: Goodbye EPEL 7 mirroring https://review.opendev.org/c/opendev/system-config/+/937634 | 15:28 |
fungi | that ^ is what i sussed out of https://static.opendev.org/mirror/logs/rsync-mirrors/epel.log | 15:29 |
fungi | and looks like the issue with ubuntu-ports is that the lockfile for our script was left behind when something aborted it | 15:56 |
fungi | i've cleared that with the flock held, and am manually refreshing it in the same screen session i used for mirror.ubuntu | 15:57 |
clarkb | fungi: I left a comment on 937634 | 16:01 |
fungi | thanks! | 16:02 |
fungi | unfortunately the ubuntu-ports mirroring is not going so well... | 16:02 |
fungi | Internal error of the underlying BerkeleyDB database: | 16:02 |
fungi | Within checksums.db subtable pool at get: BDB0075 DB_PAGE_NOTFOUND: Requested page not found | 16:02 |
fungi | i wonder if we need to rebuild that reprepro db | 16:03 |
clarkb | https://review.opendev.org/c/opendev/system-config/+/937574 is the gitea upgrade to 1.22.5. Our CI job was happy with it | 16:04 |
clarkb | I'm around all day today if we want to proceed with it. My rough goal today is to update gitea, get a dib release sorted out, and start poking at docker compose + podman in a CI job | 16:09 |
clarkb | fungi: oh did epel empty the repo in such a way that we sync that emptying first? I guess thats nice of them | 16:10 |
fungi | yeah, the error is literally because the file we want to check isn't there any longer | 16:11 |
fungi | oh, i take that back | 16:11 |
fungi | they emptied upstream which broke our syncing, so we're stuck at the immediate pre-cleanup state | 16:11 |
fungi | https://static.opendev.org/mirror/epel/7/ | 16:11 |
fungi | so yeah, cleanup definitely a good idea for this one | 16:12 |
clarkb | ya we've got cleanup encoded in many of the other scripts so should be straightforward | 16:14 |
opendevreview | Jeremy Stanley proposed opendev/system-config master: Goodbye EPEL 7 mirroring https://review.opendev.org/c/opendev/system-config/+/937634 | 16:16 |
fungi | right, i adapted that ^ from another adjacent script | 16:16 |
clarkb | fungi: should we add a log line to indicate we're deleting things? We only set -x if running interactively (so not when run under cron) | 16:18 |
clarkb | but that code looks correct | 16:19 |
fungi | the script i cargo-culted it from didn't log anything about it, but sure we can | 16:19 |
fungi | following https://docs.opendev.org/opendev/system-config/latest/reprepro.html#advanced-recovery-techniques for mirror.ubuntu-ports, the db backup step is taking a while | 16:19 |
clarkb | fungi: the fedora script emits a purging old mirrors line | 16:20 |
clarkb | I'm going to reboot for my semi weekly local updates. Back soon | 16:20 |
*** Guest2639 is now known as diablo_rojo_phone | 16:22 | |
opendevreview | Jeremy Stanley proposed opendev/system-config master: Goodbye EPEL 7 mirroring https://review.opendev.org/c/opendev/system-config/+/937634 | 16:23 |
clarkb | fungi: I just realized we can delete 8 as well | 16:26 |
clarkb | do we want to do that ina followup change or in one go? | 16:26 |
opendevreview | Jeremy Stanley proposed opendev/system-config master: Goodbye EPEL 7 mirroring https://review.opendev.org/c/opendev/system-config/+/937634 | 16:26 |
fungi | k, i'm going to pop out to grab a quick lunch and start rebuilding the ubuntu-ports checksum db when i get back | 16:26 |
clarkb | fungi: sorry one more thing on that chnage (related to removing 8) but enjoy your lunch first | 16:28 |
opendevreview | Merged openstack/diskimage-builder master: Update default Ubuntu to noble (latest LTS) https://review.opendev.org/c/openstack/diskimage-builder/+/936209 | 16:59 |
opendevreview | Clark Boylan proposed opendev/system-config master: WIP Run containers on Noble with docker compose and podman https://review.opendev.org/c/opendev/system-config/+/937641 | 17:39 |
clarkb | corvus: ^ your previous testing was really helpful. I did change a few things to make it more targetted to opendev's use case (for example the socket path) definitely open to feedback on that sort of thing | 17:40 |
clarkb | also I fully expect this first pass to fail | 17:40 |
clarkb | for the next patchset I'll need to trim the list of jobs down to a representative non noble job and the paste job that I'm bumping up to noble. | 17:41 |
corvus | clarkb: cool! does the docker-compose shim need to set the env var? | 17:43 |
corvus | oh i see | 17:44 |
corvus | the path thing you just mentioned :) | 17:44 |
corvus | yeah, i guess i didn't think of doing that because i probably was trying to keep the ability to run docker too... but if we don't need that then this is easier | 17:44 |
corvus | and means we could just remove the docker shim later | 17:44 |
corvus | that wfm. :) | 17:45 |
clarkb | ya I think for the zuul tutorial whta you did is correct since you don't know what people are doing on their laptops | 17:48 |
clarkb | but for opendev I suspect we can get away with this if it works | 17:48 |
clarkb | my understanding is that socket activiation works by passing file descriptors through fork and exec so we don't even need to update config on the podman side | 17:50 |
clarkb | but we shall see | 17:50 |
fungi | think i've found a typo in the reprepro database rebuild docs | 17:51 |
clarkb | I'm realizing that sorting out the selenium machinery for testing is something that will be impacted by the docker compose + podman stuff. THough I suspect we can do something as simple as use podman if it is installed | 18:11 |
corvus | clarkb: how is it affected? | 18:12 |
clarkb | corvus: it uses a lot of docker commands and not docker-compose | 18:16 |
clarkb | corvus: we run a selenium server container that interacts with the browser | 18:16 |
clarkb | so paste installed podman and claims to have disabled docker and enabled podman but I strongly suspect that we ended up using docker and not podman in my test | 18:17 |
clarkb | I'm probably going to jump straight to forcing a failure and holding a node | 18:17 |
clarkb | becuase debugging that from afar seems unfun | 18:18 |
opendevreview | Clark Boylan proposed opendev/system-config master: WIP Run containers on Noble with docker compose and podman https://review.opendev.org/c/opendev/system-config/+/937641 | 18:20 |
clarkb | corvus: I guess if we modified that selenium role to use docker compose then it could rely on the new stuff and maintain compatibility with the old stuff. That may actually be the best way to approach this | 18:23 |
corvus | using docker-compose with selenium sounds nice; i think it's easier to deal with rather than long cmdlines, but, shouldn't the docker commands "just work"? | 18:30 |
corvus | ie, if the test is running "docker run selenium" shouldn't that invoke the docker cli which will then tell the podman socket to start a container? | 18:31 |
clarkb | oh thats a good point | 18:31 |
clarkb | so now I'm wondering if we did actually use podman or not :) | 18:32 |
clarkb | holding a node and inspecting it directly seems like an even better idea now | 18:33 |
clarkb | I'm not even sure how you'd distinguish. Maybe via journalctl -u podman vs -u docker? | 18:34 |
clarkb | there are probably signs in the process listing too | 18:34 |
clarkb | ok I think it is using docker based on journalctl output | 18:39 |
clarkb | something si restarting dockerd and it listens on the sock | 18:40 |
clarkb | and it seems to turn off podman when it does so | 18:40 |
clarkb | I wonder if dockerd also uses socket activiation so I need to override its socket activation file | 18:40 |
clarkb | yup that is it I think | 18:41 |
opendevreview | Jeremy Stanley proposed opendev/system-config master: Goodbye EPEL 7 and 8 mirroring https://review.opendev.org/c/opendev/system-config/+/937634 | 18:41 |
fungi | next up, centos mirroring has also been stuck for 6 months according to grafana | 18:45 |
fungi | oh, that's non-stream centos | 18:46 |
fungi | i guess we could get rid of that entirely now? | 18:46 |
clarkb | fungi: yes I think so | 18:47 |
clarkb | this is fun docker listens on docker.sock even if you change its systemd socket activiation path | 18:52 |
clarkb | or maybe I can't override the listen path this way? | 18:53 |
clarkb | podman[21823]: Error: wrong number of file descriptors for socket activation protocol (2 != 1) <- ok I think that is what is happening my override isn't actually an override but an addition to | 18:55 |
clarkb | ya systemctl list-sockets confirms. So I need to find a way to acutally override thsi config | 18:57 |
fungi | can it be uninstalled? or killed with fire? how about silver crosses and wooden stakes through the pid? | 18:59 |
clarkb | I think the issue is the use of the .d dir loading all the config and merging to together | 19:00 |
clarkb | instead I need to create a new copy of the socket file entirely. I'm worknig on that now (have to unravel what I did already and start from a clean slate though) | 19:00 |
fungi | also good, if somewhat less dramatic | 19:01 |
clarkb | ya that sorted it out. But then immediately hit Error response from daemon: container create: running container create option: invalid log driver: invalid argument | 19:05 |
corvus | a ritual purging is always a good idea | 19:05 |
clarkb | this is all coming back to me podman doesn't support syslog | 19:05 |
clarkb | because why would you need ti... | 19:05 |
clarkb | I think maybe there is a journald option though whcih will probably be sufficient for us | 19:06 |
clarkb | yes switching to journald from syslog seems to continue to capture the logs via syslog in our /var/log/containers dir | 19:10 |
fungi | yeah, you can still tell journald to stream to syslog | 19:10 |
clarkb | this is because journald and syslog work together and both understand the tag option. Ok I think I have enough now to push a new patchset | 19:10 |
fungi | and apparently we already do | 19:10 |
clarkb | and docker ps -a works too | 19:11 |
clarkb | hacks | 19:11 |
fungi | okay, the present state of https://review.opendev.org/937634 looks ready to go, and should clean up old package trees as desired | 19:13 |
opendevreview | Clark Boylan proposed opendev/system-config master: WIP Run containers on Noble with docker compose and podman https://review.opendev.org/c/opendev/system-config/+/937641 | 19:17 |
clarkb | fungi: approved thanks | 19:19 |
clarkb | so I think maybe we can start switching our containers over to journald across the board | 19:19 |
fungi | the ubuntu-ports reprepro db recreation is probably going to take at least a few more hours | 19:19 |
clarkb | that might be a step 0 here | 19:19 |
clarkb | but I'll need to investigate the behaviors a bit more | 19:19 |
clarkb | side note docker will happily listen on multiple sockets. Podman will not | 19:20 |
clarkb | a behavior difference! but one without too much impact | 19:20 |
fungi | they're risking that bug-compatible reputation | 19:24 |
clarkb | also I wonder if it is worth filing a bug against systemd since it didn't explode or complain when I had two different services activated by the same socket path | 19:28 |
clarkb | seems like that should be a warning at the very least | 19:28 |
clarkb | I would have to test this a lot more to quantify it proeprly | 19:28 |
fungi | yeah, seems like it would try to start both, and they would race to listen on the socket | 19:35 |
fungi | though i guess systemd is actually what listens on the socket, and then opens pipes to the services that get activated, so maybe it just multiplexes to as many as you have? | 19:36 |
clarkb | fungi: no it seemed like systemd killed podman then started docker | 19:36 |
clarkb | fungi: the way I had it set up podman was running until docker comamnds talking to the socket were running then podman got killed and docker started | 19:36 |
clarkb | I suspect it may be whichever config is first in the systemd internals | 19:37 |
opendevreview | Merged opendev/system-config master: Goodbye EPEL 7 and 8 mirroring https://review.opendev.org/c/opendev/system-config/+/937634 | 19:48 |
opendevreview | Clark Boylan proposed opendev/system-config master: WIP Run containers on Noble with docker compose and podman https://review.opendev.org/c/opendev/system-config/+/937641 | 19:49 |
Clark[m] | I'm eating lunch but ^ is working (the failure is artificial and there to capture the node for inspection) | 20:16 |
Clark[m] | The next steps will be to look into things like database backups that use docker commands. I guess those should keep working though as long as the container names are correct | 20:17 |
Clark[m] | And then we can update those after the transition to noble + podman is done for each service? | 20:17 |
Clark[m] | Oh we also need to handle the "is my container newer than before" check that gitea and jitsi use. | 20:17 |
Clark[m] | corvus: also I didn't use any config to change cgroup manager. I suspect that isn't necessary as we run as root | 20:18 |
clarkb | docker compose is less verbose about whether or not an image has updated as a result of a pull when compared to docker-compose (did just confirm that) | 21:19 |
corvus | i think refactoring that to use jq or something would be more robust | 21:21 |
clarkb | ya the inspect output is json so could do that. Another option may be to simply do docker image list | sha256sum; docker compose pull ; docker image list | sha256sum and compare the two sums | 21:22 |
clarkb | the annoying thing about using inspect is we'll need to inspect every one of the images we care about pretty explicitly. Doable but potentially a fair bit of ansible looping and slow compared to just comparing docker image list output. | 21:23 |
clarkb | I'll start looking at the inspect angle next | 21:23 |
clarkb | oh docker image list takes a format option so maybe that is easier and then we just compare the image id columns | 21:24 |
opendevreview | Clark Boylan proposed opendev/system-config master: WIP Run containers on Noble with docker compose and podman https://review.opendev.org/c/opendev/system-config/+/937641 | 21:34 |
clarkb | corvus: ^ I think that is a simple and easy to udnerstand method | 21:35 |
clarkb | thought it just occurred to me that the output may not be order stable between docker image listings | 21:35 |
clarkb | (however I think it is based on the timestamp so it should be) | 21:35 |
clarkb | ya the order seems to be consistently based on the image timestamp | 21:37 |
corvus | clarkb: if it's weird, could always run it through `|sort` | 21:41 |
corvus | regardless, i like it | 21:41 |
corvus | (that particular use of json doesn't require that it be valid :) | 21:41 |
clarkb | ok that hit jinja errors due to the {{ }} | 21:44 |
clarkb | you can actually do docker image list --quiet and it will only print the IDs. I'll do that | 21:44 |
clarkb | maybe a little less descriptive but a comment can fix it | 21:44 |
opendevreview | Clark Boylan proposed opendev/system-config master: WIP Run containers on Noble with docker compose and podman https://review.opendev.org/c/opendev/system-config/+/937641 | 21:46 |
clarkb | fungi: should we proceed with https://review.opendev.org/c/opendev/system-config/+/937574 ? I'm guessing you are the only one who is going to review that today | 21:54 |
clarkb | (thats the gitea upgrade) | 21:54 |
clarkb | one thing I've caught myself doing is continuing to use docker commands for everything even though the backend is podman | 21:59 |
clarkb | one upside to this is its checking rough compatibility between old docker and new pdoman expectations which is helpful the transition period between running on older node and then running on noble. After we switch we can make things more podman specific | 22:00 |
clarkb | sweet comparing container images that way seems to work. I'll push up a change to "backport" this to jitsi meet and and gitea config management and anywhere else I can find us doing similar | 22:07 |
opendevreview | Clark Boylan proposed opendev/system-config master: Refactor check for new container images https://review.opendev.org/c/opendev/system-config/+/937655 | 22:13 |
opendevreview | Clark Boylan proposed opendev/system-config master: Log with journald and not syslog in lodgeit docker compose https://review.opendev.org/c/opendev/system-config/+/937656 | 22:16 |
clarkb | I'm avoiding stacking these and will use depends on in the WIP change to collect things together bceause I'm not sure what order makes the most sense to land this stuff in and for the most part they are not interrelated | 22:17 |
fungi | yeah, let's go ahead with 937574, i'll approve it now | 22:19 |
opendevreview | Clark Boylan proposed opendev/system-config master: Update gitea containers to use journald logging https://review.opendev.org/c/opendev/system-config/+/937657 | 22:24 |
opendevreview | Clark Boylan proposed opendev/system-config master: WIP Run containers on Noble with docker compose and podman https://review.opendev.org/c/opendev/system-config/+/937641 | 22:25 |
clarkb | I think this is going to be the basic structure of docker compose + podman work. Basically use this WIP change to test specific services in a forward looking manner. Make changes to update syslog -> journald and whatever else we find doesn't work in a docker and podman compatible manner. Then at some point we can land the WIP change without modifying any specific services, then we | 22:26 |
clarkb | redeploy specific services onto noble and they will transition to the new backend as part of the new platform | 22:26 |
clarkb | mostly I want to make sure we've got good coverage of things before we commit to this | 22:27 |
clarkb | since I would like to avoid a mix of runtimes | 22:27 |
clarkb | fungi: thank you for approving that | 22:27 |
fungi | i should be around for a while still to test when it deploys | 22:28 |
fungi | ubuntu-ports index rebuild is still in progress | 22:29 |
clarkb | I'm guessing if we had nvme's instead of openafs on whatever our disks are it would be much faster :) | 22:29 |
fungi | eh, probably | 22:30 |
fungi | yeah, i mean, in theory i could run it locally on a fileserver with the right tools, but there's no urgency | 22:31 |
clarkb | and now I'm hitting quota errors. Probably a good indicator to work on something else | 22:38 |
clarkb | if the gitea 1.22.5 change hits rate limits I think we can probably just try again tomorrow first thing | 22:46 |
fungi | yeah, i can make a point of rechecking it first thing when i wake up | 22:47 |
fungi | system-config-upload-image-gitea succeeded at least | 22:51 |
opendevreview | Jay Faulkner proposed openstack/diskimage-builder master: Fix: Run final Gentoo tasks in post-install.d https://review.opendev.org/c/openstack/diskimage-builder/+/937658 | 22:54 |
fungi | system-config-run-gitea is only just to the point of installing docker now though | 22:56 |
fungi | looks like it installed haproxy container images successfully at leasy | 23:02 |
fungi | least | 23:02 |
fungi | gitea images too | 23:06 |
fungi | looking likely to pass | 23:07 |
clarkb | great I am around (and just got a new cup of tea) | 23:23 |
fungi | well, the console log stream has been silent for over 20 minutes, so not really sure if that's good or bad | 23:26 |
fungi | but zuul seems to think it's only 13 minutes from finishing at least | 23:27 |
clarkb | its the nested ansible which doesn't emit to the console | 23:31 |
clarkb | I am pretty sure that is normal | 23:31 |
clarkb | when it completes we'll get a huge amount of output that it was buffering | 23:31 |
fungi | yeah, i figured | 23:33 |
fungi | anyway, that it got past fetching container images is encouraging | 23:34 |
clarkb | I've got another docker compose with podman related update for when gitea is done (I need to edit the logging driver for haproxy too in the gitea job) | 23:39 |
opendevreview | Jay Faulkner proposed openstack/diskimage-builder master: Stop using deprecated pkg_resources API https://review.opendev.org/c/openstack/diskimage-builder/+/907691 | 23:44 |
fungi | system-config-run-gitea is wrapping up finally | 23:54 |
fungi | yeah, it's running the testinfra tests now | 23:57 |
clarkb | slow node I guess. Here's hoping we don't hit a job timeout after all that | 23:58 |
fungi | build started 22:42:33 so it's already been over an hour | 23:58 |
fungi | timeout's 4800 seconds | 23:59 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!