Friday, 2024-12-13

clarkbI think testinfra should be fairly quick the early part of the job is slow because it does two passes of gitea setup for idempotency checking00:00
clarkbI wonder if this might indicate gitea 1.22.5 is slow though00:00
clarkbor maybe we just got a slow node. A slow node seems more likely00:01
fungiyeah, we barely made it to post-run in the nick of time00:01
opendevreviewMerged opendev/system-config master: Update Gitea to 1.22.5  https://review.opendev.org/c/opendev/system-config/+/93757400:02
fungiyay!00:02
fungiunfortunately the hourly jobs kicked off a few minutes ago00:03
clarkbya it will be another 15 minutes or so until we start upgrading00:04
clarkbthe promote job for the image succeeded00:04
clarkbthe sshd log and replication log files on review02 have one extra day compared to httpd and error logs00:10
clarkbI'm suddenly concerned that Gerrit's log pruning might only work for half of the logs :/ tomorrow shoudl give us a better indication though00:10
clarkbcould also just be a rotation timing thing00:10
fungilikely00:14
fungideploying now00:20
clarkbit does the load balancer first which means there is a small lag until we update the backends00:21
clarkbgitea09 just restarted its services00:22
fungiPowered by Gitea Version: v1.22.500:23
fungilgtm00:23
clarkba test clone is happy as is the web ui on a quick load00:23
fungiso far everything's checking out00:24
clarkbyup when the whole cluster is done I'm going to push my update to the podman stuff which we can use to check replication00:24
fungiperfect00:24
clarkbI've been saving that update just for this purpose00:24
clarkbbah my ssh agent just unloaded my keys00:25
fungiso close00:25
clarkbI've reloaded them00:26
clarkbI have them on timers to help me remember its time to stop working. Though these days early dinner beacuse kids tends to do a good job00:26
clarkb09-11 are done now00:27
clarkbhalfway there00:27
clarkbthe job succeeded and checks against the web ui for all 6 lgtm. I'm pushing my change update now00:31
opendevreviewClark Boylan proposed opendev/system-config master: Update gitea containers to use journald logging  https://review.opendev.org/c/opendev/system-config/+/93765700:31
fungiconfirmed00:31
clarkbgit fetch origin refs/changes/57/937657/2 worked for me where origin is https://opendev.org/opendev/system-config (fetch). Git show against FETCH_HEAD also has the new content00:32
clarkbso I think replication is working too00:33
fungiyeah, findable with some effort at https://opendev.org/opendev/system-config/commit/d9e884400:34
clarkbanything else you think we should check?00:35
funginope, cloning and stuff worked as did browsing around the various gitea webui features00:35
fungii think we're all set00:36
fungireprepro db rebuild for ubuntu-ports is still underway, i'll probably just check back in on it when i wake up tomorrow00:37
clarkbsounds good00:37
opendevreviewRafal Lewandowski proposed openstack/diskimage-builder master: Prevent from overwriting grub defaults if no variables are set  https://review.opendev.org/c/openstack/diskimage-builder/+/93768413:52
opendevreviewRafal Lewandowski proposed openstack/diskimage-builder master: Prevent from overwriting grub defaults if no variables are set  https://review.opendev.org/c/openstack/diskimage-builder/+/93768413:53
fungiubuntu-porrts reprepro db build still in progress14:53
opendevreviewClark Boylan proposed opendev/system-config master: Refactor check for new container images  https://review.opendev.org/c/opendev/system-config/+/93765515:55
opendevreviewClark Boylan proposed opendev/system-config master: Use docker-compose for container execs in gitea  https://review.opendev.org/c/opendev/system-config/+/93771716:08
opendevreviewClark Boylan proposed opendev/system-config master: WIP Run containers on Noble with docker compose and podman  https://review.opendev.org/c/opendev/system-config/+/93764116:08
clarkbI'm trying to stick all of the podman and docker compose related changes under topic:podman-prep so far I think we've found viable alternatives to behavior differences that accomodate things in backward/forward compatible ways16:09
clarkbmnasiadka: frickler: any reason for me to keep my raxflex ephemeral device test node or should I clean it up?16:10
fungiepel mirror is back to updating again after my change deployed, the epel 7 and 8 cleanup dropped utilization by half from 80gb to 40gb according to grafana16:50
clarkbthank you for sorting that out16:53
fungiso other than finishing the repair on ubuntu-ports and decommissioning the old non-stream centos volume (also opensuse?), all the other mirrors are updating regularly now16:55
fungialso the non-stream centos and opensuse mirrors are at 0 utilization so they're not taking up space in afs, just cluttering our grafana dashboard/reporting16:56
clarkbhttps://9cdcc73dee173a4574a8-d3d57836e2611e65f8d99a9aeb37f4da.ssl.cf1.rackcdn.com/937641/8/check/system-config-run-gitea/1f492b1/bridge99.opendev.org/ara-report/results/279.html is the next thing to sort out. It looks like docker compose as docker-compose has broken those execs (and that likely implies the database backup execs are also broken)16:58
clarkbI wonder if using -T means you don't get the env vars passed through or something16:59
fungithat would be odd16:59
fungibut not all that surprising i suppose16:59
clarkbya it could also be a quoting issue I suppose16:59
clarkbI still have a held paste server with a mariadb container so I'm going to try and figure out what is going on there17:00
clarkbnot exactly 1:1 but hopefully close enough to figure out17:00
clarkbI can definitely reproduce the behavior17:02
clarkbif I exec bash then run `mysql -uroot -ppasswrd` it works17:04
clarkbif I `exec bash -c 'mysql -uroot -ppasswrd'` it fails17:05
clarkbI suspect the issue may be the shim17:10
clarkbya if I run things without the shim I get saner behavior so somethign with variable replacement and quoting in the shim17:11
clarkbya the quotes are getting chomped17:18
clarkband we're using the bash -c so that we can rely on the env vars for the secrets17:19
clarkbnot sure what the best way to address this is. Maybe quote the entire string passed to the container in such a way that we preserve quoting?17:19
clarkbmaybe alias would work better17:20
clarkbalias docker-compose='docker compose' does seem to fix my problems on the held node. But aliases are per user loaded in the bashrc which makes me wonder if this will work for cron jobs17:24
funginested wrappers might also work, but would be messy17:24
clarkbwell nested wrappers is sort of what we already have17:24
fungiyeah, i mean yet another level of nesting17:24
clarkbdocker-compose shim -> docker compose plugin -> container bash -> mysql17:25
clarkbfungi: I guess I'm not understanding how another wrapper would help17:25
clarkbthe problem seems to be with bash processing of $@ and $* stripping out quoting in ways that are incompatible between the docker-compose shim and running docker compose directly17:26
fungiwherever the quotes are getting chomped, put that in its own script instead of quoting within the existing script, but i'm also not following where exactly it's breaking17:26
fungiah, yes so dedicated wrapper for the desired command instead of relying on parameter expansion could be another way17:27
clarkb`docker compose -f /etc/lodgeit-compose/docker-compose.yaml exec mariadb bash -c 'mysql -u root -ppassword -e"USE lodgeit;"'` <- this works. But if you run throug through my naive docker-compose shim it executes `docker compose-f /etc/lodgeit-compose/docker-compose.yaml exec mariadb bash -c mysql -u root -ppassword -e'USE lodgeit;'` and fails17:27
clarkbthe problem is bash -c needs everything after it to be considered input to the command argument. But instead it gets just bash -c 'mysql' and -u 'root' etc as separate args to bash17:28
clarkbmaybe bash has a way to feed in a command that takes everything after a symbol as command input17:28
clarkb-s will read commands from STDIN17:30
fungithat might work, then use a heredoc17:30
clarkbdo heredocs work in crontab entries?17:30
fungioh, this is being embedded in a cronjob?17:31
clarkbfungi: the failure in the job I linked was not in a cronjob but we do a very similar thing for mysql backups which are in cronjobs17:31
fungiaha, you're trying to redo the database backups to use podman instead of docker17:31
fungisorry, i was deep in other things and missing some of the context17:32
clarkband also fix the gitea queries that failed in the job. They are similar but one runs in a ansible shell/command context and the other in a cronjob context17:32
clarkbI've found that alias docker-compose='docker compose' works well but I worry about ensuring the alias is in use in different contexts17:32
clarkbsimilarly a heredoc and -s would also probably work but maybe not for cronjob entries17:32
fungiin theory any command that's in a crontab entry could be put into its own script and then cron just invokes the script17:32
clarkbI guess for cron you just write a script that runs shell with a heredoc17:32
fungilooking at the bash manpage, it seems $* and $@ work differently when written as "$*" and "$@" as far as word-splitting during expansion17:41
clarkbok I have heredoc with -s working with the shim for the non cron case17:41
clarkbwhich implies we can make it work with teh cron case by having cron run a shell script17:42
clarkbfungi: oh let me test that too17:42
fungiyeah, read the "special parameters" subsection in the manpage, other than it not being as straightforward as one might assume, i'm struggling to get the exact implications from it17:43
clarkbexperimentally "$@" is what I want17:44
fungithat said, putting things in a separate script seems more straightforward and maintainable than relying on inherently confusing bash special-casing magic17:45
clarkbya though making the shim more likely to work is also good for interactive usage17:46
fungiagreed, of course17:46
clarkbI'll update the shim to quote $@ and see if that is sufficient to make these commands happy and if not we're doing the other options anyway and if yes we can still refactor to make it easier to undersatnd17:46
clarkbI worrying the quoting in this: "USE gitea; UPDATE user SET avatar = '\''{{ item }}'\'', use_custom_avatar = 1 WHERE name = '\''{{ item }}'\''"' will still explode17:48
clarkbbut maybe not17:48
fungialso maybe stick a comment adjacent referring to interpreter magic so when someone needs to make adjustments and stuff confusingly breaks they know where to start digging in the documentation 17:49
fungievery time i try to find something in the bash manpage i feel like i'm stepping out into the wilderness without a map and compass17:50
opendevreviewClark Boylan proposed opendev/system-config master: WIP Run containers on Noble with docker compose and podman  https://review.opendev.org/c/opendev/system-config/+/93764117:51
clarkbfungi: ^ somethign like that17:51
clarkboh I should add a keywork to grep in the bash manpage too17:51
fungi"special parameters" yeah17:51
opendevreviewClark Boylan proposed opendev/system-config master: WIP Run containers on Noble with docker compose and podman  https://review.opendev.org/c/opendev/system-config/+/93764117:52
clarkbdone17:52
fungithat looks good, thanks!17:52
clarkbre the string above that I'm worried about I'm not even sure I undersatnd the quoting without hte extra layer of indirection :)17:52
fungiit's like somebody scattered toothpicks and pepper on the command17:53
clarkbecho'ing it the '\'' become single 's which makes sense from a ok that is what we are trying to execute standpoint but not from a how does '\'' escape to '17:54
clarkblooks like to escape a ' within a bash string with 's you have to wrap the \' around 's?17:57
fungiembedding one language's syntax inside another is a recipe for madness, yes17:57
fungioh, hey, the big reindex step for ubuntu-ports finished!17:58
fungitime to do the other steps17:58
clarkboh I see 'string1''string2' -> "string1string2" so its literally just ending and starting strings based on what is most convenient for quoting17:59
clarkbI feel like I probably knew that once upon a time then my brain evicted it as unnecesary17:59
clarkbhowever that makes me more hopeful that the shim thing might actually work? I dunno17:59
fungiah, yeah was there an outer set of ' quotes omitted in your example? if so, that's what's going on18:00
clarkbyes18:00
fungiyeah, it was missing the opening ' but i see the closing ' is still there18:00
clarkbso them maybe this magical quotes will do the right thing here since we're abusing string quoting and concatenation throughout18:00
clarkbs/them/then/18:00
fungihopefully-repaired reprepro database files are being copied back into afs now, which will presumably take a while as i'm overwriting nearly 4gb of data18:05
clarkbthis has been the biggest problem I've run into so far though since it may require us to be very careful in how we do things to maintain backward and forward compatibility18:05
clarkbheredocs feel like the ultimate in fixing that though so we should hve a fallback (in particular because you can have them not interpolate variables and also avoid quoting/unquoting madness)18:05
fungicache invalidation, naming things, and bash parameter quoting18:06
clarkbha18:06
clarkbI'm glad I started poking at this with some of the more complicated services though. Its good to find these corner cases early before we decide whther or not we're going to commit to this18:06
clarkbone upside is that once a service transitions to docker compose we can stop using the shim entirely within that service which may resimplify things18:07
fungioh, the db write into afs took a lot less time than reading it out of afs originally did18:08
fungitrying to refresh the mirror now18:09
fungigrr, back to the same error18:11
clarkb:/18:13
fungiInternal error of the underlying BerkeleyDB database:18:14
fungiWithin checksums.db subtable pool at get: BDB0075 DB_PAGE_NOTFOUND: Requested page not found18:14
fungiThere have been errors!18:14
clarkbfwiw I ran the lodgeit db backup script and redirected into a file (it normally sends out to borg if configured to backup with borg) and that worked with the "$@" so it may just be the extra db manipulation we do for avatar images that have problems if the last patchset doesn't work18:14
clarkbfungi: are we out of quota and maybe short writing?18:14
fungino, nearly 100gb of headroom before we hit quota in that volume18:15
clarkbI wonder if the inputs are corrupt so we get a corrupt db? seems like reprepro is built specifically to handle such cases of bad repos though18:17
clarkbI haven't heard anything form mnasiadka or frickler so I'm going to go ahead and clean up my jammy test node in raxflex18:24
clarkbIt is easy to recreate18:24
fungimmm, running reprepro with the db directory overridden to my local copy doesn't seem to be hitting the same errors, at least18:25
fungithough it is hitting different errors that are more tractable (packages with incorrect checksums that need cleaning up)18:25
clarkblooking through collected logs from test nodes the switch from syslog to journald for logging our conatiners seems to work and continues to write out to /var/log/containers too18:59
clarkbthe only other real concern I have si that we might fill the disk more quicky with journald filling up? I'm not sure what the rotation policy is on that19:00
clarkbreading journald.conf's manpage it should police itself on maximum disk usage19:04
clarkbusing up to 10% of the filesystem it lives on which seems reasonable and makes me less concerned about triple accounting those logs instead of double accounting them19:05
clarkb(previously was syslog + /var/log/containers, now we're adding journald to the mix)19:05
clarkbI suspect that a step -1 here is going to be switching to journald for all of our services, then updating docker exec commands to use docker-compose commands to avoid the conatiner name change problem. Then see what additional corner cases we can fine19:06
opendevreviewClark Boylan proposed opendev/system-config master: Use docker-compose for container execs in gitea  https://review.opendev.org/c/opendev/system-config/+/93771719:21
clarkbwith ^ I think gitea might work with both old docker-compose and new docker compse19:21
clarkboh wait there is one more issue19:24
opendevreviewClark Boylan proposed opendev/system-config master: Use docker-compose for container execs in gitea  https://review.opendev.org/c/opendev/system-config/+/93771719:36
clarkbgood news is docker compose is quite a bit richer with what it can do. Bad news is docker-compose is pretty simple so we're left finding workable middle grounds while we try to be backward and forward compatible19:39
clarkbI did leave a todo for where wecan improve things once on docker compose though19:39
fungiubuntu-ports mirror update is still in progress, i'm popping out for a bit to grab early dinner before i go back to paperwork20:00
opendevreviewClark Boylan proposed opendev/system-config master: Use docker-compose for container execs in gitea  https://review.opendev.org/c/opendev/system-config/+/93771720:18
*** elodilles is now known as elodilles_pto20:51
fungiubuntu-ports mirror update seems to be about halfway through the alphabet21:48
fungiargh. *now* i'm getting "Disk quota exceeded"22:44
fungiaborting and i'll bump up the quota a bit22:44
clarkbif it makes you feel better I finally got the system-config-run-gitea job passing with docker compose and podman and then the other jobs I was testing with docker compose failed due to rate limits...22:45
clarkbbut the good news is I think this is backward and forward compatible as written22:46
fungiokay, quota on ubuntu-ports has been increased from 850gb to 1tb for now (regular ubuntu volume quota is 1.2tb with 1.1tb in use, for comparison)22:48
fungithe good news is it seems to have effectively resumed where it left off22:53
clarkball of the prep changes I've written so far pass CI now. Trying to get the WIP change to pass showing the forward compatibility of the changes. if we get that I think we can start evaluating some of these prep changes and landing them. In particular they shold be roughly equivalent and therefore safe to use23:23

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!