| opendevreview | Clark Boylan proposed opendev/system-config master: Build haproxy and zk statsd containers on Trixie https://review.opendev.org/c/opendev/system-config/+/970208 | 00:21 |
|---|---|---|
| clarkb | these two images are our simplest based on python. They don't even use the assemble routines. I think if those look good then we do an irc bot or three then gerrit | 00:22 |
| clarkb | reading the gerrit 3.12 release notes I fear they may have made the h2 cache situation worse by updating to h2 v2 because now unclean shutdowns corrupt the cache files | 00:33 |
| clarkb | previously it seemed like we could shutdown and h2 would be ok but https://www.gerritcodereview.com/3.12.html#known-issues implies this may not always be the case with h2 v2 :/ | 00:34 |
| clarkb | this will require some investigating. I'm also not sure I undertasnd the implications of https://www.gerritcodereview.com/3.12.html#gitattributes-configuration-support-in-jgit-merge-driver is that maybe for the web editing of rebases/merges allowing you to edit things with markers in the brwoser? Otherwise it is an error to merge something in gerrit with conflicts so not sure why | 00:35 |
| clarkb | configuring the way conflict tags are handles is helpful | 00:35 |
| clarkb | I'll have to ask some questions as we spin up on the next upgrade | 00:35 |
| *** darmach3 is now known as darmach | 04:26 | |
| *** dhill is now known as Guest33216 | 13:29 | |
| mordred | clarkb: I think the gitattributes thing links to a not very helpful explanation. I'd be willing to bet that's more about honoring gitattributes for things like EOL markers and binary file handling | 14:04 |
| fungi | i'd buy that explanation | 14:46 |
| clarkb | mordred: ah ya could be that it is less about merge conflict behavior and other behaviors like that. I can ask Nasser who wrote the change for more details though | 15:02 |
| opendevreview | Merged openstack/project-config master: Cleanup Monasca infra https://review.opendev.org/c/openstack/project-config/+/970193 | 15:55 |
| clarkb | infra-root: gerrit 3.11 -> 3.12 upgrades require an h2 cache version change. I believe the way the current upgrade test is working is h2 must be recreating the database from scratch. If we do that in production I believe everyone will be logged out of the server during the upgrade. The alternative is to convert the underlying h2 cache dbs from v1 to v2 with this tool | 15:58 |
| clarkb | https://manticore-projects.com/H2MigrationTool/index.html | 15:59 |
| clarkb | is that something we think we would like to do? Do we think this software source is trustable (I think the account session cache is sensitive info) | 15:59 |
| clarkb | another concern is that this tool might be really slow with our large h2 v1 files. So if we do the conversion route we're probably actually going to do so selectively for caches we believe are important to rpeserve like the login sessions cache | 16:01 |
| clarkb | and then let the others rebuild from scratch | 16:01 |
| clarkb | part of me is thinking forcing everyone to log back in may not be the worst thing | 16:07 |
| clarkb | fungi: re project renames: https://zuul.opendev.org/t/openstack/build/3730086c8554461aa74ec28f7966c055/log/bridge99.opendev.org/ansible/bootstrap-and-test-review.yaml.log#956-1369 this is where we test that in our CI jobs and this log ran against gerrit 3.11 | 16:46 |
| fungi | oh perfect | 16:47 |
| clarkb | fungi: reading through the log I think my main concern is that the gerrit stop may timeout and we're not deleting caches | 16:47 |
| clarkb | however since we just restarts gerrit less than a week before the planned rename I suspect this won't be a big issue | 16:47 |
| clarkb | basically gerrit needs to run longer for the caches to become a problem | 16:47 |
| fungi | yeah, it'll have been 5 days so should stop normally | 16:47 |
| clarkb | ya I think normal prep steps are all that we need | 16:48 |
| clarkb | as a side note import playbook doesn't seem to log its own task name. I had a hard time finding where those logs were and had to look for the inner task names | 16:49 |
| clarkb | a quirk of ansible I guess | 16:49 |
| clarkb | I approved the haproxy-statsd and zookeeper-statsd trixie update change. I'll watch it | 20:52 |
| fungi | thanks | 20:54 |
| fungi | lists.o.o is getting hammered again | 20:56 |
| fungi | it's starting to respond again, but i think it ran out of apache slots for a bit | 21:00 |
| opendevreview | Jeremy Stanley proposed opendev/project-config master: Add record for planned rename on December 12, 2025 https://review.opendev.org/c/opendev/project-config/+/970307 | 21:11 |
| *** parallax is now known as Guest33237 | 21:14 | |
| clarkb | fungi: do you think we should increase the slot count like we did with static? I feel like that is trickier with lists as its load is much higher under more typical usage levels | 21:37 |
| fungi | 5-minute load average was only up around 6-7 when i was getting no response out of apache | 21:39 |
| fungi | so yeah, i think it varies | 21:39 |
| fungi | clarkb: oh, james replied to you almost immediately! | 21:42 |
| clarkb | oh cool. i also followed up with EMS too | 21:43 |
| clarkb | so all the promised emails are off to their destinations | 21:43 |
| clarkb | fungi: did you have any thoughts on the testing I add to gerritlib here: https://review.opendev.org/c/opendev/gerritlib/+/970142 Mostly I'm trying to get ahead of the gerrit releases so that its a proactive rather than post upgrade thing | 21:44 |
| fungi | clarkb: one question on that change, mostly for my own education | 21:58 |
| clarkb | fungi: oh sorry that was the main thing taht needed fixing. Gerrit 3.11 and newer requires all edits to refs/meta/config go through code review unless you have force push permissions to the ref | 21:59 |
| fungi | mostly making sure that wasn't cruft from some separate change that accidentally got picked up in the diff | 21:59 |
| clarkb | fungi: they basically decided that what we've been doing with manage-projects and project-config for 15 years is how everyone should use gerrit last year | 21:59 |
| clarkb | so we've had to update all the gerrit testing in zuul and system-config etc to push a patch to add force push then code review to land it to get manage-projects to work | 22:00 |
| clarkb | I had a note about that in the commit message at first but when I started going off on making all the versions work I think I rewrote it and dropped it (which in hindsight was a mistake) | 22:00 |
| fungi | no worries, i really just wanted to know where it came from | 22:00 |
| fungi | also interesting that you'd need to push --force when it's a fast-forward update | 22:01 |
| clarkb | yes because they think all updates to refs/meta/config should be code reviewed. Which we think as well we just solved it outside of gerrit forever ago | 22:01 |
| fungi | or is it that they require permission to use push --force even if you're not specifying the --force flag? | 22:02 |
| clarkb | its that the permission that lets you bypass the implicit requirement for code review is the force push permission | 22:02 |
| clarkb | they are overloading git terms/behavior a bit | 22:02 |
| fungi | that's a weird option to overload, yeah | 22:02 |
| fungi | i wonder why they didn't make it a separate permission/option, but whatever | 22:02 |
| clarkb | they did actually. But I tested it and it doesnt' work | 22:03 |
| fungi | oh weird, so this is a workaround for their broken permissions model? | 22:03 |
| clarkb | which I asked them about and never got a resposne on. It probably is a bug but ya | 22:03 |
| clarkb | yes pretty much | 22:03 |
| clarkb | fungi: its a gerrit.config server wide option that basically says don't require code review on refs/meta/config | 22:03 |
| clarkb | but in my testing of it setting it didn't change the behavior | 22:04 |
| clarkb | allowing force push does | 22:04 |
| fungi | mind if i push up a followup change to summarize this situation in a code comment? guessing you asked about it in discord, not a bug report | 22:04 |
| clarkb | yes it was on discord | 22:04 |
| fungi | cool, just making sure there's no bug url i should include in the comment | 22:05 |
| clarkb | and no I don't mind. You can find similar justification for similar changes in system-config's gerrit testing and zuul's quickstart testing too | 22:05 |
| fungi | ah, maybe i should add the comment to one of those instead | 22:05 |
| fungi | (or all of them) | 22:05 |
| clarkb | fungi: https://etherpad.opendev.org/p/gerrit-upgrade-3.11 line 31 and the block below covers some of this too | 22:05 |
| fungi | okay great, thanks | 22:05 |
| clarkb | it links to where I tested the setting of the config option though I'm guessing those logs have since been deleted | 22:05 |
| fungi | did you test without the config option and only +force in the acl? | 22:06 |
| fungi | wondering if it needs both to work | 22:06 |
| clarkb | fungi: yes that is the situation in production and our system-config test jobs today | 22:07 |
| clarkb | (we don't set the server wide config option at all only the +force acl) | 22:07 |
| fungi | okay, so just +force, the separate permission is currently pointless i guess | 22:07 |
| clarkb | yup and my question on discord was basically "I set this value and the behavior didn't change. Changing the acl works. What is the point of the option in this case" and never got a response | 22:08 |
| clarkb | I feel like this is one of those situations where being an early adopter (another is pbr) creates some inflexibility and strong opinions whereas for others they're still learning as they go | 22:12 |
| opendevreview | Clark Boylan proposed opendev/system-config master: Build accessbot on trixie https://review.opendev.org/c/opendev/system-config/+/970321 | 22:13 |
| fungi | looks like we may not need anything similar in git-review testing, we seem to run it with the default acl | 22:14 |
| clarkb | ya I think this may be very related to manage-projects needing to push directly to the acls | 22:15 |
| clarkb | if the default acls work for you then its a noop | 22:16 |
| opendevreview | Clark Boylan proposed opendev/system-config master: Update Hound Container to Debian Trixie https://review.opendev.org/c/opendev/system-config/+/970322 | 22:25 |
| opendevreview | Merged opendev/gerritlib master: Update Gerrit integration testing to test many Gerrit versions https://review.opendev.org/c/opendev/gerritlib/+/970142 | 22:26 |
| opendevreview | Merged opendev/system-config master: Build haproxy and zk statsd containers on Trixie https://review.opendev.org/c/opendev/system-config/+/970208 | 22:27 |
| opendevreview | Clark Boylan proposed opendev/system-config master: Update matrix-eavesdrop container to build on Debian Trixie https://review.opendev.org/c/opendev/system-config/+/970325 | 22:27 |
| clarkb | ok I think that set of three additional changes is a reasonable set to get in before Gerrit moves to trixie. We could also do lodgeit potentially. | 22:28 |
| clarkb | there are probably a couple other candidates I'm forgetting too, but the mix I've pushed up seems like decent sanity coverage so don't need to overthink it | 22:29 |
| clarkb | I've logged into both load balancers and zk01 and will check that their statsd containers update cleanly then move over to grafana to ensure the stats keep rolling in | 22:30 |
| clarkb | the job is deploying now to all three | 22:30 |
| clarkb | all three hosts I'm looking at have new containers now | 22:31 |
| clarkb | the zookeepers go one by one so will be a few minutes for the other two servers in the zk cluster to update | 22:32 |
| clarkb | hrm zk02 isn't updating and the job just turned red | 22:33 |
| clarkb | hrm docker hub rate limist were hit but I thought we weren't using docker hub for this image any more | 22:34 |
| clarkb | oh! its the zk image not the zookeeper-statsd image that hit the rate limit | 22:35 |
| clarkb | I think I'm ok leaving it in this state it should recover on its own and if zk01's statsd are arriving that is good enough for now | 22:35 |
| fungi | i guess we don't mirror zk's images to quay | 22:36 |
| clarkb | we might. I was going to check that after looking at grafana | 22:36 |
| fungi | for that matter, i wonder if some of the projects we're mirroring to quay might have separately started to publish images to non-dockerhub registries on their own | 22:37 |
| fungi | we can't be the only ones struggling with this | 22:38 |
| clarkb | https://grafana.opendev.org/d/21a6e53ea4/zuul-status?orgId=1&from=now-30m&to=now&timezone=utc has data from zk01 in the last 5 minutes (if you click zk01 on the graph it will drop the data for the other two) (approximate data size and ephemeral node counts change often enough to see it is working) | 22:38 |
| clarkb | fungi: I think some of the issue is that docker itself is managing these library images in many cases | 22:38 |
| clarkb | rather than say zookeeper themselves | 22:38 |
| clarkb | or python etc | 22:38 |
| clarkb | https://grafana.opendev.org/d/1f6dfd6769/opendev-load-balancer?orgId=1&from=now-5m&to=now&timezone=utc also has data from the gitea lb system so I think the new statsd container is working | 22:39 |
| clarkb | fungi: also ipv6 makes this a million times worse | 22:41 |
| clarkb | fungi: and I suspect many/most people are still ipv4 only | 22:41 |
| opendevreview | Clark Boylan proposed opendev/system-config master: Mirror zookeeper:3.9 to our quay.io mirror org https://review.opendev.org/c/opendev/system-config/+/970327 | 22:41 |
| clarkb | we were not mirroring zookeeper:3.9 only :latest | 22:42 |
| clarkb | (they may be the same thing today I'm not sure but we shouldn't count on that long term) | 22:45 |
| clarkb | fungi: remind me: apt-key is the old thing right? And that would explain why trixie doesn't have it by default? | 22:47 |
| clarkb | fungi: instaed we can write the ascii armored file directly to some path in /etc/apt/ ? | 22:47 |
| *** parallax is now known as Guest33250 | 22:50 | |
| clarkb | aha you're supposed to put specific signed by lines in apt sources list entries then stick the file whereever you like | 22:50 |
| opendevreview | Clark Boylan proposed opendev/system-config master: Update Hound Container to Debian Trixie https://review.opendev.org/c/opendev/system-config/+/970322 | 22:55 |
| fungi | clarkb: yes, you've found the answer now though | 23:01 |
| clarkb | ya found a good enough pointer via google then from that discovered we already had an example in system-config | 23:02 |
| fungi | if you look at the main sources list on a trixie install you'll see clear examples | 23:02 |
| fungi | apt-key add was deprecated a couple of debian releases back, and then global keyrings more generally were deprecated too | 23:03 |
| clarkb | ya seems like an improvement to make sure each key is only validating the packages you expect it to | 23:04 |
| fungi | correct | 23:04 |
| clarkb | also it is far more convenient to write these files to disk like any other fiel and refer to them | 23:04 |
| clarkb | rather than maitnain what is basicalyl a databawse | 23:05 |
| clarkb | wow I cannot type | 23:05 |
| fungi | not that it's all that huge of an improvement from a security sense, because installing a deb runs maintscripts as root and can do any arbitrary thing the package creator wants to your system | 23:05 |
| clarkb | ah so could just say trust my key for everything anyway | 23:05 |
| fungi | potentially. or just replace /bin/sh with your pwn3d backdoor | 23:06 |
| opendevreview | Clark Boylan proposed opendev/system-config master: Update Hound Container to Debian Trixie https://review.opendev.org/c/opendev/system-config/+/970322 | 23:11 |
| clarkb | I guess you'd need some wya of limiting access to portions of the system per package/per repo and have that info get signed by some more authoritatiev key and at that point you've hobbled the ability of anyone to have their own package repo | 23:30 |
| clarkb | and maybe you've reinvented snaps/flatpaks at that point | 23:30 |
| fungi | yeah, that's basically the challenge, snap/fpack or containers more generally | 23:39 |
Generated by irclog2html.py 4.0.0 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!