clarkb | just about meeting time | 18:59 |
---|---|---|
clarkb | #startmeeting infra | 19:00 |
opendevmeet | Meeting started Tue Jan 21 19:00:38 2025 UTC and is due to finish in 60 minutes. The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot. | 19:00 |
opendevmeet | Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. | 19:00 |
opendevmeet | The meeting name has been set to 'infra' | 19:00 |
clarkb | Hello! | 19:00 |
clarkb | #link https://lists.opendev.org/archives/list/service-discuss@lists.opendev.org/thread/JWBLUYVPNULENDQWGEKO6VX27CVXQLGO/ Our Agenda | 19:00 |
frickler | \o | 19:01 |
clarkb | #topic Announcements | 19:01 |
clarkb | I didn't have anything to announce. Did anyone else? | 19:01 |
clarkb | sounds like no. Let's dive into the agenda then | 19:02 |
clarkb | #topic Zuul-launcher image builds | 19:02 |
clarkb | #link https://review.opendev.org/q/hashtag:niz+status:open is next set of work happening in Zuul | 19:02 |
clarkb | I believe that a good chunk of this work landed last week and should be deployed now (via our weekly updates). But there are still some open changes last I looked so not sure if we need to get those in before this can proceed | 19:03 |
clarkb | corvus: ^ is this something where we're still waiting or is what landed sufficient to make progress? | 19:03 |
clarkb | we may not have corvus right now. We can continue and get back to this later if that changes | 19:05 |
clarkb | in any case some progress has been made I'm just not sure of how much yet | 19:05 |
clarkb | #topic Deploying new Noble Servers | 19:05 |
fungi | a noble endeavor | 19:05 |
clarkb | Progress has happened here as well. I migrated paste01's db to paste02 last week and updated DNS. The next step was to get backups working which is where I ran into complications | 19:06 |
clarkb | the tl;dr is that we need borg ~1.2.8 to run on noble for compatibility with python3.12 | 19:06 |
clarkb | we currently pin to 1.1.18 and there are some big nasty warnings from borg about mixing these. However, after much reading and attempt to understand the problems I think the risk to us is extrmeely low | 19:06 |
clarkb | basically what could happen is we could delete valid archives (backups) if we run `borg check --repair` on a borg 1.x created archive using 1.2 | 19:07 |
clarkb | however, this particular issue seems to be unlikely for us because we never used a borg old enough to produce the archives that would now be considered invalid. And we don't automatically run borg check --repair anywhere | 19:08 |
corvus | (sorry for tardiness; almost ready to make progress on images; expect real progress by next week) | 19:08 |
clarkb | (personally I would've appreciated clearer and more direct communication of the problem from borg rather than the big scary messages we got but we muddled through) | 19:08 |
clarkb | so anyway paste02 is backing up with borg 1.2.8 to servers running 1.1.18. Worst case today only paste02's backups would be impacted but as mentioend I don't expect problems anyway | 19:09 |
clarkb | corvus: ack thanks | 19:09 |
clarkb | #link https://review.opendev.org/c/opendev/system-config/+/939667 fixup for warnings treated as errors with new borg spamming email | 19:09 |
clarkb | there is one little borg update annoyance though and that is brog 1.2 exits with rc 1 if there are warnings so we've been spamming root email with backup failures that are most likely all warnings not errors. This change would treat rc 1 when using newer borg as a successful backup | 19:10 |
clarkb | the most common warning (the only one I've seen anyway) is a warning for files changing while being backed up. This is common with log files in particular | 19:10 |
fungi | also the unknown unencrypted volume warning | 19:10 |
clarkb | I think if we can get 939667 or something like it in and confirm that backups behave nicely afterwards we'll be in a spot to consider deleting paste01 and retiring its backups | 19:10 |
clarkb | fungi: that one we explicitly override though and in testing it doesn't seem to affect rc 1 with our override | 19:11 |
clarkb | I thought it did at first but that was a red herring | 19:11 |
clarkb | basically if unknown unencrypted volume warning is the only warning you get an rc 0 with our flag to ignore that warning | 19:11 |
fungi | ah, okay, so it shows up in the log as a warning but doesn't contribute to the return code? | 19:11 |
fungi | it was the only warning i saw logged in the one i looked at | 19:12 |
clarkb | yes that seems to be the case as I had logs with that warning and no others exit 0 | 19:12 |
clarkb | fungi: the other warnnig isn't logged as a warning | 19:12 |
fungi | bwahahahaha | 19:12 |
clarkb | its just a message with no prefix | 19:12 |
clarkb | let me find an example really quickly | 19:12 |
clarkb | https://zuul.opendev.org/t/openstack/build/49255ec995394248a24c5eb1e11c9a68/log/borg-backup-noble.opendev.org/borg-backup-borg-backup01.region.provider.opendev.org.log#8560 | 19:13 |
fungi | okay, so exits nonzero on warnings, logs warnings even when they're explicitly disabled, but also doesn't state that some warnings it logs are warnings | 19:13 |
clarkb | right. When the linked message doesn't appear we get rc 0 even with the explicit warning about unknown unencrypted volume | 19:13 |
clarkb | so anyway long story short I think this is mostly working now from podman to python 3.12 to borg. With the above change being the last known cleanup. And I think I'm happy to start on old server cleanups once the last issue is sorted | 19:14 |
clarkb | let me know if you have concerns or questions about that and we can dig in more and make sure we're happy with it | 19:14 |
clarkb | #topic Deploying Lodgeit entirely without Docker Hub | 19:14 |
fungi | for the record, the log that's currently /var/log/borg-backup-backup01.ord.rax.opendev.org.log.3.gz on paste02 is the one i was looking at | 19:14 |
clarkb | fungi: look for 'changed while' and you should see what I linked to above just in prod | 19:15 |
fungi | /var/log/borg-backup-backup01.ord.rax.opendev.org.log: file changed while we backed it up | 19:15 |
fungi | indeed, it's in there | 19:15 |
clarkb | cool | 19:15 |
clarkb | this next topic is related to the previous one in that I realized we could take this opportunity of paste02 running podman and docker compose to test our assumptions about speculative image testing and switch it over to quay entirely | 19:16 |
clarkb | #link https://review.opendev.org/c/opendev/lodgeit/+/939385 Publish lodgeit image to quay.io | 19:16 |
clarkb | that change updates where we push the image to (quay instead of docker hub) then I still need to write a followup to pull it from quay if we want to proceed with that | 19:16 |
clarkb | I think it is a good idea to triple check our assumptions before we get too far down this path again | 19:16 |
clarkb | please review and let me know if you ahve any questions or concerns about that | 19:16 |
tonyb | ++ | 19:17 |
tonyb | sounds like a good plan to me | 19:17 |
clarkb | #topic Upgrading Old Servers | 19:17 |
clarkb | I think we can move on I mostly wanted to call out the effort and discussion can proceed in review | 19:17 |
clarkb | tonyb: I know you were out until recently, but anything we need to do / think about re wiki? | 19:18 |
tonyb | nope. I'll send the announce email and switch the skin | 19:19 |
fungi | yay! so close | 19:19 |
clarkb | tonyb: did the changes still need some updates? I seem to remember planning to change the proxy setup maybe? | 19:19 |
tonyb | given your experience with paste do you think it's "safe" to go with noble | 19:19 |
clarkb | I guess ping for reviews when that is ready and let us know if they have arleady been updated | 19:19 |
tonyb | will do | 19:20 |
clarkb | noble was definitely an uplift but I am hopeful I've sorted ou the major items | 19:20 |
tonyb | okay | 19:20 |
fungi | just make sure you enable configdrive when launching it | 19:21 |
clarkb | anything else on this topic? | 19:21 |
fungi | (rackspace's jammy image doesn't need that, so it's not on by default, but our noble image needs it for cloud-init to work) | 19:21 |
tonyb | not from me. | 19:22 |
clarkb | ya becusae we uploaded tonyb's converted upstream noble image | 19:22 |
clarkb | I suspect rax does magic to make not config drive work and they haven't uploaded noble yet | 19:22 |
fungi | rackspace presumably fiddles with the image they make available | 19:22 |
clarkb | ya | 19:22 |
clarkb | #topic Gerrit 3.10.4 | 19:22 |
clarkb | last week we managed to get borg going for noble and upgrade gitea. The last major item on my list was updating gerrit to 3.10.4 but due to our prior gerrit restart experience going poorly and a holiday weekend approaching with basiclly only me around I decided to defer this to this week | 19:23 |
fungi | i'm happy to help with it basically any time this week | 19:23 |
clarkb | I think I'll aim for doing this tomorrow once I'm caught up with the plan being to land the h2 db setting and also update to the newer point release | 19:23 |
fungi | links to changes/topic might help | 19:24 |
clarkb | this may require a couple of restarts to see it take effect, but otherwise it should be similar to most of our restarts | 19:24 |
clarkb | #link https://review.opendev.org/c/opendev/system-config/+/938000 | 19:24 |
clarkb | #link https://review.opendev.org/c/opendev/system-config/+/939167 | 19:24 |
clarkb | those are the two related changes | 19:24 |
fungi | both lgtm, thanks | 19:25 |
clarkb | the updates between 3.10.3 and 3.10.4 seem minimal and also safe | 19:25 |
clarkb | mostly bugfixes | 19:25 |
clarkb | #topic Running certcheck on bridge | 19:25 |
clarkb | fungi: is there a change for this yet? I suspect no given the distractions last week but wanted to double check | 19:26 |
fungi | i think we had rough consensus in favor last week, but no i said i wouldn't have time to put that together until this week. thanks for the reminder! | 19:26 |
clarkb | yup I definitely didn't expect it yet. Just checking | 19:27 |
clarkb | #topic Service Coordinator Election | 19:27 |
fungi | worth noting though, the build failure which prompted me to start looking at all of that turned out to be related to a github outage and not rate limits, so it's not super urgent | 19:27 |
clarkb | ack | 19:27 |
clarkb | Last week I said taht I would send email to make the proposed plan for this election official then didn't do that. So this is now on my list for today | 19:27 |
clarkb | as a reminder Nominations Open From February 4, 2025 to February 18, 2025. Voting February 19, 2025 to February 26, 2025. All times are UTC based | 19:28 |
clarkb | #topic Beginning of the Year (Virtual) Meetup | 19:28 |
clarkb | and finally a reminder we're trying to meetup several times this week (with the first block of time occuring in 1.5 hours) | 19:29 |
clarkb | #link https://etherpad.opendev.org/p/opendev-january-2025-meetup | 19:29 |
clarkb | the etherpad has the schedule info and a number of topics from myself | 19:29 |
clarkb | feel free to add items | 19:29 |
clarkb | frickler do you know if the "early" block from 1800-2000 UTC tomorrow and day after are something you'll be attending? | 19:30 |
clarkb | if not I suspect we may only use the "late" blocks from 2100-2300 UTC each day | 19:30 |
frickler | I won't, sorry | 19:30 |
clarkb | ok I'll go ahead and remove the early blocks. We'll just use the late blocks | 19:30 |
tonyb | sounds good | 19:31 |
fungi | updated my reminders accordingly | 19:31 |
clarkb | we can probably do additonal self organization when we jump on meetpad later today | 19:32 |
clarkb | #topic Open Discussion | 19:32 |
clarkb | This didn't make it onto the agenda but I think we should push a Bindep release with changes up to everything before switching to pyproject.toml | 19:32 |
clarkb | then we can switch to pyproject.toml and have bindep act as a canary for how that all works with PBRs proposed updates to better support that system | 19:32 |
fungi | oh, right | 19:32 |
fungi | i can also try to find some time to push that this week | 19:33 |
fungi | related, there are a couple of documentation updates for pbr in review related to pyproject.toml support | 19:33 |
clarkb | unfortauntely the python packaging world is full steam ahead into pyproject.toml as an assumed tool and python3.12 and newer are getting clunkier without it (also creates tons of confusion for people when they need to manually install setuptools) | 19:33 |
clarkb | the changes should continue to support the old school method as long as you preinstall setuptools. But we're also trying to make pyproject.toml work for people who are too confused about that | 19:34 |
fungi | but yeah, it would be great to have one or more simple opendev projects we could point at as examples of how to use pbr that way | 19:34 |
clarkb | also we updated gitea last week which includes a "fix" for the memory leak | 19:35 |
fungi | also it was a good exercise for mapping out the remaining rough edges and future possible polish around pbr's support there, as well as fixing up its documentation | 19:35 |
clarkb | basically they disabled fuzzy search by default as fuzzy search is apparently very memory hungry | 19:35 |
clarkb | fungi: ++ | 19:35 |
clarkb | we also had to emergency apply a new user agent filter string for a valid but old version of edge that was impacting service availability | 19:36 |
clarkb | very likely more ai crawler bots not being nice | 19:36 |
clarkb | I found a hacker news post from someone else that had to take their gitea off the internet due to similar problems. The discussion on hacker news about it was interesting as apparently different crawler bots respect robots.txt in different orthogonal ways that people have inferred over time | 19:37 |
clarkb | liek apparently open ai will only respect entries specifically for its user agent string and not the generic top level rules | 19:37 |
clarkb | and others have noticed that crawl-delay has really inconsistent support | 19:38 |
fungi | but also there seemed to be some consensus that gitea is not designed to withstand aggressive crawlers, worse than many web applications anyway | 19:38 |
clarkb | https://news.ycombinator.com/item?id=42750420 that was this post | 19:38 |
fungi | as well as general lament that it's starting to seem like the only viable solutions are to start relying on cdn-oriented ai filtering service providers | 19:39 |
clarkb | what else? there was a small zuul blip with a bug that would impact a subset of playbook runs. corvus took care of that over the weekend before it was a larger problem. Thank you for the quick turnouarnd on that | 19:39 |
corvus | i had help; someone wrote a patch first, but i did restart over the weekend. :) | 19:40 |
corvus | also, i wrote the bug in the first place :( | 19:40 |
fungi | if you don't write them, who will? | 19:41 |
clarkb | I'll give it a few more minutes but we may get 15 minutes back for $meal today | 19:41 |
corvus | (fun story: it was the result of a particularly gruesome git conflict resolution. it's the biggest fail of git merge i've seen. i basically had to just completely reconstruct everything manually from the diffs) | 19:41 |
clarkb | oof | 19:41 |
fungi | i hate it when that happens | 19:41 |
fungi | e.g. when git merges a chunk to the wrong part of the file due to multiple context matches | 19:42 |
corvus | yep | 19:42 |
corvus | it got 1 line right and the remaining 70 lines wrong. | 19:42 |
clarkb | sounds like that is everything. Thank you everyone. We'll be back here next week same time and location. We're also going to hang out on meetpad from 2100-2300 UTC today, tomorrow, and thursday to go over higher level topic discussion | 19:44 |
fungi | thanks clarkb! | 19:44 |
clarkb | see you there in about 1.25 hours | 19:44 |
clarkb | #endmeeting | 19:44 |
opendevmeet | Meeting ended Tue Jan 21 19:44:31 2025 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) | 19:44 |
opendevmeet | Minutes: https://meetings.opendev.org/meetings/infra/2025/infra.2025-01-21-19.00.html | 19:44 |
opendevmeet | Minutes (text): https://meetings.opendev.org/meetings/infra/2025/infra.2025-01-21-19.00.txt | 19:44 |
opendevmeet | Log: https://meetings.opendev.org/meetings/infra/2025/infra.2025-01-21-19.00.log.html | 19:44 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!