| @clarkb:matrix.org | Both gitea and etherpad have new releases we can start preparing to deploy. Historically I've done a lot of these updates. I'm happy to walk someone else through one or both if there is interest. | 14:34 |
|---|---|---|
| Note we should avoid actually upgrading etherpad until after the ptg | ||
| @tafkamax:matrix.org | The etherpad creator is developing a new version of etherpad where the backend is in go | 14:34 |
| @tafkamax:matrix.org | FYI | 14:34 |
| @tafkamax:matrix.org | Haven't tested it myself | 14:35 |
| @clarkb:matrix.org | Yup eventually we will need to switch I suspect. But the nodejs version just got a release so we can continue to deploy that | 14:35 |
| @tafkamax:matrix.org | Do you use plugins aswell? | 14:35 |
| @tafkamax:matrix.org | I use and like to compile them in to docker instead of installing in runtime | 14:36 |
| @clarkb:matrix.org | We use one or two minor plugins. Nothing too crazy. The bigger concern I have is migrating existing pads to the new system when it comes to that | 14:36 |
| @tafkamax:matrix.org | Yeah | 14:36 |
| @clarkb:matrix.org | And yes we add them directly to our docker images | 14:36 |
| @tafkamax:matrix.org | Nice | 14:36 |
| -@gerrit:opendev.org- Clark Boylan proposed: [opendev/system-config] 985834: Update Gitea to 1.26.0 https://review.opendev.org/c/opendev/system-config/+/985834 | 15:14 | |
| @clarkb:matrix.org | That is a first draft on the gitea update. I figured getting the ball moving was a good idea. I'll probably do that same for etherpad. Still happy for others to jump in if interested. Chances are these changes will need updates to make thinsg work and or to update the versions as new ones may be available by the time we're ready to upgrade | 15:15 |
| @mnasiadka:matrix.org | The zp03 changes are awaiting reviews if there’s room in anybody’s todo list - https://review.opendev.org/c/opendev/system-config/+/985620 | 15:23 |
| @fungicide:matrix.org | i have a break in my ptg schedule so am going to take this opportunity to grab lunch, but can review when i get back | 15:44 |
| -@gerrit:opendev.org- Zuul merged on behalf of yatin: [zuul/zuul-jobs] 984404: [ensure-python] Fix rpm_python_pkg_name For CentOS/RHEL https://review.opendev.org/c/zuul/zuul-jobs/+/984404 | 15:52 | |
| @clarkb:matrix.org | github seems to be really slow today | 15:53 |
| @clarkb:matrix.org | githubstatus seems to confirm. I just need the etherpad changelog :) | 15:53 |
| -@gerrit:opendev.org- Clark Boylan proposed: [opendev/system-config] 985843: Upgrade etherpad to 2.7.0 https://review.opendev.org/c/opendev/system-config/+/985843 | 16:16 | |
| @clarkb:matrix.org | https://review.opendev.org/c/opendev/zuul-providers/+/984866 based on the consistent post failures there I think something may be properly broken with our intermediate swift container for zuul-launcher builds | 16:19 |
| @clarkb:matrix.org | I did a brief check to see if we might have the same requests exceptions issue that affected log uploads and that doesn't appear to be the problem. In this case swift appeas to return a 404 saying our container doesn't exist? | 16:20 |
| @clarkb:matrix.org | If I do a `openstack container list` against that cloud region I see the container listed | 16:22 |
| @clarkb:matrix.org | if I do a container show I get a 404 | 16:22 |
| @clarkb:matrix.org | interesting | 16:22 |
| @clarkb:matrix.org | has anyone else seen this before where container listings show a container but container show says 404? I don't think this is an auth issue because you'd expect a 403 or a failure to list the containers in the first place | 16:23 |
| @clarkb:matrix.org | corvus: ^ fyi as this affects opendev's zuul-launcher stuff | 16:23 |
| @clarkb:matrix.org | part of me wonders if we should just try using a new container but I'm not sure what is involved in setting that up | 16:24 |
| @jim:acmegating.com | Clark: i'm about to head out for an appointment, but i would support trying a new container; i suspect there may be a bunch of old cruft in there anyway from early launcher stuff, and a restart would be good) | 16:25 |
| (but -- perhaps this is just a temporary cloud issue?) | ||
| @clarkb:matrix.org | corvus: yes I thought it may be temporary but it appears to started ~April 15 and based on my manual testing appears to be continuing. SO about a week long issue or so | 16:26 |
| @clarkb:matrix.org | I can recheck the change again, but based on manual testing I don't expect a different result | 16:26 |
| @clarkb:matrix.org | without being able to container show the container I'm not sure what sorts of acls we might need on a new container :/ | 16:26 |
| @jim:acmegating.com | i thought we went through this during the great re-keying | 16:27 |
| -@gerrit:opendev.org- Michal Nasiadka proposed: [opendev/irc-meetings] 985845: Move Magnum weekly team meeting https://review.opendev.org/c/opendev/irc-meetings/+/985845 | 16:27 | |
| @jim:acmegating.com | did we document anything as a result of that? | 16:27 |
| @clarkb:matrix.org | oh yes we did | 16:27 |
| @jim:acmegating.com | if not, maybe it will be in logs | 16:28 |
| @clarkb:matrix.org | I don't know if we explicitly logged anything, but I can look through communication logs to see what can be found | 16:28 |
| @jim:acmegating.com | (or maybe in the password file) | 16:28 |
| @clarkb:matrix.org | I'll work on that and if I can find it I'll go ahead and create a new container and set acls and then take it from there | 16:28 |
| @jim:acmegating.com | cool, sorry i can't help more right now, but will check in when i get back | 16:28 |
| @clarkb:matrix.org | thanks. I don't think this is urgent so don't feel bad about that. its been broken for a week another few hours or a day won't be the end of the world | 16:29 |
| @clarkb:matrix.org | ok I've found where we discussed rotating the credentials. It looks like we simply created a new application credential and then updated the secret values for its uuid and token/password/secret. It does not look like we changed anything about the container itself. This implies to me that maybe we did not have any special acls on the container itself. Given then I'm inclined to create a new container and push an update to opendev/zuul-providers and see what happens | 16:36 |
| @clarkb:matrix.org | I've created a new container with `openstack container create images-e86a313cdad6` and I can container show this container. I'll push up a change to use it momentarily | 16:42 |
| -@gerrit:opendev.org- Clark Boylan proposed: [opendev/zuul-providers] 985847: Change the intermediate swift storage container https://review.opendev.org/c/opendev/zuul-providers/+/985847 | 16:45 | |
| -@gerrit:opendev.org- Clark Boylan proposed: [opendev/system-config] 985834: Update Gitea to 1.26.0 https://review.opendev.org/c/opendev/system-config/+/985834 | 17:03 | |
| @clarkb:matrix.org | something about how gitea is built has changed, but they super optimized their dockerfile so figuring out the delta is a bit of a pain. I don't think those optimizations make a lot of sense for us since we're not maintaining local images and building them often so if all I needed was the explicit go mod download step then maybe we're ok? | 17:04 |
| @clarkb:matrix.org | however, if things continue to not work maybe we port our dockerfile to something more closely matching theirs. | 17:04 |
| @clarkb:matrix.org | oh actually maybe they removed the explicit step to build that command entirely and the thing is simply not in the codebase anymore which would explain teh compiler complaint that the name isn't in std anymore? | 17:06 |
| -@gerrit:opendev.org- Zuul merged on behalf of Michal Nasiadka: [opendev/zone-opendev.org] 985619: Add zp03.opendev.org https://review.opendev.org/c/opendev/zone-opendev.org/+/985619 | 17:11 | |
| -@gerrit:opendev.org- Clark Boylan proposed: [opendev/system-config] 985834: Update Gitea to 1.26.0 https://review.opendev.org/c/opendev/system-config/+/985834 | 17:16 | |
| -@gerrit:opendev.org- Zuul merged on behalf of Michal Nasiadka: [opendev/system-config] 985620: Add zp03.opendev.org https://review.opendev.org/c/opendev/system-config/+/985620 | 17:43 | |
| @mordred:waterwanders.com | So, I've been noodling a bunch with using ai agents for both coding and code-review with gerrit and zuul. It's working well so far. Like - it's almost like the tools we've built with OpenDev and Zuul are exactly the type of guardrails needed for AI sanity. :) Based on experience so far, I'd like to try something that would require a bit more buy in from OpenDev than just approving the occasional mordred repo. | 17:54 |
| I'm finding that my review agent does a good enough and quick enough job of catching dumb things, that having zuul jobs not start until a change has an agent+1 feels more right. Instead of "patch upload -> zuul +1 -> human review / +A -> zuul gate" it would be "patch upload -> agent+1 -> zuul +1 -> human review / +A -> zuul gate" To do that I'd need new pipeline definitions, which in turn would need a new zuul tenant and config repo - and on the repos in question they'd want an additional gerrit review category. this feels like something that should be discussed more broadly before I start writing new config-repo patches. | ||
| Do you guys want me to write up a formal spec? Or is rambling back and forth here sufficient? | ||
| @fungicide:matrix.org | i don't have objections to creating a new zuul tenant for that, even as an experiment. alternatively you might be able to get away with just a separate pipeline in an existing tenant? | 17:57 |
| @fungicide:matrix.org | obviously you need slightly different trigger conditions than the typical check pipeline, but you could set the involved projects up with no check pipeline and then have a separate validation pipeline on them | 17:58 |
| @mordred:waterwanders.com | you know, I like that as a potential smaller first step for the experiment than a whole new tenant out of the gate | 17:59 |
| @clarkb:matrix.org | It also seems to be similar to what we had suggested when this came up earlier. Basically use zuul to drive the process of LLM code review. The main difference is the use of a separate pipeline which seems like it would be ok to do particularly in another tenant. I think the main drawback to that is it assumes you'll always have an LLM do pre review which is fine if tenant scoped | 17:59 |
| @clarkb:matrix.org | I'm not sure how you'd do that in an existing tenant without changing the rules for all repos in that tenant | 18:00 |
| @mordred:waterwanders.com | yeah - I don't think it can be a global thing, definitely has to be an opt-in. | 18:00 |
| @clarkb:matrix.org | Since you would need to modify the check pipeline trigger criteria | 18:00 |
| @fungicide:matrix.org | well, check could be undefined on those projects and would never trigger | 18:01 |
| @clarkb:matrix.org | Though maybe you can set llm-review piepline to run noop job | 18:01 |
| @mordred:waterwanders.com | well - if we do fungi's thing, we could make an "post-agent-check" pipepine (or a better name) - and then, as long as there isnt' a clean-check rule, if there weren't any jobs in check, it _should_ just work, yeah? | 18:01 |
| @fungicide:matrix.org | as long as the gate pipeline in that tenant doesn't require a vote from any other pipeline as a condition, and gate is always human-triggered, i don't see why there couldn't be multiple check-like pipelines used in parallel by different projects | 18:02 |
| @clarkb:matrix.org | Oh ya if you invert it where check is still the first pass but opt into using LLMs or traditional jobs then ya that would work | 18:02 |
| @clarkb:matrix.org | Though it might be weird to get the triggers right if using verified +1 still | 18:02 |
| @clarkb:matrix.org | It kinda seems like you want a llm-review label and llm-review pipeline then configure that pipeline to post to llm-review label and trigger check off of that | 18:03 |
| @fungicide:matrix.org | a custom gerrit label in those projects' acls could also be an option | 18:03 |
| @mordred:waterwanders.com | yeah - well, it's "have the LLM do a pass, then once they're happy, trigger Zuul" so the alternate check pipeline could have the rule that it needs agent+1 or whatever | 18:03 |
| @clarkb:matrix.org | But you can probably exercise the bulk of the workflow without that level of commitment | 18:03 |
| @fungicide:matrix.org | yeah, basically what you just said | 18:03 |
| @fungicide:matrix.org | right, there's a lot of flexible bits to work with here, between project-specific gerrit and zuul configuration | 18:04 |
| @clarkb:matrix.org | Right to be clear I think this is doable. I think what is awkward is to do it in a tenant with existing jobs and pipelines and expectations | 18:04 |
| @clarkb:matrix.org | Using a new tenant you side step all of that without needing to retrofit the old tenant first | 18:04 |
| @mordred:waterwanders.com | yah. I'll go see if I can't cook up a patch to add a new pipeline but not impact existing pipelines. | 18:04 |
| @fungicide:matrix.org | i'm okay with it as long as the experiment doesn't bleed over onto other uninvolved projects/changes | 18:04 |
| @mordred:waterwanders.com | ++ me too. I very much don't want uninvolved bleed | 18:05 |
| @fungicide:matrix.org | the only ugly bit i can think of is that there would be an additional pipeline showing up in zuul status for that tenant | 18:05 |
| @fungicide:matrix.org | but if that pipeline doesn't run jobs for or leave votes on uninvolved projects i don't have a concern | 18:05 |
| @clarkb:matrix.org | I guess zuul con filter verified votes by voter so maybe that is the trick to getting the triggers right | 18:06 |
| @clarkb:matrix.org | * I guess zuul can filter verified votes by voter so maybe that is the trick to getting the triggers right | 18:06 |
| @clarkb:matrix.org | Which would allow you to overload the verified label | 18:06 |
| @fungicide:matrix.org | and for that matter, the default status view these days hides pipelines that aren't running any jobs at the time, so odds are people would almost never notice except when the projects participating in the experiment have changes actively running jobs | 18:06 |
| @mordred:waterwanders.com | yah- and since all my repos are in the opendev tenant which has a pretty low user base right now, it would be _super_ low exposure | 18:08 |
| @fungicide:matrix.org | so all that's to say, i can see a few ways it could work; whether i'm okay with the idea would depend on the details of course | 18:08 |
| @fungicide:matrix.org | but seems like we can come up with something | 18:08 |
| @clarkb:matrix.org | mordred: re the thesis this is the thing I've been trying to explain a few times to others. We built these tools for a couple thousand largely independent developers to go crazy making Openstack. That doesn't look too different than a bunch of agents going crazy on a shared code base | 18:09 |
| @mordred:waterwanders.com | ++ cool. I can work on some patches to propose and we can discuss whether they are terrible or not | 18:09 |
| @mordred:waterwanders.com | Clark: EXACTLY | 18:09 |
| @mordred:waterwanders.com | like, this is not a new problem for us | 18:09 |
| @mordred:waterwanders.com | it's the places that have adopted the "you should build all tools assuming high trust for the devs involved" that are in a bad spot currently | 18:10 |
| @fungicide:matrix.org | that's the theme of the talk i've proposed for all things open, as well | 18:10 |
| @mordred:waterwanders.com | let's be honest - our tools were built with the assumption that *I* would be writing code :) | 18:10 |
| @fungicide:matrix.org | so interested to have this as an anecdote | 18:10 |
| @fungicide:matrix.org | mordred: i think some of our tools still have inline comments that say `# TODO(morderd): actually make this work` | 18:11 |
| @clarkb:matrix.org | I dunno I learned pretty quickly that you just push TODOs to prompt someone else to write the actual code. Hey that isn't much different either :) | 18:11 |
| @mordred:waterwanders.com | I'm pondering a talk for shanghai and the lf event in slc in novemeber. haven't gotten all the way to submitting anything yet | 18:11 |
| @clarkb:matrix.org | fungi: I found one today in the gitea dockerfile even | 18:11 |
| @mordred:waterwanders.com | hahahaha | 18:11 |
| @fungicide:matrix.org | mordred: the only reason i already put in a proposal for ato is that their cfp was closing and it's the closest open source conference to me geographically. they also have an "all things ai" conference they started a few years ago, but the 2026 one was last month so won't be happening again until next march | 18:12 |
| @clarkb:matrix.org | Something to do with how files are copied not working for us because we clone the repo so we skip a step gitea does and the todo says we should fix that if we ever figure it out | 18:12 |
| @mordred:waterwanders.com | fungi: yeah - unfortunatley CFPs are so far ahead. I wasn't doing any of this when the CFP for the next LF event was still open :) | 18:13 |
| @mordred:waterwanders.com | Clark: see - now that TODO makes me want to go in and use bind-mount copy | 18:14 |
| @fungicide:matrix.org | to be clear, so far the talk consists of about 5 sentences of abstract. i have faith that by the time october rolls around (if it gets accepted) i'll have something to put onto slides for it | 18:15 |
| @mordred:waterwanders.com | I'll definitely share whatever I come up with too | 18:15 |
| @mordred:waterwanders.com | I have a good amount of anecdata so far | 18:16 |
| @clarkb:matrix.org | For those that don't know one of my very first interactions with Monty was him saying the python bindings for libdrizzle were broken and someone should fix them. Guess who fixed them? This was at least a couple years before Openstack too | 18:17 |
| @mordred:waterwanders.com | nerdsniping is my favorite method of coding | 18:17 |
| @clarkb:matrix.org | mordred: it honestly feels like a layer 8 problem more than anything else. Between the foundation policy, DCO, llm service ToS, and costs it seems like there are a bunch of hoops to jump through to do anything officially. But the tools themselves should largely plug together and support the workflow | 18:19 |
| @fungicide:matrix.org | nerdsniping was vibecoding with a mechanical turk | 18:19 |
| @clarkb:matrix.org | But maybe if you unofficially nerdsnipe yourself into proving that the tools are ready then the layer 8 problems become easier to work through? "Look this could be possible" | 18:20 |
| @mordred:waterwanders.com | yeah. luckily many of those layer 8 problems don't exist on my personal projects. no foundation policy or dco to manage before proving. and ... we're big enough I did not really have to explain how gerrit or zuul worked, although i did make those gerrit and zuul plugins, but those were optimizations | 18:27 |
| @clarkb:matrix.org | the etherpad 2.7.0 change passes testing and the two screenshots look good. I guess the next step there is holding a node and testing it. But that isn't urgent as we shouldn't upgrade during the PTG and I'm in meetings next week too | 19:03 |
| @shrews:matrix.org | wow. didn't expect to see a libdrizzle reference in matrix today | 19:29 |
| @fungicide:matrix.org | nobody expects the spanish inquisition | 19:29 |
| -@gerrit:opendev.org- Zuul merged on behalf of Clark Boylan: [opendev/zuul-providers] 985847: Change the intermediate swift storage container https://review.opendev.org/c/opendev/zuul-providers/+/985847 | 19:52 | |
| @jim:acmegating.com | opendev tenant has no clean check requirement (and i will continue to advocate against it!) so i think that would be a fine place to put a new pipeline if we're okay with that. | 19:55 |
| @fungicide:matrix.org | yep, that's what i was thinking as well | 19:56 |
| @clarkb:matrix.org | I guess we should recheck an image build change if someone hasn't already and see if the new container works? | 20:02 |
| @clarkb:matrix.org | Someone beat me to it according to zuul's status page | 20:04 |
| @jim:acmegating.com | was me | 20:05 |
| -@gerrit:opendev.org- Clark Boylan proposed: | 20:51 | |
| - [opendev/system-config] 985834: Update Gitea to 1.26.0 https://review.opendev.org/c/opendev/system-config/+/985834 | ||
| - [opendev/system-config] 985877: Fix gitea selenium screenshots https://review.opendev.org/c/opendev/system-config/+/985877 | ||
| @clarkb:matrix.org | I don't actually know if that name change will work as I'm not certain the selenium installation will be able to find names in /etc/hosts as I think it may run in a container? | 20:54 |
| @clarkb:matrix.org | might need to bind mount in /etc/hosts and hope that doesn't break other things? it is using network mode hosts so maybe this will just work though | 20:56 |
| @clarkb:matrix.org | corvus: looks like the zuul-providers job failures are now due to running out of disk during the compression step? | 21:00 |
| @clarkb:matrix.org | so that is a different failure to the swift 404 from before. Not sure if this is before or after that step would've been previously though | 21:00 |
| @fungicide:matrix.org | we were also running out of disk on those before, so i guess the swift errors were masking that we still hadn't completely solved the disk usage? | 21:02 |
| @clarkb:matrix.org | ya I'm not sure which direction we may have been masking before | 21:02 |
| @clarkb:matrix.org | and I have to pop out for the school run in about 2 minutes so not in a good spot to debug further at the moment | 21:03 |
| @jim:acmegating.com | well we were only running out of disk on the resolute build, and only barely; so i'm surprised if we're now running out of space in general | 21:49 |
| @jim:acmegating.com | looks like that happened once for a resolute build, and once for a rocky build? | 21:50 |
| @clarkb:matrix.org | Ya I think those were the two I saw when I checked earlier | 21:50 |
| @jim:acmegating.com | most of the builds in the first change passed | 21:50 |
| @jim:acmegating.com | like, all the others so far | 21:50 |
| @clarkb:matrix.org | Ah ok so maybe it's not a consistent enough issue or it's specific to those builds | 21:51 |
| @jim:acmegating.com | the two that failed were on ovh | 21:52 |
| @fungicide:matrix.org | which i think usually has a lot of available rootfs? | 21:54 |
| @jim:acmegating.com | one that passed was on ovh https://zuul.opendev.org/t/opendev/build/7c19f664095b486d90578d6c3e2f90ef/log/zuul-info/inventory.yaml | 21:55 |
| -@gerrit:opendev.org- Monty Taylor https://matrix.to/#/@mordred:inaugust.com proposed: [opendev/project-config] 985898: Add vouched pipeline to OpenDev tenant https://review.opendev.org/c/opendev/project-config/+/985898 | 21:55 | |
| @mordred:waterwanders.com | naming is impossible - but how does that look/sound? | 21:56 |
| @fungicide:matrix.org | wfm, i'm trying to do better about sticking to my personal policy not to get into bikeshed color debates | 21:56 |
| @mordred:waterwanders.com | RED | 21:56 |
| @fungicide:matrix.org | looking at the successful run on ovh, its rootfs is 80gb (79928754176 bytes reported by df) | 21:58 |
| @jim:acmegating.com | does it only have one fs? (no separate /opt) | 21:59 |
| @fungicide:matrix.org | yes | 21:59 |
| @clarkb:matrix.org | I wonder if the number of successes is enough to just land that change as an immediate first step since we think it may make things generally more reliable | 21:59 |
| @fungicide:matrix.org | so /opt is shared with / | 21:59 |
| @jim:acmegating.com | Clark: could -- or we could squash the other disk-saving change | 22:00 |
| @jim:acmegating.com | each one gets us 10GB | 22:00 |
| @fungicide:matrix.org | looking at the dstat report for the successful job, the highest i see usage go on / is 66gb | 22:00 |
| @clarkb:matrix.org | corvus: oh ya maybe that is what we should do. Combine them to maximize savings | 22:01 |
| @fungicide:matrix.org | when zuul-info is collected at the start of the job, the rootfs has 13gb used and about 60gb free, fwiw | 22:01 |
| @jim:acmegating.com | yeah, i'm just stuck on thinking that we didn't think either of these was necessary before; wondering what changed | 22:01 |
| @jim:acmegating.com | maybe rockylinux is that much bigger? | 22:02 |
| @jim:acmegating.com | and maybe we were marginal before? perhaps it's even the case that we have previously seen rocky failures on ovh nodes, but that's the only failure condition, and it's rare? | 22:03 |
| @jim:acmegating.com | rocky failure: /dev/vda1 78055180 69604308 4845968 94% / | 22:03 |
| focal success: /dev/vda1 78055180 65657132 8793144 89% / | ||
| both on ovh | ||
| @jim:acmegating.com | that's from the "Collect disk usage info post image build" task | 22:04 |
| -@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: | 22:05 | |
| - [opendev/zuul-providers] 984866: Run vhd builds first and delete cache https://review.opendev.org/c/opendev/zuul-providers/+/984866 | ||
| - [opendev/zuul-providers] 982182: Add Ubuntu resolute image build job https://review.opendev.org/c/opendev/zuul-providers/+/982182 | ||
| @jim:acmegating.com | okay, omnibus space saving change followed by resolute change | 22:05 |
| @clarkb:matrix.org | It wouldn't surprise me if there are differences. | 22:06 |
| @jim:acmegating.com | 4GB is a big difference | 22:07 |
| @jim:acmegating.com | but it may not all be in the image, it could be in the build scaffolding | 22:07 |
| @clarkb:matrix.org | infra-root https://review.opendev.org/c/opendev/system-config/+/985877 this change does appear to work and fixes the gitea screenshots. Landing that one should be safe (I stacked the 1.26.0 upgrade on top of it so they can happen separately | 22:38 |
Generated by irclog2html.py 4.1.0 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!