| *** liuxie is now known as liushy | 05:58 | |
| dtantsur | Hey folks, how do I figure out why a post job never ran on https://review.opendev.org/c/openstack/ironic-python-agent/+/960732? | 09:31 |
|---|---|---|
| dtantsur | Also https://zuul.opendev.org/t/openstack/builds?project=*ironic-python-agent&branch=stable%2F2025.2&skip=0 just hangs even though we've had patches merged there.. | 09:34 |
| tonyb | dtantsur: Looking | 09:38 |
| tonyb | The hanging is something corvus will need to look at https://zuul.opendev.org/t/openstack/builds?project=*ironic-python-agent*&branch=stable%2F2025.2&skip=0 works (not the additional * in the project name) but adding more filters causes the same lack of results | 09:40 |
| tonyb | dtantsur: for the post pipeline we need a SHA rather than chnage number do you have that to hand? | 09:42 |
| tonyb | Oh I see it | 09:42 |
| tonyb | the job ran: https://zuul.opendev.org/t/openstack/build/24b5390629844f95aca887886b9314f6 | 09:43 |
| tonyb | dtantsur: what makes you think it didn't? | 09:43 |
| dtantsur | tonyb: https://zuul.opendev.org/t/openstack/builds?job_name=ironic-python-agent-build-image-dib-centos9 | 09:46 |
| dtantsur | not a single mention of 2025.2 despite two patches merging there | 09:46 |
| dtantsur | it kept running on master after the merge date of https://review.opendev.org/c/openstack/ironic-python-agent/+/960732 | 09:47 |
| dtantsur | so I'm quite puzzled | 09:47 |
| tonyb | Okay so 960732 did merge and triggered the post pipeline, but that specific job, and perhaps others, didn't | 09:50 |
| dtantsur | Could it be caused by the fact that the job is defined in another project? | 09:51 |
| tonyb | dtantsur: No I don't think so. Zuul knows that job applies on that branch | 09:52 |
| tonyb | https://zuul.opendev.org/t/openstack/job/ironic-python-agent-build-image-dib-centos9 | 09:52 |
| tonyb | that's the right job isn't it? | 09:53 |
| dtantsur | Could it be that stable/2025.2 on IPA-builder (where is the job is defined) was created after the IPA change merged? | 09:53 |
| dtantsur | yes, that's the one | 09:53 |
| tonyb | That's possible | 09:54 |
| tonyb | Trying to think how to verify that | 09:54 |
| dtantsur | We could merge a dummy patch to 2025.2 :) | 09:55 |
| tonyb | Well yes you could :) I was trying to think how to verify the hypothesis at hand. | 09:56 |
| tonyb | I *think* I can re-enqueue that job, but again that doesn't verify why it didn't run | 09:57 |
| dtantsur | tonyb: could you re-enqueue it nonetheless please? Ironic 2025.2 is now broken because it expects the artifacts to be generated already. | 09:57 |
| tonyb | sure | 09:58 |
| tonyb | dtantsur: I haven't done this before gimme 5 | 10:03 |
| tonyb | if I can't figure it out you may want to merge a dummy change | 10:04 |
| dtantsur | of course | 10:04 |
| tonyb | https://zuul.opendev.org/t/openstack/status?change=960732%2C1&pipeline=post | 10:07 |
| tonyb | Running now | 10:08 |
| tonyb | Oh yeah the branch creation for API-Builder was a week after IPA, that's almost certainly the root cause. | 10:12 |
| dtantsur | I see, so unfortunate timing.. | 10:21 |
| dtantsur | succeeded, nice! thank you again tonyb | 10:22 |
| tonyb | I think so. We can ask elodilles if there is any easy way to ensure that IPA-Builder is tagged before IPA (without looking at the associated release models) | 10:22 |
| tonyb | dtantsur: glad I could help | 10:22 |
| elodilles | tonyb dtantsur : i'm not aware of any easy way to force tagging order :/ both project have type 'other' and the release patches are either created by the team at some stage around/after Milestone-3, or by release team with 'missing-releases', so there doesn't even exist any automatism for this | 11:44 |
| tonyb | elodilles: That's more or less what I suspected. | 12:03 |
| fungi | would adding an inline comment to the deliverable file in the release repo work as a reminder maybe? | 13:25 |
| fungi | though maybe that's only helpful if people are manually editing to propose releases, not if scripts are doing bulk release additions | 13:25 |
| *** sfernand_ is now known as sfernand | 13:28 | |
| tonyb | fungi: Can't hurt. | 13:29 |
| clarkb | elodilles: you could use depends on then approve them one at a time. (I think if approved at the same time then the release tagging process can go in either order | 14:46 |
| fungi | yeah, the challenge is remembering that doing so is necessary when bulk release candidate tagging is being done | 14:48 |
| elodilles | yepp | 14:48 |
| fungi | i'm heading out to an early lunch, won't be long, back soon | 14:48 |
| elodilles | enjoy your food o/ | 14:50 |
| clarkb | fungi: when you get back https://review.opendev.org/c/opendev/zuul-providers/+/962233 https://review.opendev.org/c/opendev/zuul-providers/+/962235 and https://review.opendev.org/c/opendev/system-config/+/962237 are the followups to the raxflex double nic saga | 15:02 |
| clarkb | I think if we want to land those today I should be able to check the instances that get booted afterwards | 15:02 |
| opendevreview | Clark Boylan proposed opendev/system-config master: Remove ze11 from our inventory and management https://review.opendev.org/c/opendev/system-config/+/962371 | 15:21 |
| opendevreview | Michal Nasiadka proposed openstack/diskimage-builder master: almalinux-container: Add support for building 10 https://review.opendev.org/c/openstack/diskimage-builder/+/960336 | 15:49 |
| opendevreview | Merged opendev/zuul-providers master: Fix test sjc3 region flavors https://review.opendev.org/c/opendev/zuul-providers/+/962233 | 16:05 |
| fungi | all three of those lgtm | 16:05 |
| clarkb | thanks. the last one won't actually apply until the launchers get restarted which I may just let the weekly updates do in the next 24 hours | 16:06 |
| clarkb | but I can monitor instance boots after the second lands to ensure they still have one nic | 16:06 |
| opendevreview | Merged opendev/zuul-providers master: Revert "Remove raxflex networks config" https://review.opendev.org/c/opendev/zuul-providers/+/962235 | 16:06 |
| clarkb | ok instances booted after ~now | 16:06 |
| clarkb | every single node in sjc3 right now has a hostname that ends with 4 | 16:09 |
| clarkb | wonder what the odds are of that happening | 16:09 |
| clarkb | I have not seen any double nics yet in sjc3. I'll check the other regions now | 16:10 |
| clarkb | fungi: ok I think DFW3 has double nics again. I'm so confused | 16:11 |
| clarkb | fungi: I workflow -1'd https://review.opendev.org/c/opendev/system-config/+/962237 while I try to make sense of this | 16:11 |
| fungi | okay | 16:13 |
| fungi | were the launchers restarted onto the fix yet? | 16:13 |
| clarkb | fungi: yes I did that wednesday | 16:14 |
| fungi | that's what i thought. so strange | 16:14 |
| clarkb | ok some of sjc3's nodes are double nic now and some are not. Maybe one of the launchers didn't update and the other did (though I pulled new images on both and restarted both on wednesday | 16:14 |
| clarkb | they both claim to be running the same image | 16:15 |
| fungi | even stranger | 16:15 |
| clarkb | and were restarted ~42 and ~41 hours ago | 16:15 |
| clarkb | I think the best thing here may be to revert the revert (ugh) and see if corvus has any further insight when he is able to look closer. I haven't been following the configuration inheritance changes super closely so not sure how all the bits may interact here | 16:17 |
| fungi | the components page shows them both running on the same version | 16:18 |
| opendevreview | Clark Boylan proposed opendev/zuul-providers master: Reapply "Remove raxflex networks config" https://review.opendev.org/c/opendev/zuul-providers/+/962379 | 16:18 |
| fungi | and on a newer version than everything else, as expected | 16:18 |
| clarkb | the first chagne to fix flavors shouldn't be related to this and can stay | 16:18 |
| fungi | yep | 16:18 |
| opendevreview | Merged opendev/zuul-providers master: Reapply "Remove raxflex networks config" https://review.opendev.org/c/opendev/zuul-providers/+/962379 | 16:19 |
| clarkb | I'm wondering if there is cached config in zookeeper maybe | 16:19 |
| fungi | oh, and it's causing the fix not to take effect because the set of networks is a subset of what was cached? | 16:20 |
| fungi | though if that were the case, why did we stop getting the extra network when we dropped the configuration? | 16:20 |
| fungi | since the empty set is still a subset of the earlier set | 16:21 |
| clarkb | good point. We did restart when we put the config back in clouds.yaml but I think we went straight to the "multiple networks found" error when removing it last time without config in clouds.yaml | 16:21 |
| clarkb | so ya I'm not sure. Maybe the fix was incomplete. It did get some new test case updates but possible that didn't full capture what is happening? | 16:22 |
| fungi | yeah, something's not adding up... but also it's friday afternoon so lots of things are no longer adding up for me at this point in the week | 16:22 |
| clarkb | I'm starting to see single nic instances again so ya updating the config in this direction did work | 16:23 |
| clarkb | ok I'm reasonably confident we're back to "normal" now and we can pick this up again when we're able to more closely debug why it isn't working | 16:35 |
| opendevreview | Merged opendev/system-config master: Remove ze11 from our inventory and management https://review.opendev.org/c/opendev/system-config/+/962371 | 16:41 |
| clarkb | oh cool once that is done deploying I guess I'll look into fully deleting ze11 (and I think it has a volume attached too that can be deleted) | 16:42 |
| clarkb | then I can push up a dns cleanup change | 16:42 |
| stephenfin | What component is responsible for determining "are there sufficient resources in the underlying cloud to run this job"? Is it zuul or nodepool? My google-fu is failing me and I vaguely recall reading something recently about nodepool being merged into zuul (but I may have imagined that) | 16:43 |
| clarkb | stephenfin: in opendev it is 'zuul-launcher', but most (all?) other zuul deployments will still be using ndoepool | 16:43 |
| clarkb | stephenfin: opendev is acting as the beta tester for zuul launcher | 16:43 |
| stephenfin | Okay, so likely still nodepool in the RDO CI. zuul-launcher gives me something new to google though | 16:44 |
| fungi | stephenfin: i'll save google the work: https://zuul-ci.org/docs/zuul/latest/developer/specs/nodepool-in-zuul.html | 16:45 |
| stephenfin | and one other question: are there any jobs where resources are allocated from the underlying "CI cloud" (RAX or whatever) as part of the job itself, rather than by nodepool/zuul-launcher? | 16:45 |
| stephenfin | fungi++ thanks :) | 16:45 |
| clarkb | stephenfin: not in opendev, but if you had credetnials in your jobs to do that I don't think anything zuul does would prevent it | 16:45 |
| fungi | scs has a zuul that does what you're describing, i thino | 16:46 |
| fungi | think | 16:46 |
| fungi | (sovereign cloud stack) | 16:46 |
| stephenfin | Drat. I was hoping there would be, and I was going to ask how you were managing resources when Zuul wasn't the sole owner of them | 16:47 |
| clarkb | stephenfin: within opendev what typically happens is we deploy a nested cloud or kuberentes and test against that | 16:47 |
| clarkb | a lot of jobs run devstack not to test openstack changes but to test integration with openstack APIs. Similar with k8s | 16:47 |
| fungi | oh, zuul sharing resource pools/quota is orthogonal to jobs spinning up their own cloud resources, though i can see how they could be related | 16:48 |
| stephenfin | Yeah, orthogonal but related. Something needs to be done to e.g. stop Zuul filling the cloud up and starving jobs of the resources they need | 16:51 |
| clarkb | nodepool will respect quotas | 16:51 |
| stephenfin | I figure projects + quota, host aggregates, or separate (potentially nested) clouds, but was looking for prior art | 16:51 |
| stephenfin | clarkb: The context is that we're running OpenShift jobs via Prow against a cloud that is also being used by RDO Zuul. We have quotas, but currently the sum of quotas for all projects exceeds what actually available on the cloud | 16:52 |
| clarkb | ah yes openshift... I have strong opinions and well the openshift 3 -> 4 transition made testing openshift extremely painful | 16:53 |
| clarkb | it was once possible to deploy openshift within your zuul job in a straightfowrard manner as you could run openshift 3 atop preallocated resources | 16:54 |
| opendevreview | Michal Nasiadka proposed openstack/diskimage-builder master: almalinux-container: Add support for building 10 https://review.opendev.org/c/openstack/diskimage-builder/+/960336 | 16:54 |
| stephenfin | Private clouds cost money and lowering quotas means the cloud could go underutilised if some of the projects are quiet. So I'm guessing something need to mediate. But again, I was hoping there might be prior art | 16:54 |
| clarkb | but then 4 switched to an more entire rack management model and now its very diffiuclt to deploy unless you had it cloud credetnials or ipmi details | 16:54 |
| clarkb | stephenfin: one option may be to have nodes in your zuul nodesets that are only there to grab the quota/resources then have your openshift deployment take over from zuul for them | 16:54 |
| stephenfin | I will hold my tongue on testability of OpenShift in general, but needless to say, I miss devstack | 16:54 |
| clarkb | I don't know if that would work as I've never tried it. But you may be able to do something along those lines and that would solve the underutilization problem as zuul would mediate all the quota usage | 16:55 |
| stephenfin | Hmm, that's not a bad idea | 16:55 |
| clarkb | you might need to put the nodes in a group that zuul ignores or something like that so that the zuul setup and teardown steps don't trip over the external management | 16:56 |
| fungi | nodepool (and zuul-launcher) can be told a maximum node count which you could manually set on the provider below the actual quota, if you don't want zuul using more than that amount of the capacity. as clarkb noted nodepool and zuul-launcher are capable of monitoring quota usage reported by the cloud even when it's not the only thing using it | 16:57 |
| fungi | so it shouldn't try to use more than what's remaining available even if it's sharing that resource pool | 16:58 |
| clarkb | fwiw complex systems should consider "how do we test this" as a major part of design input. That includes openstack fwiw | 16:59 |
| clarkb | I think trove mightily struggled with the whole run a nested virt database just to test the mangaement pieces none ofwhich were actually necessary to test the mangaement piece | 17:01 |
| clarkb | and that pattern has been a common issue within openstack. Today it is less of an issue as nested virt is more reliable and available. | 17:01 |
| fungi | a delicate balance between "test the software the way users will be running it" and making use of the available test resources | 17:05 |
| stephenfin | Yeah, I wouldn't put any blame at the feet of zuul here. boskos (the resource manager used by k8s prow) doesn't have any insight into OpenStack resource utilisation so the best it can do it manage a static number of slices (one consumed per job) | 17:06 |
| stephenfin | Even if it did, there's no mechanism in OpenStack (at least without something like blazar) to reserve resources without consuming them. And we can't consume them early in the job since the job itself is testing the creation of those resources | 17:07 |
| stephenfin | So yeah, I suspect I'll need some level of communication between prow and zuul so one of them can mediate | 17:08 |
| stephenfin | Or projects + quota, host aggregates, or separate (potentially nested) clouds | 17:08 |
| fungi | maybe prow can be improved upstream in order to become as smart as zuul/nodepool | 17:10 |
| fungi | becoming quota-aware is a means of communication, quota usage for various resources is essentially a communication channel for all systems using the available quota | 17:11 |
| clarkb | the deploy buildset for 962371 failed, but not in a way that prevents me from removing the server I don't think | 17:13 |
| clarkb | hrm actually it didn't run the infra-prod-service-zuul job which is the one I wanted ti to to pick up firewall updates | 17:14 |
| clarkb | as soon as we delete the node its IP can be recycled so we want to ensure our firewall rules are updated before deleting then ode | 17:14 |
| clarkb | the hourlies do run infra-prod-service-zuul so this may be fine anyway | 17:15 |
| clarkb | https://zuul.opendev.org/t/openstack/build/6da769919f4a47c9b39eab8bec36462d I think this may have updated firewall rules | 17:15 |
| clarkb | ya I think we are good. Zookeeper doesn't show firewall rules for ze11 anymore | 17:17 |
| clarkb | specifically zk01 | 17:17 |
| clarkb | fungi: ^ let me know if you have any concerns with clearing out that server at this point | 17:19 |
| clarkb | stephenfin: fungi: theoretically nothing prevents you from just running the entire CI system from zuul too | 17:20 |
| clarkb | I assume the biggest issue there is simply users not wanting to go into unfamiliar waters | 17:20 |
| stephenfin | clarkb: Except the fact that all of OpenShift uses Prow, and by diverging we lose the ability to integrate and report into their "pipelines" | 17:21 |
| fungi | i think he means using zuul to run prow | 17:22 |
| opendevreview | Clark Boylan proposed opendev/zone-opendev.org master: Remove the ze11 DNS records https://review.opendev.org/c/opendev/zone-opendev.org/+/962385 | 17:22 |
| fungi | so prow could still report things | 17:22 |
| stephenfin | Same issue though, if it's a different prow | 17:22 |
| fungi | clarkb: clearing out as in deleting it from the cloud provider? seems fine, it's out of the components list in zuul which should be all that matters now | 17:22 |
| clarkb | fungi: yup deleting the server and its cinder volume from the cloud provider. Once I'm done doing that I can land 962385 | 17:23 |
| clarkb | I'll start looking at cleaning it out shortly | 17:24 |
| clarkb | stephenfin: ya I guess these are administrative issues. Openshift could just use zuul is what i was trying to get at | 17:24 |
| stephenfin | If only ❤️ | 17:24 |
| stephenfin | The NIH is real | 17:25 |
| fungi | well, zuul could run *the same* prow for everything | 17:25 |
| fungi | and then it's not *a different* prow | 17:25 |
| clarkb | there is no ze11 cinder volume. It uses the ephemeral disk | 17:26 |
| clarkb | so I would just be deleting the one server instance named ze11.opendev.org | 17:26 |
| clarkb | if sticking with Prow making it quota aware seems like the most correct thing | 17:28 |
| clarkb | the two systems will compete at times similar to what it was like for us to run zuul-launcher and nodepool at the same time. but it is workable as we proved | 17:28 |
| fungi | yeah, from a simple mathematical perspective i don't know of any "cooperative fair use" algorithm they could employ to avoid that | 17:30 |
| clarkb | fungi: so this deployment issue has affected us since september 12 | 17:31 |
| clarkb | the borg-backup job is failing consistently. I think we need to dig into this | 17:31 |
| clarkb | as it effectively means our daily catch up/enforcement runs are not running for half the things | 17:31 |
| fungi | from a xkcd.com/927 perspective, what is needed is a new scheduler that coordinates the work of both zuul and prow, if one isn't going to coordinate the other | 17:32 |
| clarkb | The task includes an option with an undefined variable. The error was: 'ansible.vars.hostvars.HostVarsVars object' has no attribute 'borg_user'. 'ansible.vars.hostvars.HostVarsVars object' has no attribute 'borg_user' | 17:32 |
| clarkb | The error appears to be in '/home/zuul/src/opendev.org/opendev/system-config/playbooks/roles/borg-backup-server/tasks/main.yaml': line 57, column 3, but may be elsewhere in the file depending on the exact syntax problem. | 17:33 |
| fungi | huh, that's a new one on me, looking... | 17:33 |
| fungi | did we leave borg_user out of an inventory group somewhere? | 17:33 |
| fungi | or host | 17:33 |
| clarkb | I wonder if we have a server in the emergency file list that prevents it from populating that borg_user variable | 17:33 |
| clarkb | fungi: we auto generate ti from the hostname by deault | 17:34 |
| clarkb | I wish a million times that ansible would like the list iteration items | 17:34 |
| fungi | clarkb: could it be the ze11 entry? otherise there's nothing new in there | 17:34 |
| clarkb | I don't think ze11 does backups | 17:34 |
| fungi | everything else in the emergency list has been there for months, so i don't guess it's there | 17:35 |
| clarkb | ack | 17:36 |
| clarkb | looks like each backup host has processed users borg-etherpad02 borg-gitea09 borg-review03 borg-zuul01 and borg-zuul02 then whatever the next list item is breaks | 17:36 |
| clarkb | fungi: I think it is kdc03 | 17:37 |
| fungi | huh... | 17:37 |
| clarkb | that task loops over the borg-backup group of servers and if you look at https://opendev.org/opendev/system-config/src/branch/master/inventory/service/groups.yaml#L24-L45 the next one listed is kdc03 and it seems to go in order | 17:37 |
| fungi | it's been up 14 days, which matches the september 12 timeframe | 17:38 |
| fungi | maybe upgrades broke auth? | 17:38 |
| clarkb | fungi: https://opendev.org/opendev/system-config/src/branch/master/playbooks/roles/borg-backup/tasks/main.yaml#L1-L4 | 17:38 |
| clarkb | this is where we should autogenerate the name | 17:39 |
| clarkb | fungi: well I think we're able to run the base playbook against kdc03. Its more like that specific fact isn't being set for some reason | 17:39 |
| fungi | i did a `sudo rm /var/cache/ansible/facts/kdc0{3,4}.openstack.org` after upgrading those servers | 17:40 |
| clarkb | ya which we thoguht we needed to clear out bad facts | 17:40 |
| clarkb | but maybe that is the genesis fi we're not regenerating it properly | 17:40 |
| fungi | i also did the same to the afs server fact caches though | 17:41 |
| clarkb | fungi: https://opendev.org/opendev/system-config/src/branch/master/playbooks/service-borg-backup.yaml#L11-L24 this is where we should genearte that info for each server then use it in that order | 17:41 |
| clarkb | fungi: I don't think we backup the afs servers | 17:41 |
| clarkb | fungi: I see it | 17:44 |
| frickler | stephenfin: what we (osism, not scs) do for testing our deployment tooling is having static nodes each with their own cloud credentials. we still need core to set an extra flag to run these in order to avoid leaking credentials, but this way we can still run in check and are not restricted to post only like with zuul secrets | 17:44 |
| clarkb | fungi: in service-borg-backup we're failing to install borg backup before we get to the part where we set the borg_user tuple value from borg_username and the ssh key value | 17:45 |
| clarkb | fungi: I don't know why execution doesn't stop there and why it continues to the backup servers and then breaks harder then. But ModuleNotFoundError: No module named 'pip' is the error | 17:45 |
| stephenfin | frickler: Does that mean you have wholly separate clouds? | 17:45 |
| stephenfin | i.e. nested clouds | 17:45 |
| stephenfin | or just different tenants on the same shared cloud? | 17:46 |
| frickler | stephenfin: different tenants on the same cloud. the overall capacity is large enough, since it is not only used for our CI, so global resource count is not an issue | 17:46 |
| fungi | though worth noting that opendev has two tenants/projects in every cloud provider, in our case so that we can isolate zuul test node usage from other more permanent resources | 17:47 |
| clarkb | fungi: I think the issue is that its trying to use the venv from the old deployment (its python3.10 not python3.12 from what I can tell) | 17:48 |
| clarkb | fungi: I thought when you were trying to fixup kdc03 that can moved aside but maybe I misremembered? | 17:48 |
| fungi | oh that would do it, we should blow away the venv | 17:48 |
| clarkb | fungi: ya I think we need to clear out or move that venv then we can reenqueue the deploy job for my eearlier change to get ahead of the daily runs later today and check things are happy | 17:49 |
| fungi | i can delete it, we don't keep any state in that tree right? | 17:49 |
| clarkb | fungi: I don't think there is any state in the borg venv (borg will keep that ina .dir in /root somewhere iir) | 17:49 |
| clarkb | I think /root/.cache/borg is the state carrying dir | 17:49 |
| clarkb | which we may also need to claer out I'm not sure | 17:49 |
| clarkb | but we can take this one step at a time | 17:50 |
| fungi | on august 20 i seem to have done a `mv /opt/borg{,.old}` which probably fixed it for jammy but didn't repeat it for noble | 17:50 |
| fungi | my bad | 17:50 |
| clarkb | aha that explains why I remembered this and why its python3.10 and not python3.8 | 17:50 |
| clarkb | I think that root causes it | 17:50 |
| fungi | done | 17:51 |
| clarkb | do we want to reenqueue deploy jobs for https://review.opendev.org/c/opendev/system-config/+/962371 now? | 17:51 |
| fungi | sure, i can do that now | 17:51 |
| clarkb | I don't think anything has merged since that ran (in project-config or system-config but we want to be careful about not going backwards) | 17:51 |
| clarkb | fungi: thanks | 17:51 |
| clarkb | once we're happy with that I'll look at ze11 again and probably just go ahead and delete it (can't think of any reason not to at this point) | 17:52 |
| fungi | done | 17:52 |
| clarkb | fungi: probably want to check on kdc03 backups once this is all settled too (just to be sure they start running successfully with the new venv) | 17:53 |
| fungi | yep, will do | 17:53 |
| clarkb | infra-prod-service-borg-backup is running now | 18:04 |
| clarkb | it succeeded! | 18:08 |
| clarkb | hopefully this gets us back to happy daily runs and we just need to check kdc03 backups are happy after this | 18:08 |
| opendevreview | Michal Nasiadka proposed openstack/diskimage-builder master: almalinux-container: Add support for building 10 https://review.opendev.org/c/openstack/diskimage-builder/+/960336 | 18:26 |
| clarkb | down to the last job in the at deployment buildset. If that comes back green I'll proceed wtih ze11 deletion at that point | 18:30 |
| clarkb | https://zuul.opendev.org/t/openstack/buildset/92d1e4bf21014455b9d1612cb9829262 success | 18:39 |
| clarkb | now to delete ze11 | 18:39 |
| clarkb | #status log Deleted ze11.opendev.org (8b5220c9-e3cf-4f28-a2f9-5eded6f963be) as it had network issues cloning git repos. It can be replaced later if we need the extra executor. | 18:42 |
| clarkb | fungi: do you want to review https://review.opendev.org/c/opendev/zone-opendev.org/+/962385 before I approve it? | 18:42 |
| clarkb | I've removed ze11 from the emergency file on bridge too | 18:43 |
| fungi | sorry, that one snuck past me, approved | 18:44 |
| clarkb | its ok I didn't want to approve it until I had deleted the server anyway so the timing works out | 18:45 |
| clarkb | thanks | 18:45 |
| opendevreview | Merged opendev/zone-opendev.org master: Remove the ze11 DNS records https://review.opendev.org/c/opendev/zone-opendev.org/+/962385 | 18:50 |
| clarkb | going to find lunch now then I have some changes to review when I get back | 18:53 |
| clarkb | fungi: speaking of backups https://review.opendev.org/c/opendev/system-config/+/961752 is where I ended up after pruning the vexxhost backup server less than a month after the prior prune | 20:38 |
| clarkb | we'll want to follow that up with some purges too but starting with the retirements made sense to me | 20:38 |
| clarkb | fungi: and then separately do we think we can safely land https://review.opendev.org/c/opendev/system-config/+/958666 now that your openafs cleanups have been compelted? | 20:38 |
| fungi | yeah, thanks for the reminder, approved it now | 20:45 |
| clarkb | I'm just trying to clear things out of my backlog. Granted big items like gerrit upgrades and summit presentations are not getting done but the collection of small items I've accumulated is slowly decreasing in size | 20:49 |
| fungi | yep | 20:50 |
| opendevreview | Merged opendev/system-config master: Revert "reprepro: temporarily ignore undefinedtarget" https://review.opendev.org/c/opendev/system-config/+/958666 | 21:17 |
| opendevreview | Merged opendev/system-config master: Retire eavesdrop01 and refstack01 backups on the smaller backup server https://review.opendev.org/c/opendev/system-config/+/961752 | 21:17 |
| clarkb | the backups appear to be retired, disk usage hasn't dropped yet as that occurs during the next pruning | 21:38 |
| clarkb | I think we can either wait for normal pruning to be required and prune then, or if we wish to start purging some of these retired backups we can do a prune pass sooner than necessary then start purging | 21:40 |
| clarkb | https://mailarchive.ietf.org/arch/msg/ietf/q6A_anL1u-Y9iXe-vboiOYamsl0/ this may be useful to refer to in discussion around mailing lists | 21:54 |
| clarkb | I also found https://mailarchive.ietf.org/arch/msg/ietf/3tNNSIOc0BOO64Sm7xQLYmTwoco/ interesting in that same thread | 22:00 |
| fungi | yeah, on game development and modding forums i constantly see people talking about which games are playable on their phone, or what they're looking forward to playing when they get a computer | 22:05 |
| fungi | i think there are more and more people not using computers at all as phones have grown increasingly capable of performing many of the same tasks | 22:05 |
| fungi | and so the remaining tasks that phones can't do well become a hindrance | 22:06 |
| fungi | also the idea of being "offline" (and so any benefits of software that can deal with such an apocalyptic catastrophe) just makes no sense to them | 22:07 |
| clarkb | one of the things I need to do (which I should start shortly) is upgrade my router | 22:08 |
| clarkb | speaking of being offline | 22:08 |
| clarkb | I should just get that done with now | 22:09 |
| fungi | here's hoping you survive the darkness | 22:10 |
| clarkb | that was easy. I appreciate that opnsense's smaller more frequent updates seem to reuslt in quicker upgrades | 22:22 |
| clarkb | fungi: the 22:00 UTC update for the debian mirror is running now and I think is the first check of the ingore missing removal | 22:23 |
| clarkb | I'm tailing the log now but I expect it to succeed | 22:24 |
| clarkb | reprepro just succeeded and it is working on the vos release | 22:26 |
| clarkb | 2025-09-26 22:26:22 | Done. | 22:26 |
| fungi | perfect | 22:26 |
| clarkb | I think this is working happily now | 22:26 |
| clarkb | https://www.mail-archive.com/nginx@nginx.org/msg25485.html related to the why mailing lists exist discussion | 22:32 |
| clarkb | tldr is nginx is going to stop using theirs September 30 | 22:32 |
| fungi | the python community semi-regularly gets long-timers complaining about how terrible discourse is compared to mailing lists/usenet | 22:34 |
| fungi | my personal experience is getting reprimanded by forum moderators for replying to a post that they had deleted. i was told that since i choose to follow the forum in "mailing list mode" i'm obligated to check the web interface for the forum before replying to make sure i don't violate the forum guidelines again | 22:36 |
| fungi | i mostly stopped engaging after that and just treat it as a read-only information source | 22:38 |
| clarkb | seems like that is easily solvable by having discourse emit a "this message thread is closed" email | 22:39 |
| clarkb | and then reject new posts with similar information | 22:39 |
| fungi | yeah, or reject replies where the in-reply-to header refers to a deleted post | 22:39 |
Generated by irclog2html.py 4.0.0 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!