Wednesday, 2026-05-13

@mnasiadka:matrix.org> <@clarkb:matrix.org> mnasiadka: ok posted some thoughts to https://review.opendev.org/c/opendev/system-config/+/988310/ In particular I wonder if we can run prometheus and greptimedb on the same host and simplify things a bit?04:53
I think that should be possible, I’ll rework the patches
-@gerrit:opendev.org- Zuul merged on behalf of Jack Hodgkiss: [openstack/diskimage-builder] 986427: fix: add support for `cloud-init` in `Ubuntu Resolute` https://review.opendev.org/c/openstack/diskimage-builder/+/98642706:39
-@gerrit:opendev.org- Michal Nasiadka proposed on behalf of Mohammed Naser: [opendev/system-config] 980840: Add Prometheus monitoring service https://review.opendev.org/c/opendev/system-config/+/98084008:39
-@gerrit:opendev.org- Sylvain Bauza proposed: [opendev/system-config] 988406: Add bots to #openstack-agentic-worfklows https://review.opendev.org/c/opendev/system-config/+/98840609:10
-@gerrit:opendev.org- Michal Nasiadka proposed: [opendev/system-config] 988310: Add GrepTimeDB long term storage for Prometheus https://review.opendev.org/c/opendev/system-config/+/98831009:17
-@gerrit:opendev.org- Michal Nasiadka proposed: [opendev/system-config] 988310: Add GrepTimeDB long term storage for Prometheus https://review.opendev.org/c/opendev/system-config/+/98831009:21
-@gerrit:opendev.org- Michal Nasiadka proposed: [opendev/system-config] 988310: Add GrepTimeDB long term storage for Prometheus https://review.opendev.org/c/opendev/system-config/+/98831009:22
-@gerrit:opendev.org- Michal Nasiadka proposed: [opendev/system-config] 988310: Add GrepTimeDB long term storage for Prometheus https://review.opendev.org/c/opendev/system-config/+/98831009:23
-@gerrit:opendev.org- Michal Nasiadka proposed: [opendev/system-config] 988310: Add GrepTimeDB long term storage for Prometheus https://review.opendev.org/c/opendev/system-config/+/98831009:44
-@gerrit:opendev.org- Mohammed Naser proposed: [opendev/system-config] 980994: Deploy node_exporter across all managed hosts https://review.opendev.org/c/opendev/system-config/+/98099409:44
-@gerrit:opendev.org- Michal Nasiadka proposed: [opendev/system-config] 988310: Add GrepTimeDB long term storage for Prometheus https://review.opendev.org/c/opendev/system-config/+/98831009:44
-@gerrit:opendev.org- Michal Nasiadka proposed on behalf of Mohammed Naser: [opendev/system-config] 980840: Add Prometheus monitoring service https://review.opendev.org/c/opendev/system-config/+/98084009:50
-@gerrit:opendev.org- Elod Illes proposed: [openstack/project-config] 988430: [release-tools] Fix dist_name fetch for upper bump https://review.opendev.org/c/openstack/project-config/+/98843011:23
-@gerrit:opendev.org- Mohammed Naser proposed: [opendev/system-config] 980994: Deploy node_exporter across all managed hosts https://review.opendev.org/c/opendev/system-config/+/98099413:27
-@gerrit:opendev.org- Michal Nasiadka proposed: [opendev/system-config] 988310: Add GrepTimeDB long term storage for Prometheus https://review.opendev.org/c/opendev/system-config/+/98831013:27
@mnasiadka:matrix.orgClark: looking at your comments on node_exporter patch - and I'm leaning towards writing a local role maybe?14:20
@clarkb:matrix.orgmnasiadka: a local role to manage the node exporter config that builds atop the upstream stuff? I think that makes sense14:50
@clarkb:matrix.orgbasically a concrete way of encoding how we want node_exporter to run?14:51
@clarkb:matrix.orginfra-root I think ansible upgrade on bridge (https://review.opendev.org/c/opendev/system-config/+/976282/) is still a potential go for today. However, I forgot that there is a parent change to make borg testing happy on older systems which has not had the same level of review14:51
@clarkb:matrix.orginfra-root maybe we can land that first change nowish if there are people able to review it. Confirm that borg installs still look good then depending on timing and what else comes up in the meantime proceed with the ansible 9 change?14:52
@mnasiadka:matrix.orgClark: yeah, we could even run it as a container as well, Kolla does that and mounts / as /host inside the container read only - https://opendev.org/openstack/kolla-ansible/src/commit/8510b763fec5fbfdcd677177809c682eb69e97dc/ansible/roles/prometheus-node-exporters/defaults/main.yml#L4214:52
-@gerrit:opendev.org- Michal Nasiadka proposed: [openstack/diskimage-builder] 988472: CI: Switch dib-devstack to use openstack.cloud https://review.opendev.org/c/openstack/diskimage-builder/+/98847214:55
@fungicide:matrix.orgi've approved the borg focal fix (976286) now15:59
@fungicide:matrix.orgalso +2 on ansible 9 for bridge, but didn't approve yet in case we want to coordinate hands available15:59
@fungicide:matrix.orgas for the resolute mirror, the mirror.ubuntu volume is currently taking up 986gb for 3 releases so we'll want at least 400gb of free quota on that volume (i recommend we increase from 1.3tb to 1.5 just to be safe)16:02
@fungicide:matrix.organd mirror.ubuntu-ports is at 1.01tb used so we ought to raise it to 1.6tb at least16:03
@clarkb:matrix.orgI think a quick check on the focal borg update first is a good idea just so that we don't break ansible if we need to address borg first16:05
@fungicide:matrix.orgassume we're looking at ~900gb of additional space resolute is going to take up between the two of those, right now afs01.dfw is at 80.2% used (4.24/5.28tb), we're looking at landing around 97% full on /vicepa16:06
@clarkb:matrix.organd then we may also want to move the old ansible venv aside as that deploys so that we can restore it quickly if necessary. But I think checking borg first then proceeding is a good plan16:06
@clarkb:matrix.orgfungi: not the change I proposed is only for x86_64 and we only have x86_64 images in zuul currently16:06
@clarkb:matrix.org* fungi: note the change I proposed is only for x86\_64 and we only have x86\_64 images in zuul currently16:07
@fungicide:matrix.orggiven this is all napkin math already, 3% headroom doesn't feel safe to me. i expect my estimates are already +/- more than that on the error bars16:07
@clarkb:matrix.orgagreed, but I also think we're avoiding arm64 for now16:07
@fungicide:matrix.orgyeah, if we go with just amd64 and no arm64 yet, that's going to be 89% utilization on afs01.dfw /vicepa16:08
@fungicide:matrix.orgstill a bit uncomfortable, but doable i think16:08
@clarkb:matrix.orgin particular arm64 gets much less utilization so I think it is less of a priority16:08
@clarkb:matrix.orgwe can sort arm out as a secondary step16:09
@fungicide:matrix.orgwe'll need to clear more space or add more cinder volumes before we can talk about mirroring anything else large (including resolute arm64)16:09
-@gerrit:opendev.org- Zuul merged on behalf of Goutham Pacha Ravi: [openstack/project-config] 987313: Add devstack-plugin-lustre project https://review.opendev.org/c/openstack/project-config/+/98731316:20
@fungicide:matrix.orgClark: okay, so are you in favor of me increasing the mirror.ubuntu quota from 1.3tb to 1.5tb and then approving 988279 and 988280? (mirror.deb-docker has plenty of room, current packages aren't even using 1% of its quota anyway)16:25
@clarkb:matrix.orgfungi: I think so. Do we want to coordinate that around the ansible 9 update at all?16:26
@clarkb:matrix.organd yes the docker update seemed super safe16:26
@fungicide:matrix.orgi don't expect the bridge ansible upgrade to impact the mirror setup. i'll probably need to run reprepro manually under screen anyway in order to avoid timeouts16:30
@fungicide:matrix.orgworst case, bridge ansible is too broken to deploy the mirror addition16:31
@fungicide:matrix.organd then we work around that or wait until bridge is back on track16:31
-@gerrit:opendev.org- Zuul merged on behalf of Clark Boylan: [opendev/system-config] 976286: Pin llfuse during borg install on Focal https://review.opendev.org/c/opendev/system-config/+/97628616:32
@clarkb:matrix.orgfungi: ack wfm then.16:32
@clarkb:matrix.orgfungi: re bridge the venv exists at `/usr/ansible-venv` I'm thinking that I can cp -r /usr/ansible-venv /usr/ansible-venv.8 or similar?16:33
@clarkb:matrix.orgfungi: so that we can easily move it back to /usr/ansible-venv if the new version breaks for any reason?16:33
@clarkb:matrix.org(mostly worried about an unexpected chicken and egg that might need ansible to fix ansible)16:34
@fungicide:matrix.org#status log Increased AFS quota for the mirror.ubuntu volume by 200GB (from 1.3TB to 1.5TB) in order to make room for Ubuntu 26.04 LTS (Resolute) packages16:36
@status:opendev.org@fungicide:matrix.org: finished logging16:36
@clarkb:matrix.orgok the borg backup job just succeeded. So my next step is to "backup" the ansible venv and then we can approve the upgrade for ansible16:37
@clarkb:matrix.orgfungi: does that cp -r seems reasonable. I think that should work well enough for this use case as its not crossing host or fs boudnaries16:37
@fungicide:matrix.orgClark: `cp -r` is probably fine, but you might want `cp -a` instead16:38
@clarkb:matrix.orgoh yup preserving links is probably important for a venv16:38
@fungicide:matrix.orgalso you'll keep the same file modes, ownership, and timestamps16:39
@clarkb:matrix.orgok running `cp -a /usr/ansible-venv /usr/ansible-venv.8` now16:40
@clarkb:matrix.orgthat is done and a quick glance at the contents of the new dir lgtm16:41
@fungicide:matrix.orgyeah, lgtm too16:42
@clarkb:matrix.orgfungi: anything else you can think of before approve https://review.opendev.org/c/opendev/system-config/+/976282 ?16:42
@clarkb:matrix.org* fungi: anything else you can think of before we approve https://review.opendev.org/c/opendev/system-config/+/976282 ?16:43
@clarkb:matrix.orgmy typing today is terrible16:43
@fungicide:matrix.orgnope, i went ahead and approved it now16:45
@fungicide:matrix.orgalso the resolute mirror additions16:45
@fungicide:matrix.orgall three are in the gate16:45
@fungicide:matrix.orgzuul is estimating they'll merge around 17:20 utc, ~25 minutes from now16:56
@fungicide:matrix.orgthat should be enough time that the hourly deploys will already have wrapped up16:56
@clarkb:matrix.orgPerfect16:57
-@gerrit:opendev.org- Michal Nasiadka proposed: [openstack/diskimage-builder] 988472: CI: Switch dib-devstack to use openstack.cloud https://review.opendev.org/c/openstack/diskimage-builder/+/98847217:17
-@gerrit:opendev.org- Zuul merged on behalf of Clark Boylan: [opendev/system-config] 988279: Mirror Ubuntu Resolute Docker packages https://review.opendev.org/c/opendev/system-config/+/98827917:18
-@gerrit:opendev.org- Zuul merged on behalf of Clark Boylan:17:18
- [opendev/system-config] 988280: Mirror Ubuntu Resolute Packages https://review.opendev.org/c/opendev/system-config/+/988280
- [opendev/system-config] 976282: Upgrade Ansible on Bridge to Ansible 9 https://review.opendev.org/c/opendev/system-config/+/976282
@clarkb:matrix.orgdeployment jobs are meandering through bridge now17:22
@clarkb:matrix.orgfungi: ^ not sure if you think you need to hold the mirror lock now?17:23
@fungicide:matrix.orgnah, the ubuntu mirror starts at 15 after even hours, so it'll be another 50 minutes17:26
@clarkb:matrix.orgdue to the way things deploy I think we are already using ansible 9 fwiw17:26
@fungicide:matrix.orgi'll just wait for the deploys to complete first17:26
@clarkb:matrix.organd the first buildset for docker package mirror updates succeeded so that is a good sign17:26
@clarkb:matrix.orgthe 1800 UTC hourly run will be the next real test (since the next two buildsets won't exercise much more than what is already exercised I think)17:27
@fungicide:matrix.orginfra-prod-bootstrap-bridge is finally running17:31
@clarkb:matrix.orgyup though it is a noop at this point as I think the prior two buildsets already updated ansible (per pip freeze in the venv)17:32
@fungicide:matrix.organd succeeded17:32
@fungicide:matrix.orgoh, indeed, so we likely ran at least one of them with 917:33
@clarkb:matrix.orgI think the bridge bootstrapping uses master rather than the change state (which may be a bug I guess)17:33
@clarkb:matrix.orgyup exactly. I think this is looking good from early results. 1800 hourlies will be the next test17:33
@fungicide:matrix.orgi'll go ahead and do the reprepro manual run now17:34
-@gerrit:opendev.org- Michal Nasiadka proposed: [openstack/diskimage-builder] 988472: CI: Switch dib-devstack to use openstack.cloud https://review.opendev.org/c/openstack/diskimage-builder/+/98847217:36
@fungicide:matrix.orgoh, huh somehow i missed that the mirror.ubuntu volume has been stale for a month? that one didn't run out of quota though17:39
@fungicide:matrix.orgleft behind a stale /afs/.openstack.org/mirror/ubuntu/db/lockfile so something killed the reprepro process, likely a point release caused it to timeout17:39
@clarkb:matrix.orgoh should we put the mirror in the emergency file and temporarily remove resolute while we address that first?17:41
@fungicide:matrix.orgwhile i probably should have done that, it's running now17:42
@fungicide:matrix.orgi'll just keep an eye on it for the next however many days it takes to complete17:42
@clarkb:matrix.orgack17:42
@fungicide:matrix.orgit's running under a root screen session on mirror-update, if anyone else needs to check on it17:43
@clarkb:matrix.org1800 hourlies have begun. I'll keep an eye on them18:02
@fungicide:matrix.orgno failures so far, at least18:04
@fungicide:matrix.orgall successful: https://zuul.opendev.org/t/openstack/buildset/a4335d0211804e2f9303d7bd53d87fa118:09
@clarkb:matrix.orggreat, the real test will be daily runs early tomorrow UTC time. But this is all pointing in the right direction. I guess if you see any problems with system-cofnig-run jobs says something as well as those may catch unexpected things early 18:10
@scott.little:matrix.orghas annything changed at opendev that might be blocking my ability to push a signed tag?   I'm still a member of the starlingx release group.  The repo is starlingx/root.git ...    git push gerrit vf/gazpacho ... remote: error: branch refs/tags/vf/gazpacho:18:28
remote: use a SHA1 visible to you, or get update permission on the ref
remote: User: slittle1
@scott.little:matrix.orgnever mind, i see the problem18:29
@clarkb:matrix.orgFor completeness the most recent change thatay have impacted that was the Gerrit 3.12 upgrade last month18:30
-@gerrit:opendev.org- Clark Boylan proposed: [opendev/system-config] 988535: Drop graphite port 80 redirect https://review.opendev.org/c/opendev/system-config/+/98853519:22
-@gerrit:opendev.org- Michal Nasiadka proposed wip: [openstack/diskimage-builder] 988472: CI: Switch dib-devstack to use openstack.cloud https://review.opendev.org/c/openstack/diskimage-builder/+/98847219:43
-@gerrit:opendev.org- Zuul merged on behalf of Clark Boylan: [opendev/system-config] 988535: Drop graphite port 80 redirect https://review.opendev.org/c/opendev/system-config/+/98853520:25
-@gerrit:opendev.org- Clark Boylan proposed: [opendev/system-config] 893571: DNM Forced fail on Gerrit to test 3.13 upgrades and downgrades https://review.opendev.org/c/opendev/system-config/+/89357123:12
@clarkb:matrix.orgI cleared out my autoholds for etherpad 2.7.3 and gitea 1.26.1 and replaced them with a gerrit 3.12 autohold to test the upgrade process to 3.1323:13
@clarkb:matrix.orgI still want to upgrade to 3.12.7 on Friday if possible, but this way I can start testnig the upgrade process tomorrow hopefully and get a head start23:13

Generated by irclog2html.py 4.1.0 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!