Wednesday, 2025-08-20

clarkb	corvus: the upgrades and reboots have gotten as far as zl01. We issueda graceful stop which the process recorded in its debug log. I think we're waiting now for all the uploads to finish maybe? I guess thats fine for rolling upgrades but it is different than the hard stop start we have been doing. I wonder if we should update the playbook to match or not	00:26
clarkb	looks like it took approximately 15 minutes to gracefull stop. That isn't too bad os maybe this is fine. It does mean the launcher falls off the components list because unlike the executors it is isn't "paused" during that period of time	00:27
clarkb	anyway I think this is working its just didn't look how I expected for a moment so I dug in	00:28
opendevreview	OpenStack Proposal Bot proposed openstack/project-config master: Normalize projects.yaml https://review.opendev.org/c/openstack/project-config/+/957995	02:25
opendevreview	Merged openstack/project-config master: kolla: Allow kolla-core to remove RP votes https://review.opendev.org/c/openstack/project-config/+/949842	04:19
opendevreview	Merged openstack/project-config master: Remove nodepool configuration/elements https://review.opendev.org/c/openstack/project-config/+/956184	04:22
opendevreview	Merged openstack/project-config master: Prepare for retirement of RefStack repositories https://review.opendev.org/c/openstack/project-config/+/947859	04:24
opendevreview	Merged openstack/project-config master: Stop syncing run_tests/Vagrantfiles for OSA https://review.opendev.org/c/openstack/project-config/+/956944	04:24
ykarel	Hi is this known? ERROR! couldn't resolve module/action 'openvswitch_bridge'. This often indicates a misspelling, missing collection, or incorrect module path.	04:47
ykarel	or was a timing thing during ansible 11 switch?	04:47
ykarel	seen in todays periodic run	04:47
ykarel	https://c4e65dc90d54a7cb0d09-c58207963db0f03dec19154799b50d2d.ssl.cf5.rackcdn.com/openstack/41fccd9ecb204b53bc4e82d4e6cd9dec/job-output.txt	04:47
ykarel	https://6ab93f4ca7b96e79e883-fb8b0f0ff152f556a5802daf1433e080.ssl.cf5.rackcdn.com/openstack/492f0763d7b54cb388374107cc79cf62/job-output.txt	04:48
ykarel	or may be we need to adopt in https://codesearch.opendev.org/?q=Ensure%20the%20infra%20bridge%20exists&i=nope&literal=nope&files=&excludeFiles=&repos=	04:49
frickler	ykarel: IIUC that module was dropped to be installed with ansible 11, a short term workaround might be to let those jobs run with ansible 9	04:59
opendevreview	Takashi Kajinami proposed opendev/system-config master: Add OpenVox to mirror https://review.opendev.org/c/opendev/system-config/+/957299	05:00
ykarel	frickler, ack i was going to copy the module like https://opendev.org/zuul/zuul-jobs/raw/branch/master/roles/multi-node-bridge/library/openvswitch_bridge.py	05:00
frickler	ah, yes. maybe there is a way we can make it useable from other playbooks without duplication, though.	05:05
ykarel	https://review.opendev.org/c/openstack/neutron-tempest-plugin/+/958008	05:09
ykarel	yes looks setting ANSIBLE_LIBRARY that can be reused, any way to set it ?	05:10
frickler	sorry, I don't know too much about these ansible-in-zuul details, maybe clarkb or corvus have an idea	05:14
ykarel	ok thx	05:33
*** clarkb is now known as Guest24535		11:02
*** dhill is now known as Guest24539		12:35
fungi	ykarel: yeah, ansible dropped the ovs module, so clarkb vendored it into the multi-node-bridge role with with https://review.opendev.org/c/zuul/zuul-jobs/+/957188	12:54
fungi	you could also just do something similar	12:54
fungi	though i agree finding a way to not have two copies floating around would be nice	12:55
fungi	longer term it would probably make more sense to replace it entirely in multi-node-bridge with a simiar setup just using linux's bridge driver	12:55
ykarel	fungi, for the limited use going with https://review.opendev.org/c/openstack/neutron-tempest-plugin/+/958008/2/roles/multi-node-setup/tasks/main.yaml	12:56
frickler	infra-root: looks like we again have a kolla-ansible change stuck in check for > 24h, https://review.opendev.org/935704 PS20. not sure if that's related to the ongoing restarts? I'll leave it in place for now in case someone wants to dig further	13:02
*** dhill is now known as Guest24544		13:36
Clark[m]	frickler: I wouldn't expect that to be related to the restarts as the launchers updated closer to 14 hours ago. I think if you hover the build status bar in the UI you get the request I'd which you can grep for in launcher and scheduler logs to see where it may be stuck	13:46
Clark[m]	We probably want to check on rax ord and dfw as well since reenabling those may have caused them to error again?	13:47
opendevreview	Clif Houck proposed openstack/diskimage-builder master: Add a sha256sum check for CentOS Cloud Images https://review.opendev.org/c/openstack/diskimage-builder/+/957983	13:52
opendevreview	Clif Houck proposed openstack/diskimage-builder master: Add a sha256sum check for CentOS Cloud Images https://review.opendev.org/c/openstack/diskimage-builder/+/957983	13:53
frickler	Clark[m]: if neither you nor corvus want to check deeper, I'd just abandon and restore that change in order tu rerun jobs	13:54
Clark[m]	I can take a look but it will be a bit. I'm trying to get out for an early morning bike ride before the heat of the day but can look when I get back	13:56
fungi	looks like it has two builds that are waiting on specific nodeset requests	13:57
fungi	we could check what provider those are for	13:57
fungi	though it's been in the queue since long before we reenabled the rax classic regions, and these don't look like retries	13:58
corvus	nodeset requests are a56f042da27b4cfda9af080eea029ac7 and ea449d9a4f744590a1e0af1b7c3bc625, for posterity	14:08
corvus	given that they each have the requisite nodes ready and assigned to the request, i think that's very likely a launcher bug	14:10
fungi	i'll be in and out a bit today. taking a break from storm prep to go grab lunch, then when i get back i'll split my time between last-minute yardwork and server upgrades	14:59
cloudnull	Clark[m] fungi can we get you all to shutdown jobs on the rackspace legacy cloud environments? We’re seeing more than 300k api requests hammering the environment this morning.	15:07
cloudnull	Maybe there’s a run away process? Bad return from the legacy api?	15:07
opendevreview	James E. Blair proposed opendev/zuul-providers master: Revert "Reenable rax DFW and ORD providers" https://review.opendev.org/c/opendev/zuul-providers/+/958094	15:16
corvus	2025-08-20 15:13:51,274 ERROR zuul.Launcher: keystoneauth1.exceptions.connection.ConnectFailure: Unable to establish connection to https://dfw.servers.api.rackspacecloud.com/v2/637776/servers/d92503ac-b6c8-4d5d-a54c-f6c0a4717271: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))	15:16
corvus	cloudnull ^ i see a lot of those errors	15:16
corvus	6422 of those errors since utc midnight	15:18
corvus	31294 regular api calls	15:20
corvus	cloudnull: opendev's zuul issues 37716 api calls to both regions of rax legacy since utc midnight, 6422 of them failed with connection errors	15:20
opendevreview	Merged opendev/zuul-providers master: Revert "Reenable rax DFW and ORD providers" https://review.opendev.org/c/opendev/zuul-providers/+/958094	15:21
cloudnull	We’ll give it a look one things calm down a bit.	15:26
cloudnull	Do you happen to have an api breakdown of calls made across DFW and ORD since it was reenabled? Maybe this is just an issue with DFW?	15:27
corvus	cloudnull: yes, we have a record of every call. one sec.	15:32
corvus	dfw: 19009 successful calls, 6422 failures; ord: 11271 successful calls, 0 failures	15:35
corvus	cloudnull: ^ so yeah, looks like dfw is the only one we're seeing errors for	15:35
corvus	frickler: i think i have what i need from those changes; they're never going to resolve on their own. i will dequeue them. bugfix to come later.	15:40
corvus	s/those changes/that change/	15:41
cloudnull	corvus could I bother you to reenabled ord and iad?	15:42
cloudnull	We think there’s an issue specifically with DFW and I’d like to prove that and help with resources	15:43
corvus	cloudnull: we hadn't enabled iad yet, just because its failure mode was different (slow booting nodes) so we wanted to monitor that separately... can i do ord for now and wait for Clark or fungi to be around to add iad back?	15:44
opendevreview	James E. Blair proposed opendev/zuul-providers master: Re-enable rax-ord https://review.opendev.org/c/opendev/zuul-providers/+/958099	15:45
opendevreview	Merged opendev/zuul-providers master: Re-enable rax-ord https://review.opendev.org/c/opendev/zuul-providers/+/958099	15:46
cloudnull	Thank you corvus	15:46
frickler	corvus: thx, rechecked the change and will check later whether things progress better now. those stuck requests weren't related to the dfw issues, right?	15:50
corvus	don't think so; it's an error in the ready node reuse code	15:56
fungi	okay, back at the keyboard for a bit to see what i missed	16:15
fungi	and yeah, the stuck jobs had node assignments pending for hours before we tried to bring rax classic back online yesterday, so should have been entirely unrelated	16:17
Guest24535	I am around now and can monitor reenabling rax-iad if cloudnull is still ok with that: https://review.opendev.org/c/opendev/zuul-providers/+/957957 is the change for that	16:24
Guest24535	oh heh I've guestified	16:24
Guest24535	one moment please	16:24
*** Guest24535 is now known as clarkb		16:25
fungi	welcome back Guest24535! ;)	16:25
clarkb	thanks	16:26
fungi	i'll approve that change, i'm semi-around again now	16:26
clarkb	thanks I'm properly around at this point	16:27
opendevreview	Merged opendev/zuul-providers master: Reenable rax IAD https://review.opendev.org/c/opendev/zuul-providers/+/957957	16:27
clarkb	then also in my backlog is https://review.opendev.org/c/opendev/zone-opendev.org/+/957981 and child	16:27
fungi	looking	16:29
fungi	approved both, though on the second what do you think about just adding a www vhost to static and doing tghe redirect there since it already hosts a ton of them?	16:30
fungi	alternatively we can probably point is to the lb as long as we add a redirect in the apache configs for the backends	16:31
opendevreview	Merged opendev/zone-opendev.org master: Delete review02 DNS records https://review.opendev.org/c/opendev/zone-opendev.org/+/957981	16:32
clarkb	fungi: ya I didn't think to far ahead on that one. I just didn't want us to accidentally enable the record and have it point to a non existent location	16:32
clarkb	but now that apache is doing all the initial connections on giteas I thinkwe could handle it there	16:32
opendevreview	Merged opendev/zone-opendev.org master: Update commented out www.opendev.org record https://review.opendev.org/c/opendev/zone-opendev.org/+/957982	16:33
fungi	oh, good point, it was a less clean set of options in the past when we were doing haproxy->gitea instead of haproxy->apache->gitea	16:34
clarkb	I've shutdown the screen that was running the zuul reboots as those completed successfully	16:38
frickler	if someone gets bored, there's a new big yaml reformatting change, I didn't check yet what happened to trigger this https://review.opendev.org/c/openstack/project-config/+/957995	16:39
clarkb	corvus: we're largely caught up as of yesterday evening. I think the setup_hook change landed just a bit too late to get deployed though so we aren't running that in prod yet (but we did direct checking of it so I'm not too worried)	16:39
clarkb	fungi: frickler: do you recall what the set mtu to 1500 process was for the rax flex network? that is up next for me if the ssh keys and routers and networks etc all got created properly overnight. THen we can boot a mirror	16:42
clarkb	based on https://grafana.opendev.org/d/fd44466e7f/zuul-launcher3a-rackspace?orgId=1&from=now-1h&to=now&timezone=utc&var-region=$__all rax classic iad seems to be working. we have ready and in use nodes implying that boot timeouts are not a major issue	16:43
fungi	`openstack --os-cloud=opendevzuul-rax-flex --os-region-name=SJC3 network set --mtu=1500 opendevzuul-network1`	16:44
fungi	that	16:44
fungi	's from my shell history on bridge	16:44
frickler	clarkb: from what I remember we needed to set that on the tenant network?	16:44
frickler	like that, yes	16:44
clarkb	perfect thanks. run_cloud_launcher.log seems to indicate success so I'll set that in IAD3 momentarily	16:44
frickler	maybe check the current value first	16:45
clarkb	frickler: ++	16:45
frickler	would be interesting to see whether rax fixed their deployments	16:45
fungi	well, they aleady fixed them at least once to no longer be <1500 they merely made them much larger	16:46
clarkb	its 3942 currenctly	16:46
clarkb	*currently	16:46
fungi	but yeah, i dunno if they re-lowered them to 1500	16:46
fungi	guess not, that's the setpoint i recall	16:46
frickler	ok, so bad configuration for a public cloud IMO, but up to them.	16:46
fungi	not entirely broken at least, but yes likely to lead to pmtud negotiations and/or fragmentation/reassembly	16:47
clarkb	ok I set the mtu to 1500 on that network in both accounts	16:48
fungi	and possible unreachability to/from some places impacted by pmtud black holes, though hopefully those parts of the internet are vanishingly rare these days	16:48
fungi	i still have ptsd from dealing with customer sites where their "security" people had been convinved that icmp was dangerous so they just blocked all of it at their edge	16:49
clarkb	there is an Ubuntu 24.04 image in this cloud region. Do you think we should upload our own noble image or just use theirs? (we uplaoded our own image into most other cloud since it wasn't otherwise available)	16:49
fungi	the whole "ping of death" scare did more lasting damage to the internet in misplaced security filters than the actual packets ever could	16:50
clarkb	I'm somewhat inclined to just use the cloud provided image	16:50
clarkb	our config management should coerce it to looks basically identical to whatever we would upload I think	16:50
fungi	i think i deleted the noble-server-cloudimg-amd64.img from my homedir that i had uploaded to the other regions	16:52
clarkb	ya we would probably just download the ubuntu published image and reupload which is likely the same result as what that cloud image is as well	16:52
fungi	`wget https://cloud-images.ubuntu.com/noble/current/{noble-server-cloudimg-amd64.img,SHA256SUMS{,.gpg}}` is how i pulled it, fwiw	16:52
fungi	followed by `gpg --verify SHA256SUMS.gpg SHA256SUMS && sha256sum -c --ignore-missing SHA256SUMS`	16:53
keekz	cloudnull corvis: dfw should be better now	16:57
fungi	keekz: thanks for the followup!	16:58
clarkb	did someone else want to push a change up to enable that region? I'm working on a new mirror in iad3	17:01
fungi	gimme a sec i can push a revert	17:01
opendevreview	Jeremy Stanley proposed opendev/zuul-providers master: Reapply "Reenable rax DFW and ORD providers" https://review.opendev.org/c/opendev/zuul-providers/+/958104	17:03
fungi	though i guess the subject is now misleading since part of it was reverted already	17:03
fungi	happy to amend with revised commit message if anyone cares	17:03
clarkb	ya maybe amend it just to avoid confusion if we have problems in the futuer. It will be clearer that ord was ok	17:05
fungi	can do	17:13
opendevreview	Jeremy Stanley proposed opendev/zuul-providers master: Reapply "Reenable rax DFW provider" https://review.opendev.org/c/opendev/zuul-providers/+/958104	17:14
fungi	better?	17:14
clarkb	approved	17:16
opendevreview	Merged opendev/zuul-providers master: Reapply "Reenable rax DFW provider" https://review.opendev.org/c/opendev/zuul-providers/+/958104	17:17
frickler	fwiw I'd be fine with using the cloud ubuntu image	17:19
opendevreview	Clark Boylan proposed opendev/zone-opendev.org master: Add mirror01.iad3.raxflex to our DNS zone https://review.opendev.org/c/opendev/zone-opendev.org/+/958106	17:22
opendevreview	Clark Boylan proposed opendev/system-config master: Add mirror01.iad3.raxflex to our inventory https://review.opendev.org/c/opendev/system-config/+/958107	17:26
clarkb	frickler: ya that is what I ended up doing	17:26
clarkb	I think the main reason we didn't use cloud images previously was that they were simply not available so I took advantage of the image being available here and simplify things. It seems to hae worked fine and you can ssh in to the IP address in ^ and check it out first too	17:26
opendevreview	Merged opendev/zone-opendev.org master: Add mirror01.iad3.raxflex to our DNS zone https://review.opendev.org/c/opendev/zone-opendev.org/+/958106	17:30
opendevreview	Clark Boylan proposed opendev/zuul-providers master: Add raxflex iad3 region to zuul's resource pools https://review.opendev.org/c/opendev/zuul-providers/+/958109	17:33
clarkb	cloudnull: ^ I noticed that we're still set to a 10 instance limit and unlimited floating IPs in iad3. I've hardcoded us to the 10 instance limit as a conservative starting point but can update that to whatever we set the floating ip limit to later	17:34
opendevreview	Jeremy Stanley proposed opendev/system-config master: Use Jammy for our Kerberos servers https://review.opendev.org/c/opendev/system-config/+/958112	17:47
fungi	infra-root: ^ i couldn't find anything similar for the afs db or file servers, am i blind or do we not do test deployments of those?	17:47
clarkb	fungi: in zuul.d/system-config-roles.yaml we have tests for the openafs role. I think that may only be the client side though	17:49
fungi	yeah, nothing that deploys test servers on specific platforms	17:50
clarkb	I don't see any job for that seems to run the service-afs playbook	17:50
fungi	okay, so good enough	17:50
clarkb	you could add a job that does ^ but I'm wondering if part of the reason for that is it isn't fully automated?	17:50
clarkb	oh I wonder if some of the problem is with the domain and authentication and all that	17:50
clarkb	since its a global filesystem it probably isn't trivial to spin up something working without making it a different domain?	17:51
fungi	yeah, i assumed it was complexities of actually having a working subtree in global afs	17:52
clarkb	did anyone else want to review the mirror01.iad3.raxflex server addition? I'll probably approve it in ~10 minutes if there is no -1 betwen now and then	18:21
fungi	yeah, i just wanted to make sure it was in dns before approving	18:26
corvus	lgtm	18:27
fungi	since my jammy change for the kerberos servers is passing i'm going to stick them and all the afs servers into the emergency disable list now	18:27
clarkb	fungi: I think we have a documented process for kerberos server outages fwiw	18:28
fungi	afs servers too	18:34
fungi	i'm pulling them all up	18:34
fungi	https://docs.opendev.org/opendev/system-config/latest/kerberos.html#no-service-outage-server-maintenance and https://docs.opendev.org/opendev/system-config/latest/afs.html#no-outage-server-maintenance for the record	18:42
fungi	i've placed the following servers temporarily in the emergency disable list on bridge in order to start working through upgrades over the rest of the week: afs01.dfw.openstack.org, afs01.ord.openstack.org, afs02.dfw.openstack.org, afsdb01.openstack.org, afsdb02.openstack.org, afsdb03.openstack.org, kdc03.openstack.org, kdc04.openstack.org	18:48
fungi	these are the only afs and kerberos servers i found in our inventory	18:48
fungi	per the no-outage docs i'm upgrading kdc04 first since it's the inactive replica	18:50
fungi	packages for focal are already up to date, but it apparently needs a clean reboot before i can run do-release-upgrade to jammy. i expect this will be common across the entirety of the set	18:52
fungi	it starts an extra sshd on 1022/tcp, for reference	18:58
Clark[m]	Our iptables likely blocks that fwiw. I'm on matrix now due to lunch	18:59
fungi	"Sorry, this storage driver is not supported in kernels for newer releases. There will not be any further Ubuntu releases that provide kernel support for the aufs storage driver. Please ensure that none of your containers are using the aufs storage driver, remove the directory /var/lib/docker/aufs and try again."	19:00
fungi	i don't think we rely on it?	19:00
Clark[m]	That should be fine. I didn't even think we run docker on the kdcs	19:00
fungi	yeah, don't think we do	19:01
Clark[m]	Are there any containers? If not then nothing should use aufs	19:01
Clark[m]	And I would expect new containers to have stopped using aufs at some point	19:02
fungi	Command 'docker' not found	19:02
fungi	survey says "no"	19:02
opendevreview	Merged opendev/system-config master: Add mirror01.iad3.raxflex to our inventory https://review.opendev.org/c/opendev/system-config/+/958107	19:03
Clark[m]	That change will run all the jobs but you've got the hosts in the emergency file so shouldn't matter	19:03
fungi	right	19:03
fungi	i did a `rm /var/lib/docker/aufs` and retried do-release-upgrade, seems maybe happier now	19:04
Clark[m]	Ack	19:04
fungi	er, rm -rf because it was a directory	19:05
Clark[m]	I suspect it was empty too and just auto created by something for some reason at some point :)	19:05
fungi	i concur	19:06
fungi	so far i've only told it to keep our sshd and sudoers config changes, anything else i expect ansible can (re-)correct	19:28
clarkb	https://mirror.iad3.raxflex.opendev.org/ubuntu/ has content now. I think it should be ok to land https://review.opendev.org/c/opendev/zuul-providers/+/958109 as a result. But I know that cloudnull wants to adjust quotas there. Not sure if we want to wait for that to happen before we try to use the 10 instance quota	19:39
fungi	lgtm, yep	19:51
fungi	kdc04 is up and running on jammy now. i'll do the switchover steps in our docs next	19:57
fungi	Database propagation to kdc04.openstack.org: SUCCEEDED	19:59
corvus	clarkb: did you verify the flavor names? (they are different in the other 2 flex regions)	20:06
corvus	(i mean to say, dfw3 and sjc3 are different from each other, so i wonder if iad3 should be different still, or is the same as sjc3)	20:07
fungi	oh! good memory	20:07
fungi	granted, thay should have become readily apparent when booting the new mirror instance	20:08
clarkb	corvus: I did. sjc3 and iad3 have matching flavors	20:08
fungi	wonder why dfw3 is the odd one out	20:09
clarkb	actually iad3 is a subset. But the three flavors we use are in the subset	20:09
corvus	clarkb: cool, lgtm then...	20:09
corvus	the zuul-providers config for iad3 == sjc3	20:10
corvus	4 flavors there in your change	20:10
clarkb	ya I had to check for booting the mirror as well so made sure everything lined up	20:10
clarkb	the fourth is a duplicate we just alias the nested virt to the 8gb flavor iirc. But yes I checked they are in there	20:10
fungi	did i miss the zuul-providers addition?	20:10
clarkb	gp.0.4.4 gp.0.4.8 and gp.0.4.16 show up	20:11
clarkb	fungi https://review.opendev.org/c/opendev/zuul-providers/+/958109 its this change	20:11
corvus	ah yes, 4 of our flavors, 3 of theirs. i thought you meant that iad3 was a subset of sjc3 (but it's not)	20:13
corvus	clarkb: i think from our pov, it's okay to start with the 10.	20:13
clarkb	corvus: oh sorry I meant the flavors on the cloud side of iad3 are a subset of the flavors in sjc3	20:17
clarkb	but those that do exist overlap	20:17
clarkb	corvus: in that case I think we can probably land the change and see how it goes?	20:17
clarkb	then after the quota is adjusted we can change that value	20:17
fungi	lgtm, thanks!	20:24
opendevreview	Merged opendev/zuul-providers master: Add raxflex iad3 region to zuul's resource pools https://review.opendev.org/c/opendev/zuul-providers/+/958109	20:24
corvus	clarkb: oh that's interesting about the cloud flavors. gtk. all caught up now. :)	20:24
fungi	re-ran /usr/local/bin/run-kprop.sh on kdc03 after upgrade to jammy, all done there now	20:40
fungi	starting on afsdb01 with our no-downtime maintenance instructions	20:43
fungi	Instance ptserver, temporarily disabled, has core file, currently shutdown. Instance vlserver, temporarily disabled, currently shutdown.	20:44
clarkb	I think for the fileservers we have to transition the primary volume away from the host being updated. That might be a bit more painful unless things are already distributed in a way that just works for all but one	20:48
clarkb	but this already seems liek good progress!	20:48
fungi	yeah	20:49
fungi	already reading ahead to those	20:49
fungi	but we have 3 db servers to get through first	20:50
fungi	and then maybe repeat from jammy to noble	20:50
clarkb	one thing that just occurred to me is you may want to remove the ansible fact cache files for those hosts before you remove them from the emergency file	20:54
clarkb	that way ansible rereads all the facts as they are now rather than potentially relying on old fact info	20:54
fungi	any idea where that is these days?	20:58
fungi	but good call, yep	20:58
clarkb	/var/cache/ansible/facts on bridge I think	20:59
fungi	k, thx	20:59
fungi	Instance ptserver, has core file, currently running normally. Instance vlserver, currently running normally.	21:12
fungi	i'll move on to afsdb02 and 03	21:13
cloudnull	clarkb: I can go get those quotas update now.	21:20
clarkb	cloudnull: ack I don't think we're in a hurry. But we have everythin configured on our side to take advantage of them once set now	21:21
clarkb	fungi: are you having to do a preparatory reboot on all of these before beginning the update process?	21:24
clarkb	and did any other server complain about aufs?	21:24
fungi	clarkb: yeah, all of them want a reboot before running, and i've just preemptively been removing that directory	21:26
clarkb	ack thanks	21:27
fungi	basically if there's been a kernel update applied since the last reboot they want to be rebooted first, so that's every last one really	21:27
fungi	because we reboot them infrequently	21:27
clarkb	makes sense	21:28
fungi	hopefully afsdb02 will be done soon and i can move on to 03	21:28
fungi	finally finished the yardwork so i can focus on this a little more intently	21:28
fungi	though i'll probably grab a shower while afsdb03 is upgrading	21:29
fungi	otherwise christine will complain	21:29
clarkb	once you get to a good pausing point you can always pick it back up again in the morning	21:33
clarkb	I have an appointment tomorrow morning but am around otherwise	21:35
clarkb	I'm off to get my annual eyeball scan	21:36
fungi	yeah, other than meetings i've got nothing pressing tomorrow	21:36
fungi	assuming my cables don't get sucked up by a hurricane (unlikely)	21:37
clarkb	omnomnom	21:37
fungi	copper ramen	21:38
fungi	working on 03 now	21:46
fungi	okay, afsdb03.openstack.org is now upgraded to jammy. that just leaves the file servers, which i'll pick back up with in my morning	23:04
clarkb	and we're leaving all of the hosts in the emergency file for now? (I think that is fine just double checking)	23:07

Generated by irclog2html.py 4.0.0 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!