clarkb | we did not get an email about filling the disk on the backup server today | 00:30 |
---|---|---|
clarkb | pruning did its job. thank you fungi | 00:31 |
fungi | indeed, it worked a treat | 00:31 |
clarkb | One thing that just occurred to me re gerrit upgrade is to compare the config files between 3.3 and 3.4 in the test systems to ensure that gerrit isn't adding any new data to the config (it likes to do that) | 00:37 |
clarkb | if it does that isn't the end of the world but means ansible will fight gerrit over the content and we should try and make them match to avoid that. I can probably look into that over the weekend | 00:38 |
clarkb | fungi: https://review.opendev.org/c/openstack/pbr/+/662035 has a question for you | 00:43 |
fungi | oh, thanks | 00:45 |
fungi | clarkb: i don't see any new questions since my answer from thirsday | 00:46 |
fungi | thursday | 00:47 |
clarkb | oh sorry. I read them out of order | 00:47 |
clarkb | my brain is clearly fried enough for one week | 00:47 |
fungi | no worries, thought maybe gertty had swallowed my comments or something | 00:49 |
clarkb | no, you didn't quote reply to the question you commented in the code where you would make the change and I got things mixed up in my head | 00:52 |
fungi | ahh, yeah | 00:52 |
*** rcastillo|rover is now known as rcastillo|out | 02:01 | |
*** mazzy509881292958 is now known as mazzy50988129295 | 04:00 | |
*** mgoddard- is now known as mgoddard | 09:05 | |
*** ysandeep|out is now known as ysandeep | 14:19 | |
*** ysandeep is now known as ysandeep|out | 14:41 | |
corvus | i'm going to do a very relaxed rolling restart of zuul | 15:53 |
corvus | i'll start with a graceful shutdown of half the executors | 15:53 |
corvus | components page shows ze01-ze06 paused | 15:56 |
corvus | and actually, looks like ze04 has exited already | 15:57 |
fungi | thanks for the heads up | 16:13 |
fungi | also i rebooted all the executors yesterday for a kernel upgrade, so they're probably fairly fresh processes anyway | 16:13 |
fungi | corvus: any way we can work in reboots of the schedulers and mergers? or is it too far along? | 16:13 |
corvus | fungi: oh sure, can do. we're still at wait for first half of executors to stop. | 16:16 |
corvus | 3 to go | 16:16 |
fungi | cool, as i said the 12 executors are already good on reboots, so it's just the 8 mergers and 2 schedulers | 16:17 |
fungi | i figure we'll do rolling reboots of the zk cluster next week | 16:17 |
corvus | ze01-06 are stopped | 17:10 |
fungi | awesome | 17:10 |
corvus | i'm starting them back up now | 17:10 |
fungi | that was, what, ~1.5 hours? | 17:10 |
fungi | granted, it's a saturday | 17:10 |
corvus | maybe 1:15 | 17:10 |
corvus | ze01-06 are back up; gracefully stopping 07-12 now | 17:12 |
corvus | ze07-12 show paused on the components page | 17:13 |
corvus | i'm going to stop all the mergers and reboot their hosts now | 17:13 |
fungi | oh, yep i keep forgetting that page exists | 17:13 |
corvus | fungi: just 'reboot' ? | 17:14 |
corvus | no other prep needed for the update? | 17:14 |
fungi | yeah, just reboot | 17:15 |
fungi | they'll already have the latest kernels installed | 17:16 |
fungi | Linux zm08 5.4.0-96-generic #109-Ubuntu SMP | 17:17 |
fungi | lgtm | 17:17 |
fungi | that's the one we wanted | 17:17 |
corvus | cool. mergers are back up. | 17:18 |
corvus | will restart zuul01 now | 17:19 |
fungi | thanks! | 17:19 |
fungi | and yeah, we discovered yesterday that the images we've been using recently explicitly disable unattended-upgrades, so i manually fixed the config on zuul01 and forced a package update there, but we still need to add some configuration management to solve that | 17:20 |
corvus | oops :( | 17:25 |
corvus | ok, zuul01 is coming back up now | 17:25 |
fungi | yep, it's running on the intended kernel now too, thanks | 17:26 |
corvus | zuul01 is back online; restarting zuul02 now; this will cause a web outage | 17:30 |
corvus | zuul02 is coming up | 17:34 |
fungi | thanks, booted kernel looks right on 02 now too | 17:42 |
corvus | it's fully up now. only tasks remaining are to wait for ze07-12 to stop and restart them. i may just check in later for that. or if anyone else wants to run `docker-compose up -d` on them after they stop, feel free. | 17:46 |
fungi | i'll check on them periodically as well and mention in here | 17:51 |
fungi | 09 no longer has any processes owned by zuuld but i'll wait for there to be at least a few | 18:53 |
fungi | 07 has stopped too | 19:56 |
fungi | 07-09 are stopped now, so i'll go ahead and start those back up | 21:25 |
fungi | 10-12 have not stopped yet | 21:25 |
fungi | https://zuul.opendev.org/components shows ze01-ze09 running now, ze10-ze12 still paused | 21:28 |
fungi | and those have finally stopped, so starting them up | 22:09 |
fungi | corvus: components page shows all 12 executors running again now on the same 4.11.1.dev66 1ed18610 version | 22:10 |
fungi | all components list a consistent version now | 22:11 |
corvus | \o/ thanks! | 22:14 |
corvus | #status log rolling restart of zuul onto commit 1ed186108956a1f7cc5fe34dc9d93731beaa56f6 | 22:14 |
opendevstatus | corvus: finished logging | 22:14 |
corvus | i think that's our first complete+successful rolling restart :) | 22:15 |
fungi | so smooth! | 22:18 |
fungi | though the second half of the executors did take close to 5 hours to come to a full graceful stop, that's fine when we're not in a hurry | 22:20 |
opendevreview | Neil Hanlon proposed openstack/diskimage-builder master: Setup rocky linux container https://review.opendev.org/c/openstack/diskimage-builder/+/825957 | 22:55 |
opendevreview | Neil Hanlon proposed openstack/diskimage-builder master: Setup rocky linux container https://review.opendev.org/c/openstack/diskimage-builder/+/825957 | 23:11 |
opendevreview | Neil Hanlon proposed openstack/diskimage-builder master: Add new container element - Rocky Linux https://review.opendev.org/c/openstack/diskimage-builder/+/825957 | 23:15 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!