corvus | clarkb: whoops thanks. no rush on merging 970 we can just leave it | 00:09 |
---|---|---|
corvus | clarkb: i'm unlikely to debug those queue items in the near term, so if they need to go, thats fine | 00:09 |
Clark[m] | Ack sorry switched to dinner prep once I was happy with gerrit | 01:03 |
opendevreview | Tony Breeds proposed openstack/diskimage-builder master: Add a tool for displaying CPU flags and QEMU version https://review.opendev.org/c/openstack/diskimage-builder/+/937836 | 01:55 |
opendevreview | Tony Breeds proposed opendev/system-config master: Also include tzdata when installing ARA https://review.opendev.org/c/opendev/system-config/+/923684 | 01:59 |
opendevreview | Tony Breeds proposed opendev/system-config master: Update ansible-devel job to run on a newer bridge https://review.opendev.org/c/opendev/system-config/+/930538 | 01:59 |
opendevreview | Tony Breeds proposed opendev/system-config master: Update pip3 role to work on Ubuntu Noble https://review.opendev.org/c/opendev/system-config/+/934937 | 01:59 |
opendevreview | Tony Breeds proposed opendev/system-config master: Add ara and tzdata as installed requirements for the ansible-devel job https://review.opendev.org/c/opendev/system-config/+/934917 | 01:59 |
opendevreview | Tony Breeds proposed opendev/system-config master: Install ARA master in the ansible-devel job https://review.opendev.org/c/opendev/system-config/+/924012 | 01:59 |
opendevreview | Tony Breeds proposed opendev/system-config master: Add some debugging commands to the post job https://review.opendev.org/c/opendev/system-config/+/925667 | 01:59 |
*** ykarel_ is now known as ykarel | 06:02 | |
opendevreview | Karolina Kula proposed openstack/diskimage-builder master: WIP: Add support for CentOS Stream 10 https://review.opendev.org/c/openstack/diskimage-builder/+/934045 | 07:22 |
tonyb | karolinku[m]: can you add a depends-on: https://review.opendev.org/c/openstack/diskimage-builder/+/937836 to your change | 07:26 |
karolinku[m] | I just rebased on that change | 07:51 |
tonyb | ahh cool I didn't notice | 07:54 |
tonyb | karolinku[m]: Looks like you got an x86_64-v3 VM, but the job timed-out. Would you like help debugging that? | 09:57 |
tonyb | Oh NM, the node from nodepool is v3, but the VM that got launched isn't so the CentOS-10 VM didn't boot. | 09:59 |
karolinku[m] | yes, this is most recent problem. it looks like that even if it shows Haswell, AVX flags are disabled | 10:02 |
frickler | there aren't enough logs collected from devstack to be sure, but I assume the libvirt KVM flag doesn't make it into devstack really. either add more log collection or hold a node to check I'd suggest | 10:07 |
tonyb | It looks like it does, libvirt.cpu_model = Haswell is set according to https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_22c/934045/26/check/dib-nodepool-functional-openstack-centos-10-stream-src/22cc926/nodepool/openstack/screen-n-cpu.txt | 10:08 |
tonyb | The QEMU in the nodepool VM isn't new enough to emulate x86_64-v3 but it should be able to set it correctly | 10:09 |
tonyb | But yeah I think placing an autohold will be my next step | 10:38 |
tonyb | the config looks right but it'll be easier to verify that interactively | 10:38 |
tonyb | karolinku[m]: If you send me an SSH public key I can add you to the held node | 10:39 |
opendevreview | Rafal Lewandowski proposed openstack/diskimage-builder master: Prevent from overwriting grub defaults if no variables are set https://review.opendev.org/c/openstack/diskimage-builder/+/937684 | 11:07 |
karolinku[m] | tonyb: https://github.com/karolinku.keys | 11:31 |
hashar | clarkb: corvus: hi,I was looking at Gerrit metric for caches and "Cache disk metrics are expensive to compute on larger installations and are not computed by default." Which I guess explains why `gerrit show-cache` was slow yesterday | 12:26 |
* hashar at the bottom of https://gerrit.wikimedia.org/r/Documentation/metrics.html#_caches | 12:26 | |
hashar | I also restarted the Wikimedia Gerrit which has been up since October 22nd or two months | 12:39 |
hashar | git_file_diff.h2.db went from 12G to 500MB and gerrit_file_diff.h2.db from 3.3G to 683M as well | 12:40 |
hashar | other caches got made smaller | 12:40 |
hashar | all thanks to `-Dh2.maxCompactTime=15000` being passed to java | 12:41 |
*** darmach5 is now known as darmach | 14:24 | |
clarkb | hashar: looks like the compact time occurs during db shutdown not startup so we would have to restart then restart again before it takes effect. Not a big deal and still likely an improvement for us. I'll work on a change in a bit | 15:47 |
hashar | oh possibly hyeah | 15:47 |
clarkb | infra-root I also want to delete the caches I moved aside on gerrit yesterday to return that disk space to the service. Any concerns with me doing that this morning? | 15:47 |
hashar | and your caches should have some noticeable size now | 15:48 |
clarkb | ya I need to load ssh keys and check | 15:49 |
frickler | I haven't noticed anything wrong with gerrit today, so seems fine to proceed with the cleanup | 15:53 |
clarkb | those two caches I moved aside are now up to 5GB and 4GB in production again. So ya they grow, but still much smaller than they were when I moved them aside | 16:02 |
clarkb | infra-root I have deleted the contents of /home/gerrit2/tmp/clarkb/old_caches/ on review02 | 16:08 |
clarkb | I will also shutdown the screen used to do the restarts yesterday | 16:08 |
opendevreview | Clark Boylan proposed opendev/system-config master: Set h2.maxCompactTime to 15 seconds https://review.opendev.org/c/opendev/system-config/+/938000 | 16:17 |
clarkb | hashar: ^ fyi and thanks again for the help | 16:17 |
hashar | oh you are even adding my blog post as a reference, that is kind :) | 16:22 |
clarkb | hashar: of course! | 16:23 |
clarkb | I am a fan of capturing as much info for future me and you as possible. Probably 90% of the time we never refer back to that info but the 10% that you do is so much better when you haev the information. Also you did all the debugging and deserve the credit | 16:24 |
corvus | ++ | 16:28 |
hashar | I +1ed and let a comment about our gerrit being probably killed by systemd after 90seconds | 16:30 |
hashar | and I have no idea why it takes a long time for it to stop | 16:30 |
hashar | that MIGHT be due to compaction | 16:30 |
hashar | or some other oddity, I have never investigated | 16:30 |
hashar | $ du -s -h node_modules/ | 16:37 |
hashar | 442Mnode_modules/ | 16:37 |
hashar | err wrong window | 16:37 |
clarkb | fun fact those two large cache files were consuming more disk than the rest of the complete gerrit installation combined | 16:56 |
clarkb | the ubuntu ports reprepro process is still running. Doesn't look like the log file has updated for a few hours though so not sure if it is stalling out or just busy doing expected work | 17:53 |
opendevreview | Clark Boylan proposed opendev/system-config master: Set h2.maxCompactTime to 15 seconds https://review.opendev.org/c/opendev/system-config/+/938000 | 18:05 |
corvus | that's one heck of a cache | 19:52 |
hashar | clarkb: fun you apparently noticed the issue back in June 2022 based on https://review.opendev.org/c/opendev/system-config/+/849886 | 22:14 |
hashar | and I have hit live in December 2022 (resulting in an outage) | 22:14 |
hashar | that change can be abandoned | 22:14 |
* hashar sleeps | 22:15 | |
clarkb | oh good research | 22:15 |
clarkb | pre-commit doesn't log the versions of things it installs like tox and nox do | 23:38 |
* clarkb adds that to reasons to not use pre-commit | 23:38 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!