clarkb | wow that managed to merge first pass after the check +1. | 00:02 |
---|---|---|
clarkb | All of the images appear to have promoted too | 00:02 |
*** rlandy is now known as rlandy|out | 00:41 | |
ianw | weirdly i can not update a change on gerrit's gerrit | 00:41 |
ianw | https://gerrit-review.googlesource.com/admin/repos/plugins/zuul-results-summary,access | 00:41 |
ianw | suggests to me that i should be able to, but ... "cannot add patch set to" | 00:42 |
opendevreview | Ian Wienand proposed opendev/system-config master: reprepro doc: mention contents.cache.db https://review.opendev.org/c/opendev/system-config/+/857530 | 01:30 |
opendevreview | Merged opendev/system-config master: translate: fix dump with MySQL 5.7 https://review.opendev.org/c/opendev/system-config/+/857241 | 02:00 |
opendevreview | Ian Wienand proposed opendev/system-config master: Add an .allowedSigners file https://review.opendev.org/c/opendev/system-config/+/857542 | 03:33 |
ianw | interested to see what we all think about ^ :) | 03:34 |
ianw | -rw-r--r-- 1 10004 root 210M Sep 14 03:43 contents.cache.db | 03:43 |
ianw | -rw-r--r-- 1 10004 root 724M Sep 13 18:26 contents.cache.db.old | 03:43 |
ianw | so it's ... ~30% done? i started it at about 8am, it's ~2pm now ... so i make it about 16-17 hours to go, at this rate | 03:44 |
ianw | 7pm-ish UTC | 03:45 |
*** ysandeep|out is now known as ysandeep | 05:52 | |
*** jpena|off is now known as jpena | 07:37 | |
fungi | okay, so it did end up needing to slowly rebuild the cache? | 11:31 |
*** dviroel|brb is now known as dviroel | 11:37 | |
opendevreview | Merged openstack/project-config master: Add Keystone OpenID Connect charm to OpenStack charms https://review.opendev.org/c/openstack/project-config/+/857492 | 12:23 |
frickler | amorin: the flavors that you mentioned for testing the nested kvm issue are a bit low on cpu and ram, would using [bc]2-15 or even -30 work, too? otherwise I can set up a smaller devstack based test | 12:38 |
*** jpena is now known as jpena|off | 13:03 | |
*** ysandeep is now known as ysandeep|afk | 13:15 | |
*** ysandeep|afk is now known as ysandeep | 13:41 | |
*** ysandeep is now known as ysandeep|out | 13:43 | |
corvus | clarkb: looks like the zuul reboot script stopped after an error on zm05 | 13:51 |
*** Guest305 is now known as dasm | 13:57 | |
Clark[m] | corvus: it is probably something related to docker-compose ps output not discriminating running containers vs exited like docker ps. I'll take a look after the school run this morning | 13:57 |
corvus | yeah, it said no container found, so it may have been a previously stopped container or something (?) | 13:57 |
Clark[m] | Ya zm05 has stopped containers as fallout from the previous crash. I thought I had addressed that but maybe not completely. | 13:58 |
Clark[m] | And we should be able to restart with a limit to mergers and schedulers | 13:58 |
Clark[m] | To avoid iterating through executors again | 13:59 |
*** frenzyfriday is now known as frenzyfriday|lunch | 14:11 | |
*** jpena|off is now known as jpena | 14:55 | |
amorin | frickler ack, c2-15 is ending on the same hardware as c2-7 | 14:55 |
amorin | so it's good | 14:55 |
amorin | same for -30 | 14:55 |
opendevreview | Clark Boylan proposed opendev/system-config master: Fix error checking with zuul graceful stops https://review.opendev.org/c/opendev/system-config/+/857725 | 15:17 |
clarkb | corvus: ^ | 15:18 |
clarkb | I can restart the playbook limiting it to mergers and schedulers if we are happy with that and it lands | 15:18 |
*** dviroel is now known as dviroel|lunch | 15:21 | |
fungi | clarkb: when you were cleaning up the old mm3 holds, it looks like you left one from two weeks ago | 15:24 |
fungi | though in good news, the hold i set yesterday did trigger, so we should have a fresh example to work from now with the latest state of the implementation | 15:24 |
clarkb | fungi: the one from two weeks ago is the one you did the last round of imports on. I didn't want to claen it up from under you | 15:25 |
clarkb | fungi: if you are done with it feel free to remove it though | 15:25 |
fungi | ahh, okay, yeah i thought you'd asked if i was done with it. happy to delete it now. thanks! | 15:25 |
fungi | and done | 15:26 |
fungi | 104.239.143.143 is the latest held listserv | 15:27 |
*** marios is now known as marios|out | 15:28 | |
opendevreview | Alfredo Moralejo proposed zuul/zuul-jobs master: Use AFS mirrors for extras-common in CS9 https://review.opendev.org/c/zuul/zuul-jobs/+/857730 | 15:36 |
fungi | rsync from the production listservs to the new held node are underway | 15:36 |
*** frenzyfriday|lunch is now known as frenzyfriday | 15:55 | |
*** tweining_ is now known as tweining | 16:01 | |
*** dviroel|lunch is now known as dviroel | 16:43 | |
*** jpena is now known as jpena|off | 17:00 | |
clarkb | I think https://github.com/ansible/ansible/issues/76572 is related to our ansible slowness | 17:22 |
clarkb | it affects ansible 5 but not 6. They appear to have decided to not fix ansible 5 for some reason | 17:22 |
clarkb | testing locally this definitely has an impact (though harder to measure how big of one without doing far more invovled testing) | 17:23 |
clarkb | I went ahead and approved the zuul reboot playbook update | 17:24 |
corvus | clarkb: we still have ansible 2 available; you could throw up a change to switch to ansible 2 for a job and compare runtimes | 17:29 |
clarkb | corvus: thats a good point. I'll give that a go | 17:30 |
corvus | clarkb: 857725 lgtm | 17:31 |
opendevreview | Clark Boylan proposed opendev/system-config master: DNM Check if older ansible 2.9 pipelining is faster than ansible 5 https://review.opendev.org/c/opendev/system-config/+/857748 | 17:35 |
clarkb | Using the system-config-run-zuul job since that has a large nodeset which seems to amplify these issues | 17:35 |
opendevreview | James E. Blair proposed opendev/system-config master: Add Jaeger tracing server https://review.opendev.org/c/opendev/system-config/+/855983 | 18:01 |
clarkb | corvus: ^ the currently running system-config-run-zuul for the reboot playbook change took about 2 minutes to copy ssh host keys to all the hosts. That same process took about a minute on the job running for the ansible 2.9 DNM change on rax-ord | 18:01 |
clarkb | not a direct comparison but at least we haven't disproven it helps yet | 18:01 |
corvus | groovy! | 18:02 |
corvus | clarkb: i love that the change to add ansible 6 is waiting on the relase of zuul 6.4.0 which is waiting on the opendev restart which is waiting on the zuul test job which is slow because we don't have ansible 6 | 18:04 |
fungi | zuul 6.4.0, in which franz kafka meets rube goldberg for a few beers | 18:06 |
clarkb | checking the previous 3 successful runs 2 minutes, 3 minutes, and 6 minutes for the same block of tassk work | 18:06 |
clarkb | none of them ran on rax-ord but the 2 minute one did run on rax-dfw | 18:06 |
clarkb | definitely seems like this may help | 18:07 |
clarkb | I can recheck the dnm change a couple times to generate more data too | 18:07 |
clarkb | In a way this was a good thing. We did finda few places where the code was just inefficient even if we speed it up a bit with pipelining | 18:09 |
*** persia is now known as Guest377 | 18:13 | |
opendevreview | Merged opendev/system-config master: Fix error checking with zuul graceful stops https://review.opendev.org/c/opendev/system-config/+/857725 | 18:19 |
clarkb | once the repo on bridge updates I'll start the playbook again with a --limit 'zuul-merger:zuul-scheduler' added | 18:20 |
clarkb | the playbook is running limited to mergers and schedulers | 18:27 |
*** rlandy is now known as rlandy|mtg | 18:28 | |
clarkb | corvus: schedulers are restarting now. I notice https://zuul.opendev.org/t/openstack/build/90a46366443d4e3baaa14beb97619764 failed without logs. Makes me suspicious given timing but I haven't started tracking down why that happened yet | 18:41 |
clarkb | I can probably take a look after lunch if no one beats me to it | 18:41 |
clarkb | corvus: reboot is done | 18:47 |
*** rlandy|mtg is now known as rlandy | 19:23 | |
clarkb | the error was a decyption error | 19:48 |
clarkb | other jobs don't seem to have thatissue so likely due to the job config | 19:49 |
clarkb | it ran on ze02 if you want to find the logs for it. I think we're ok | 19:49 |
fungi | argh, bit of a hangup on the latest mm3 import test. the held node is in rackspace which only gets a 37gb rootfs. i'll need to move /var/lib/mailman to the ephemeral disk before i can get all the data copied over | 20:17 |
fungi | wish i'd noticed that before i copied some 30gb of data over to it and filled the rootfs | 20:23 |
fungi | hopefully the local file move won't take too long, and then i can resume the rsync rather than having to re-copy everything over the net | 20:23 |
*** dviroel is now known as dviroel|afk | 20:26 | |
fungi | going to pop out for some food while the resumed prod mailman copies finish up | 20:44 |
fungi | bbiaw | 20:44 |
*** timburke__ is now known as timburke | 20:59 | |
*** dasm is now known as dasm|off | 21:27 | |
*** rlandy is now known as rlandy|bbl | 22:01 | |
*** dviroel|afk is now known as dviroel | 22:12 | |
opendevreview | James E. Blair proposed opendev/system-config master: Add Jaeger tracing server https://review.opendev.org/c/opendev/system-config/+/855983 | 22:15 |
fungi | see, going out to the biergarten is the answer to slow file transfers. i'm certain that's what got them to finish | 22:24 |
ianw | -rw-r--r-- 1 10004 root 555M Sep 14 20:41 contents.cache.db | 22:25 |
ianw | i predicted 7pm utc, so close enough :) | 22:26 |
fungi | i don't think anyone else's jellybean count was closer, so you win | 22:26 |
ianw | so the debian volume has been released -- i guess afaik we'll call it fixed? why exactly the file went corrupt is still unknown? | 22:27 |
ianw | i'm just moving the old file to my homedir just in case, but i've dropped the locks | 22:29 |
ianw | #status performed reprepro db recovery on debian mirror; has been synced and volume released | 22:29 |
opendevstatus | ianw: unknown command | 22:29 |
ianw | #status log performed reprepro db recovery on debian mirror; has been synced and volume released | 22:29 |
opendevstatus | ianw: finished logging | 22:29 |
ianw | i added a quick on it via https://review.opendev.org/c/opendev/system-config/+/857530 too | 22:30 |
ianw | a quick note on it | 22:30 |
ianw | also we are still seeing the "afs: Warning: We are having trouble keeping the AFS stat cache trimmed down under the configured limit" popping up every now and then | 22:31 |
fungi | ianw: best guess is that whatever event leaked the reprepro lockfile on the 13th terminated the process and left the bdb file for its content cache dirty, but that's really as far as we got | 22:33 |
clarkb | ianw: https://review.opendev.org/c/opendev/system-config/+/857520 was my attempt at dealing with the afs warning but it appears I got something horribly wrong | 22:42 |
opendevreview | Clark Boylan proposed opendev/system-config master: Up openafs client -stat value https://review.opendev.org/c/opendev/system-config/+/857520 | 22:43 |
clarkb | maybe that is better | 22:43 |
*** dviroel is now known as dviroel|afk | 22:52 | |
opendevreview | Merged zuul/zuul-jobs master: Update gpg key file for extras-common in CS9 https://review.opendev.org/c/zuul/zuul-jobs/+/838450 | 23:10 |
ianw | yeah i couldn't find much googling for that error | 23:19 |
*** timburke_ is now known as timburke | 23:33 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!