ianw | # grep 'Could not access submodule' * | wc -l | 00:36 |
---|---|---|
ianw | 137 | 00:36 |
ianw | this is hitting a lot on the builders updating the cache | 00:36 |
ianw | # grep -B1 'Could not access submodule' * | grep 'Updating cache' | awk '{print $7}' | uniq -c | 00:38 |
ianw | 31 https://opendev.org/openstack/openstack.git | 00:38 |
ianw | it really looks like this repo is the major cause | 00:38 |
ianw | i'm going to have to try to extract the script and run it separately. what we are doing is fairly convoluted | 00:44 |
fungi | i expect i'm partly to blame, i vaguely recall adding bits to force caching of all the branches which was... ugly | 01:32 |
*** ricolin_ is now known as ricolin | 04:18 | |
ianw | https://paste.opendev.org/show/810990/ is what's happening distilled | 04:27 |
*** pojadhav|afk is now known as pojadhav | 04:56 | |
opendevreview | Merged opendev/system-config master: Enable mirroring of centos stream 9 contents https://review.opendev.org/c/opendev/system-config/+/817136 | 05:23 |
*** ysandeep|out is now known as ysandeep | 05:24 | |
opendevreview | Ian Wienand proposed opendev/system-config master: Revert "Enable mirroring of centos stream 9 contents" https://review.opendev.org/c/opendev/system-config/+/817899 | 06:08 |
opendevreview | Ian Wienand proposed opendev/system-config master: Enable mirroring of 9-stream https://review.opendev.org/c/opendev/system-config/+/817900 | 06:08 |
ianw | fungi: i totally missed that uses /mirror/centos-stream which needs to be a new volume mount. i think it makes sense to do it like that, which 817900 enables. | 06:09 |
opendevreview | Ian Wienand proposed opendev/system-config master: Enable mirroring of 9-stream https://review.opendev.org/c/opendev/system-config/+/817900 | 06:55 |
opendevreview | Merged opendev/system-config master: Revert "Enable mirroring of centos stream 9 contents" https://review.opendev.org/c/opendev/system-config/+/817899 | 06:57 |
ianw | fungi: ^ i've setup the users/permsissions/volumes/quota (200g) etc. for 817900 now | 08:09 |
ianw | ahh, i replicated the no such module issue | 08:15 |
ianw | https://paste.opendev.org/show/810992/ | 08:15 |
ianw | $ git --version | 08:15 |
ianw | git version 2.33.1 | 08:15 |
ianw | so this happens at least on my f35 version and git version 2.30.2 on the builder | 08:15 |
ianw | the final script that replicated this was -> https://paste.opendev.org/show/810993/ | 08:16 |
ianw | infra-root: ^ might be good if someone else can confirm this. i see suspects in either the gerrit submodule update plugin thing, replication to gitea, gitea serving itself, or git client issues ... i.e. i have no idea :) | 08:17 |
ianw | running it, leaving it for a while for a few things to merge, then running again is required. i can't get a deterministic replication, yet | 08:18 |
*** ykarel__ is now known as ykarel | 08:19 | |
ianw | i'm out of time for now, but tracing through a request to see which backend it ends up at, and seeing if anything in gitea logs will be my first thought | 08:20 |
opendevreview | Merged opendev/irc-meetings master: Update QA meeting info https://review.opendev.org/c/opendev/irc-meetings/+/817224 | 09:00 |
*** giblet is now known as gibi | 09:02 | |
*** pojadhav is now known as pojadhav|lunch | 09:21 | |
*** ykarel is now known as ykarel|lunch | 09:42 | |
*** ysandeep is now known as ysandeep|afk | 09:44 | |
*** pojadhav|lunch is now known as pojadhav | 10:04 | |
*** ykarel|lunch is now known as ykarel | 10:43 | |
*** ysandeep|afk is now known as ysandeep | 10:57 | |
*** jpena|off is now known as jpena | 10:59 | |
fungi | ianw: in which log files were you finding that error? | 12:43 |
opendevreview | Merged opendev/system-config master: Enable mirroring of 9-stream https://review.opendev.org/c/opendev/system-config/+/817900 | 13:10 |
fungi | i'll check in on that ^ once it's deployed | 13:11 |
frickler | fungi: those errors seem to be in the build logs, e.g. https://nb01.opendev.org/centos-8-stream-0000129890.log | 13:54 |
fungi | ahh, thanks, i checked some build logs but just didn't get lucky | 14:09 |
frickler | seems one has a good chance when looking at those that are much shorter than the others, but not too short like the c-9-s builds | 14:11 |
fungi | is it the clone or the fetch triggering that? i guess it's the fetch operation? | 14:30 |
noonedeadpunk | hey folks! can I ask to review https://review.opendev.org/c/openstack/project-config/+/817271 and https://review.opendev.org/c/openstack/project-config/+/799825 ? | 14:48 |
fungi | sure, lookin | 14:52 |
*** ykarel is now known as ykarel|away | 15:03 | |
opendevreview | Merged openstack/project-config master: Add Vault role to Zuul jobs https://review.opendev.org/c/openstack/project-config/+/799825 | 15:08 |
fungi | that centos stream 9 mirror change failed to deploy, checking the log on bridge.o.o now | 15:15 |
fungi | broke on "mirror-update : Copy keytab files in place" | 15:16 |
fungi | but unfortunately, "censored": "the output has been hidden due to the fact that 'no_log: true' was specified for this result" | 15:16 |
fungi | maybe the private hiera is missing the value or mistyped | 15:17 |
fungi | i see a mirror_update_keytab_centos-stream in host_vars/mirror-update01.opendev.org.yaml and it looks properly base64 encoded | 15:19 |
fungi | aha! they needed to be added to host_vars/mirror-update02.opendev.org.yaml instead | 15:27 |
fungi | i've copied it into there now | 15:29 |
Clark[m] | re the openstack/openstack repo I wonder if we are replicating the openstack/openstack submodule update prior to replicating the ref the submodule was updated to point at. This is possible because we run multiple replication threads and submodule updates may be cheap but updates to other repos less so? | 15:38 |
Clark[m] | And if we try to fetch/clone in that period of time we lose the race. Now this should only be an issue if actually updating or checking the submodules which we don't need to do in the git cache? Maybe we can force git to be more naive there | 15:39 |
fungi | yeah, i have a feeling at some point git fetch changed its behavior to also update or validate submodules by default | 15:48 |
fungi | or maybe the default behavior is expressly overridden on specific distros? | 15:49 |
*** jpena is now known as jpena|off | 16:08 | |
fungi | perhaps we need to explicitly pass --recurse-submodules=false? | 16:09 |
fungi | "Use on-demand to only recurse into a populated submodule when the superproject retrieves a commit that updates the submodule’s reference to a commit that isn’t already in the local submodule clone. By default, on-demand is used..." | 16:11 |
fungi | i think that explains why we're seeing it try | 16:11 |
fungi | and yes, a agree a race between openstack/openstack and some other project it treats as a submodule getting replicated in the "wrong" order could explain it | 16:12 |
fungi | s/a agree/i agree/ | 16:12 |
*** ysandeep is now known as ysandeep|out | 16:34 | |
clarkb | fungi: it might be nice to get https://review.opendev.org/c/opendev/system-config/+/816771 and/or https://review.opendev.org/c/opendev/system-config/+/816869/ in today if we are feeling confident in them? | 17:21 |
clarkb | (its early in the week etc etc) | 17:21 |
fungi | i'll take a look | 17:21 |
clarkb | I'm happy to pick one or the other so that we can monitor if we think that is prudent. Mostly didn't want that work to stall out because I got sick last week | 17:22 |
clarkb | I do have a meeting here in the next few minutes. Some people are interested in eharing how we do arm64 ci stuff | 17:22 |
clarkb | But other than that I expect to be around all day and able to help | 17:22 |
clarkb | also I need to do a system udpate that is removing xorg-video-mach64 whcih I don't think is used by my current gpu, but on the off chance it is I'll be falling back to the laptop I guess | 17:23 |
clarkb | when this meeting is done I'll send email to linaro about the expiring cert and what appear to be a number of leaked instances as well | 17:27 |
fungi | okay, so if i were to approve 816771 now, that's cool with you? | 17:30 |
clarkb | fungi: I think so, assuming you are able to help watch it as my meeting just started | 17:31 |
*** jgwentworth is now known as melwitt | 17:34 | |
fungi | yeah, i can. as for your point about the admin user possibly being used for some service, while i doubt we would have done that, it's not easy to search our configuration for it since that string appears all over | 17:37 |
opendevreview | Merged opendev/system-config master: Cleanup users launch-node.py might have used https://review.opendev.org/c/opendev/system-config/+/816771 | 17:56 |
fungi | watching that delpoy ^ | 17:59 |
clarkb | fungi: ya I doubt we have too. Ok meeting over. Turns out that the hardware we are on for the linaro cloud is old and need to go away so need to work with equinix and linaro to shift onto new hardware. I'm working on an email to get kevinz_ looped into that conversation next | 18:00 |
clarkb | then also an email about the ssl cert and the leaked nodes :) | 18:00 |
fungi | yeah, it started complaining about the 30-day expiration over the weekend | 18:00 |
fungi | should hopefully see /var/log/ansible/base.yaml.log start updating once the infra-prod-base job is underway | 18:02 |
fungi | looks like it's working on the account updates now | 18:05 |
fungi | occasional stderr about the ubuntu account having no mail spool to remove | 18:06 |
clarkb | fungi: anything unexpected or concerning? | 18:06 |
fungi | not that i could see, though it spammed past very quickly so i need to take a slower look through the log | 18:07 |
fungi | i'll wait to make sure the build result comes back | 18:07 |
clarkb | ok first email sent | 18:14 |
clarkb | and now I've also asked kevinz_ to look at the laeked nodes and the ssl cert. I feel so productive to day :) | 18:20 |
clarkb | fungi: looks like the base job finished but failed? | 18:21 |
fungi | yeah, i'm looking now to see what host failed | 18:22 |
clarkb | fungi: its was lists.o.o a dpkg lock was being held so it couldn't update the cache | 18:22 |
clarkb | I think that is the only error so we are probably good | 18:22 |
fungi | E: Unable to acquire the dpkg frontend lock (/var/lib/dpkg/lock-frontend), is another process using it? | 18:22 |
fungi | yep | 18:22 |
clarkb | whew | 18:23 |
fungi | i'm putting together a list of hosts now which seem to have gotten accounts removed in that build | 18:23 |
clarkb | thanks. I spot checked a couple to make sure the removal actually hapepend and it seems to have | 18:24 |
clarkb | I'm going to do that local update and hope it doesn't break my xorg (will fallback to laptop if it does) | 18:24 |
clarkb | Gerrit User Summit has been announced for December 2&3 | 18:25 |
clarkb | I'll probably try ot attend that at least. But also propose a talk about how we deploy Gerrit these days as I think the automation we've got is probably interesting to other gerrit users | 18:26 |
fungi | these reported 4 changed tasks instead of the usual 3 (or occasional 2): afs01.ord afs02.dfw afsdb02 cacti02 ethercalc02 lists.katacontainers.io translate01 | 18:26 |
clarkb | fungi: but those 4 tasks weren't anything unusual? | 18:27 |
clarkb | I guess only odd in that they were different? | 18:27 |
fungi | i think it represents the ones which had accounts removed, but i'm checking now to be sure | 18:27 |
clarkb | fungi: note the task count is for the entire playbook not just the user stuff | 18:30 |
clarkb | spot checking lists.kc.io I think it only did one change for the user | 18:30 |
fungi | for translate01, the tasks which reported changed status were the ubuntu account removal, running `ua status`, adding the puppet remote ssh key to the root account, and running `apt-get autoremove -y` | 18:31 |
clarkb | ya if you filter by changed.*lists.katacontainers.io you can see what they are. That one lgtm | 18:31 |
fungi | i bet the list with 4 changed tasks is the ones with ua configured | 18:31 |
clarkb | yup similar for lists.kc.io (no ua status) | 18:31 |
fungi | okay, so it'll probably be the ones with only 2 changed tasks which had no accounts to remove | 18:32 |
clarkb | ya that seems likely | 18:33 |
fungi | gitea* and mirror02.mtl01.inap | 18:33 |
fungi | that does indeed make far more sense | 18:33 |
clarkb | alright this seems happy. I'm going to go ahead with my local reboot now. Back in a bit (I hope!) | 18:34 |
fungi | the two changed tasks for mirror02.mtl01.inap were adding the puppet remote ssh key to the root account, and running `apt-get autoremove -y` | 18:35 |
fungi | no account removals | 18:35 |
fungi | so that does seem reasonable | 18:35 |
fungi | okay, i've approved 816869 now | 18:37 |
clarkb | ok back again. | 18:39 |
clarkb | https://review.opendev.org/c/opendev/system-config/+/816772 is the other one, but I'm still a bit worried we might end up breaking functionality for those instances if we do this | 18:39 |
clarkb | cloudnull: ^ may know? | 18:39 |
* cloudnull looking | 18:40 | |
clarkb | cloudnull: mostly I'm not sure if we need to keep that because Xen or similar. | 18:40 |
clarkb | like will console access stop working or live migration or etc | 18:40 |
cloudnull | ¯\_(ツ)_/¯ | 18:41 |
fungi | yeah, in short, what do we lose by uninstalling nova-agent on rackspace server instances? | 18:41 |
clarkb | ha ok. I wonder if the source code is readable on disk. We might be able to deicpher it that way | 18:41 |
cloudnull | the nova agent used to query the xen meta data and setup networking. but IDK if that's still a thing | 18:41 |
clarkb | that is a good hint that it does important tasks though | 18:42 |
clarkb | and we should be careful :) | 18:42 |
cloudnull | ++ | 18:42 |
cloudnull | IDK if the rax cloud still relies on that ? | 18:42 |
clarkb | ya maybe what we should do is boot a throwaway instance and do some testing | 18:42 |
cloudnull | ++ | 18:42 |
fungi | on bridge.o.o, dpkg says we used to have rax-nova-agent installed but no longer do (and do not have python3-nova-agent installed either) | 18:42 |
clarkb | fungi: interesting I thought we were uninstalling it at one point but then couldn't find evidence of that. Maybe because the package name changed | 18:43 |
fungi | we do have qemu-guest-agent installed though | 18:43 |
fungi | amusingly, /usr/bin/nova-agent (supplied by the python3-nova-agent package) is an executable zipball | 18:45 |
fungi | /usr/bin/nova-agent: Zip archive data, made by v?[0x314], extract using at least v2.0, last modified Fri Sep 13 04:00:00 2013, uncompressed size 87176, method=store, K\003\004\024 | 18:45 |
fungi | it seems to bundle netifaces, pyyaml, pyxs, distro, and novaagent | 18:46 |
fungi | the __main__.py supplied in the root of the bundle looks like a sort of entrypoint wrapper calling novaagent.novaagent.main() | 18:49 |
fungi | looks like https://github.com/Rackspace-DOT/nova-agent is probably the current source | 18:50 |
fungi | and has a readme explaining what it's for | 18:50 |
clarkb | nice find | 18:50 |
fungi | and we clearly have at least some server instances in rackspace without any version of that agent running | 18:52 |
clarkb | the network changes are probably what we would need to dig into more. Is it possible for those to happen without our input (I really doubt it as that would be an outage for the users) | 18:52 |
clarkb | and ya I did think we removed it at one time, but then couldn't find where we had done that | 18:52 |
fungi | yes, in the past we've seen the agent change "backend" (eth1) interface routes when they added or removed rfc-1918 networks in a particular region, and also change our dns resolver configuration | 18:54 |
fungi | which is what i seem to recall prompted us to start uninstalling it the first time | 18:54 |
clarkb | ah | 18:54 |
clarkb | fungi: any idea where we uninstalled it before. I swear I looked when I wrote this change any couldn't find it. But if we can find those logs we may find additional useful info | 19:17 |
fungi | looking to see if i can find archeological evidence of it | 19:36 |
fungi | clarkb: https://review.opendev.org/713341 merged last year and seems related, https://review.openstack.org/84543 cleaned up the old service removal code in 2014 though | 19:39 |
fungi | oh, no not 84543 that was in a job config | 19:40 |
fungi | i have a feeling if we did take the agent off our servers at some point, we did so manually | 19:43 |
opendevreview | Merged opendev/system-config master: Lower UID/GID range max to make way for containers https://review.opendev.org/c/opendev/system-config/+/816869 | 20:20 |
clarkb | sorry grabbed lunch. Ya I guess it could've been manual | 20:22 |
clarkb | wow the deploy jobs for the first change are still running and ^ is queued up behind that | 20:23 |
clarkb | corvus: I'm updating tomorrow's meeting agenda. For zuul we are running two mostly up to date schedulers and zuul-web talks to zk directly now too so no more weird flapping? | 20:28 |
clarkb | corvus: is gearman completely removed at this point or is it still around for the cli commands? What is the plan there? | 20:28 |
clarkb | also generally seems like queues are working as we expect them to today. I don't see anything that looks odd in zuul status right now | 20:47 |
clarkb | corvus: we might want to consider a zuul tag this week if that holds up? | 20:47 |
fungi | yeah, it's been looking good since the upgrade and even the hitless rolling scheduler restart turned up no obvious issues | 20:48 |
corvus | clarkb: 1: correct. 2: gearman is still around vestigally; we need to write more changes to finish removing it. a zuul tag sounds potentially useful, but i think we may want to merge a few more bugfixes. | 20:53 |
clarkb | cool thanks for the update | 20:54 |
clarkb | I'll go do some hashtag:sos reviews now | 20:55 |
ianw | fungi: thanks for fixing up the 9-stream error. the initial deploy might need to be done under lock? | 21:09 |
ianw | we should delete that old hiera file; also the instructions for using hiera edit didn't work either, i'll look at both | 21:09 |
ianw | the 9-stream update did timeout. i'm running it now under no_timeout lock in a root screen | 21:30 |
ianw | (on mirror-update) | 21:33 |
fungi | oh, thanks i didn't check back to see if it deployed properly after i fixed it | 21:42 |
clarkb | fungi: the most recent run on infra-prod-base for the uid and guid shift also failed on lists.o.o due to the dpkg lock. I suspect we may have a stale lock there? | 21:58 |
clarkb | fungi: I think there is an apt-get autoremove from november 11 that is causing that to happen on lists.o.o | 22:11 |
clarkb | fungi: also looking in /boot I see two vmlinuz files newer than the one that I seem to have extracted so that xen could boot it. I don't think our package pins are working there | 22:13 |
clarkb | fungi: if you get a chance can you take a look? We can probably go ahead and extract the two newer files and put them in place too? | 22:14 |
clarkb | oh wait we end up using menu.lst I think | 22:16 |
clarkb | oh except that chain loads grub2 | 22:17 |
clarkb | ya Ok I've paged enough of this back in again. Basically we rely on the menu.lst which xen reads to chainload grub2 | 22:18 |
clarkb | grub2 will boot a new kernel by default which is compressed with the wrong compression algorithm for xen. This means our kernel pinning isn't working and we need to extra newer kernels before rebooting | 22:18 |
clarkb | I wonder if the autoremove is broken due to our pinning and we've confused it somehow? | 22:19 |
opendevreview | Ian Wienand proposed opendev/system-config master: mirror: Add centos-stream link https://review.opendev.org/c/opendev/system-config/+/818026 | 22:27 |
clarkb | https://review.opendev.org/c/opendev/base-jobs/+/817289 and child are a couple of what should be safe cleanups to our base jobs | 22:27 |
clarkb | basically we can stop scanning for the growroot info because we've not had problems with that recently | 22:28 |
ianw | ok, my "git --git-dir=openstack_179b61797588a5983c2f97c6533dca570c8f887d/.git fetch --prune --update-head-ok https://opendev.org/openstack/openstack.git '+refs/heads/*:refs/heads/*' '+refs/tags/*:refs/tags/*'" | 22:34 |
ianw | just got a bunch of "Could not access submodule 'ceilometer'" messages | 22:35 |
ianw | i'm going to try tracing what backend i went to | 22:35 |
ianw | ...45c8:37818 [15/Nov/2021:22:34:16.037] balance_git_https [38.108.68.124]:53688 balance_git_https/gitea08.opendev.org 1/0/2280 79208 -- 27/26/25/1/0 0/0 must have been me | 22:37 |
clarkb | ianw: did you see my theory that openstack/openstack is getting replicated before the ceilometer repo in this case? | 22:38 |
clarkb | ianw: we should be able to check the timestamps in the gerrit replication log to check that now that we have a concrete case | 22:38 |
ianw | Nov 15 22:09:20 gitea08 docker-gitea[846]: #033[36m2021/11/15 22:09:20 #033[0mCompleted #033[34mGET#033[0m #033[1m/openstack/openstack/commit/bf679fc618d360a7d1f3b329bef50f67d3d40fa3#033[0m #033[1;41m500#033[0m #033[1;41mInternal Server Error#033[0m in #033[1m19.713562ms#033[0m | 22:39 |
fungi | ianw: and that's coming from the fetch, right? any chance you could try to recreate with --recurse-submodules=false? i have a feeling it's the default of on-demand which is causing it | 22:42 |
fungi | i feel like there's probably no point in recursing submodules in our git cache anyway | 22:43 |
clarkb | ya the jobs update all the repos when they run | 22:43 |
clarkb | we just want enough object data in place that those updates are not super slow | 22:44 |
clarkb | fungi: please see notes about lists.o.o above too | 22:44 |
ianw | here is the error from above -> https://paste.opendev.org/show/811016/ | 22:44 |
fungi | clarkb: yeah, i'm trying to make sure i remember what the specifics are with the initrd there | 22:45 |
clarkb | fungi: I don't think it was initrd. It is vmlinuz since that is compressed by some newer algorithm on ubuntu now that xen doesn't understand. In my homedir on lists under kernel-stuff is a script from the linux kernel repo that will extract an uncompressed kernel | 22:46 |
fungi | oh, yep right you are | 22:46 |
clarkb | Our boot sequence is something like xen finds grub1 menu.lst and finds our pvchain loader boot thing which knows how to read the grub 2 config and get the kernel from that for xen | 22:47 |
fungi | yeah | 22:47 |
clarkb | xen then tries to uncompress it and fails or it finds a pre decompressed kernel and is fine | 22:47 |
fungi | i'm going to kill the hung autoremove and try running it manually without the -y | 22:47 |
clarkb | we thought we had pinned the kernel on that server to prevent my old uncompressed file from getting replaced but there are two newer kernels in grub now | 22:48 |
clarkb | fungi: ok | 22:48 |
ianw | https://paste.opendev.org/show/811017/ was the update and the list of submodules that "failed" | 22:48 |
fungi | The following packages will be REMOVED: linux-image-5.4.0-88-generic | 22:48 |
clarkb | ianw: hrm the number of submoduels complaining there would imply my theory isn't very valid (just too many for that race to be a problem) | 22:48 |
fungi | we're booted on 5.4.0-84-generic | 22:49 |
fungi | there's a vmlinuz-5.4.0-90-generic installed | 22:49 |
clarkb | fungi: yes 84-generic si the one I uncompressed (you'll see the original and a copy of the uncompressed version in my homedir) | 22:49 |
fungi | and a vmlinuz-5.4.0-89-generic | 22:49 |
clarkb | fungi: we can uncompress -90, but then when -91 happens we'll be in the same boat | 22:49 |
fungi | right | 22:49 |
clarkb | mostly I am worried about getting our pin working then we can uncompress whatever is current | 22:49 |
fungi | but also the autoremove isn't trying to remove the one we've booted from | 22:50 |
clarkb | got it | 22:50 |
fungi | and yeah, looks like it's probably set to boot vmlinuz-5.4.0-90-generic by default | 22:53 |
ianw | ok, here's my hit on the LB, and the subsequent hits on the gitea08 backend -> https://paste.opendev.org/show/811018/ | 22:56 |
ianw | there are these "Unsupported cached value type: <nil>" errors periodically, but in this case it's a red-herring | 22:56 |
ianw | fromt the actual error (https://paste.opendev.org/show/811017/) i do not see any backend hits for charm-* -- i.e. i do not think my client making is any requests related to these "missing" submodules | 22:58 |
ianw | none of these submodules are populated -- i don't think it is doing --recurse-submodules | 22:59 |
clarkb | https://opendev.org/openstack/openstack/src/branch/master/.gitmodules#L245-L252 those submodules are relative to the openstack/openstack repo | 23:01 |
fungi | ianw: --recurse-submodules=on-demand is the default on recent git versions | 23:03 |
fungi | which is why i suspected it might be related | 23:03 |
ianw | https://github.com/git/git/blob/master/submodule.c#L1525 is where the error comes from | 23:03 |
clarkb | it is complaining the dir isn't empty? | 23:04 |
ianw | feels like it must be hitting that !is_empty_dir | 23:05 |
fungi | what would cause it to populate the directory though? | 23:05 |
clarkb | doing a clone of openstack/openstack on my desktop using git 2.33.1 I get all empty dirs for the submodules which seems to line up with not hitting that error | 23:06 |
ianw | fungi: what does the "-on-demand" do in the fetch case? | 23:06 |
clarkb | I wonder if some subsequent action is populating some of those submodules, then the next time we try to update it complains? | 23:06 |
fungi | the manpage explains, but basically it recurses if the commit updates the reference for a submodule | 23:06 |
fungi | which is basically all the commits in openstack/openstack ever do | 23:07 |
fungi | so may as well be --recurse-submodules=true where that repo is concerned | 23:07 |
ianw | it says "populated submodules" | 23:09 |
clarkb | fungi: if you have a moment for https://review.opendev.org/c/opendev/base-jobs/+/817289 we can land that one and I can run a quick test then we can land the child | 23:09 |
ianw | since i don't think this has populated the submodules at all, i don't think it's trying to clone those bits at all | 23:09 |
ianw | https://github.com/git/git/commit/505a27659638614157a36b218fdaf25fe9fed0ce is what introduces this check | 23:11 |
fungi | yeah, and the git clone manpage doesn't state what its default for --recurse-submodules is unfortunately, so it seems like it defaults to off | 23:12 |
fungi | so any of the submodules in the cache look like they've got a populated worktree? | 23:12 |
fungi | i guess we should expect to have a submodule.active set for those if so | 23:13 |
clarkb | or at least not empty | 23:13 |
ianw | it's definitely racy though. https://paste.opendev.org/show/811017/ isn't a consistent list of what fails, it always changes | 23:13 |
ianw | i guess instrumenting git to print why it thinks the directory isn't empty might help | 23:14 |
clarkb | fungi: re lists. Let me know if you want me to extract the -90 kernel or I can walk you through it. But I'll defer to you on package pinning as that always seems like apt magic to me | 23:15 |
fungi | yeah, linux-image-5.4.0-84-generic is in held state, which explains why it's not trying to remove that | 23:23 |
fungi | linux-image-5.4.0-88-generic was listed as being in an error state, but when i manually re-ran `apt-get autoremove` and agreed to remove the package, it seems to have done so without error | 23:24 |
fungi | now 89 is showing removed, 84 is still installed, and rerunning autoremove no longer thinks it needs to do anything | 23:24 |
fungi | clarkb: i'll try to get up with you tomorrow about decompressing 90 and testing a reboot on that | 23:25 |
fungi | er, 88 is showing removed i meant to say, 89 and 90 are still installed as the two most recent kernels, as is 84 because we held it | 23:26 |
fungi | it seems to have previously cleaned up 85, 86 and 87 without problem, so not sure what caused it to choke on 88, i wasn't able to reproduce the problem | 23:26 |
clarkb | sounds good. The process is straightforward to use the extraction tool and it cam eout of the linux kernel git repo. Would be good to have someone else that knows how to use it though :) | 23:28 |
Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!