Wednesday, 2026-04-15

@harbott.osism.tech:regio.chatClark: fungi please check https://review.opendev.org/c/openstack/project-config/+/978566 when you have a moment, I think this is ready to get tested for real now06:58
-@gerrit:opendev.org- Kai Liu proposed: [zuul/zuul-jobs] 984689: Start zuul_console in prepare-workspace-{git,openshift} https://review.opendev.org/c/zuul/zuul-jobs/+/98468907:15
-@gerrit:opendev.org- Michal Nasiadka proposed: [opendev/system-config] 984694: Remove mirror03.gra1.ovh from configuration https://review.opendev.org/c/opendev/system-config/+/98469407:54
-@gerrit:opendev.org- Kai Liu proposed: [zuul/zuul-jobs] 984695: Fix missing {{ }} in remove-registry-tag role https://review.opendev.org/c/zuul/zuul-jobs/+/98469507:59
-@gerrit:opendev.org- Michal Nasiadka proposed: [opendev/system-config] 984694: Remove mirror03.gra1.ovh from configuration https://review.opendev.org/c/opendev/system-config/+/98469408:16
-@gerrit:opendev.org- Michal Nasiadka proposed: [opendev/system-config] 984698: Remove mirror02.ord.rax from configuration https://review.opendev.org/c/opendev/system-config/+/98469808:20
-@gerrit:opendev.org- Michal Nasiadka proposed: [opendev/system-config] 984702: Remove mirror02.dfw.rax from configuration https://review.opendev.org/c/opendev/system-config/+/98470208:40
-@gerrit:opendev.org- Michal Nasiadka proposed: [opendev/zone-opendev.org] 984738: Remove mirror03.gra1, mirror02.ord and mirror02.dfw from DNS https://review.opendev.org/c/opendev/zone-opendev.org/+/98473812:40
-@gerrit:opendev.org- Michal Nasiadka proposed: [opendev/system-config] 984694: Remove mirror03.gra1.ovh from configuration https://review.opendev.org/c/opendev/system-config/+/98469412:40
-@gerrit:opendev.org- Michal Nasiadka proposed: [opendev/system-config] 984698: Remove mirror02.ord.rax from configuration https://review.opendev.org/c/opendev/system-config/+/98469812:41
-@gerrit:opendev.org- Michal Nasiadka proposed:12:41
- [opendev/system-config] 984698: Remove mirror02.ord.rax from configuration https://review.opendev.org/c/opendev/system-config/+/984698
- [opendev/system-config] 984702: Remove mirror02.dfw.rax from configuration https://review.opendev.org/c/opendev/system-config/+/984702
-@gerrit:opendev.org- Michal Nasiadka proposed: [opendev/system-config] 984702: Remove mirror02.dfw.rax from configuration https://review.opendev.org/c/opendev/system-config/+/98470212:41
@mnasiadka:matrix.orgOk, the whole set is ready for reviews12:41
@lchams:matrix.orgHi, I hope this is the right place to send this message. I think the recent update to Gerrit may have broken things for my account. I can no longer auth with git review via ssh. I've tested from multiple machines, it always denies my key. Could anyone help me with this please?13:29
@fungicide:matrix.orgLeonie Chamberlin-Medd: can you let us know the error message? you might also add a `-v` on the command line to get more detail14:08
@lchams:matrix.orgAll here: https://paste.opendev.org/show/bzSg0Ok3cHdKAcToBRev/ 14:17
-@gerrit:opendev.org- Zuul merged on behalf of Michal Nasiadka: [openstack/project-config] 978566: propose-updates: Add pcu target https://review.opendev.org/c/openstack/project-config/+/97856614:18
@fungicide:matrix.orgLeonie Chamberlin-Medd: in the gerrit sshd log i'm seeing a "user not found" error for the username you're supplying14:32
@fungicide:matrix.orgi'll see if i can figure out what's up with that14:32
@fungicide:matrix.orgLeonie Chamberlin-Medd: when was the last time it worked for you? i'm not seeing any successful ligins with that username for at least a month14:34
@fungicide:matrix.orgs/ligins/logins/14:34
@harbott.osism.tech:regio.chatnot sure if this is a general zuul issue or just our database being slow: https://zuul.opendev.org/t/openstack/builds?project=openstack%2Fpython-cinderclient&project=openstack%2Fpython-brick-cinderclient-ext&pipeline=periodic-weekly&skip=0&limit=10 works fine for me, but having a larger limit the results page spins forever (limited by my patience) without showing results14:39
@clarkb:matrix.orgfungi: Leonie Chamberlin-Medd https://review.opendev.org/c/opendev/sandbox/+/984731 this was pushed today. Not sure if over ssh or https though14:49
@lchams:matrix.orgInteresting. I was using it absolutely fine last week on Wednesday. With LChams as the username right? 14:49
@lchams:matrix.orgYeah that was me testing https so I could keep submitting patches14:50
@clarkb:matrix.orgfungi: Leonie Chamberlin-Medd LChams shows up in the sshd_log from the 8th14:53
@clarkb:matrix.organd it looks like successful attempts. I would expect to see the unsuccessful attempts in today's log but don't14:54
@clarkb:matrix.orgreview.openstack.org appears to point at the correct location in DNS so that isn't the problem14:54
@clarkb:matrix.orgLeonie Chamberlin-Medd: can you go to https://review.opendev.org/settings/ and confirm that Username: says LChams?14:55
@clarkb:matrix.organd https://review.opendev.org/settings/#SSHKeys shows an expected key? I'm just trying to rule things out while we sort out what the problem could be14:55
@lchams:matrix.orgYep LChams14:55
@lchams:matrix.orgYeah I even tried generating a fresh key but nothing14:56
@clarkb:matrix.orgok. I think the next step is to try ssh directly maybe you're on a system that can't negotiate an ssh connection with the updated MINA SSHD for some reason. If you run `ssh -p 29418 LChams@review.opendev.org gerrit ls-projects` this is a command that should list all of the projects via ssh and can be used to test things with fewer tools in the way. If that doesn't work you can add -v up to -vvv to get more ssh client debug output to see what the issue might be from that side. Then I'm hoping that will also create something in the sshd_log I can check14:57
@clarkb:matrix.orgLeonie Chamberlin-Medd: nevermind I think i found the issue14:59
@clarkb:matrix.orgit doesn't like your key15:00
@clarkb:matrix.orgone second while I sanitize the log and share it15:00
@clarkb:matrix.org`failed (ExecutionError) to consult delegate for ssh-ed25519 key=SHA256:$KEYHASHHERE: java.lang.NoClassDefFoundError: net/i2p/crypto/eddsa/EdDSAPublicKey`15:00
@clarkb:matrix.orgthe only key I have in gerrit is also an ed25519 key so ed25519 generally works15:01
@clarkb:matrix.orgbut let me triple check that by using my key really quickly15:01
@clarkb:matrix.orgyes I can use my ed25519 key against gerrit just fine (I used ssh-add -c to get a confirmation prompt then confirmed the hash/name matches what ssh-add -l lists as an ed25519 key)15:03
@clarkb:matrix.orgare you using one of the newer -sk keys maybe?15:03
@clarkb:matrix.orgmy hunch here is that the key you supplied isn't strictly an ed25519 key (which is a subset of eddsa) and so MINA is looking for a generic eddsa handler and can't find it15:04
@clarkb:matrix.orgpossibly because your ssh client is supplying it as eddsa not ed25519 due to the difference15:05
@clarkb:matrix.orgthe example ssh command I gave above with an extra -vvv may help illustrate that case if it is the issue15:05
@fungicide:matrix.orgClark: Leonie Chamberlin-Medd: oh, sorry, i was going off the error from ther paste which showed you trying to log in as user `leonie` and there were a bunch of errors in gerrit's `sshd_log` between 10:17 and 14:11 today saying `user-not-found` for that username15:07
@clarkb:matrix.orgfungi: yup then git review prompts for the user name after probing the local machine username and that also failed15:08
@clarkb:matrix.orgbut oddly sshd_log doesn't get the errors only error_log does15:08
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [opendev/system-config] 984795: Add a script to test user agent patterns https://review.opendev.org/c/opendev/system-config/+/98479515:10
@lchams:matrix.orghttps://paste.opendev.org/show/bAzQ5uFBuKt8LqGOp5Zn/ 15:11
@clarkb:matrix.orgLeonie Chamberlin-Medd: that is a different key than the one that was failing before but the client hash and the server hash match so it is that key that is failing15:14
@clarkb:matrix.organd to be clear it is still failing the same way as the prior key15:15
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [opendev/system-config] 984796: Remove a rewrite rule that matches googlebot https://review.opendev.org/c/opendev/system-config/+/98479615:16
@clarkb:matrix.orglet me rerun my test with my ED25519 key and see if there is a behavior delta15:16
@clarkb:matrix.orgin the ssh -vvv output I mean15:16
@jim:acmegating.comClark: fungi ^ i think that rule is blocking crawlers from our static sites15:16
@jim:acmegating.comClark: fungi note it has a parent change with a script that can help us find things like this15:17
@clarkb:matrix.orgLeonie Chamberlin-Medd: the only difference I see during that part of the negotiation is `explicit` vs `agent` but I think that just has to do with where the key lives15:18
@clarkb:matrix.orgLeonie Chamberlin-Medd: how is the key generated? can you try rsa and see if that has the same problem?15:19
@clarkb:matrix.orgcorvus: I've approved both15:20
@fungicide:matrix.orgit's possible the mina-sshd version changed between gerrit 3.11 and 3.12, and they dropped support for some older key options or regressed on something related to key handling15:21
@clarkb:matrix.orgfungi: yes MINA did update. This new version should be post quantum ready for example15:21
@clarkb:matrix.orgI'm about to generate a new ed25519 key and test it15:22
-@gerrit:opendev.org- Michal Nasiadka proposed: [openstack/project-config] 984800: Follow up for I170bdb3ffc89bc3307a08da7cd4c7f2793a4e491 https://review.opendev.org/c/openstack/project-config/+/98480015:22
@clarkb:matrix.orgsince the key I am using is a couple years old15:22
@mnasiadka:matrix.orgfungi: noticed a flaw in job config in 984800 - and I'll followup with a mechanism to install pip-check-updates only in jobs that use it15:22
@fungicide:matrix.orgi'm popping out for a lunch errand, back in an hourish15:23
@lchams:matrix.orgGenerated with ssh-keygen -t ed25519. Just tried with an rsa key and am seeing the same issue15:23
@clarkb:matrix.orgLeonie Chamberlin-Medd: ok is that with openssh? and what version of openssh?15:24
@harbott.osism.tech:regio.chatseems https://docs.openstack.org/ is down?15:24
-@gerrit:opendev.org- Michal Nasiadka proposed: [openstack/project-config] 984804: Install pip-check-updates only in jobs requiring that https://review.opendev.org/c/openstack/project-config/+/98480415:25
@lchams:matrix.orgSorry yes. When on powershell ssh -V gives OpenSSH_for_Windows_9.5p2, LibreSSL 3.8.2 and then on wsl ubuntu OpenSSH_9.6p1 Ubuntu-3ubuntu13.15, OpenSSL 3.0.13 30 Jan 202415:27
@clarkb:matrix.orglooks like even with -i ssh will prefer keys in my agent so this is taking longer than I had hoped.15:27
@mnasiadka:matrix.orgClark: With ssh-agent I usually end up with that mix that works: -o "IdentitiesOnly yes" -o "IdentityFile path_to_the_public_key"15:29
@mnasiadka:matrix.org(or killing ssh-agent or removing keys from ssh-agent)15:29
@clarkb:matrix.orgyup I ended up doing `ssh-add -D` and then rerunning. I cannot reproduce. I generated a new key with `ssh-keygen -t ed25519 -f test-ed25519` and then ssh -i with that key works here. The explicit vs agent thing does seem to be as I asusmed as switching to the new key not in an agent reports explicit rather than agent.15:31
-@gerrit:opendev.org- Zuul merged on behalf of James E. Blair https://matrix.to/#/@jim:acmegating.com: [opendev/system-config] 984795: Add a script to test user agent patterns https://review.opendev.org/c/opendev/system-config/+/98479515:31
@clarkb:matrix.orgOpenSSH_10.2p1, OpenSSL 3.5.3 is my local ssh version running on opensuse tumbleweed15:31
@clarkb:matrix.orgLeonie Chamberlin-Medd: did both of those openssh versions generate a non working key for you?15:32
@clarkb:matrix.orgI think there must be something either in the key generation process or the client handling of the key that is not working with MINA SSHD15:33
@clarkb:matrix.orgsince both my old and new ed25519 keys work I dont' think this is a problem with ed25519. You also reported that rsa has the same problem which is more evidence that the issue is maybe not type specific but more client specific for some reason15:34
@clarkb:matrix.orgLeonie Chamberlin-Medd: could it be an encoding issue maybe? Does clicking the "Click To View" button at https://review.opendev.org/settings/#SSHKeys work for the pubkey?15:37
@clarkb:matrix.orgI don't know if they do any validation of that content when rendering it. But if they do its possible you may get an error there?15:38
@clarkb:matrix.orgJens Harbott: it does look like we're running a full complement of apache servers there again15:38
@lchams:matrix.orgYeah. Don't really use powershell but also from my vm (OpenSSH_9.6p1 Ubuntu-3ubuntu13.14, OpenSSL 3.0.13 30 Jan 2024) the keys aren't working. 15:40
Click to view on the ssh section works fine
@clarkb:matrix.orgLeonie Chamberlin-Medd: did you try RSA? I don't see a recent failure with rsa15:45
@clarkb:matrix.orgya if I grep -v eddsa I get no returns for failed ssh attempts15:46
-@gerrit:opendev.org- Michal Nasiadka proposed: [openstack/project-config] 984804: Install pip-check-updates only in jobs requiring that https://review.opendev.org/c/openstack/project-config/+/98480415:48
@lchams:matrix.orgJust tried again with rsa, can you see the fail?15:51
@clarkb:matrix.orgLeonie Chamberlin-Medd: yes, I see it but it reports the same error. As if the key is actually an ed25519/eddsa ckey15:52
@clarkb:matrix.orgsha256sums should be the same length for the two key types right so I can't infer anything from that15:53
@clarkb:matrix.orgI'm going to start reading the Gerrit source code now I guess15:54
@clarkb:matrix.orgLeonie Chamberlin-Medd: `WARN  com.google.gerrit.sshd.CachingPublicKeyAuthenticator` is the source of the issue and now I'm also wondering if the problem could in whatever is caching the keys? Do you have other keys defined maybe and it is failing on them. Like maybe you added an -sk key at one time and its looking through your keys and failing on that one before it can get to the others?15:56
@lchams:matrix.orgInteresting. It's definitely an rsa key.15:58
@clarkb:matrix.orgya that is why I'm wondering if it is looking at an entirely different key in this process15:58
@lchams:matrix.orgWe added my coworker's key to my account. We can see it fails with my username LChams but works with his, mattcrees https://paste.opendev.org/show/bWL3LcQjPtltoii24mrR/ 15:59
Same ssh key in both
@clarkb:matrix.orgeg you say "here's my rsa key" then it goes through its cache of keys looking for it and if it hits a broken key before hand then short circuits?15:59
@clarkb:matrix.orgok, that doesn't rule out the client or the server. But probably does rule out problems with key generation15:59
@clarkb:matrix.orgI think https://github.com/apache/mina-sshd/blob/sshd-2.15.0/sshd-core/src/main/java/org/apache/sshd/server/auth/pubkey/CachingPublicKeyAuthenticator.java#L57-L64 is generating the warning message in the logs. I think authenticate() is this: https://gerrit.googlesource.com/gerrit/+/refs/tags/v3.12.6/java/com/google/gerrit/sshd/DatabasePubKeyAuth.java#11116:10
@lchams:matrix.orgI do have a -sk key, but I think it shouldn't be offering it as I'm passing -c core.sshCommand="ssh -i ~/.ssh/opendev-ssh-key" I only see it offer the one key in the verbose output. I've tried killing the ssh agent in case it was caching anything, but this didn't help16:19
I've also just tried deleting all keys from Gerrit and adding a new ed25519 key but am still seeing issues
@clarkb:matrix.orgLeonie Chamberlin-Medd: ok I was asking questions upstream and they thought that maybe if you had -sk keys available locally it could cause this problem. I agree ti seems from the -vvv output that your client is only offering what appears to be a normal ed25519 key16:20
@clarkb:matrix.orgdid you ever add that -sk key to gerrit?16:20
@clarkb:matrix.orgI'm beginning to suspect: https://gerrit.googlesource.com/gerrit/+/refs/tags/v3.12.6/java/com/google/gerrit/sshd/DatabasePubKeyAuth.java#128 is where things may be crashing if you did16:20
@lchams:matrix.orgYes it was added until I removed it a few mins ago16:21
@clarkb:matrix.orgok I think my theory then is that having a key type that Gerrit cannot handle may cause the authentication process to short circuit around line 128 linked above if that key is checked before any other valid keys16:22
@clarkb:matrix.orgnow if you've removed that key from your account but the problem is persisting the next question becomes how do we clear the cache for you16:22
@clarkb:matrix.orgI think I can run the `gerrit flush-caches sshkeys` command and flush it for everyone16:26
@lchams:matrix.orgJust got my coworker to try it and it works on his ubuntu machine now. I do need to head off, thank you for all your help today and we'll try and get it working on our end again soon. Must be an issue with the sk key 16:26
@clarkb:matrix.orgLeonie Chamberlin-Medd: try what?16:26
@lchams:matrix.orgwe uploaded my coworker's ssh key on my gerrit account and he could run git review -s just fine 16:27
@clarkb:matrix.orgoh interesting so something with your client then?16:27
@lchams:matrix.orgwasn't working before so deleting the sk key seems to be the fix 16:28
@clarkb:matrix.orggot it16:28
@lchams:matrix.orgThank you for your help! :) 16:28
@clarkb:matrix.orgto be clear I won't flush keys given that it seems to be working now16:29
@clarkb:matrix.orgfeel free to check back in later if you find new issues. But to summarize we suspect that having a -sk key within the user account caused gerrit to short circuit in key cache lookups when trying to find a matching key during authentication16:29
@clarkb:matrix.orgremoving the -sk key seems to have resolved the problem16:29
@lchams:matrix.orgThanks again!16:30
@clarkb:matrix.orgJens Harbott: fungi re static/docs.openstack.org the mod security db is 2.6GB large. I think we should drop mod security from all of the vhosts now and restart apache and see if we can keep up better in that situation?16:41
@clarkb:matrix.orgI worry that the large db is causing requests that go through mod security to do large amounts of work just to look up IPs and that is slowing requests down which causes apache workers to be used for longer16:41
@clarkb:matrix.orgthen we eventually run out of workers16:42
@clarkb:matrix.orgbut I want to defer to fungi on that in particular as fungi has been dealing wit hthe static stuff more than anyone else I think16:44
@clarkb:matrix.orgoh we still have mod security enabled for openstack docs too16:47
@clarkb:matrix.organd its rules are non trivial I suspect that is why we're slow again. I wonder why this was better on the other server? Maybe just less contention and dedicated resources/workers?16:47
@fungicide:matrix.orgit seems to have recovered for the moment?16:48
@fungicide:matrix.orgbut yes, i'll push a change to clean up those rules16:48
@clarkb:matrix.orgfungi: well but also the implication is that things are hitting those rules since the db is growing16:51
@clarkb:matrix.orgbut I guess it has to check the db on every request to determine if something is blocked16:51
@clarkb:matrix.organd this server is bigger than the old one that had issues that led to the mod security rules. So ya maybe we reset and evaluate with new valid info based on what we see after cleanups16:52
@fungicide:matrix.orgthat makes sense, yes17:05
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/zuul-jobs] 984141: Add prepare-repos role https://review.opendev.org/c/zuul/zuul-jobs/+/98414117:08
-@gerrit:opendev.org- Ron Stone proposed: [openstack/project-config] 984830: Update StarlingX docs promote job for R12 release https://review.opendev.org/c/openstack/project-config/+/98483017:15
@clarkb:matrix.orgmnasiadka: we're removing nodes from inventory then from DNS? (just making sure I'm following the process properly)17:31
@mnasiadka:matrix.orgClark: we can do it in other order if you prefer that, we don’t use DNS names in Ansible inventory - but currently the DNS change depends on the system-config stack17:36
@clarkb:matrix.orgno I think this is fine. I'm just getting my bearings it has been a busy morning17:36
@clarkb:matrix.orgall three system-config changes lgtm. I'm not sure if anyone else will review those today (feels like everyone is busy). Maybe I single core approve them after lunch if no one else reviews them and I don't have new fires to fight17:37
@clarkb:matrix.orgThis is interesting docs.opendev.org seems to be quite response right now and the total apache process count is down below the limit. But system load is high17:46
@clarkb:matrix.orgso not sure if this is the leading edge of things are going to break shortly or we're just seeing different bebaviors from different types of crawlers?17:46
@clarkb:matrix.orgok ya we are right back to the process limit17:46
@clarkb:matrix.organd its slower but still responsive17:47
@clarkb:matrix.organd load is falling back off again (but still really high)17:47
@fungicide:matrix.orgworking on the config change for that now, sorry for the delay17:49
@clarkb:matrix.orgload like that may be explained by the mod security db I suppose? I guess I'm trying to determine if we should change approach here17:50
@clarkb:matrix.orgbut I don't think so17:50
@jim:acmegating.comare we talking about https://review.opendev.org/981160 ?17:51
@fungicide:matrix.orgoh, looks like we are17:52
@fungicide:matrix.orgthanks for reminding me i'd already written this once17:52
@clarkb:matrix.orgyes except that change won't disable them as the mod is enabled. I think ew also need to disable the mod?17:52
@fungicide:matrix.orgit drops all the additional secrules from the vhost though17:53
@jim:acmegating.comyeah, it looks like that change will remove the rules that add new ips to the honeypot, but we will still check.  but eventually, the list will be empty and not grow any more.  as that right?17:53
@clarkb:matrix.orgoh yup for docs.openstack.org specifically17:53
@jim:acmegating.com * yeah, it looks like that change will remove the rules that add new ips to the honeypot, but we will still check.  but eventually, the list will be empty and not grow any more.  is that right?17:53
@clarkb:matrix.orgcorvus: yes I think if we clear the 2.6GB db as part of applying this change that would be the case17:53
@clarkb:matrix.orgso that seems like a reasonable next step even without disabling the mod security module17:54
@jim:acmegating.comat that point, are we using mod_security for anything?  should we disable it in a followup to that change?17:54
@clarkb:matrix.orgcorvus: I think we should disable it as a followup yes17:54
@clarkb:matrix.orgit was an experiment that our experimentation has discovered has a fatal flaw and looking for other tools seems appropriate at this point17:54
@fungicide:matrix.orglooks like that change is not even a month old. what does it say about the past month that it's erased my etch-a-sketch?17:54
@clarkb:matrix.orgfungi: should we go ahead and approve https://review.opendev.org/c/opendev/system-config/+/981160 now or was there something you wanted to modify in it?17:55
@jim:acmegating.comfungi: haha -- all these bot fighting days run together17:55
@fungicide:matrix.orgi'm still good with it as long as there's no merge conflict17:55
@clarkb:matrix.orgI have approved it17:59
@clarkb:matrix.orgGerrit doesn't report a emrge conflict and zuul should report one quickly if there is one17:59
-@gerrit:opendev.org- Zuul merged on behalf of Jeremy Stanley https://matrix.to/#/@fungicide:matrix.org: [opendev/system-config] 981160: Drop custom WAF rules from docs.openstack.org https://review.opendev.org/c/opendev/system-config/+/98116018:14
@fungicide:matrix.orgload average on static03 is still falling, back under 4.0 now18:24
@fungicide:matrix.organd there are still crawlers hitting static04 even almost a day after we moved dns away from it18:24
@jim:acmegating.comjust for situational awareness -- the googlebot cleanup patch hit a post-run failure so i rechecked it.  based on the jobs, looks like this affects lists and gitea too.18:30
@fungicide:matrix.orgyes, they all use the same ua filter18:31
@jim:acmegating.comi don't see a job for static, actually -- should we add the user agents file to the static job?18:31
@fungicide:matrix.orgprobably yes18:31
@fungicide:matrix.orgClark: ^ >18:31
@fungicide:matrix.org?18:31
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [opendev/system-config] 984848: Add the ua-filter to static job file matcher https://review.opendev.org/c/opendev/system-config/+/98484818:32
@jim:acmegating.comdid that as i followup ^18:32
@clarkb:matrix.orgoh yes we removed static because lp was failing18:33
@clarkb:matrix.orgbut I thought that was only on the system-config-run side not infra-prod-service- side18:33
@jim:acmegating.comi was talking about system-config-run18:33
@clarkb:matrix.orgah got it18:33
@jim:acmegating.comyeah looks like it's still there on prod-service18:34
@clarkb:matrix.orglooks like the deployment did run against static for the custom waf rules18:34
@clarkb:matrix.orgfungi: I think we need to stop apache, remove the /var/cache/modsecurity content then start apaache again. Should I go ahead and do that?18:34
@fungicide:matrix.orgplease do, should only take a few seconds of downtime18:35
@clarkb:matrix.orgcorvus: I went ahead and approved 984848 since it is test only18:35
@fungicide:matrix.orgmost of that will be apache freeing swap18:35
@clarkb:matrix.org`systemctl stop apache2 && rm /var/cache/modsecurity/www-data-ip.dir && rm /var/cache/modsecurity/www-data-ip.pag && systemctl start apache2` ok this is the command I'll run18:36
@fungicide:matrix.orglgtm18:38
@clarkb:matrix.orgthat is done now18:38
-@gerrit:opendev.org- Zuul merged on behalf of Ron Stone: [openstack/project-config] 984830: Update StarlingX docs promote job for R12 release https://review.opendev.org/c/openstack/project-config/+/98483018:43
@clarkb:matrix.orgapache looks pretty happy right now so maybe this was the ticket18:49
@fungicide:matrix.orgyeah, load average is back down around 0.318:51
@fungicide:matrix.orgthrilled that the solution was that simple18:51
-@gerrit:opendev.org- Zuul merged on behalf of James E. Blair https://matrix.to/#/@jim:acmegating.com:18:51
- [opendev/system-config] 984796: Remove a rewrite rule that matches googlebot https://review.opendev.org/c/opendev/system-config/+/984796
- [opendev/system-config] 984848: Add the ua-filter to static job file matcher https://review.opendev.org/c/opendev/system-config/+/984848
@clarkb:matrix.orgok lunch is over and I said I would proceed with mirror cleanups from mnasiadka going to do that now. static's load appears to have skyrocketed again while I was eating and is falling now19:54
@clarkb:matrix.orgbut the mod security db is still size 0 and the service is reponsive and isn't using all worker processes so I think that may just be an artifact of how things are crawling?19:54
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed:20:17
- [opendev/zuul-providers] 982182: Add Ubuntu resolute image build job https://review.opendev.org/c/opendev/zuul-providers/+/982182
- [opendev/zuul-providers] 984866: Run vhd builds first https://review.opendev.org/c/opendev/zuul-providers/+/984866
@jim:acmegating.comClark Jens Harbott mnasiadka ^ that's an idea to try to fit in the space we have20:17
@jim:acmegating.comhttps://opendev.org/openstack/diskimage-builder/src/branch/master/diskimage_builder/lib/img-functions#L149-L154 is a reference btw.  i looked at the vhd-util source and i don't see a way to avoid that without modification.20:18
@clarkb:matrix.org+2 from me. I think even if that doesn't fix this particular issue it should make things more reliable over all as the total disk usage at any one time should be reduced20:18
@jim:acmegating.comer, i think that commit should say "vhd conversion" not build but you get the idea20:18
@jim:acmegating.comanother idea i had: we could remove /opt/dib_cache after the build but before the conversion -- is there a dib element hook point or whatever you call them where we could put that?20:20
@clarkb:matrix.orgcorvus: I think the "hooks" are the run phases. I suspect that we could add an element that executes a cache cleanup in the very last run phase20:20
@clarkb:matrix.orglet me find the docs for the order20:20
@clarkb:matrix.orgcorvus: https://docs.openstack.org/diskimage-builder/2.7.0/developer/developing_elements.html20:23
@clarkb:matrix.orgI think cleanup.d would be an appropriate spot for that20:23
@clarkb:matrix.orgit runs outside of the chroot too so we don't have to break out or do anything weird like that20:24
@jim:acmegating.comcool, thanks, in progress20:24
@clarkb:matrix.orginfra-root one of the things on my todo list is cleaning up the old h2 v1 caches from Gerrit 3.11 as they are consuming something like 60GB20:25
@clarkb:matrix.organy concern with doing that at this point? It seems unlikely we will revert and if we do we can start over with new caches like we did when we updated to h2 v220:26
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [opendev/zuul-providers] 984867: Delete cache at end of build https://review.opendev.org/c/opendev/zuul-providers/+/98486720:29
@jim:acmegating.comClark: h2 cleanup sounds good20:30
@jim:acmegating.comClark: ^ i did the delete-cache as a followup since it's more uncertain; but if we still have problems, we can move it up the stack20:30
@jim:acmegating.comeach of those changes should give us ~10GB20:30
@clarkb:matrix.orgsounds great reviewing that one now20:31
@clarkb:matrix.orgcorvus: I did post one thought but its more of a "if this doesn't work maybe try this next" option rather than something we need to address if it works as is20:33
-@gerrit:opendev.org- Zuul merged on behalf of Michal Nasiadka: [opendev/system-config] 984694: Remove mirror03.gra1.ovh from configuration https://review.opendev.org/c/opendev/system-config/+/98469420:33
@clarkb:matrix.orgok old h2 v1 caches are deleted. It actually freed up about 79GB20:39
@clarkb:matrix.orgI have to pop out in about an hour and 15 minutes to shuttle kids around. I've just approved the mirror02.ord.rax mirror cleanup change. Not sure I'll get to the dfw one today20:45
@clarkb:matrix.orgthe deployment for the first mirror03.gra1.ovh mirror removal failed on LE. Looks like we got an ansible rc -13 error against gitea10. This is the issue that occurs with ansible trying to use an ssh control persisted connection that is shutting down iirc. The deployment for removal of mirror02.ord.rax should hopefullybe fine and then we'll be set. I don't think I need to reenqueue the failed deployment as these changes are pulling things out of the inventory20:59
-@gerrit:opendev.org- Zuul merged on behalf of Michal Nasiadka: [opendev/system-config] 984698: Remove mirror02.ord.rax from configuration https://review.opendev.org/c/opendev/system-config/+/98469821:27
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [opendev/zuul-providers] 984867: Delete cache at end of build https://review.opendev.org/c/opendev/zuul-providers/+/98486721:29
@jim:acmegating.com*sudo* delete cache  :)21:30
@clarkb:matrix.orgOh I had the same problem when I tried to delete the old Gerrit caches21:30
@clarkb:matrix.orgFirst time I didn't have permissions had to sudo :)21:30
@clarkb:matrix.orgletsencrypt did successfully deploy in the second mirror cleanup change's deploy buildset21:52
@clarkb:matrix.orgfungi: if you have a moment https://review.opendev.org/c/opendev/gerritlib/+/983476 and https://review.opendev.org/c/opendev/jeepyb/+/983482 are sort of related to the git-review change in that I'm trying to improve the testing that we run against tools that talk to gerrit. Note I think the jeepyb change may rebuild Gerrit images hwich isn't as big of a deal now that we're pinned to specific tags but keep that in mind as I'm out friday etc. Happy to approve next week during the ptg if we prefer21:57

Generated by irclog2html.py 4.1.0 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!