Thursday, 2023-11-30

opendevreviewBrian Rosmaita proposed openstack/project-config master: Implement openstack-unmaintained-core group
fricklerclarkb: johnsom: ah, sorry for blocking that patch, I just wanted to make sure that it reaches the V+1 state so it can get merged faster after the gerrit update. but then it looks like I gave a good practicing opportunity to tonyb, so not too bad in the end ;)07:55
*** maxh1 is now known as maxh08:01
*** tobias-urdin is now known as tobias-urdin-pto09:34
tonybfrickler: yup it worked out pretty well really 12:57
ykarelHi need help in checking what's wrong with job neutron-ovs-tempest-plugin-scenario-iptables_hybrid-nftables, it fails in RETRY_LIMIT, and looking at console last task is Preparing Job Workspace14:19
ykarelcan someone check zuul executor log, or any other way to troubleshoot it?14:21
fungiykarel: looks like it's hitting that on stable/xena but working on stable/yoga, correct?14:22
fungiykarel: this is what i'm finding on the executor. i think it's having trouble with the checkout override for neutron-tempest-plugin:
opendevreviewMerged openstack/project-config master: Implement openstack-unmaintained-core group
fungiykarel: looks like 2023-11-18 is when it started doing this, but the checkout override has been in place since 2023-04-21, so i suspect this is a zuul behavior change14:36
fricklerthat override is in place for half a year, can this be a regression in zuul's handling of it?14:37
fungicorvus: if you have a moment, does that look like it could be related to ref handling changes a couple of weeks ago? seems like it started happening coincident with our automated zuul upgrade two weeks ago14:37
corvusfungi: can do14:39
fungiskimming codesearch, we don't have a lot of jobs specifying override-checkout to a tag, but there are more beyond just neutron14:42
frickler looks suspicous to me, it only talks about branch refs, not about tags14:42
ykarelfungi, yes only xena impacted, for yoga it works fine. last success of job was on 17th Nov14:47
fricklernice, found another javascript error in the zuul UI while checking for other examples, reported in #zuul14:51
clarkbI've got a meeting in 20 ish minutes. After that I'll generate a new ed25519 ssh key for gerrit -> gitea replication and push a change to rotate it into gitea as well as add it to hostvars for gerrit16:33
opendevreviewBrian Rosmaita proposed openstack/project-config master: Address TODO in acl normalization script
*** tkajinam is now known as Guest866617:02
tonybclarkb: Did that key get made?17:23
TheJuliafungi: you can reclaim that node, thanks. looks like it was all stupid human tricks17:27
fungiTheJulia: isn't it always? ;)17:28
TheJuliaoften, yes17:28
clarkbtonyb: not yet, I'm about to do that17:28
fungihold has been deleted, thanks TheJulia!17:28
clarkbtonyb: is that something you want to do together?17:29
clarkbtonyb: I'm currently doing system updates then will reboot and then load ssh keys so I can ssh into bridge and do that17:30
clarkbsystem updates locally to be clear17:30
tonybfungi: zuul-client autohold-list #to get the request_id; zuul-client autohold-delete request_id ?17:30
clarkbtonyb: yup you need to pass the --tenant openstack (or other tenant name) flag as well but those are the commands17:30
fungitonyb: exactly, on one of the schedulers, with sudo or as root, and need to tell it the tenant17:31
tonybcool beans17:31
fungithough to parrot corvus's comment from yesterday, it could be done through the webui too17:31
tonybclarkb: I figured it was local updates ;P  and sure I gues creating all new credentials doesn't happen often so it'd be good to see if there is anything unexpected17:32
tonybfungi, corvus: noted17:32
clarkbtonyb: ok I've rebooted and added ssh keys. We can use taht same meetpad we've been using?17:41
clarkband Iv'e started a root screen on bridge17:42
tonybokay. gimme 2 mins17:43
tonybokay I'm attached17:44
fungiproposed command lgtm17:45
fungilooks like it worked17:46
clarkbyup I'll edit hostvars in a sec17:46
clarkbjust trying to find the old one to make the adjacent17:48
clarkbhave to refer to the code to find the old names17:48
fungihopefully gerrit handles new-style private keys17:50
clarkbhrm that is a good question17:51
clarkbI feel like it didn't before and we had to manually convert all of our keys at one point but I'm not sure if MINA updates have solved that17:51
fungithey've been "new" for enough years that i expect it's not an issue17:51
clarkbnow to figure out which version of mina gerrit 3.8 has17:53
clarkbI see the eddsa artifact in a bazel BUILD file for mina so I'm pretty sure we've got those deps for ed25519 but still not finding the release version17:55
tonybThat doc says: "(although for reading ed25519 keys one needs to add the EdDSA support artifacts)", how would I verify we have inclueded that artifact?17:55
tonybThanks, you're a mind reader ;P17:56
fungimy bigger concern was over the "-----BEGIN OPENSSH PRIVATE KEY-----" vs "-----BEGIN <algorithm> PRIVATE KEY-----"17:56
clarkbfungi: yup and we are running this version:
clarkbinside gerrit/tools/nongoogle.bzl there is an SSHD_VERS var set that sets all the mina lib versions17:57
clarkbinside that same file eddsa is listed as a dep17:57
clarkbtonyb: ^17:57
clarkbso I think we're good. I'll proceed17:57
clarkbshould I rm the original files that were used to generate things?17:58
fungii would shred and then rm them17:59
fungiat least the privkey17:59
tonybI agree18:00
clarkbok done18:01
opendevreviewClark Boylan proposed opendev/system-config master: Rotate the new Gitea replication key into Gitea config
fungispecifically, is what i was wondering about18:03
fungisince mina-ssh may want pem format instead of the new openssh-specific format18:03
opendevreviewClark Boylan proposed opendev/system-config master: Switch Gerrit replication to using an ed25519 key
clarkbfungi: yes the link I gave above seems to say it accepts openssh formated files?18:04
clarkbif you click the hyperlinked OpenSSH in the link it seems to point at the new format so I think that is correct18:05
fungican you quote the passage you're looking at? it's a big document and what i found was "Reading key files in PEM format (including encrypted ones) is supported by default for the standard keys and formats. Using additional non-standard special features requires that the Bouncy Castle supporting artifacts be available in the code's classpath."18:05
tonyb"OpenSSH file format support18:05
tonybThe code supports OpenSSH formatted files without any specific extra artifacts (although for reading ed25519 keys one needs to add the EdDSA support artifacts). "18:05
fungiopenssh isn't creating them in pem format lately, but its own RFC4716 format18:05
clarkbfungi: it deep links to the specific section but here is teh quote: "The code supports OpenSSH formatted files without any specific extra artifacts (although for reading ed25519 keys one needs to add the EdDSA support artifacts)." The OpenSSH string there is a hyperlink to the openssh new private key format18:05
fungiaha, perfect. thanks18:06
clarkb is what that links to18:06
corvusfungi: remote: Fix repo state restore for zuul role tag override        18:06
fungithanks corvus! frickler ykarel ^18:08
clarkbnote we don't want to approve the gerrit change until after the gitea change has applied just to make sure that applies cleanly first18:09
clarkbalso reviewers please double check the host vars I added when you review the gerrit change18:09
clarkbI'm now going to look at gerrit 3.2 mina which I'malmost positive did not support the new formatted keys and we can hopefully use that toconform mina added support in the interim18:09
tonybclarkb: Should we do the playbooks/roles/gerrit/templates/gerrit_ssh_config.j2 part of 902169 in a seperate patch or perhaps use the RSA key and have a seperate patch to switch in the new key18:10
clarkbhrm is the gerrit 3.2 version and it also reports support there. So less confident now18:10
tonybthat'd give us a change to double check ssh with both keys before we "go live"18:10
clarkbtonyb: ya maybe first change switches over to using the rsa key (though note we have to restart gerrit to pick that up each time we change it) then a followup to do ed2551918:11
clarkbthe need for a restart is why I didn't both splitting it up because I was thinking we should minimize those, but might give us more confidence18:11
clarkbfungi: also I can convert to PEM If we think we need to18:11
fungiclarkb: nah, it looks like they do intentionally support openssh key format, i just wasn't seeing the keywords i expected in their document18:12
clarkbok I'll shutdown the screen now then18:12
clarkbfungi: tonyb: let me know if you think we should split up the change to explicitly choose the rsa key first, do a restart, then udpate to use the ed25519 key and do another restart18:14
corvusi'm going to restart zuul-web to pick up the recent js changes18:14
tonybcorvus: okay18:16
clarkbI've just noticed that the gerrit replication plugin docs do say you should use the PEM format18:17
clarkbit is possible that is no longer necessary and we can risk it or I can just go ahead and convert the key now18:18
tonybclarkb: Yeah I think converting to PEM is probably safest18:19
tonybclarkb: WRT gerrit_ssh_config.j2 I'm happy to go with your risk assessment.18:20
clarkbtonyb: I think our "it doesn't work" fallback is to (re)move the .ssh/config file and restart gerrit again18:20
clarkbI'm somewhat inclined to try a single restart given that is straightforward18:21
clarkbheh now I wish I ahdn't shredded and rm'ed the files18:21
clarkbwill the pubkey format change?18:22
clarkbI'm half inclined to genreate new keys if so since I'll have to update the gitea change18:22
* clarkb does this the key isn't used anywhere anyway so a new one is file18:22
tonybcurse fungi and his detail focused questions!18:23
corvus#status log restarted zuul-web to pick up js changes18:23
opendevstatuscorvus: finished logging18:23
clarkbheh I prefer getting this stuff right before we put it in production18:24
tonybI'm sure18:24
fungiclarkb: yeah, generating a new key is easier than converting an existing one18:24
fungii'd recommend that where possible18:24
clarkbhrm I did -m PEM but the output file didn't change the header and footer text18:27
clarkbfungi: does the header and footer not change?18:30
* clarkb is testing locally18:30
clarkbwith rsa keys the headers change. With ed25519 they don't. Not sure if those headers actually matter for parsing within the client18:31
clarkb`file` distinguishes the pem vs openssh but only for rsa as well18:32
clarkbI wonder if ed25519 can only exist as new format key?18:33
fungioh, maybe...18:34
clarkbthe man page hints at this " Setting a format of “PEM” when generating or updating a supported private key type will cause the key to be stored in the legacy PEM private key format."18:35
clarkbI think our next step may be to test this with a held node18:35
clarkbthis == ed25519 and replication18:36
clarkbwhich is annoying but better than breaking in production when it doesn't need to18:36
clarkbin that case I won't bother replacing this with the new key. Instaed I'll try to setup a test situation that can verify the ed25519 key works with MINA in modern gerrit18:37
clarkbor we can just use a large rsa key18:37
clarkbhere I thought using rsa would be more work :)18:37
tonybI think we should use a larger RSA key.  ed25519 is nice for other reasons but not essential for what we're trying to do18:38
tonybwe can use the new key rotation tasks later to switch to ed25519 after we've tested it18:39
fungi"OpenSSH doesn't currently support reading or writing Ed25519 keys in any format other than the OpenSSH native key format."18:40
fungifixed in openssh 9.618:41
clarkbfungi: that confirms my suspicion then. I'm happy to do a 4096 or 8192 bit rsa key instead and update the existing changes18:41
clarkband then we can still use .ssh/config to select the key we want I think18:41
clarkband have two rsa keys side by side18:42
clarkbthat might make it more difficult to determine if we're authenticating with the new key but I think we needed to sort that out with ed25519 anyway18:42
clarkbI'm thinking we should do 8192 to avoid needing to rotate again in the near future when germany decides 4096 isn't enough18:43
tonybclarkb: sounds good to me18:46
fungiconventional cryptanalytic wisdom is that any advances which make 4096-bit rsa easy enough to factor are likely to be advances which have simply broken rsa entirely, in which case 8192-bit rsa will be just as broken. 4096 is already larger than the latest openssh's default of 307218:47
clarkb3072 is also the minimum that gitea will accept18:48
clarkbI guess 4096 is fine as that is above the minimum18:48
clarkb`ssh-keygen -m PEM -f ./replication_id_rsa_B -t rsa -b 4096 -C ' 20231130'` is what I intend to use18:49
clarkband then I'll update the chagnes I just pushed to use that instead with appropriately renamed ansible vars18:49
fungithere's a law of diminishing returns at play, since the work factor involved in brute-forcing an rsa key is exponentially proportional to its length, so 4096 is already so much stronger than 3092 that it's unlikely to be brute-forced before the heat death of the universe18:49
clarkbbut I guess the difference between 2048 and 3072 is a problem18:50
fungi1024-bit is known to be unsafe, 2048 is probably fine in most cases but just to be safe openssh raised its default to 3072 recently just to be sure18:51
fungibecause there's a risk that in the distant future 2048-bit might be brute-forcible (3072 really shouldn't)18:51
fungirecommendations have generally been to replace 2048 keys by/not use them after ~2030 just to be safe18:52
tonybfungi: interesting timeline.18:53
fungispeaking of timelines, has a neat timeline of what's actually been publicly acknowledged as factored18:54
tonybclarkb: that command looks good to me, although I tend to avoid spaces in the comment18:54
funginobody has actually broken a 1024-bit rsa key and proven it publicly yet, though they're getting sort of close18:55
clarkbtonyb: ya I just noticed that the existing comment doesn't have spaces. I can remove the space in hostvars since it shouldn't actually affect the key material to do so18:56
clarkbthough I guess it is encoded into the private key itself18:57
* clarkb generates a new key18:57
clarkbI'm getting really good at generating keys18:57
tonybclarkb: #silverlining18:58
clarkbfungi: the reason gitea set their limit to 3072 is germany is now saying you should use at least that many bits apparently18:58
clarkbtoday rather than in 2030 but that may be a "start now so its done in time" thing18:59
fungiyes, based on voodoo science, but then again politicians have no idea what any of this means18:59
opendevreviewMerged opendev/system-config master: Add inventory/LE records for and
opendevreviewMerged opendev/system-config master: Add inventory/LE records for mirror02.dfw.rax
fungimathematicians say 1024-bit rsa is still working but it could be factored in coming years, so using larger keys means that stuff you encrypt now won't be accessible to attackers years in the future once breaking 1024-bit rsa becomes relatively easy. security folks see the warning from mathematicians and say, well if the mathematicians say we should be using at least 2048 bit now to be on the19:03
fungisafe side, then let's go ahead and recommend something even stronger just in case. politicians then see what the security folks said, and increase it even more so they can look like they're doing something useful19:03
fungiit becomes more absurd when, instead of talking about long-term/archival encryption or the ability to trust old signatures, you look at it from the perspective of point-in-time authentication and encrypted communication streams19:06
fungibasically, the risk in this case is that someone records a packet capture of our ssh session and we transmit sensitive data over it, then many years in the future the surreptitious packet capture they've held onto all that time can finally be decrypted and they'll be able to see what we transmitted over the connection19:07
funginone of which is even sensitive now, much less in the distant future19:08
opendevreviewClark Boylan proposed opendev/system-config master: Rotate the new Gitea replication key into Gitea config
opendevreviewClark Boylan proposed opendev/system-config master: Switch Gerrit replication to a larger RSA key
clarkbok I think that should be mergeable now. Reviewers: please review both changes before approving the first and double check the contents in ansible host vars19:16
clarkbalso maybe wait for CI results to come back just to make sure I didn't do anyting silly19:16
clarkbI am going to help my brother install a new drop in stove (it amazes me that drop in stoves were ever considered a good idea) at 2100 UTC so I'll be afk for a bit otherwise I'm around today to address feedback and/or monitor changes as they land19:24
clarkbmaybe goal should be to get gitea updated today then plan for a gerrit restart to pick up new config tomorrow?19:24
clarkbthe gitea test job logs should also show us successfully adding both keys without removing them (the remove task should be skipped)19:25
clarkbnevermind delivery says thirty minutes away. That is probably better as I can hopefully get that done while waiting for zuul19:30
corvusthey're nice if you keep your ovens in the wall!19:35
fungii keep my ovens in the ceiling so i can elevate my cooking19:37
corvusfungi: above your drop-in-floor-mounted range?19:39
fungiof course, it also doubles as a footwarmer in the winter19:40
clarkbI'm noting that we do sometimes run the gitea and gerrit jobs at the same time. I wonder if there is a way to make test gerrit replicate to test gitea without too much time delays. The naive way to do it would be to have gitea run first which is not short then run gerrit which is also no short19:46
clarkbanyway not something we need to solve right now. Just an idea for something that would be neat to test19:46
clarkbI'm going to pop out now but will try to keep an eye on any feedback from zuul or reviewers while I avoid electrocution19:46
fungirubberize yourself19:47
fungiyep, that looks like it ought to do the trick19:49
corvusclarkb: since gerrit would need the gitea address, i think gerrit pausing after starting then running gitea is the most straightforward way but slow.  another idea would be to combine them into one multi-node job.  that might make sense for this particular pairing.20:14
Clark[m]corvus: ya maybe combining makes sense. Also before electrocution we must do limb removal carpentry work20:29
tonybChange 901628 failed keycloak and nodepool in the deploy pipeline, both with gateway timeouts doing the docker-compose pull.20:35
tonybI'm assuming that this is "okay", and wont block 902008 from running.20:37
Clark[m]tonyb the second change should run regardless and apply both20:55
Clark[m]Since they touch the same code iirc20:55
tonybokay.  I'll keep an eye on that20:56
fungimeeting up with some friends to do holiday things, so disappearing nowish for the evening, but will be around all tomorrow as usual21:32
corvusi'm continuing to see dockerhub errors locally too21:39
corvusthird attempt succeeded; so seems like partial outage21:41
tonybI don't understand the state of things after the 2 runs of the deploy pipeline after 901628 and 902008 merged.  901628 ran all the jobs (keycloak and nodepool failed because partial outage), importantly letsencrypt and service-mirror ran and included all 3 of the new mirror nodes (  Then 902008 ran and letsencrypt failed (possibly because there was no work to be done?)   23:10
tonybso none of the dpenedent jobs (including service-{mirror,keycloak,nodepool}) ran.23:11
tonybI think overall it's fine as across the 2 runs the work we really needed letsencrypt and service-mirror ran and the 3 nodes look okay after looking at SSL, process table and mounted filesystems23:16
Clark[m]tonyb: without being able to check that seems possible. The only thing is the LE job should succeed even when no work is to be done (we run it daily in periodic jobs for example)23:24
tonybokay.  I'll look some more when I get home.  the fact the LE job failed was what worried me.23:30

Generated by 2.17.3 by Marius Gedminas - find it at!