fungi | the infra-root address got the same message too | 00:45 |
---|---|---|
fungi | i really don't know, i generally defer such questions to i18n team folks (ian and seongsoo mainly) | 00:46 |
clarkb | bit of a slow start this morning, but local system is updated which is nice. Once I've caught up on messages I'm going to look into updating gerrit vars for the ssh host keys so that we can move to land that change. | 15:32 |
clarkb | I also realized that synchronizing the gerrit data might require temporarily adding personal ssh keys to the gerrit port 22 ssh user account so that we can sync with the correct perms and take advantage of rsync speedups when updating existing data | 15:32 |
clarkb | I'll update the etherpad with more details on the dns thing and ^ today too | 15:33 |
clarkb | hoping we can actually snyc data today/tomorrow and test the server too | 15:33 |
fungi | wouldn't we just generate a temporary ssh key on the new server and add it to authorized_keys on the old server? | 15:37 |
clarkb | ya that would also work. I aws thinking use ssh -A and then its slithgly less effort? | 15:39 |
clarkb | but we need to do something to make the sync possible from review_site to review_site. Another option would be to sync from review_site to gerrit2/tmp using personal keys and then mv/cp over to review_site which loses some rsync benefits | 15:39 |
fungi | for similar work on the mailman 3 migration i generatd a key on the new server, trusted it on the old server, then ran rsync on the new server to pull files from the old, since i was going to need to do it multiple times | 15:42 |
clarkb | infra-root secret vars should be in place for https://review.opendev.org/c/opendev/system-config/+/947044 now. I updated the host vars for both review02 and review03 to use review02's current keys. Please double check that. I'm pretty sure I didn't copy the wrong new data from review03 but worth ensuring the data I put in place matches review02's content as part of review | 15:59 |
clarkb | ok got a number of updates into https://etherpad.opendev.org/p/i_vt63v18c3RKX2VyCs3 there are a number of todos to prep changes that I'll start on now. Then its a matter of sorting out how to synchronize stuff. Feedback very much welcome | 16:08 |
opendevreview | Clark Boylan proposed opendev/zone-opendev.org master: Reduce the review.o.o record TTL https://review.opendev.org/c/opendev/zone-opendev.org/+/947136 | 16:12 |
opendevreview | Clark Boylan proposed opendev/zone-opendev.org master: Switch review.o.o to review03 https://review.opendev.org/c/opendev/zone-opendev.org/+/947137 | 16:12 |
opendevreview | Clark Boylan proposed opendev/system-config master: Unstage review03.opendev.org https://review.opendev.org/c/opendev/system-config/+/947139 | 16:18 |
clarkb | fungi: re mm3 syncs is there a reason to prefer the orientation you used of having mm2 trust a key on mm3 then mm3 runs the sync vs having mm2 be trusted by mm3 and having mm2 run the sync? | 16:20 |
clarkb | I'm just trying to mental model if there is a benefit to doing things in a specific way that I havne't considered | 16:20 |
fungi | mainly trusting the replacement more than the server that's being replaced | 16:21 |
fungi | since the replacement will presumably live on long after the replaced server has been archived and deleted | 16:22 |
clarkb | got it. Not a case of making rsync more efficient/faster but one of appropriate trust levels | 16:23 |
clarkb | one thing that is different with gerrit compared to mailman is gerrit uses the ssh key in its user for replication and it relies on the default key names for that. But as long as we use a non default for rsync and pass -i keyname we should be fine (you can do that with rsync right?) | 16:25 |
clarkb | looks like you have to construct the ssh command for rsync to use or set up .ssh/config to do that | 16:25 |
clarkb | I guess another option is don't generate another key but trust the replication key | 16:26 |
fungi | yeah, honestly i'd just reuse the replication key and be done with it | 16:26 |
fungi | then there's no additional cleanup to worry about, the only addition is to authorized_keys on the old server which is going to get deleted anyway | 16:27 |
clarkb | makes sense. I added that idea to the etehrpad ns seems like the one we'll likely go with | 16:27 |
clarkb | ok the etherpad is updated with notes for the db dump, copy, restore; git content rsync; and gerrit index rsync | 17:11 |
clarkb | before I proceed it would be good to get feedback on those to ensure I'm not going to do anything stupid as this has the potential to impact the production server | 17:12 |
clarkb | on further inspection the replication key is actually a non default key and we set some .ssh/config to force that key to be used with giteas. I think the easiest solution here is to generate a new ed25519 key on review03 and add its pubkey to authorized_keys on review02. That way we're not impacting any existing keys but we get a default key so rsync doesn't need complicated | 17:18 |
clarkb | configuration. Then when all this is done we can remove that ed25519 key | 17:18 |
clarkb | I'll update the etherpad to state ^ is my plan and let others look that over with the more complete context of the etherpad | 17:18 |
clarkb | https://etherpad.opendev.org/p/i_vt63v18c3RKX2VyCs3 is the etherpad | 17:20 |
corvus | the default gerrit2 ssh key isn't used for anything? like gitea replication? or any gerrit hook scripts? | 17:33 |
clarkb | corvus: it isn't used for gitea replication anymore. It was used but then gitea complained it didn't have enough bits so we added a new flip flop key system to be able to switch keys out in gitea and then .ssh/config selects it | 17:35 |
clarkb | I haven't been able to find anything that would use the old id_rsa key (jeepyb seems to use a key specific to it) | 17:35 |
clarkb | at this point i have a new ed25519 key on 03 that is trusted by 02. | 17:35 |
clarkb | This seems like it should be safe and then easily cleaned up when the host migration is complet | 17:35 |
corvus | sgtm then; that was the only thing that jumped out at me. i left a few supporting comments. | 17:36 |
clarkb | thanks! | 17:36 |
clarkb | I think fungi and I will start on these synchronization tasks around 1800 UTC so that we can get the test node up and running and confirm we get a working gerrit. Then we'll replay what we did in a week and update DNS to make it official | 17:39 |
corvus | clarkb: after your last comment i now fully understand "on further inspection the replication key is actually a non default key" :) | 17:41 |
corvus | s/last/earlier/ | 17:41 |
corvus | (for some reason i was reading that as "key type" or something similar and didn't understand the relevance) | 17:42 |
clarkb | and still no concerns? | 17:43 |
opendevreview | Tim Burke proposed opendev/git-review master: Fix TypeError when reporting old git version https://review.opendev.org/c/opendev/git-review/+/947145 | 18:16 |
corvus | correct | 18:17 |
clarkb | fwiw we're doing the index sync first and it is not very fast :/ | 19:13 |
clarkb | I have a family lunch in a bit and the goal is to start up the git repo sync when the index sync finishes, check that all looks good before popping out. If thinsg look not good we can ^C and stop the sync on review03 and then go to lunch | 19:14 |
clarkb | but that way it can run in the background while we're all busy with other things anyway. Then probably tomorrow morning redo syncs to see how much of a speedup we get and then test the actual service on the new server | 19:14 |
clarkb | the first index sync completed and tool 55 minutes and 6 seconds. The git repo sync is running now and will probably take less time as there is less total data | 19:39 |
clarkb | I think the way my day is going my plan is to run another pass of synchronization tomorrow morning and get an idea of the speedups we can expect from subsequent syncs then some point tomorrow turn on gerrit on review03 | 19:39 |
clarkb | the current git sync is running in a root screen on review03 if anyone notices problems with that it should be safe to attach and ^C and we'll just do a short sync that needs catching up later | 19:40 |
clarkb | and the git sync finished | 19:55 |
clarkb | the screen is still up on review03 but I'm going to leave things as is for the afternoon so that I can eat lunch and work on other stuff. Will pick this back up tomorrow | 19:59 |
clarkb | When I get back from lunch I'll start putting a meeting agenda together. Let me know if there are any edits you'd like to see and i'll do my best to get to then | 19:59 |
clarkb | I've updated the meeting agenda. Anything else we need to add? | 22:09 |
clarkb | last call on meeting agenda updates. I'll probably send it out not long after 23:00 UTC | 22:42 |
opendevreview | James E. Blair proposed opendev/system-config master: Mirror node:22 https://review.opendev.org/c/opendev/system-config/+/947160 | 22:44 |
corvus | i'm restarted zuul sched/web/launcher to pick up the change that will allow us to set image import timeouts | 22:48 |
clarkb | corvus: that will also pickup the gerrit event stream update? | 22:49 |
clarkb | (something to keep in mind I guess if we notice any oddities with that) | 22:49 |
corvus | sure will :) | 22:49 |
clarkb | ok I tried to write a good number of test cases and it should largely noop for opendev since we won't set the timeout | 22:50 |
clarkb | but ... something to keep an eye on | 22:50 |
corvus | okay, we should be on the new code now as the second batch is starting up | 22:59 |
opendevreview | James E. Blair proposed opendev/zuul-providers master: Add import timeout for rax classic https://review.opendev.org/c/opendev/zuul-providers/+/947163 | 23:00 |
corvus | and that should exercise both things :) | 23:00 |
corvus | it got an instant +1 so that's good. | 23:01 |
clarkb | +2 from me | 23:01 |
corvus | James E. Blair: Uploaded patch set 1. (2025-04-14 23:00:18+0000) | 23:01 |
corvus | Zuul: Patch Set 1: Verified+1 (2025-04-14 23:00:25+0000) | 23:01 |
corvus | that's < 10 seconds, so i think the change is in effect :) | 23:02 |
clarkb | ++ | 23:02 |
opendevreview | Merged opendev/zuul-providers master: Add import timeout for rax classic https://review.opendev.org/c/opendev/zuul-providers/+/947163 | 23:02 |
corvus | this is so fast :) | 23:02 |
clarkb | zoom zoom | 23:02 |
clarkb | would be good to exercise it with a change that runs jobs too | 23:02 |
corvus | Last reconfigured: | 23:03 |
corvus | Mon, Apr 14, 2025 11:02 PM | 23:03 |
corvus | that change is in production :) | 23:03 |
clarkb | let me send the meeting agenda then I can look to see if I have any changes. | 23:03 |
clarkb | hrm nothing easily convenient to update. I guess I can push something throwaway | 23:06 |
opendevreview | Clark Boylan proposed opendev/system-config master: DNM rebuild gitea https://review.opendev.org/c/opendev/system-config/+/947165 | 23:08 |
clarkb | that should be a sufficiently interesting change to exercise things | 23:08 |
clarkb | and it queued up and has node requests which implies the merge happened? So ya seems like things are streaming in | 23:08 |
clarkb | and jobs are running | 23:13 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!