Monday, 2024-04-22

opendevreviewAlberto Gonzalez proposed openstack/project-config master: Add new repository powertrain-build  https://review.opendev.org/c/openstack/project-config/+/91603806:49
opendevreviewAlberto Gonzalez proposed openstack/project-config master: Add new repository powertrain-build  https://review.opendev.org/c/openstack/project-config/+/91603806:57
opendevreviewDmitriy Rabotyagov proposed openstack/project-config master: Move vexxhost/ansible-role-frrouting to openstack namespace  https://review.opendev.org/c/openstack/project-config/+/91001808:22
opendevreviewDmitriy Rabotyagov proposed openstack/project-config master: Move vexxhost/ansible-role-frrouting to openstack namespace  https://review.opendev.org/c/openstack/project-config/+/91001808:22
opendevreviewMerged opendev/system-config master: gitea: move robots.txt to public directory  https://review.opendev.org/c/opendev/system-config/+/91642012:19
frickler^^ verified new image deployed and running on all gitea instances, all serving the moved robots.txt file with updated timestamp12:49
fungifrickler: thanks! and no more error about serving that from a legacy path, i take it?12:58
fricklerfungi: yes, that message is gone now, too, thx for the reminder to check it :)13:10
fungiperfect!13:16
clarkbthank you!15:04
clarkbinfra-root: if you have time over the next few hours pelase review the project rename plan etherpad and related gerrit changes: https://etherpad.opendev.org/p/opendev-project-renames-20240422 (change links are in the etherpad)15:05
clarkbI need to copy the renames.yaml file to bridge but otherwise I think everything should be prepped unless you find concerns or issues15:05
clarkbwe will begin renaming in just under 5 hours. cc noonedeadpunk 15:09
clarkbok I've created the little staging dir on bridge and copied the rename yaml file into it15:32
clarkbthat was the last todo I had in terms of prep other than review15:33
frickleris it a known issue that the gitea links on the gerrit repo tags page are wrong? like https://opendev.org/openstack/sushy-oem-idrac/src/tag/refs/tags/5.0.0 instead of https://opendev.org/openstack/sushy-oem-idrac/src/tag/5.0.0 ?16:03
frickler(on https://review.opendev.org/admin/repos/openstack/sushy-oem-idrac,tags)16:03
clarkbno that is news to me. `tag = ${project}/src/tag/${tag}` is what we have in gerrit's config that sets that16:04
clarkbwe probably need a different ${tag} variable that doesn't include the refs/tags/ prefix16:05
clarkbhttps://gerrit-review.googlesource.com/Documentation/config-gerrit.html#gitweb indicates there isn't currently a different option there16:06
* noonedeadpunk around16:18
clarkbnoonedeadpunk: mostly just a heads up that we plan to do the rename later today and to double check things. I had to rebase your change to address merge conflicts16:20
noonedeadpunkyeah, thanks a lot for that16:20
fungirename will be happening in a little over 3.5 hours16:21
clarkbI just checked ssh (not from bridge but from my local network so maybe not the best test) and all hsots are up. I however cannot ssh into storyboard-dev01 with my new key because it is in the emergency file16:25
clarkbthe playbooks does not honor the emergency list which means we'll rename things there anyway. fungi you're on the emergency file entry comment not sure if that is a problem for you16:25
fungii think it's fine, we've had it in there for 4 years already since the attachments work stalled on patches that were behind the upgrade to python 316:27
clarkback16:28
fungiif the playbook didn't respect the emergency file the last dozen times we ran it, then it's probably still fine16:28
fungialso it doesn't appear to have used or be using storyboard anyway, so i don't expect it's going to end up doing any update queries there16:29
fungithe ansible-role-frrouting project i mean16:29
clarkbeven better16:30
fungimaintenance plan lgtm. i added the urls to test and updated the patchset number in the submit command17:12
fungii'm also +2 on both the changes17:12
clarkbthanks!17:12
clarkbI added a #status notice message to step three in the etherpad18:13
clarkbfeel free to edit with more or less info18:13
clarkbI'm going to try and grab early lunch in a little bit. I'll stick servers in the emergency file before I do that (so that step will get done a little early)18:22
fungisounds good18:29
clarkbfungi: so I can't ssh to storyboard-dev01 from bridge due to ssh keys not matching18:36
clarkbfungi: I suspect that fixing that will be quicker than landing a chagne to remove storyboard-dev01 from the playbook18:36
clarkbas an alternative we could use a --limit on the ansible-playbook command18:36
clarkbfungi: any chance you can look at storyboard-dev01 and see if we can/should fix ssh from bridge to it and it not then we can fallback to the limit I think18:38
fungiclarkb: yeah, i'm trying18:45
clarkbok you must've just fixed it because I'm seeing authorized_keys file match between storyboard and storyboard-dev18:45
clarkbinterestingly it still fails18:45
fungithe "from" restriction was for the old bridges, yeah18:46
fungii updated that but it doesn't seem to have helped18:46
clarkbI think sshd_config is preventing it too18:47
clarkbI'm somewhat inclined to say maybe we use the --limit since ansible hasn't hopped onto this host in ever18:47
fungi/var/log/auth.log records it as refused, yeah18:47
clarkband then we remove the dev server from the rename playbook18:47
clarkbor we use a local checkout of system-config and update the playbook to remove the one play for storyboard-dev instead of using --limit18:48
clarkbbecause who knows what else may fail if we try to connect to this server with ansible18:48
fungithe reason we have been renaming there is that manage-projects still tries to update it too18:48
fungiso we'd need to remove it there first18:49
clarkbI'm having a hard time understanding how this ever worked with the rename playbook though18:49
clarkbalso I don't see where manage-projects runs on storyboard-dev18:49
clarkbthe manage-projects.yaml playbook runs against gitea and gerrit18:50
fungiokay, got it fixes for now18:51
fungier, fixed18:51
fungiif a project specified use-storyboard, manage-projects added it to storyboard and storyboard-dev when it didn't exist. maybe we've already pared that down?18:52
fungiand yeah, we had a similar ip address restriction in /etc/ssh/sshd_config that also needed updating18:52
fungifollowed by an ssh service restart18:53
clarkbfungi: looks like puppet runs jeepyb things but since storyboard-dev is in emergency and ahs been for some time puppet won't run there18:53
clarkbso ya it just never touches storyboard-dev I don't think18:53
fungiyeah, so probably fine to remove18:54
clarkbright but what do we want to do for this run18:54
clarkbbecause I don't think we can land a change to remove it from the playbook quickly enough18:54
fungiit should be fine anyway now that bridge can ssh again18:54
clarkbI think our options are let it run as is and see what breaks if anything, use --limit to exclude that server, or use a copy of the playbook in a checkout of the repo with the playbooked edited to remove that play18:55
clarkbsounds like you think we should go ahead with option 1 I guess I'm good with that and we can change our approach if that fails18:55
clarkbfungi: oh actually I think it will fail because that repo is unlikely to be in storyboar-dev at all18:55
clarkbso the sql update won't have any rows to update? or maybe that isn't a failure with mysql18:56
clarkbya actually that should be a non error right? it will just update zero rows?18:56
fungiit's not a failure with sql update queries. will just match 0 rows18:56
fungiand therefore no-op18:56
clarkbok then ya I think we can proceed with the playbook as is and address it if there is a problem18:56
clarkbin that case steps 2 and 2.5 are both done, but feel free to double check18:57
clarkband I'm going to eat a sandwich18:57
clarkbI put a clone of system-config in our tmpdir for the rename just in case we end up needing a place to edit that playbook19:08
clarkbI've started a screen and asked it to log into our working dir19:31
clarkbthe timestamp on the rename_repos.yaml playbook on bridge is quite a bit newer than I expected, However the content looks correct to me19:34
fungitimestamp is when you did the git clone, i think?19:36
clarkbfungi: I mean on the file in /home/zuul/.../etc19:38
clarkbnot the new clone I made19:38
clarkbbut ya maybe that aligns with zuul jobs updating the repo19:38
clarkbWith 10 minutes to go before our announced window I guess now is a good time to double check that we're still happy to proceed? and if so any updates to make to the proposed #status notice message?19:49
fungieverything lgtm including the status notice19:52
clarkbgreat we'll get started in a few minutes then19:53
fungithough if you wanted you could also append https://lists.opendev.org/archives/list/service-announce@lists.opendev.org/message/KP6NCOKJEYRGFD5FS26CZPVLEKFSY2ZO/ for reference19:53
fungii always like doing that because it serves as a subtle reminder that you can get advance warning of this stuff if you subscribe to the service-announce ml19:54
clarkbproblem is those links are so long now :)19:56
clarkbbut still within the irc message limit so ya I'll do that.19:56
clarkbok it is 2000 UTC. I'm going to send the notice now20:00
clarkb#status notice Gerrit will be offline for a short time while we rename a project repo. https://lists.opendev.org/archives/list/service-announce@lists.opendev.org/message/KP6NCOKJEYRGFD5FS26CZPVLEKFSY2ZO/ for more details20:00
opendevstatusclarkb: sending notice20:00
fungithanks!20:00
-opendevstatus- NOTICE: Gerrit will be offline for a short time while we rename a project repo. https://lists.opendev.org/archives/list/service-announce@lists.opendev.org/message/KP6NCOKJEYRGFD5FS26CZPVLEKFSY2ZO/ for more details20:00
clarkbfungi: I put the command in screen commented out then ls'd the file paths just to make sure all is well. I guess I go ahead and run the command as soon as the bot is done sending the notice?20:02
opendevstatusclarkb: finished sending notice20:03
clarkbfungi: you good with me running the playbook now?20:03
clarkboops looks like you had it in copy mode. sorry if I dropped it out of that20:04
fungithis is the screen session on bridge?20:04
clarkbyes20:04
funginot seeing the commented command20:04
clarkbits a few lines up20:04
clarkbI did two ls's and a cat after it20:04
fungiaha the ansible-playbook command20:05
fungiyep, lgtm20:05
clarkbyes, that is what will actually do the rename20:05
clarkbok I'm proceeding to invoke that command now20:05
fungitonyb: ^ you had mentioned being around for this, so fyi20:06
fungilookin' good so far20:07
clarkbok playbook is done with no errors20:10
fungican confirm, yes20:12
clarkbeverything I've checked so far lgtm20:12
fungii see a working redirect in gitea20:12
clarkbthose three links you have for example as well as general gerrit web ui20:12
fungigerrit project link has content, yes20:12
clarkbI'm going to look at gerrit queues next then we can decide if we want to forc merge the project-config change with or without hosts in the emergency file20:13
fungiold project link in gerrit is a 404 too20:13
fungiand changes have been moved according to the query url20:13
fungiquery for the old url turns up an empty list20:13
clarkbshow queue doesn't show any replication tasks. I think one of the reasons we were particularly concerned about replication and letting jobs run and then noop is that gerrit would replicate all the things on startup then anything that merged would be behind that list20:14
clarkbsince gerrit doesn't do that anymore and I've just confirmed that there is nothing in the replication queue I'm reasonably happy to see what happens if we force merge with hosts outside of the emergency file20:15
clarkbmaybe we wait for the index queue to complete first though?20:15
fungiyeah, i think we can20:16
clarkbwe're down to 1939 indexing tasks and had 2173 when I first checked20:16
fungithe only real risk is if we have in-flight project changes that could merge in the interim and rerun manage-projects with the wrong state, i think?20:17
clarkbfungi: yes20:17
fungioh, though it should result in skipping the servers in emergency too20:17
clarkbright the risk begins when we remove the servers from the emergency file if we want to land the change and have it run normally to noop20:17
clarkbI have not removed any servers from the emergency file yet so we're good for now20:18
fungiand the waiting change should also be a no-op since we made the updates out of band to reflect its intended eventual state20:18
clarkbexactly20:18
clarkbI think what we can do is wait for gerrit to settle the index queue (just so there isn't a bunch of stuff fighting for cpu time), then ermove hosts from the emergency file and force merge the project-config change. Then check that it replicates and applies normally20:18
clarkbnow 1741 indexing tasks. It seems to be moving fairly quickly20:19
fungiit's way faster than it used to be20:19
clarkbya they chunk it up by estimated work now rather than by naive project order20:20
clarkband they can split larger projects into multiple tasks so each task has a ceiling of effort (roughly)20:20
clarkbresults in much better scheduling of effort20:20
clarkbfungi: are you good with my proposed plan or do you think we should force merge with things still in emergency then check replication and then either reenqueue the manage-project jobs or just wait for the daily run later today?20:21
clarkbbut ya thinking back on it the main concern was gerrit replication could lag because gerrit would replicate everythign on startup. It doesn't do that anymore so I don't think w eneed to go through that crazy dance20:22
fungii agree we should be able to simplify this20:27
fungii suppose if there are multiple rename changes then we need to keep the emergency file set for all but the last one yeah?20:28
clarkbcool in that case we wait for a few more minutes for reindexing to end and then plan the force merge20:28
clarkbfungi: ++20:28
clarkbdown to 853 now20:28
clarkbdown to 10420:33
clarkbfungi: do you want to escalte privs to force merge or should I?20:33
clarkbyou reviewed the change and I did a reabse of it. Might be good to have you do the merging?20:33
clarkband I'll let you know when the emergency file is cleaned up and that can happen20:33
fungican do20:34
fungitechnically the pad is missing the gerrit set-members --add command20:35
clarkbits on line 4320:35
fungioh, i missed it because of the optional step. never mind!20:36
clarkbit also used the wrong hostname for the vote setting command. I've just fixed that20:36
clarkbreidnexing is done. I'll remove hosts from the emergency file now20:38
clarkbfungi: that is done. I did a cat of the file in the screen to confirm it too20:39
clarkbI think you can merge the chagne when you are ready20:39
opendevreviewMerged openstack/project-config master: Move vexxhost/ansible-role-frrouting to openstack namespace  https://review.opendev.org/c/openstack/project-config/+/91001820:40
fungiclarkb: tonyb: ^20:40
clarkbthanks replication happened almost instantly and the chagne shows up at https://opendev.org/openstack/project-config/commits/branch/master20:41
clarkbthe manage projects-job is running now and should noop, once we've confirmed that we can land the recording change and I think we're done20:41
clarkbthe playbook is done and it ran quickly enough that I suspect it did noop20:44
clarkbchecking gitea and gerrit links again20:44
fungiyeah, looks good to me still20:46
clarkbya the gitea redirect is still there. In gerrit I get an error trying to open the vexxhost project name and if I search by the vexxhost project name I get no changes and that name doesn't show in the autocomplete list20:46
fungiso it didn't recreate the old project, which is the biggest concern20:46
clarkbcorrect20:46
fungior would be the biggest concern if it had, i mean20:46
clarkbmerging https://review.opendev.org/c/opendev/project-config/+/916323 is the last thing in the todo list20:47
clarkbwhcih can happen normally 20:47
clarkbI also made a note of what we did at the end of the etherpad20:47
clarkbI have self approved 91632320:48
fungioh, i also approved it20:48
opendevreviewMerged opendev/project-config master: Add record for planned rename on April 22, 2024  https://review.opendev.org/c/opendev/project-config/+/91632320:48
clarkbfungi: should I go ahead and exit the screen as well?20:48
fungisure20:48
clarkbthe log is in our working dir so we've got that captured if necessary20:48
clarkbI'm going to try and take a break for a bit as I ended up eating a sandwich in about 5 minutes and then getting back to my office for lunch. But I think this can be considered done and seems to have gone well. When I get back I'll put our weekly meeting agenda together. Now is a good time to add items if you have them20:49
fungiyeah, gonna warm up some leftovers for dinner20:54
clarkblooks like fungi added an agenda item. I added a couple and pruned some others. I'll get this sent out at about 23:00 UTC.22:14
fungicool, thanks!22:18
clarkbfungi: if you have time I did end up pushing two changes for mailman web hosting to add a robots.txt and also add the UA filtering22:19
clarkbI think we can probably land those changes alongside the zuul changes that do similar if/when you are happy with them22:19
clarkbthere is also the glean change for python3.12 support https://review.opendev.org/c/opendev/glean/+/91590722:19
fungii thought i reviewed them but will double-check22:20
opendevreviewMerged opendev/glean master: Use importlib when pkg_resources isn't available  https://review.opendev.org/c/opendev/glean/+/91590723:20

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!