*** rlandy is now known as rlandy|out | 00:15 | |
opendevreview | Neil Hanlon proposed openstack/diskimage-builder master: Add new container element - Rocky Linux https://review.opendev.org/c/openstack/diskimage-builder/+/825957 | 00:19 |
---|---|---|
opendevreview | Harald Jensås proposed openstack/diskimage-builder master: dhcp-all-interfaces: opt let NetworkManager doit. https://review.opendev.org/c/openstack/diskimage-builder/+/825983 | 00:46 |
opendevreview | Neil Hanlon proposed openstack/diskimage-builder master: Add new container element - Rocky Linux https://review.opendev.org/c/openstack/diskimage-builder/+/825957 | 01:27 |
opendevreview | Ian Wienand proposed opendev/grafyaml master: Generate and use UID for acessing dashboards https://review.opendev.org/c/opendev/grafyaml/+/825990 | 01:47 |
opendevreview | Ian Wienand proposed opendev/system-config master: grafana: update to oss latest release https://review.opendev.org/c/opendev/system-config/+/825410 | 01:52 |
fzzf[m] | <clarkb> "fzzf: you need to ensure you..." <- Hi, Is there any way to only monitor the events of the manila project? I have added a lot projects in the zuul tenant configuration. | 02:03 |
fzzf[m] | I use zuul as CI, but after submit manila, CI doesn't seem to be triggered, where to check | 02:05 |
*** melwitt is now known as Guest344 | 02:26 | |
*** chkumar|rover is now known as chandankumar | 03:03 | |
*** bhagyashris is now known as bhagyashris|ruck | 03:04 | |
opendevreview | Ian Wienand proposed openstack/diskimage-builder master: Switch 9-stream testing to use opendev mirrors https://review.opendev.org/c/openstack/diskimage-builder/+/821651 | 04:05 |
opendevreview | Ian Wienand proposed openstack/diskimage-builder master: Add debian-bullseye-arm64 build test https://review.opendev.org/c/openstack/diskimage-builder/+/821652 | 04:05 |
opendevreview | Ian Wienand proposed openstack/diskimage-builder master: Add 9-stream ARM64 testing https://review.opendev.org/c/openstack/diskimage-builder/+/821653 | 04:05 |
opendevreview | Ian Wienand proposed openstack/diskimage-builder master: debian-minimal: remove old testing targets https://review.opendev.org/c/openstack/diskimage-builder/+/821654 | 04:05 |
opendevreview | Ian Wienand proposed openstack/diskimage-builder master: Switch 9-stream testing to use opendev mirrors https://review.opendev.org/c/openstack/diskimage-builder/+/821651 | 05:26 |
opendevreview | Ian Wienand proposed openstack/diskimage-builder master: Add debian-bullseye-arm64 build test https://review.opendev.org/c/openstack/diskimage-builder/+/821652 | 05:26 |
opendevreview | Ian Wienand proposed openstack/diskimage-builder master: Add 9-stream ARM64 testing https://review.opendev.org/c/openstack/diskimage-builder/+/821653 | 05:26 |
opendevreview | Ian Wienand proposed openstack/diskimage-builder master: debian-minimal: remove old testing targets https://review.opendev.org/c/openstack/diskimage-builder/+/821654 | 05:26 |
*** chandankumar is now known as raukadah | 05:54 | |
opendevreview | Dr. Jens Harbott proposed opendev/system-config master: Add docs for restoring an etherpad https://review.opendev.org/c/opendev/system-config/+/826017 | 06:35 |
frickler | clarkb: ^^ I adapted that from a comment from you that I found in my irc logs, please doublecheck | 06:53 |
*** bhagyashris is now known as bhagyashris|ruck | 07:05 | |
*** ysandeep is now known as ysandeep|afk | 07:07 | |
*** ysandeep|afk is now known as ysandeep | 07:45 | |
opendevreview | Michal Nasiadka proposed openstack/diskimage-builder master: Add rocky-container element https://review.opendev.org/c/openstack/diskimage-builder/+/826027 | 08:33 |
*** jpena|off is now known as jpena | 08:36 | |
*** ysandeep is now known as ysandeep|brb | 08:37 | |
*** ysandeep|brb is now known as ysandeep | 08:44 | |
*** odyssey4me is now known as Guest380 | 09:43 | |
*** bhagyashris_ is now known as bhagyashris|ruck | 09:52 | |
*** rlandy|out is now known as rlandy|ruck | 11:14 | |
*** dviroel|out is now known as dviroel | 11:22 | |
rlandy|ruck | hello - anyone with merge rights on project-config, can you take a look at https://review.opendev.org/c/openstack/project-config/+/825435? | 11:23 |
*** marios_ is now known as marios | 11:37 | |
opendevreview | Eduardo Santos proposed openstack/diskimage-builder master: Fix openSUSE images and bump them to 15.3 https://review.opendev.org/c/openstack/diskimage-builder/+/825347 | 11:57 |
*** ysandeep is now known as ysandeep|afk | 12:07 | |
rlandy|ruck | fungi: thanks for the review of https://review.opendev.org/c/openstack/project-config/+/825435 | 12:46 |
fungi | sure thing | 12:47 |
*** ysandeep|afk is now known as ysandeep | 13:13 | |
*** amoralej is now known as amoralej|lunch | 13:18 | |
*** amoralej|lunch is now known as amoralej | 14:08 | |
*** rcastillo|out is now known as rcastillo|rover | 14:09 | |
opendevreview | Neil Hanlon proposed openstack/diskimage-builder master: Add new container element - Rocky Linux https://review.opendev.org/c/openstack/diskimage-builder/+/825957 | 14:27 |
*** bhagyashris is now known as bhagyashris|ruck | 14:35 | |
opendevreview | Neil Hanlon proposed openstack/diskimage-builder master: Add new container element - Rocky Linux https://review.opendev.org/c/openstack/diskimage-builder/+/825957 | 14:57 |
*** dviroel is now known as dviroel|lunch | 15:01 | |
*** ysandeep is now known as ysandeep|out | 16:05 | |
*** dviroel|lunch is now known as dviroel | 16:14 | |
clarkb | frickler: lgtm thanks! | 16:16 |
clarkb | anyone know if we have a change to flip the gerrit image to 3.4 in system-config yet? If not I can get that up. I'll also work on going over the etherpad, adding a reboot step and trying to combine the renames and upgrades abit more | 16:17 |
*** Guest344 is now known as melwitt | 16:18 | |
clarkb | infra-root I'm working on updating the etherpad plan for today's gerrit work and one slightly awkward thing is landing the chagnes to reflect the new realities. For manage projects we land the projcet-config updates after we rename. ianw has proposed in his upgrade doc that we land the 3.4 gerrit image update change first since we don't auto restart/upgrade gerrit. | 16:29 |
clarkb | The problem is we'll have gerrit in the emergency file or we need to take the time in the middle to ensure that project-config runs successfully before proceeding with the upgrade. The doc I've got punts everything to the end but if we want to pause in the middle which may take some time we can do that instead | 16:30 |
frickler | clarkb: hmm, is it possible that the etherpad revert is no longer working? wget doesn't give any output, trying with curl I see: {"code":3,"message":"no such function","data":null} and the pad seems unchanged | 16:32 |
frickler | trying to revert to https://etherpad.opendev.org/p/kolla-ansible-letsencrypt-https/timeslider#3594 | 16:32 |
clarkb | frickler: oh I think your url may need to specify the api version | 16:32 |
clarkb | frickler: it is likeyl that v1 doesn't have that function but a newer version does | 16:33 |
clarkb | frickler: https://etherpad.org/doc/v1.8.4/#index_restorerevision_padid_rev ya api 1.2.11 adds that function. I think you can replace 1 in the url with 1.2.11 | 16:33 |
*** amoralej is now known as amoralej|off | 16:36 | |
frickler | clarkb: yes, that did the trick, thx. I'll update the docs patch with it | 16:37 |
opendevreview | Dr. Jens Harbott proposed opendev/system-config master: Add docs for restoring an etherpad https://review.opendev.org/c/opendev/system-config/+/826017 | 16:39 |
opendevreview | Merged openstack/project-config master: Adding nested-virt pools for centos-9-stream https://review.opendev.org/c/openstack/project-config/+/825435 | 16:55 |
clarkb | heads up Gerrit is asking if anyone uses/relies on the file is reviewed functionality in the web UI. Sounds like the number of queries that makes can make gerrit quite slow for some changes. I'm letting them know we didn't migrate the DB when we moved servers last and no one complained. | 16:57 |
clarkb | and that removing it would simplify our CI as we now have to check it works due to previous issues with the mariadb jdbc connector | 16:57 |
*** rlandy is now known as rlandy|ruck | 17:11 | |
frickler | clarkb: oh, so far I think I only heard about the marks getting lost during an upgrade, not about removing the feature completely. it sure is helpful for me sometimes, a) by showing which files were unchanged from a previous PS and reviewed there, b) allowing for easier restart if I had to interrupt a review halfway through | 17:12 |
noonedeadpunk | hey! We have some issue while trying to drop centos-8 jobs - might be you can suggest some solution.... So our Xena patch https://review.opendev.org/c/openstack/openstack-ansible/+/824567 fails on W having jobs and V https://review.opendev.org/c/openstack/openstack-ansible/+/824570 failing because of X | 17:18 |
noonedeadpunk | I'm not sure I understand what's really wrong with them | 17:18 |
noonedeadpunk | ah, we likely get job mentioned elsewhere.... | 17:20 |
noonedeadpunk | I can't find where though | 17:21 |
noonedeadpunk | also a bit weird that it reference cross-branch | 17:22 |
clarkb | frickler: I can followup with those pieces of info if you like | 17:23 |
clarkb | noonedeadpunk: usually when that happens it implies there is a stray branch: matcher somewhere | 17:27 |
clarkb | noonedeadpunk: I see that your ubuntu bionic jobs have bad branch matchers for example. But I haven't found any for centos 8 yet | 17:28 |
noonedeadpunk | you mean these experimantal ones? https://opendev.org/openstack/openstack-ansible/src/branch/stable/xena/zuul.d/jobs.yaml#L237-L261 | 17:31 |
* noonedeadpunk wishes codesearch searching on specific branches | 17:32 | |
clarkb | noonedeadpunk: yes everye one of your stable branches define that job as applying to devel and master which means when the job runs on devel and master it is going to get config from every stable branch | 17:33 |
clarkb | which is almost definitely not what you want and that causes problems like the one you are seeing now with the other jobs | 17:33 |
fungi | yeah, branch matchers in a multi-branch repository is almost never what you want. either use branch matchers in another single-branch repository, or only put the configuration in the branches where you want it to apply | 17:38 |
clarkb | I'm still not sure why the specific complaints above are happening though | 17:39 |
*** jpena is now known as jpena|off | 17:40 | |
clarkb | fungi: if you get a chance can you look over the gerrit downtime etherpads and if possible form an opinion on whether or not we should do the renames first and complete that with successful runs on project-manage or if we should try and combine them a bit more and do the project-manage and merging post upgrade | 17:40 |
clarkb | I think that is the biggest open question for me today on the gerrit stuff | 17:40 |
noonedeadpunk | we can easily drop this job from every branch if it's the one that causing issues. I believe when implementing, branches there were aimed to reffer this extra defined required-project but I doubt that this setup ever worked or was used | 17:43 |
jrosser | noonedeadpunk: https://paste.opendev.org/show/812330/ | 17:44 |
jrosser | on W we use the centos-8 job but only define the centos-8-stream one | 17:45 |
noonedeadpunk | we've dropped that with https://review.opendev.org/c/openstack/openstack-ansible/+/824569 ? | 17:45 |
clarkb | jrosser: noonedeadpunk ya I am beginning to suspect that creating the variant in the project-template causes zuul to do lookup resolution for that job and it finds it in xena and uses that definition | 17:45 |
clarkb | if you were to leave off the voting: false then zuul should say the job doesn't exist or just ignore it I think. But since you create a variant in the template itself the job is resolved | 17:46 |
noonedeadpunk | doh | 17:46 |
clarkb | and once you remove it from xena it can no longer resolve that job | 17:46 |
jrosser | i think we have tripped over this before | 17:47 |
jrosser | it would be nice to be able to choose that jobs were not visible across branches and just failed hard straight away | 17:48 |
clarkb | I think that is the default before for most of the zuul config. But creating a variant is like creating a new job | 17:48 |
clarkb | the implicitness of the variant is what is confusing | 17:48 |
jrosser | ahhh ok | 17:48 |
clarkb | when you say voting: false there you create an entire new job config for wallaby | 17:48 |
clarkb | which is different than saying just run this job. Because in the just run this job cause it would fail since the job doesn't exist | 17:49 |
clarkb | in the make me a new job case it does its best to make the job and then it becomes runable | 17:49 |
clarkb | That said maybe the variants shouldn't be so greedy across branches | 17:49 |
clarkb | corvus: ^ | 17:49 |
noonedeadpunk | clarkb: thank for help and your time! | 17:54 |
*** prometheanfire is now known as Guest414 | 17:54 | |
clarkb | you're welmcome | 18:01 |
*** sshnaidm is now known as sshnaidm|afk | 18:05 | |
opendevreview | Clark Boylan proposed opendev/system-config master: Manage apt.conf.d/20auto-upgrades https://review.opendev.org/c/opendev/system-config/+/826145 | 18:22 |
opendevreview | Clark Boylan proposed opendev/system-config master: Replace 10periodic with 20auto-upgrades https://review.opendev.org/c/opendev/system-config/+/826146 | 18:22 |
clarkb | infra-root ^ that is followup on a thing we discovered last week | 18:23 |
clarkb | I'm not 100% sure I got the 10periodic -> 20auto-upgrades replacement in debuntu packaging correct so I split the cahgnes in two and reviewers can double check that. I'm happy to squash the cahnges if we prefer too | 18:23 |
clarkb | But the first one should be very safe to land as it reflects the state of most of our servers | 18:24 |
clarkb | frickler: gerrit has clarified that it is only the automatic setting of the reviewed flag on files that they are looking at changing. I guess you would be able to make it explicitly if you like | 18:29 |
clarkb | frickler: is the automatic flip important to you are is it sufficient to flip it manually? | 18:30 |
*** Guest414 is now known as prometheanfire | 18:47 | |
opendevreview | Clark Boylan proposed opendev/system-config master: Upgrade Gerrit to 3.4 https://review.opendev.org/c/opendev/system-config/+/826148 | 18:56 |
opendevreview | Clark Boylan proposed opendev/system-config master: Manage apt.conf.d/20auto-upgrades https://review.opendev.org/c/opendev/system-config/+/826145 | 19:17 |
opendevreview | Clark Boylan proposed opendev/system-config master: Replace 10periodic with 20auto-upgrades https://review.opendev.org/c/opendev/system-config/+/826146 | 19:17 |
clarkb | I missed the testinfra testing for those files. Should be better now | 19:17 |
fungi | clarkb: sorry, i've been pulled in 50 different directions today. looking over the upgrade plan again now | 19:26 |
fungi | we're planning to start downtime at 2200z right? | 19:26 |
clarkb | yes 2200z is what we emaile dabout | 19:26 |
clarkb | I think the big open question is whether or not we're trying to combine steps for the two different actions or if we want to run the rename to full completion first then start the upgrade after | 19:27 |
clarkb | My only concern with doing them as completely separate tasks is the amount of time we might have to deal with with zuul. I suppose we can always dequeue/enqueue in order to speed things up though so maybe doing one then the other each to completion is fine | 19:27 |
fungi | where "full completion" means bringing gerrit back online and waiting for changes to merge and deploy? | 19:28 |
clarkb | yes | 19:28 |
clarkb | the alternative is wait for gerrit to come back online and spot check the rename was fine. Then do the upgrade. Then merge and deploy the upgrade and rename changes together | 19:28 |
clarkb | But it would also be good for people to double check the rename input file in the opendev/project-config change and review the changes to land after the rename and the upgrade image change change | 19:29 |
fungi | and i guess the reason we don't want to do the rename after the upgrade is that it complicates any potential version rollback? | 19:31 |
clarkb | yes | 19:31 |
clarkb | also 3.4 has never had renames done in production. We do test renames against 3.3 and 3.4 though | 19:33 |
fungi | both very good points | 19:33 |
clarkb | (we have done one rename on 3.3) | 19:35 |
clarkb | Overall I'm reasonably confident in both tasks. More just concerned about how best to combine them. | 19:37 |
clarkb | https://etherpad.opendev.org/p/project-renames-2022-01-24 is the etherpad if anyone needs it | 19:39 |
fungi | so given all the above, i agree rename first and then upgrade is the way to go, we're not stopping/starting zuul so i suppose the runtime for merging the changes won't be bad, though the deploy to run manage-projects likely will be dicey | 19:40 |
fungi | the manage-projects run should be a noop, so it's really the config update we care about, right? | 19:42 |
clarkb | yes its mostly just a sanity check to ensure that the change lands to reflect the new state and that when we run manage-projects it noops | 19:42 |
fungi | maybe we could speed it along by manually updating the projects.yaml? | 19:42 |
clarkb | We don't expect it to get any real work done, just confirming it runs as expected | 19:42 |
fungi | we could manually run manage-projects too | 19:43 |
clarkb | Ya I think what we could do is dequeue the hourly jobs. Then when we force merge teh change it should enqueue and run right away? | 19:43 |
fungi | as long as we don't have half a dozen other deploy jobs triggered by the same merge which have to run before it | 19:44 |
clarkb | I don't think we will it should just be review and manage projects iirc | 19:44 |
clarkb | it is triggered by project-config which has very lmited deploy actions | 19:44 |
clarkb | So ya maybe the thing to do is run the rename to completion and manually dequeue the hourly jobs if necessary, remove the hosts from emergency, force merge project-config update, ensure it runs happily, then put review back into emergency? | 19:45 |
clarkb | Then we can roll through ianw's process pretty much as is | 19:45 |
fungi | in theory we could simply approve the project-config change so that it can merge normally, but if you want to bypass zuul to merge it then that does save a bit of time | 19:47 |
fungi | and yes, the real risk is that we end up merging some other change to projects.yaml first which triggers a manage-projects run and recreates the old project names | 19:48 |
fungi | if we're careful not to approve any other changes then it's just a question of whether we want to incur the gating delay | 19:48 |
clarkb | I think we do have to force merge it because zuul won't know about the new project yet (I don't know that merging the chagne will trigger zuul jobs either) | 19:48 |
clarkb | fungi: well its not just that because the other option is to do it after the 3.4 upgrade | 19:49 |
clarkb | which is msotly what I'm wondering about. Should we try to combine the post change merges for the rename and upgrade for after both are done | 19:49 |
clarkb | or do them sequentially | 19:49 |
clarkb | I think force merging is a given for the project-config change either way | 19:49 |
fungi | the project-config change which updates projects.yaml is already mergeable | 19:50 |
fungi | updates gerrit/projects.yaml i mean | 19:50 |
fungi | the zuul.d/projects.yaml change is separate, and yes zuul will need to find out about the project rename for that to work | 19:51 |
clarkb | oh right | 19:51 |
clarkb | ya its the second change that would need to wait and we should merge the second chagne normally | 19:51 |
fungi | in theory we could merge both normally, it's merely a question of whether we want to wait for gating on the first one | 19:52 |
fungi | (or on either one for that matter) | 19:52 |
clarkb | ya I think I'm coming around to lets do the rename and complete it first then do the upgrade. And if we do that we do not want to wait for gating in the middle | 19:53 |
*** melwitt is now known as Guest440 | 19:54 | |
fungi | i'll go ahead and review the changes. going to need to take a break before maintenance to make/eat dinner | 19:55 |
clarkb | ya I'm about to have lunch now | 19:55 |
fungi | otherwise i could be eating very late if the window runs long | 19:55 |
clarkb | fungi: I updated the process on the tehrpad to reflect ^ basically lets do the rename then force merge the first project-config change and ensure manage-projects runs normally as expected. Then put review02 back in the emergency file and do the upgrade once reindexing is done | 19:55 |
clarkb | we have to wait for reindexing anyway so amy as well use that time to complete the rename fully | 19:56 |
fungi | thankfully, reindexes are way faster than they used to be | 19:56 |
clarkb | yup | 19:57 |
clarkb | the other nice thing about doing it this way is we can worry about one set of challenges at a time. If we run over time a bit oh well | 19:57 |
fungi | agreed | 19:58 |
clarkb | Alright lunch now. Back in a bit | 19:59 |
fungi | all four changes related to the maintenance lgtm (822222, 825680, 825677, 826148) | 20:04 |
clarkb | I put some #status notice drafts into the therpad | 20:41 |
fungi | ooh, thanks! | 20:42 |
clarkb | if those looks good I'll send the first one out at 21:00 UTC and put servers in the emergency file | 20:42 |
fungi | clarkb: i often tack on the url for the maintenance announcement from the ml archive just for completeness | 20:43 |
ianw | o/ | 20:43 |
fungi | ianw: good morning! we're on track for the announced time | 20:44 |
fungi | some adjustments have been made to the plan in order to speed up the delay between renaming and upgrading | 20:44 |
clarkb | fungi: like that? | 20:45 |
fungi | clarkb: yep, lgtm | 20:45 |
clarkb | ianw: ya basically I think the plan I've settled on is that we should try and complete the rename as completely as possible before the upgrade. This means force merging the project-config change so that we run the manage-projects job more quickly rather than waiting around for that | 20:45 |
clarkb | ianw: then once we're happy with the renaming and reindxing has completed we can proceed with the gerrit upgrade. One questyion for you is if you think we need to land the 3.4 update change prior to upgrading or if we can land that after? I 've proposed after in my etherpad since that will speed things up a bit but your etherpad indicates before | 20:46 |
clarkb | otherwise I think we can largely follow your etherpad for the gerrit bits | 20:46 |
ianw | what's the rename etherpad link sorry -- i closed it and can't find it now | 20:47 |
fungi | https://etherpad.opendev.org/p/project-renames-2022-01-24 | 20:47 |
clarkb | re dequeue I'm suddenly remembering that I don't know if we can dequeue the hourly jobs because they are periodic | 20:48 |
clarkb | corvus: ^ do you know? | 20:48 |
clarkb | I can try to dequeue the 2100 UTC hourly jobs | 20:48 |
fungi | yeah, that'll be a good test | 20:50 |
fungi | i was able to use zuul-client dequeue successfully yesterday from zuul02 | 20:50 |
clarkb | for a check entry though right? | 20:50 |
fungi | right | 20:50 |
fungi | though also the webui might be able to do it | 20:50 |
clarkb | I don't know how to login to the web ui | 20:51 |
fungi | i have not tested authenticated actions in the zuul webui yet | 20:51 |
clarkb | I think I need a keycloak account which I've not made | 20:51 |
ianw | clarkb: i think 826148 can land after, but we just need to hand edit in a s/3.3/3.4/ to the docker-compose then? | 20:52 |
clarkb | ianw: yup that is what I proposed on my etherpad | 20:52 |
fungi | clarkb: ahh, yeah i was helping test the keycloak poc so have an account i can try with | 20:52 |
ianw | ++ | 20:53 |
clarkb | ok cool I think we have a reasonable plan then | 20:53 |
clarkb | ianw: do you want me to finish migrating your steps from your etherpad to mine or should we just jump from mine to yours when ready? | 20:54 |
ianw | i think we can minimise work and just jump over? i can update to reflect hand editing | 20:55 |
clarkb | wfm | 20:55 |
*** melwitt_ is now known as melwitt | 20:55 | |
clarkb | ianw: can you also edit it to indicate we shouldn't start upgrade until the reindexing for renaming is done? | 20:55 |
clarkb | that was the other small thing | 20:55 |
fungi | nevermind, i think i have an account in a different test realm than the one zuul is set up for, so this is probably not the time to try to iron it out | 20:56 |
fungi | best to just try with zuul-client for now | 20:57 |
ianw | yeah, on that, i think we can see that in the config file. https://gerrit-review.googlesource.com/c/homepage/+/328539 ; luca likes to drop some cryptic-ish comments which i'm trying to convert to "not a gerrit java developer" level :) | 20:57 |
clarkb | ya though I am fairly certain it will fail now that I think about it because there isn't a way to express the ref to remove iirc | 20:57 |
clarkb | ianw: you can do a gerrit task listing against the ssh api | 20:57 |
clarkb | it also logs when it is done in the error log iirc | 20:58 |
*** melwitt is now known as Guest450 | 20:59 | |
clarkb | I'm going to send the first notice now | 21:02 |
clarkb | then I'll update the emergency file. Then I'll try dequeing | 21:02 |
clarkb | #status notice review.opendev.org will have a few short outages over the next few hours (beginning 22:00 UTC) while we rename projects and then upgrade to Gerrit 3.4. See https://lists.opendev.org/pipermail/service-announce/2022-January/000030.html for details. | 21:02 |
opendevstatus | clarkb: sending notice | 21:02 |
-opendevstatus- NOTICE: review.opendev.org will have a few short outages over the next few hours (beginning 22:00 UTC) while we rename projects and then upgrade to Gerrit 3.4. See https://lists.opendev.org/pipermail/service-announce/2022-January/000030.html for details. | 21:02 | |
clarkb | ok I've just realized one minor thing. Zuul has multiple scheduelrs now | 21:04 |
*** melwitt_ is now known as melwitt | 21:05 | |
*** melwitt is now known as Guest451 | 21:07 | |
opendevreview | Clark Boylan proposed opendev/system-config master: Run zuul project rename steps on a single scheduler https://review.opendev.org/c/opendev/system-config/+/826156 | 21:07 |
clarkb | I think we'll need ^ that | 21:08 |
clarkb | we can manually apply it for this time | 21:08 |
clarkb | the dequeue worked as listed on the etherpad | 21:10 |
*** Guest451 is now known as melwitt | 21:10 | |
clarkb | I updated the plan to use a checkout of 826156 | 21:12 |
clarkb | the etehrpad notes don't list zuul as needing to go in the emergency file. I put zuul01 and zuul02 in the emergency file anyway as I don't think it will hurt anything | 21:15 |
corvus | clarkb: was your Q answered? | 21:22 |
clarkb | corvus: yes we tested against the 2100 UTC enqueue and it worked. I guess zuul-client is smarter than the gearman one | 21:23 |
clarkb | ianw: I've just remembered (yay) that the manage-projects.yaml job doesn't sync project-config to the remote before updating gitea servers because it uses teh checkout on bridge for that. Do you know what will update project-config on bridge? Is it just the sync for the change landing? | 21:24 |
corvus | kk | 21:25 |
clarkb | ianw: ya it looks like that is what does it: https://zuul.opendev.org/t/openstack/build/485ec807009d41929213af2f080e58ac/log/job-output.txt#132 | 21:26 |
clarkb | so ya I think we are safe to force merge and have those jobs run as normal | 21:27 |
clarkb | alright I'm going to take a break now and be back before 22:00 UTC to start things | 21:28 |
clarkb | I did start a root screen on bridge that others can join when we are ready too | 21:29 |
ianw | clarkb: yeah i think that's right; runs against the zuul version | 21:43 |
clarkb | ianw: where does ^A H log to? | 21:46 |
clarkb | I can turn that on in the bridge screen but not sure where it will write | 21:47 |
opendevreview | James E. Blair proposed zuul/zuul-jobs master: Add upload-logs-ibm role https://review.opendev.org/c/zuul/zuul-jobs/+/826158 | 21:47 |
fungi | it will write something like a hardcopy.txt in the current working dir | 21:47 |
fungi | i don't remember the precise extension (if it even has one) | 21:48 |
*** dviroel is now known as dviroel|afk | 21:48 | |
clarkb | looks like hardcopy then a digit | 21:48 |
clarkb | I'll do that now | 21:48 |
clarkb | oh it told me screenlog.0 | 21:48 |
ianw | yep :) | 21:50 |
ianw | i've attached to the screen | 21:50 |
clarkb | we'll start with the review02 reboot | 21:51 |
clarkb | I'll docker-compose down and reboot then docker compose up after wards to make sure gerrit is in the state that the playbook expects | 21:52 |
fungi | i've joined the screen session, thanks | 21:54 |
clarkb | fungi: ianw: and my use of a local checkout of system config to get https://review.opendev.org/c/opendev/system-config/+/826156 looks good to you? | 21:55 |
clarkb | I'll be on the meetpad for our etherpad just because I can be if anyone else wants to join | 21:57 |
ianw | ++ from me | 21:57 |
fungi | lgtm, yeah. i don't think i'm set up for meetpad at the moment unfortunately but happy to coordinate in here and in the etherpad(s) | 21:58 |
clarkb | ya if others prefer here thats fine | 21:58 |
clarkb | no one has joined and I'm ok with that | 21:59 |
clarkb | alright it is 22:00 now | 22:00 |
clarkb | sending the second notification | 22:00 |
clarkb | #status notice The review.opendev.org maintenance work is beginning now. Expect Gerrit outages over the next couple of hours. See https://lists.opendev.org/pipermail/service-announce/2022-January/000030.html for details. | 22:00 |
opendevstatus | clarkb: sending notice | 22:00 |
-opendevstatus- NOTICE: The review.opendev.org maintenance work is beginning now. Expect Gerrit outages over the next couple of hours. See https://lists.opendev.org/pipermail/service-announce/2022-January/000030.html for details. | 22:00 | |
ianw | all you would hear here is me nagging kids that if they don't do their chores as requested they will not be leaving for their playdates with friends :) | 22:00 |
clarkb | infra-root I'll go ahead with the review02 docker-compose down and reboot. then start gerrit up post reboot then we can run the playbook | 22:01 |
fungi | sounds good, thanks | 22:01 |
clarkb | it is rebooting now | 22:01 |
fungi | ianw: oh, is it summer break there? | 22:01 |
clarkb | gerrit is starting up again | 22:02 |
ianw | yes, back to school next week | 22:02 |
clarkb | uname -a lgtm | 22:02 |
ianw | also back in and lgtm | 22:03 |
clarkb | it says it is ready | 22:03 |
clarkb | I already checked that I can ssh from bridge to the hosts involved in this playbook | 22:03 |
clarkb | That takes us to running the rename playbook. Are we ready to do that now? | 22:03 |
*** melwitt is now known as Guest456 | 22:04 | |
clarkb | this bit will run in the screen | 22:04 |
fungi | yep, seems fine | 22:04 |
clarkb | ok proceeding now | 22:04 |
clarkb | fungi: maybe you can prep to do the force merge? I can get the dequeue ready and ianw will probably want to monitor the reindexing | 22:08 |
fungi | reindexing should be underway now | 22:08 |
fungi | yeah, i'll prepare for submitting the change | 22:08 |
clarkb | and ya the playbook seems to have run happily | 22:09 |
fungi | following our process documented at https://docs.opendev.org/opendev/system-config/latest/sysadmin.html#force-merging-a-change | 22:09 |
clarkb | fungi: ya can you hold off on actually doing it until we dequeue the hourly jobs and pull hosts out of emergency? | 22:09 |
clarkb | but I know you have to add yourself to groups and stuff sof igured getting started was a good idea | 22:10 |
clarkb | the gitea redirects lgtm | 22:10 |
clarkb | and the projects show up in gerrit so I think sanity checking can be marked done | 22:10 |
fungi | right, that's what i'm doing | 22:10 |
clarkb | cool I'll proceed with dequeing hourly now | 22:10 |
ianw | yeah i can see all the index jobs, keeping an eye | 22:10 |
fungi | it's just 822222 we want merged right? | 22:11 |
clarkb | fungi: yes https://review.opendev.org/c/openstack/project-config/+/822222 since that is the one that updates projects.yaml | 22:12 |
*** Guest456 is now known as melwitt | 22:12 | |
clarkb | fungi: you can merge now | 22:12 |
clarkb | hosts are removed from the emergency file and the hourly runs have been dequeued | 22:13 |
opendevreview | Merged openstack/project-config master: Rename skyline/skyline* project to openstack/skyline* https://review.opendev.org/c/openstack/project-config/+/822222 | 22:13 |
*** melwitt is now known as Guest458 | 22:13 | |
clarkb | that replicated so thats good | 22:14 |
fungi | specifically, i followed our cut-n-paste cli example at https://docs.opendev.org/opendev/system-config/latest/sysadmin.html#gerrit-admins | 22:14 |
fungi | i'll leave my admin account in project bootstrappers for the moment in case we need to do that for any other changes related to the maintenance | 22:14 |
*** Guest458 is now known as melwitt | 22:15 | |
*** dviroel|afk is now known as dviroel | 22:15 | |
clarkb | sounds good | 22:15 |
clarkb | I figured the reindexing would be slow and it appears to not be fast :) I'm happy we didn't try to squash things together given that. | 22:17 |
clarkb | but given there is online reindexing with the 3.4 upgrade not having a stale index felt important (we'll backup a good index) | 22:17 |
fungi | yes | 22:18 |
ianw | it did 20 indexing tasks in 27 seconds | 22:19 |
ianw | 1485 to go -- 20 minutes or so maybe | 22:19 |
clarkb | ianw: ya the cost of each tasks depends on the size of the repos themselves and the number of changes etc | 22:20 |
clarkb | I think the manage projects playbook is complete | 22:20 |
clarkb | the opendev.org redirects still work as expected | 22:20 |
fungi | yeah, the tasks are per-repo and the duration of each seems to scale linearly by the number of changes for a repo | 22:20 |
clarkb | and searching gerrit for skyline only shows the new names | 22:20 |
clarkb | I think that means manage-projects appropriately noop'd | 22:20 |
clarkb | I skimmed the log too and didn't see anything unexpected and the project-config on review02 was updated | 22:21 |
clarkb | Other than the reindexing and the zuul config update which should happen shortly I think we can consider this largely done for the renames | 22:21 |
fungi | the change reindexer packs its threads with the largest (change-count-wise) repos first, in order to maximize parallel throughput | 22:22 |
fungi | so in theory it should speed up a bit as it finishes the longer-running ones | 22:22 |
clarkb | I think that might not be true for onlien reindexing | 22:22 |
clarkb | since nova is queued up still | 22:22 |
clarkb | there may be an optimization that upstream could make there | 22:22 |
fungi | oh, maybe they changed it :/ | 22:22 |
clarkb | anyway this seems like a hurry up and wait for reindexing step so I'm going to grab a glass of water back in a few | 22:22 |
fungi | k | 22:23 |
clarkb | under 1k tasks now | 22:26 |
clarkb | I've rechecked https://review.opendev.org/c/openstack/project-config/+/825680 as zuul should know about the renamed projects now | 22:29 |
ianw | getting close now | 22:30 |
clarkb | I wonder if we need towait for the zuul smart reconfigure to end | 22:31 |
clarkb | because zuul pulls from gerrit | 22:31 |
clarkb | corvus: ^ is there an easy way to track when that is done? | 22:31 |
clarkb | maybe when the zuul queue length drops back to 0 it is implied | 22:32 |
clarkb | I think zuul may be done as it has enqueued the change I rechcked | 22:34 |
clarkb | and now it has queued a job for that chagne so ya I think we're good there | 22:34 |
fungi | yeah, otherwise it would have errored back immediately | 22:34 |
clarkb | should I exit the root screen on bridge? I dont think there is anything else to do there | 22:35 |
ianw | fd7cfc10 22:08:46.774 Index all changes of project openstack/nova | 22:35 |
ianw | that's the long ones | 22:36 |
fungi | the upgrade will be done from review.o.o directly, yeah. i'm fine closing out the session on bridge | 22:36 |
clarkb | ya though it seems to be making decent progress | 22:36 |
clarkb | fungi: yup upgrade will be on review02 | 22:36 |
clarkb | ok closing the bridge screen now | 22:36 |
fungi | i have a root screen session started on review now | 22:37 |
clarkb | [2022-01-24T22:36:58.885Z] [Reindex changes v60-v60] INFO com.google.gerrit.server.index.OnlineReindexer : Reindex changes to version 60 complete | 22:37 |
clarkb | ianw: I think you are good to proceed with your upgrade now. Let me know how I can help :) | 22:37 |
fungi | we need review back in the emergency list now, yeah? | 22:38 |
clarkb | yes | 22:38 |
clarkb | fungi: https://etherpad.opendev.org/p/gerrit-upgrade-3.4 has the steps for review now | 22:38 |
fungi | yep | 22:38 |
fungi | i'll add it | 22:38 |
fungi | i can edit the docker-compose file too | 22:39 |
fungi | see if what's in the screen session looks right | 22:40 |
fungi | if so, i'll pull | 22:40 |
clarkb | it does, but I was going to let ianw drive | 22:40 |
fungi | sure, waiting for him to be ready | 22:41 |
ianw | i'm happy to drive if we like | 22:41 |
fungi | feel free! | 22:41 |
clarkb | I mean I'm happy to let fungi do things I just meant I didn't want to say go for the next step if ianw wasn't ready :) | 22:41 |
clarkb | but ya go for it | 22:41 |
fungi | we're up to step u4 | 22:41 |
clarkb | fungi: I think you're on the wrong etherpad | 22:41 |
clarkb | though I don't think any of the steps you've done interfere greatly with ianw's document | 22:41 |
clarkb | sorry maybe leaving my plan on the other one is confusing | 22:42 |
fungi | is there another one besides https://etherpad.opendev.org/p/project-renames-2022-01-24 and https://etherpad.opendev.org/p/gerrit-upgrade-3.4 ? | 22:42 |
clarkb | fungi: just those two but https://etherpad.opendev.org/p/gerrit-upgrade-3.4 has the canonical steps for the gerrit upgrade | 22:43 |
clarkb | after we discussed earlier I noted on my etherpad that these steps are historical as ianw updated his more detailed steps on his pad | 22:43 |
fungi | for some reason gerrit-upgrade-3.4 shows me only myself on the pad | 22:44 |
fungi | okay, reloaded and now i see everyone else on it too | 22:44 |
fungi | i guess it was a etherpad session bug | 22:44 |
clarkb | ianw: if the shas don't match it may be due to the image update on friday | 22:44 |
clarkb | we did an update to pull in a logging bugfix | 22:44 |
clarkb | looks like they match what you've got in the etherpad though | 22:45 |
ianw | yep f64852c9fb561bea81f9e522ad6c0edb3e81168ea392348f2392674c77771bf4 is the latest, and that's what we have | 22:45 |
fungi | lgtm | 22:45 |
clarkb | gerrit log says it is ready | 22:47 |
ianw | com.google.gerrit.pgm.Daemon : Gerrit Code Review 3.4.3-22-gf8ae0e043a-dirty ready | 22:47 |
clarkb | I get my personal dashboard | 22:48 |
ianw | looks up | 22:48 |
ianw | i'm seeing grey circles next to peoples names, that is new i think | 22:48 |
clarkb | I think those might show a profile pic if properly configured somehow | 22:49 |
clarkb | and yes I see them too after a hard refresh | 22:49 |
ianw | wonder if we can plug into gravatar or something | 22:49 |
ianw | that must be where upstream got my head profile | 22:50 |
clarkb | I can generally browse changes | 22:50 |
ianw | ok, has zuul picked something up | 22:50 |
clarkb | I don't see anything since the restart | 22:50 |
fungi | we can approve something like 825677 | 22:51 |
fungi | see if it notices? | 22:51 |
opendevreview | Clark Boylan proposed opendev/system-config master: DNM testing things https://review.opendev.org/c/opendev/system-config/+/826190 | 22:51 |
clarkb | I just pushed ^ to check | 22:51 |
ianw | i put in a recheck on https://review.opendev.org/c/opendev/grafyaml/+/825990 and it is running | 22:51 |
clarkb | cool | 22:51 |
fungi | thanks | 22:51 |
fungi | lgtm | 22:52 |
clarkb | https://review.opendev.org/825680 should report back soon too | 22:52 |
ianw | only thing unconfirmed is gitea replication but i think we can keep an eye on that | 22:52 |
clarkb | ianw: the replication log should show us if 826190 replicated. Looking | 22:53 |
ianw | hrm, i wonder why we have differences in the config files | 22:54 |
clarkb | new gerrit will add stuff to configs and it never makes sense | 22:54 |
fungi | https://opendev.org/opendev/system-config/commit/51d9d04c2a70d0750f1abc48a50ed5eeb9cf6d90 | 22:55 |
ianw | but we should be validating that and erroring out in the job now | 22:55 |
fungi | replication seems to have worked | 22:55 |
clarkb | ianw: oh I see | 22:56 |
fungi | zuul reported back +1 on 825680 now | 22:56 |
clarkb | fungi: thanks for looking I was just catfhing up to that in thel ogs and I see some replication completed events. I see some not attempted log messages for the change metadata but I think that may be intentional since its a lot of data? | 22:56 |
fungi | not sure, i thought we were replicating change metadata previously | 22:57 |
fungi | at least the notes ref | 22:57 |
clarkb | ianw: what are you diffing? | 22:57 |
clarkb | I just did a manual diff and I think it is empty? | 22:57 |
clarkb | well I diffed the main config file. I'm assuming the delta you found is in another file | 22:58 |
ianw | diff -urN /home/gerrit2/tmp/upgrade-3.3/etc / | 22:58 |
ianw | home/gerrit2/review_site/etc > /home/gerrit2/tmp/upgrade-3.3/config.diff | 22:58 |
clarkb | ianw: thats the 3.3 path? | 22:59 |
clarkb | I think you might be diffing against 3.2 configs | 22:59 |
clarkb | we named the dirs after the target version not the version they were generated with | 22:59 |
clarkb | ya `cp -r /home/gerrit2/review_site/etc /home/gerrit2/tmp/upgrade-3.4` happened pre upgrade | 23:00 |
clarkb | so upgrage-3.4 has the 3.3 configs in it | 23:00 |
ianw | oh doh, i've put in the wrong command in the etherpad | 23:00 |
clarkb | is it clean when you use the newer path? | 23:00 |
*** dviroel is now known as dviroel|out | 23:00 | |
fungi | looking like i won't need bootstrappers membership for my admin account any longer, so i'll remove it again | 23:01 |
fungi | and done | 23:02 |
clarkb | reading the replication log more carefully I think the NOT_ATTEMPTED is logged before we attempt it. Then it completes a few seconds earlier and we actually are replicating the meta stuff | 23:03 |
ianw | ok, yeah it's clean. will fix those commands before 3.5 upgrade! | 23:03 |
fungi | once we merge a new change i can check whether i get an updated review notes ref | 23:03 |
clarkb | cool. I'm double checking replication of the meta ref now | 23:04 |
ianw | goign to quit the screen now | 23:04 |
clarkb | but ya I think we can generally call this done and monitor as we land changes to catch up | 23:04 |
fungi | thanks | 23:04 |
fungi | ready for me to approve the remaining changes, or did someone else want to? | 23:04 |
fungi | also i guess we need to take the server back out of the disable list now? | 23:05 |
clarkb | refs/changes/90/826190/meta I can fetch that from opendev.org and that change was created after the upgrade so we wouldn't be fetching old meta tehre | 23:05 |
clarkb | I think replication is happy | 23:05 |
clarkb | ya I think we can remove from emergency, then approve the various changes. I'll get links in a minute. | 23:05 |
clarkb | Then we can also consider dequeing the hourly jobs | 23:06 |
clarkb | ianw: ^ that sounds good? | 23:06 |
ianw | ++ i think we're ready to go | 23:06 |
fungi | that technically reverses the order of steps 15 and 16 | 23:07 |
fungi | do we care if they're done at the same time? | 23:07 |
clarkb | https://review.opendev.org/c/opendev/system-config/+/826148 https://review.opendev.org/c/opendev/system-config/+/826156 https://review.opendev.org/c/opendev/project-config/+/825677 https://review.opendev.org/c/openstack/project-config/+/825680 | 23:07 |
clarkb | fungi: no I think it is fine | 23:07 |
clarkb | worst case we'll change the docker-compose.yaml back to 3.3 for a short time | 23:07 |
clarkb | the reindexing is still running so gerrit will probably be a little slower than usual until that is done | 23:08 |
fungi | okay, i'll start approving unless someone else wants to | 23:08 |
clarkb | I think you can go for it. I pushed many/most of them so should avoid approving if I can | 23:08 |
ianw | have we removed from emergency? | 23:08 |
clarkb | doesn't look like it | 23:09 |
clarkb | fungi: ^ were you still going to do that? | 23:09 |
fungi | we haven't, but i can | 23:09 |
ianw | ok, i just did it since i had it open | 23:09 |
opendevreview | Merged opendev/project-config master: Record project renames happening January 24, 2022 https://review.opendev.org/c/opendev/project-config/+/825677 | 23:09 |
fungi | wfm, thanks | 23:09 |
ianw | i've approved 826148 to update the docker-compose | 23:10 |
fungi | confirmed, refs/notes/review was updated for 825677 and tells me the approval vote and submitted time | 23:12 |
clarkb | cool | 23:12 |
fungi | Submitted-at: Mon, 24 Jan 2022 23:09:34 +0000 | 23:12 |
clarkb | We can start looking at adding gerrit 3.5 image builds next, update the upgrade job to do 3.4 to 3.5 :) But that isn't urgent | 23:12 |
fungi | and git claims it pulled refs/notes/review from opendev.org not review.opendev.org | 23:12 |
fungi | so pretty sure that proves it replicated to gitea | 23:13 |
clarkb | and I'd like to keep the 3.3 image builds around until we confident we won't revert | 23:13 |
fungi | of course | 23:13 |
ianw | ++. this feels routine now -- a testament to a lot of hard work put in! | 23:14 |
clarkb | Yes! tahnk you everyone who helped make this as easy as it has been | 23:15 |
opendevreview | Merged openstack/project-config master: Configure renamed skyline projects with openstack jobs https://review.opendev.org/c/openstack/project-config/+/825680 | 23:17 |
fungi | very smooth indeed. thanks ianw and clarkb for all the upgrading and renaming planning! | 23:17 |
clarkb | fungi: ianw do we want a #status notice along the lines of "Unless any major issues show up we expect the review.opendev.org maintenance to be complete." ? | 23:19 |
fungi | we don't usually, but up to you | 23:19 |
clarkb | ya I'm fine leaving it out then | 23:20 |
clarkb | reindexing is complete too now | 23:24 |
clarkb | Anything else I can do to be helpful? if not I'm going to take a break. Then check back in in half an hour or so. I'll send out the meeting agenda then too | 23:24 |
ianw | i think we're all good, thanks! | 23:24 |
clarkb | side note: we've overtaken wikimedia's gerrit now :) | 23:25 |
clarkb | not that it is a competition, but really does show how well we've managed to get ahead of where we were in the past | 23:25 |
clarkb | ok back in a bit | 23:25 |
opendevreview | Ian Wienand proposed opendev/system-config master: gerrit: add avatar-gravatar plugin https://review.opendev.org/c/opendev/system-config/+/826196 | 23:28 |
*** rlandy|ruck is now known as rlandy|out | 23:39 | |
johnsom | How do we audit gerrit access these days? The access page is blank now, https://review.opendev.org/admin/repos/openstack/octavia,access | 23:50 |
opendevreview | Ian Wienand proposed openstack/project-config master: gerrit: add gravatar avatar plugin https://review.opendev.org/c/openstack/project-config/+/826202 | 23:51 |
opendevreview | Ian Wienand proposed opendev/system-config master: gerrit: add avatar-gravatar plugin https://review.opendev.org/c/opendev/system-config/+/826196 | 23:52 |
clarkb | johnsom: through openstack/project-config/gerrit/acls | 23:55 |
clarkb | johnsom: unfortunately one of the side effects of the bug that fungi and I discovered before we upgraded to 3.2 was that to fix it they locked that down much more | 23:56 |
johnsom | Ah, ok, a new, maybe temporary thing. | 23:56 |
clarkb | It might be worth having a conversation with them about opening that up again just for the acl listings | 23:56 |
clarkb | johnsom: ya the problem was they were exposing too much notedb stuff previously. They locked that down and now they expose too little :/ | 23:57 |
clarkb | it isn't ideal | 23:57 |
johnsom | Yeah, it was super handy | 23:57 |
clarkb | we might be able to expose it specifcally in the leaf projects | 23:57 |
clarkb | I think the main issue is in the central project All-Projects and All-Users you do not want to expose those refs | 23:58 |
clarkb | fungi: ^ may recall more details on that | 23:58 |
clarkb | infra-root https://review.opendev.org/c/opendev/system-config/+/826145 and child are relevant to things fungi and I discovered last week. Not all servers had properly enabled unattended-upgrades. | 23:58 |
clarkb | johnsom: but we don't allow any changes to those refs except through openstack/project-config/gerrit/acls so that should be a reasonable place to go for now | 23:59 |
opendevreview | Ian Wienand proposed opendev/grafyaml master: Generate and use UID for acessing dashboards https://review.opendev.org/c/opendev/grafyaml/+/825990 | 23:59 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!