Monday, 2023-11-27

tkajinamhmm. it seems reset author works. wondering if something can be fixed at infra side or I should update number of patches by my side...01:28
tkajinamhmmm. we may face a same problem when we attempt to backport a change but its author's email can't be validated so I think we have to fix it if the change in email address is the cause02:35
fricklertkajinam: the issue is a bug in gerrit, I mentioned a patch yesterday we could likely apply, but we'll have to wait for people to recover from stuffing turkeys it seems, so if you need some patch to merge soon, better do the email amending07:05
tkajinamfrickler, yeah I did reset-author for some patches we may want to merge soon, but will leave the other non-urgent ones until people come back from nice thanksgiving holidays and we get a conclusion.07:16
mrungeHow would one look at CI results these days, like for
ykareluntil the issue is fixed can check log url in
mrungethank you ykarel! This is good to know10:49
opendevreviewTakashi Kajinami proposed openstack/project-config master: Redirect irc notifications from os-(apply|collect|refresh)-config
opendevreviewTakashi Kajinami proposed openstack/project-config master: Move os-(apply|collect|refresh)-config to heat's queue
opendevreviewTakashi Kajinami proposed openstack/project-config master: Move os-(apply|collect|refresh)-config to heat
dpanechHello, we are getting a Zuul error, it just says "Something went wrong" with no further information, could someone help? Here:
fungidpanech: yes, an upgrade over the weekend brought in a new version of a javascript library that has exposed a bug in the dashboard. we're working on fixing it now:
dpanechfungi: ok thanks15:46
fungidpanech: looks like the fix we have will solve it based on the preview build in check, so it's been approved. once it merges and new container images appear, we'll restart the web front-ends to get it into production asap16:00
dpanechfungi: thank you16:01
clarkbtkajinam: frickler: this feels like somethnig that upstream should definitely fix rather than us carrying a local backport as others will be affected too16:07
clarkbfrickler: I can bring it up with them this morning16:07
clarkbfirst thing to confirm is that the identified fix is not on stable-3.8. It is not so upstream probably does need a backport16:13
clarkbI've asked in discord if there is a reason to not backport and how far back it needs to be backported. It does cherry pick cleanly16:28
opendevreviewMerged opendev/ master: Add DNS records for and
clarkbinfra-root are we ready to clean up gerrit iamges? I can reapprove that one if so17:08
fungii think so17:09
clarkbalso there are new gerrit point releases so I can actually stick a new chagne int he middle to update the versions and make 3.9 the proper version from the start17:09
fungiseems pretty much no chance we'll roll back the upgrade at this point17:09
tonybclarkb: I don't think we've seen anything that would cause a rollback17:09
clarkbya the issue that tkajinam and frickler debugged is probably the biggest one and we have a workaround for that and a presumed fix we can backport17:09
clarkbI'll approve it17:10
clarkband now to update the rest of that stack with the latest releases17:10
opendevreviewClark Boylan proposed opendev/system-config master: Add gerrit 3.9 image builds
opendevreviewClark Boylan proposed opendev/system-config master: Add gerrit 3.8 to 3.9 upgrade testing
opendevreviewClark Boylan proposed opendev/system-config master: Update Gerrit 3.8 images to 3.8.3
clarkbok I updated the 3.9 stuff to 3.9.0 from 3.9.0-rc5 and then stacked the 3.8.3 upgrade on top of that so that we don't have to get prod updated to 3.8.3 before adding 3.9 testing (which I thik we can just go ahead and approve now if people want)17:21
clarkbtkajinam: frickler: upstream indicated that the identified fix should be backported to stable-3.8. I have done so here:
clarkbif we get lucky we'll pull it in when lands (depends on timing)17:30
opendevreviewMerged opendev/system-config master: Cleanup Gerrit 3.7 image jobs and disable Gerrit upgrade job
tonybIn frickler asked "Just wondering if we should keep increasing the ids for new mirrors or recycle deleted ones, i.e. can we go back to mirror01 instead?"  I don't have a strong preference.17:31
tonybI can see arguments for either option.  It isn't a blocker but something for us to think about in prep for the upcoming jammy->noble transition17:33
clarkbone upside to increasing the numbers is we avoid weird ansible caching problems17:34
clarkbansible isn't great about knowing a new host has the same name and yiou have to do manual cleanup17:34
fungiwe often have reused the lowest available number when replacing servers, unless it would cause confusion for some historical data tracking. for mirrors it seems safe enough, but also nothing i'd worry about redoing work over17:34
clarkbpersonalyl I like avoiding those problems in the frist place and just have new numbers17:34
fungiand yes, it got more problematic when we switched to ansible17:34
fungisince you may have to clear the ansible fact cache on the bridge17:35
clarkbinfra-root in addition to the gerrit image updates the base python image updates for python3.12 should be ready to go now and parent17:37
tonybThat's a good point about fact caching.    It is only a cache, so we could just remove it whenever we delete a host (although just removing that host would be better) as it'll just recreate on the next run.17:39
tonybThe only change where it's relevant now is the mirror03 one and it isn't worth rebuilding that server.17:40
opendevreviewClark Boylan proposed opendev/system-config master: Update gitea to 1.21.1
clarkbI'm not going to bother with update the hold node yet since we want to do gerrit key rotation first17:43
clarkbbut I wanted to keep the change up to date with upstream updates17:43
clarkbthe zuul web fix has landed.18:20
clarkbonce images promote we should probably pull and mnaully update the zuul-web services on zuul01 and zuul02?18:21
corvusyeah, i'll go ahead and restart zuul-web18:21
tonybclarkb, fungi: On my todo list for today is to boot the new mirrorXX.dfw did either/both of you want to do a meetup+screen session18:21
clarkbtonyb: I should be able to. What time works for you?18:22
clarkbI can do nowish or after lunch, but late afternoon I need to go and refill my bins full of leaves since they just collected them18:24
fungithanks again corvus!18:25
fungi is so convenient for monitoring restarts18:26
clarkbcorvus: thanks!18:27
corvus#status log restarted zuul-web to fix js errors18:27
opendevstatuscorvus: finished logging18:27
fungi works now!18:28
clarkbI can browse builds from the builds list to their logs18:28
clarkbmaybe we should do a #status Notice Zuul build urls should be working again18:28
fungifrickler: mrunge: ykarel: dpanech: ^18:28
corvusstatus notice zuul build urls should be working again (browser refresh may be required)18:29
corvushow about that ?18:29
clarkbcorvus: lgtm18:30
corvus#status notice Zuul build urls should be working again (browser refresh may be required)18:30
opendevstatuscorvus: sending notice18:30
-opendevstatus- NOTICE: Zuul build urls should be working again (browser refresh may be required)18:30
clarkbfrickler: tkajinam upstream has merged the stable-3.8 backport of the identified fix. That means when we land and then restart gerrit on the updated version this problem should go away18:30
opendevstatuscorvus: finished sending notice18:33
clarkbhrm the gerrit 3.8 to 3.9 upgrade fails now...18:43
fungiconnection refused from the rest api18:46
clarkbya its happening due to a lucene issue. I think this may be an actual problem wtih gerrit's upgrade path so I'm reporting it to them now18:47
fungi"This index was initially created with Lucene 7.x while the current version is 9.8.0 and Lucene only supports reading the current and previous major versions. This version of Lucene only supports indexes created with release 8.0 and later by default."18:48
clarkbyup thats the error18:48
tonybclarkb: anytime 11:15 - 15:00 your time.18:48
fungiyeah, looks like they need some sort of intermediate lucene migration, or to blow away the existing index18:49
clarkbfungi: ya either the bug is in their lucene updates or in their release notes saying online reindexing is possible. I hope they stick to online reindexing myself and fix lucene18:49
clarkbtonyb: how about 11:30? I'm working on early lunch/late breakfast right now18:49
tonybclarkb: Yup.  No rush.18:50
clarkbI'm super happy we have this gerrit upgrade test job. Its been super useful18:51
tonybYeah it's really cool.  Does upstream have anything like it?18:52
clarkbI'm not sure. Luca was saying they do run the gatling load tester against release candidates before doing releases as one of their pre release tasks18:52
fungii like that he referred in their channel to "...our ci jobs that test the upgrade..."18:53
fungimaybe that'll get them wanting something similar18:53
tonybwhere is said channel?18:53
clarkbtonyb: it is the gerrit discord channel which is mirrored to ^18:54
fungithough it's a mirror of... yeah what clarkb said18:54
clarkbI'm actually connected to both discord and matrix .... but I really only use the discord server for the monthly community meeting otherwise I prefer to use matrix18:54
clarkbI'm going to stack the 3.8.3 image update under the upgrade change now so that it is mergeable18:55
fungisounds good, no reason to hold it up since we're not going to be upgrading to 3.9 soon anyway18:56
opendevreviewClark Boylan proposed opendev/system-config master: Update Gerrit 3.8 images to 3.8.3
opendevreviewClark Boylan proposed opendev/system-config master: Add gerrit 3.8 to 3.9 upgrade testing
fungitonyb: clarkb: i'll probably skip the call today, i've got to take a gardening break to ready some plants for freezing temperatures predicted tomorrow night, and had to wait until the rain let up so that's basically nowish18:58
tonybfungi: okay.18:58
tonybfungi: I expect it'll be pretty quick/easy I just want extra eyes because it's dfw.19:00
fungimakes sense, and yeah it should be straightforward19:02
corvusi'm going to restart zuul-web a second time in order to run the schema migration that should fix periodic build queries19:23
clarkbtonyb: ready when you are19:29
tonybI figure we can reuse that one19:30
corvus works now19:39
fricklerI like how gerrit is making mode changes more obvious now
fungiooh, that's a nice little warning sign20:42
clarkblooks like gerrit confirms the lucene issue is a problem20:49
fungiapparently they didn't mean to increase the lucene version quite that far in 3.920:50
fungithough now that they have... i wonder what their options are20:50
clarkbthey are discussing it in a differetn discord channel that isn't bridged to matrix20:50
fungiclearly downgrading lucene for people who have already installed 3.9 would be tricky20:50
clarkbseems like all of the options are not great so will be interesting to see what they end up with20:50
fungieven having not seen the discussion i could pretty much guess that20:51
clarkbthe good news if there is any is that I was lucky enough to catch it less than a week since the 3.9.0 release20:52
clarkbhopefully that means the number of people who may have hit issues due to it is small20:53
fungithe sad news is that if they had an upgrade test like ours in their ci, it could have blocked the change that dragged in that unintentional increase20:54
fungii notice that their merged-as and parent links in the change info have a link to gitiles instead of linking a gerrit search query like we do (so it also works with merge commit parents). i wonder how they do that?20:56
fungiwe could in theory put gitea overrides there20:57
clarkbfungi: I think if you use gitiles its part of the plugin20:57
fungiaha, so we'd need a custom plugin for that20:57
corvusthey do have a zuul, but no one is really dedicated to writing jobs for it. i think they would welcome contributions if someone wanted to set up an upgrade test.20:57
clarkbI'm letting them know that a 25 minute offline reindex for us isn't the end of the world for upgrading20:59
clarkbas a datapoint for them to consider their options20:59
fungiyeah, as long as we know to plan for it, i'd be fine with that20:59
clarkbya the problem they have is anyone that has newly deployed 3.9.0 or blazed ahead with an offline reindex will get stranded if they revert. if they don't revert the rest of us have to do an offline reindex too21:02
opendevreviewTony Breeds proposed opendev/ master: Add DNS records for mirror02.dfw.rax
opendevreviewTony Breeds proposed opendev/system-config master: Add a helper script for doing the LVM setup on mirror nodes.
opendevreviewTony Breeds proposed opendev/system-config master: Add inventory/LE records for and
opendevreviewTony Breeds proposed opendev/system-config master: Add inventory/LE records for mirror02.dfw.rax
opendevreviewTony Breeds proposed opendev/ master: Add DNS records for mirror02.dfw.rax
*** dmellado206 is now known as dmellado2022:08
tonybI am okay with: but I'm hesitant to +W them because I'm not certain what will happen once 901992 merges.22:36
tonybI *think* it will update the gerrit-compose to the new tags after building and publishing the images but it will *not* restart gerrit.22:37
tonyband an infra-root will need to do that manually at a "safe time"22:37
fungiyeah, we usually hold off approving that sort of change until we're ready to perform a controlled restart of the service, just in case something happens to the server that forces it to be rebooted22:41
fungisafer for unexpected reboots from provider outages to result in running a version of gerrit we've been using for a while than one we haven't tried outside of testing22:41
tonybfungi: Thanks for confirming.22:45
clarkbthe 3.9 image creation is probably good to hold off on now too22:55
clarkbupstream is talking about pulling the release22:55
clarkbI think I'll resync on all that tomorrow after it has time to settle and update the changes if necessary22:55
fungioh fun22:56
clarkbI've been trying to do any last debugging on this laptop before I call lenovo for warranty stuff. Booting nomodeset (so disabling the gpu entirely) works but at lower (fuzzy resolution)22:57
clarkbone suggestion I found was disabling the dynamic power management might help but doing that I get no screen after askign grub to boot22:57
clarkbthe issue is present in jammy though. I think I might try focal?22:58
clarkbits fun that this is so complicated and breaks often enough in linux that you can't really tell if it is hardware or software22:58
tonybthat sounds terrible 22:59
tonybI think I'm going to step away for the night.  I'll figure out why the mirror testing is failing tomorrow 23:00
clarkbya about the only good news is that it is broken on ubuntu jammy so maybe I have some hope someone else has run into this if it isn't a hardware issue23:00

Generated by 2.17.3 by Marius Gedminas - find it at!