Monday, 2022-03-21

opendevreviewMerged opendev/system-config master: Add (lack of) license information for org logos
opendevreviewIan Wienand proposed openstack/project-config master: grafana afs : remove centos 8 volume tracking
opendevreviewMerged zuul/zuul-jobs master: [multi-node-bridge] Allow to skip openvswitch installation
opendevreviewAlbin Vass proposed zuul/zuul-jobs master: mirror git repos: cleanup submodule checkouts
opendevreviewAlbin Vass proposed zuul/zuul-jobs master: mirror git repos: cleanup submodule checkouts
opendevreviewAlbin Vass proposed zuul/zuul-jobs master: mirror git repos: cleanup submodule checkouts
dpawlikianw: some issue with zuul?09:54
opendevreviewMerged openstack/project-config master: grafana afs : remove centos 8 volume tracking
dpawlikfrickler, fungi: hey, do you have some issue with zuul?10:09
fricklerdpawlik: none that I know of, do you have more details?10:19
dpawlikfrickler: sorry, false alarm. 10:22
dpawlikchrome got some issue, after restart works :)10:22
opendevreviewMarios Andreou proposed openstack/project-config master: Update channel ops for oooq (tripleo ci) channel
fungidpawlik: did you have a zuul status page open over the weekend and found it showing something broken today?12:07
fungiwondering if that's related to saturday's service upgrades. we have multiple zuul-web services with a load balancer directing traffic to them, but i don't think we've tested to see how seamless that is with the status page autorefresh12:08
dpawlikfungi: It stuck on refreshing status builds web page, but it seems to be issue on my side12:53
fungiwell, if it had been up refreshing constantly since before the upgrade on saturday, it's possible it got a bad api response at one point when we were failing over between zuul-web servers. that could persist on the screen, but i would also have expected a hard refresh to clear it if that were the cause12:57
dpawlikaha, good to know13:01
dpawlikI see in Opensearch that there is low amount of logs from yeasterday 5 PM UTC, but now it seems to be ok13:02
fungidpawlik: it's possible we weren't running many jobs. weekends tend to be especially quiet times for openstack development work, sundays in particular13:06
fungiyou can check the zuul jobs launched graph to see:
fungithose tall spikes are when all the daily periodic and periodic-stable jobs run13:20
fungibut you can see that the activity around emea/americas "business hours" are missing on saturdays and sundays13:21
fungilooks like sunday actually does pick up a little compared to saturday, i guess that's apac monday mornings showing up in the mix13:22
clarkbfungi: looks like the change to build new gerrit images did land14:50
clarkbAnd zuul reports the promote image jobs were successful14:50
clarkbany opinions on when we should try and restart for that?14:51
fungii'm up for it any time, just didn't want to do it until we'd had a chance to discuss14:51
fungimaybe when activity starts to die down later in our day?14:51
clarkbsure that should work for me14:52
clarkbI'll let the openstack release team know due to proximity to the openstack release14:56
*** dviroel|lunch is now known as dviroel16:38
clarkb some updates on that gitea partial clone bug. I just responded as it doesn't quite make sense to me yet16:52
fungii have a feeling something changed in client-side handling of the filter feature between when it was first added and the versions we saw working with gitea17:25
fungiit's really the only explanation which makes sense17:25
fungiclarkb: one detail you elided in the description, which may have led readers to an incorrect assumption, is that these same git clients/options were working fine prior to our upgrade to gitea 1.1617:28
fungiit's possible for someone reading that report to guess (incorrectly) that this is the first version of gitea we've tried to run, rather than that we observed a regression when upgrading17:29
clarkbah yup I meant to mention that when I indiated the commit that chagned the bheuavor but it got missed17:31
clarkbThis is fun. I just had rpm fail because cpio woudln't replace a directory with a file. Package was updated to change a path from a dir to a file. Had to manually move the dir aside and rerun the installation.17:51
fungiinteresting it doesn't have logic to deal with that18:28
corvusthe zuul docs publishing job had an error preventing a docs builds for a tagged release today; i'm going to manually copy in a build (detailed discussion in #zuul earlier)19:58
clarkbthanks for the heads up19:58
corvus#status log manually published /afs/
opendevstatuscorvus: finished logging20:01
fungiokay, i'll double-check we have the latest image on review21:16
clarkbfungi: thanks I was lookign at zuul status and there are some changes that we might wait a few minutes for but then go for it I think21:16
fungilooks like there was a mariadb container update to pull21:16
fungiopendevorg/gerrit   3.4       86c3bcd9fafe   2 days ago      800MB21:17
fungimariadb             10.4      fd1ed4499d48   2 days ago      403MB21:17
clarkbif we want to go a little longer we could update dockerd and do a reboot too. Not sure if we want to try and sneak that in21:17
clarkbthat will definitely be more noticeable21:18
fungiyeah, it's probably not a huge deal to add those in, should go quickly21:18
clarkbprocess for that would be to apt-get update && docker-compose down && apt-get dist-upgrade && reboot then docker-compose up -d21:18
fungithere's new kernel images to install as well21:18
fungii'll get those installed first and hold on the docker package updates just yet while gerrit's still running21:19
fungijust so we don't waste time during the outage21:20
clarkbI think at some point we should consider a docker image prune on that server. But that can happen later not when we're doing a short downtime21:20
clarkbmaybe after we've run the current images for a few days happily we can manually do that21:20
clarkbfungi: the tripleo chagnes that were close to merging are really close now. Just finished up the paused job then a few changes should be able to land there. Then I think we can proceed?21:25
fungistatus notice The Gerrit service on will be offline momentarily for a Gerrit patch upgrade and kernel update, but should return again shortly21:26
fungithat lgty?21:26
clarkbthe tripleo job is in their collect logs phase which iirc can take a few minutes :/21:27
clarkbI'm putting together the agenda for tomorrow's meeting. Looks like ianw added an item. Anything else to add before I snd that out? I'll probably send it out once gerrit work is done21:29
fungithere's a repo rename request i thinl21:29
clarkbah so there is. Let me make sure that is on the agenda21:29
clarkbthe tripleo job is uploading logs to swift now21:31
clarkbshould be really close21:31
fungi#status notice The Gerrit service on will be offline momentarily for a Gerrit patch upgrade and kernel update, but should return again shortly21:31
opendevstatusfungi: sending notice21:31
fungii'll get that circulating21:31
-opendevstatus- NOTICE: The Gerrit service on will be offline momentarily for a Gerrit patch upgrade and kernel update, but should return again shortly21:31
clarkbthose chagnes are out of the zuul queue now21:34
clarkbmaybe give it a few seconds just to make sure the gerrit stuff is reported but then I think we can proceed21:34
clarkbgerrit merged changes list shows them as merged. I think we can proceed21:34
clarkbfungi: were you going to drive? you were alraedy doing stuff on the server21:35
fungiyep, on it21:35
fungigerrit down, docker upgrading21:35
clarkbI see the containers have stopped. That lgtm I think we can proceed wtih docker updates21:35
fungiwill reboot as soon as that's done21:35
clarkbsounds good21:35
clarkband then we'll have to docker-compose up -d when it is up again21:36
fungiit's back21:36
clarkbyup I'm in21:36
fungiLinux review02 5.4.0-105-generic #119-Ubuntu SMP Mon Mar 7 18:49:24 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux21:36
fungistarting gerrit21:36
corvus(btw i rebooted the schedulers on saturday to get the new kernel)21:37
fungiwebui is up21:37
fungithanks corvus!21:37
fungiPowered by Gerrit Code Review (3.4.4-14-g76806c8046-dirty) 21:38
clarkbthere it goes took a minute before web ui would load for me but it is there now and I see the version we expected21:38
clarkbI'm not seeing files listed in the changes. I seem to recall I noticed this last time too and it took a little time to prime the diff caches?21:39
fungioh maybe21:39
clarkbyup just refreshed a change and now the files show up21:39
clarkbI kidna wish they wouldn't report the server is ready before stuff like that is happy21:40
fungii see them in a change i just opened21:40
clarkbBut noting it here as I think this is expected and related to caching21:40
clarkbya it doesn't take long but is just long enough to be noticeable21:40
clarkball seems happy from what I'm looking at. We're a few commits ahead of 3.4.4 proper. Looks like some older schema changes (shouldn't affect us as we're already ahead of those schema changes) to GC repos as they migrated into notedb for stuff, a comment got clarified and an exception gets logged more verbosely21:44
clarkbI don't see anything of concern there21:44
clarkbfungi: we should probably #status log Updated Gerrit to 3.4.4-14-g76806c8046-dirty ?21:45
fungigo for it, yeah21:45
clarkb#status log Updated Gerrit on review.o.o to 3.4.4-14-g76806c8046-dirty21:45
opendevstatusclarkb: finished logging21:45
clarkbalright time to work on sending teh agenda now22:27

