*** rlandy|ruck is now known as rlandy|out | 00:09 | |
clarkb | http://mirror.centos.org/centos/8-stream/BaseOS/x86_64/os/Packages/systemd-239-55.el8.x86_64.rpm exists now but our mirrors don't seem to have it yet | 00:14 |
---|---|---|
clarkb | I suspect that iputils is pulled in by our image builds too (but it may not be) and if that is the case we likely need to also rebuild our centos 8 images once our mirrors update | 00:14 |
clarkb | Maybe that means tomorrowe we'll be happy | 00:14 |
clarkb | er wait its systemd not iputils we need to update and that definitely gets pulled at image build time :) | 00:16 |
clarkb | http://mirror.dal10.us.leaseweb.net/centos/8-stream/BaseOS/x86_64/os/Packages/ hasn't updated yet so we are waiting on that, then our mirror, then our images | 00:21 |
clarkb | http://mirror.facebook.net/centos/8-stream/BaseOS/x86_64/os/Packages/ hasn't updated either which means switching mirrors is unlikely to help us right now | 00:22 |
clarkb | time to practice patience. Something I'm nto very good at | 00:22 |
opendevreview | Steve Baker proposed openstack/diskimage-builder master: Rename existing BLS entry with the new machine-id https://review.opendev.org/c/openstack/diskimage-builder/+/825695 | 00:25 |
clarkb | re the reload4j chagnes to gerrit. It looks like they are pushing those changes to stable-3.3 and rolling up to 3.4 and 3.5 and master. We should be careful with our upgrade that we don't miss it if we go with the upstream version (they may update 3.3 and then not 3.4 for some additional time) | 00:43 |
fungi | also we've seen them release a x.y.z with a commit that isn't merged to stable-x.y | 00:44 |
clarkb | well and more that they seem to batch up the merge forwards and I'm slightly worried we might miss that 3.3 is fixed upstream but not 3.4. For that reason we might want to just go with our local fixing anyway. But I'll look at where things are tomorrow morning and go from there | 00:48 |
*** ysandeep is now known as ysandeep|away | 01:22 | |
fungi | where's our backup pruning documentation? | 02:02 |
fungi | oh, never mind, i didn't read far enough | 02:02 |
fungi | Each backup server has a script /usr/local/bin/prune-borg-backups which can be run to reclaim space. This should be run in a screen instance as it can take a considerable time. It will prompt when run; you can confirm the process with a noop run; confirming the prune will log the output to /opt/backups. | 02:03 |
fungi | it's in process on backup02.ca-ymq-1.vexxhost in a root screen session | 02:06 |
fungi | i'll check up on it periodically | 02:06 |
opendevreview | Steve Baker proposed openstack/diskimage-builder master: Fix the root device in the default BLS entry https://review.opendev.org/c/openstack/diskimage-builder/+/825700 | 02:31 |
opendevreview | Steve Baker proposed openstack/diskimage-builder master: Rename existing BLS entry with the new machine-id https://review.opendev.org/c/openstack/diskimage-builder/+/825695 | 02:42 |
opendevreview | Steve Baker proposed openstack/diskimage-builder master: Fix the root device in the default BLS entry https://review.opendev.org/c/openstack/diskimage-builder/+/825700 | 02:42 |
*** tkajinam is now known as tkajinam|lunch | 03:55 | |
*** tkajinam|lunch is now known as tkajinam | 03:56 | |
opendevreview | Steve Baker proposed openstack/diskimage-builder master: Fix the root device in the default BLS entry https://review.opendev.org/c/openstack/diskimage-builder/+/825700 | 04:27 |
*** hashar_ is now known as hashar | 05:51 | |
*** ysandeep|away is now known as ysandeep | 06:03 | |
*** frenzy_friday is now known as frenzyfriday | 07:05 | |
chkumar|rover | ianw: hello, please have a look when free https://review.opendev.org/c/opendev/system-config/+/825446, thanks! | 07:23 |
*** amoralej|off is now known as amoralej | 08:03 | |
*** jpena|off is now known as jpena | 08:04 | |
*** ysandeep is now known as ysandeep|lunch | 08:12 | |
*** johnsom_ is now known as johnsom | 09:03 | |
*** bhagyashris_ is now known as bhagyashris|ruck | 09:03 | |
*** Tengu_ is now known as Tengu | 09:07 | |
*** kopecmartin_ is now known as kopecmartin | 09:09 | |
*** melwitt is now known as Guest1312 | 09:32 | |
*** ysandeep is now known as ysandeep|coffee | 10:48 | |
*** melwitt is now known as Guest1320 | 10:50 | |
*** ysandeep|coffee is now known as ysandeep | 11:09 | |
*** rlandy is now known as rlandy|ruck | 11:13 | |
*** dviroel|out is now known as dviroel | 11:25 | |
fzzf[m] | Hello, I run "ssh -p 29418 gh@review.opendev.org gerrit stream-events", result shows "Received disconnect from 199.204.45.33 port 29418:12: Too many concurrent connections (96) - max. allowed: 96",How to deal with this. | 11:27 |
fzzf[m] | zuul CI how to limit listen events. I use it for manila project. | 11:44 |
rlandy|ruck | fungi: hello ... to follow up on the mirror issues: https://bugs.launchpad.net/tripleo/+bug/1958510 | 12:11 |
rlandy|ruck | please see last two comments from Alfredo Moralejo (amoralej) and the link to issue https://pagure.io/centos-infra/issue/609 | 12:12 |
rlandy|ruck | also patch still open for w+ https://review.opendev.org/c/opendev/system-config/+/825446 | 12:12 |
frickler | rlandy|ruck: I still fail to find any indication in all of this that the facebook mirror is better than the one we are currently using to sync from | 12:19 |
rlandy|ruck | frickler: there is no confirmation of that | 12:29 |
rlandy|ruck | we have had some better stats on c9 this week | 12:29 |
rlandy|ruck | but that is only one week to go by | 12:29 |
rlandy|ruck | where we switch to the facebook mirror or not is unrelated to https://pagure.io/centos-infra/issue/609 | 12:30 |
rlandy|ruck | which is what I said I would follow up on for fungi | 12:30 |
frickler | rlandy|ruck: maybe if redhat is to interested in this they can provide us direct access to the original repo to mirror from, that would avoid all issues with public mirrors that tend to come and go and be broken once in a while | 12:32 |
frickler | the c9 mirror was syncing from rackspace originally, not leaseweb, so that's no indication for anything | 12:34 |
rlandy|ruck | I'll look into that suggestion | 12:37 |
*** rcastillo|out is now known as rcastillo | 12:56 | |
fungi | #status log Pruned backups on backup02.ca-ymq-1.vexxhost.opendev.org | 12:57 |
opendevstatus | fungi: finished logging | 12:57 |
fungi | also the problem we ran into with rackspace's mirror is that it started rejecting rsync connections out of the blue, it wasn't related to the content of their mirror | 13:00 |
fungi | though more generally, i am curious why centos insists on looking for different indices than are present on mirrors from time to time. it seems very sensitive to mirror states older than whatever exists on disk. are we not using our mirrors when building our centos images? | 13:01 |
*** rcastillo is now known as rcastillo|rover | 13:02 | |
*** amoralej is now known as amoralej|lunch | 13:11 | |
*** jentoio_ is now known as jentoio | 14:02 | |
*** amoralej|lunch is now known as amoralej | 14:09 | |
*** anbanerj is now known as frenzyfriday | 14:41 | |
*** ykarel_ is now known as ykarel | 14:51 | |
*** dviroel is now known as dviroel|lunch | 14:58 | |
clarkb | fzzf[m]: you need to ensure you have fewer connections made to gerrit on port 29418. One thing to double check is that your clients are properly closing connections | 15:41 |
fungi | zuul prior to 2.5.2 had an issue with that, right? | 15:48 |
fungi | leading to https://review.openstack.org/447066 | 15:49 |
*** promethe- is now known as prometheanfire | 16:03 | |
*** dviroel|lunch is now known as dviroel | 16:15 | |
frickler | oh, might that be an issue with recent paramiko versions again? 2.9.x broke connections to cirros, which might be a similar scenario. although then I'd expect connections to fail, not to pile up | 16:17 |
fungi | or it might just be a very, very old zuul which doesn't have that fix | 16:18 |
frickler | hmm, gerrit show-connections is very slow, because it tries to reverse resolve every remote IP. if I avoid that with -n, at the same time it also only shows user ids instead of names. not very helpful, thx. | 16:29 |
frickler | anyway I'm not seeing more than a handful of parallel connections currently, so it must only be a spurious problem | 16:30 |
clarkb | I think in the past we've seen firewall ungracefully kill connections that idle too. Then zuul reconnects but gerrit doesn't know it can close the old connection. But ya if it isn't a persistent problem may be more difficult to debug | 16:31 |
clarkb | when I looked at our mirrors yesterday for centos 8 stream both the facebook and limestone upstream seemed to be roughly up to date at the same level. | 16:57 |
clarkb | Are we sure that facebook will be any better for stream? I suspect some underlying problem with the mirror updates and not the mirrors we pull from | 16:57 |
*** jpena is now known as jpena|off | 17:04 | |
fungi | frickler: yeah, i end up using alternative means to reverse the numeric ids to account names (rest api is probably easiest) | 17:09 |
fungi | clarkb: if you follow the copy of the irc discussion copied in pagure, it sounds like there's more than one backend the mirror sites might get balanced between, so if both aren't kept in sync then downstream mirrors can end up rewinding to an earlier state | 17:14 |
fungi | but there was some assertion that facebook validates their mirror consistency to prevent that from occurring on their copy? that much was not entirely clear to me | 17:15 |
clarkb | ah I see | 17:15 |
clarkb | but also that seems problematic if you are a centos 8 stream user and you can pull from constantly reverting systems | 17:16 |
fungi | also i haven't checked to see if the leaseweb site fell off the official centos mirrors list in the ~2 years since we switched to it | 17:16 |
clarkb | this isn't just an opendev CI problem is what I'm saying :) | 17:16 |
fungi | there was some indication that mirror operators are supposed to have specific blackout windows for updates in order to avoid copying inconsistent state too | 17:17 |
fungi | but it also sounded like there may have been an update to the primary content outside that window | 17:18 |
fungi | which, if so, yeah it's not the mirror site's fault | 17:18 |
fungi | gmann just updated one of the threads on openstack-discuss to say there are more 404 errors for centos stream 8 mirrors | 17:20 |
fungi | gmann: do you have a recent example? | 17:20 |
gmann | fungi: yeah, let me check | 17:20 |
fungi | there was a ~6 hour window yesterday where we'd copied a "bad" state from a centos mirror and were serving that, but it should have cleared up around 13:00z | 17:21 |
gmann | fungi: https://zuul.openstack.org/build/260bc2dea5ec48aab2e9ded2a1afabd4 | 17:22 |
gmann | this is last i have seen https://zuul.openstack.org/build/d6bdd3a87c2c445b98120a5cf5bda294 | 17:22 |
clarkb | I've just remembered I totalled spaced on doing the limnoria update yesterday. Oh well. | 17:31 |
fungi | according to https://static.opendev.org/mirror/logs/rsync-mirrors/centos.log we synced in AppStream/x86_64/os/repodata/6cf23d2ab4fded7416b05bddd463d0fba08f77965b091c59a9e057654b2cef99-filelists.xml.gz from leaseweb's mirror and published it at 2022-01-19T06:58:12, then deleted it because it had disappeared from leaseweb and published that state at 2022-01-20T06:55:09, then back again | 17:34 |
fungi | 2022-01-20T12:57:29, refreshed it 2022-01-20T18:45:42 (maybe there was a timestamp change?), deleted 2022-01-21T00:45:01 | 17:34 |
clarkb | is there a change to swap to the facebook mirror? we can test its consistency verification by using it and observiing if it does ^ ya? | 17:35 |
fungi | yet there's a job trying to retrieve it at 2022-01-21 04:01:13 | 17:35 |
fungi | gmann's latest example is for a job which started 2022-01-21 06:14:19 | 17:36 |
clarkb | they may have cached index content via the image builds? | 17:38 |
fungi | yeah, that's why i was looking at the job start time, trying to correlate it to image updates | 17:39 |
gmann | fungi: clarkb I can see latest run is passing (at 08:13) https://zuul.openstack.org/build/f7755bae78094c43a38a630fdbf3218d/logs | 17:39 |
fungi | i wonder if we're not using our mirrors when building the centos-8-stream images | 17:40 |
fungi | and at times end up with nodes that think the mirrors have gone backwards in time | 17:40 |
fungi | that build ran in rax-dfw which got a new centos-8-stream image moments ago, but before that a little over a day ago | 17:42 |
fungi | so whatever caused things to start failing or clear up doesn't seem to have been related to an image update | 17:43 |
fungi | the mirror-update log shows a fairly massive stream 8 update which we published at 2022-01-21T06:57:37 so that's probably when it cleared up | 17:45 |
fungi | seems to correlate fairly closely with gmann's observations | 17:45 |
fungi | i've also confirmed mirror.dal10.us.leaseweb.net is still on the list of official mirrors at https://www.centos.org/download/mirrors/ which claims to cover centos linux and stream 8 | 17:53 |
*** amoralej is now known as amoralej|off | 18:09 | |
*** ysandeep is now known as ysandeep|out | 18:20 | |
clarkb | fungi: should I go ahead and approve https://review.opendev.org/c/opendev/system-config/+/825446/ ? I've been distracted by other things so not completely sure where you've landed on that | 18:28 |
opendevreview | Clark Boylan proposed opendev/system-config master: Rebuild gerrit images to pick up slf4j fix https://review.opendev.org/c/opendev/system-config/+/825873 | 18:33 |
clarkb | infra-root ^ if we land that today I can try to help restart gerrit later today though I do have some appointments today | 18:33 |
fungi | clarkb: i don't think 825446 is urgent, but if you're in favor of it then feel free to approve | 18:40 |
clarkb | I imagine the sooner we can rule in or out the flip flopping of the facebook mirror the sooner centos can make reliable updates to their mirroring system? | 18:42 |
clarkb | mostly my concern is that facebook is just as broken and we're pretending it isn't. hard data would be useful | 18:42 |
clarkb | I've approved it | 18:46 |
fungi | thanks, and yeah that's my thinking as well | 18:52 |
opendevreview | Merged opendev/system-config master: Use facebook mirror for CentOS Stream 8 https://review.opendev.org/c/opendev/system-config/+/825446 | 19:12 |
*** Guest1320 is now known as melwitt | 19:18 | |
noonedeadpunk | fwiw irc bot seems stuck | 19:49 |
noonedeadpunk | *gerritbot | 19:49 |
fungi | opendevreview is in channel still, it commented in here at 19:12 (a little over half an hour ago) | 19:51 |
fungi | did it fail to report something more recently than that? | 19:51 |
fungi | maybe it lost contact with gerrit | 19:51 |
fungi | i'll take a look in its logs | 19:51 |
noonedeadpunk | Yeah, I did bunch of patch updates and like 1 out of 10 was reported | 19:54 |
noonedeadpunk | commented for us more then 2 hours ago last time | 19:55 |
noonedeadpunk | like https://review.opendev.org/c/openstack/ansible-role-qdrouterd/+/824564 has been just merged minutes ago | 19:56 |
clarkb | fungi: re 825873 I wonder if we should abandon and restore that so that we hopefully get enqueued to a different cloud for the image builds (currently airship cloud and that one has less capacity hence the queuing) | 20:00 |
fungi | the gerritbot saw the change-merged event for that at 19:54:14 | 20:00 |
clarkb | ya I think I'm going to go ahead and do that now since I don't think it will be any slower | 20:00 |
clarkb | but has a good chacne of being faster | 20:00 |
noonedeadpunk | 824567 has been jsut reported though | 20:01 |
fungi | Potential channels to receive event notification: set() | 20:02 |
clarkb | probably not subscribed to events for that repo | 20:02 |
noonedeadpunk | oh... | 20:03 |
noonedeadpunk | that could be the case:) | 20:03 |
fungi | noonedeadpunk: i don't see it in the config? | 20:03 |
fungi | that repo, i mean | 20:03 |
noonedeadpunk | I guess I took reporting as granted:) will fix that | 20:03 |
noonedeadpunk | sorry for taking your time | 20:03 |
fungi | noonedeadpunk: no worries, normally you'd add it to gerritbot/channels.yaml in openstack/project-config at the same time you created the project | 20:04 |
noonedeadpunk | yeah, I was relying that it was done :) | 20:04 |
fungi | but it's easy to overlook even though we have it documented together | 20:04 |
opendevreview | Dmitriy Rabotyagov proposed openstack/project-config master: Add ansible-role-qdrouterd IRC reporting https://review.opendev.org/c/openstack/project-config/+/825878 | 20:10 |
*** dviroel is now known as dviroel|out | 20:53 | |
opendevreview | Ghanshyam proposed openstack/project-config master: Add openstack-skyline irc channel in access an gerrit bot https://review.opendev.org/c/openstack/project-config/+/825881 | 21:17 |
opendevreview | Ghanshyam proposed opendev/system-config master: Add openstack-skyline channel in statusbot/meetbot/logging https://review.opendev.org/c/opendev/system-config/+/825882 | 21:24 |
opendevreview | Merged openstack/project-config master: Add ansible-role-qdrouterd IRC reporting https://review.opendev.org/c/openstack/project-config/+/825878 | 22:12 |
opendevreview | Eduardo Santos proposed openstack/diskimage-builder master: Fix openSUSE images and bump them to 15.3 https://review.opendev.org/c/openstack/diskimage-builder/+/825347 | 22:13 |
opendevreview | Eduardo Santos proposed openstack/diskimage-builder master: General improvements to the ubuntu-minimal docs https://review.opendev.org/c/openstack/diskimage-builder/+/806308 | 22:21 |
opendevreview | Eduardo Santos proposed openstack/diskimage-builder master: Don't run functional tests on doc changes https://review.opendev.org/c/openstack/diskimage-builder/+/825891 | 22:24 |
opendevreview | Merged opendev/system-config master: Rebuild gerrit images to pick up slf4j fix https://review.opendev.org/c/opendev/system-config/+/825873 | 22:38 |
clarkb | fungi: looks like promote for new gerrit images completed successfully. Should we go ahead and restart gerrit real soon now? I'm not sure if you are still around | 22:55 |
fungi | yeah, i'm still around and happy to help with a restart | 22:56 |
clarkb | cool. I'd like to get that done then I'll probably shift gears to zuul reviews | 22:56 |
clarkb | do you want to drive or should I? I'm happy either way | 22:56 |
fungi | i can | 22:57 |
clarkb | cool if you start a screen I can join that | 22:57 |
fungi | i have one going and pulled in it | 22:58 |
clarkb | I've joined | 22:58 |
fungi | opendevorg/gerrit 3.3 8aaea7eace5d 56 minutes ago 790MB | 22:58 |
clarkb | let me double check against docker hub | 22:58 |
fungi | thanks | 22:59 |
clarkb | that one looks correct to me when I inspect it and compare the sha against what is tagged on docker hub | 22:59 |
fungi | okay, i'm downing the container in that case | 23:01 |
clarkb | ++ | 23:01 |
fungi | #status notice The Gerrit service on review.opendev.org is being restarted briefly to apply a bugfix | 23:01 |
opendevstatus | fungi: sending notice | 23:01 |
-opendevstatus- NOTICE: The Gerrit service on review.opendev.org is being restarted briefly to apply a bugfix | 23:01 | |
fungi | and it's on its way back up now | 23:02 |
fungi | [2022-01-21T23:02:13.999Z] [main] INFO com.google.gerrit.pgm.Daemon : Gerrit Code Review 3.3.9-17-g3724a74167-dirty ready | 23:02 |
clarkb | I've got a web ui now | 23:02 |
clarkb | it shows me all the work I should be doing :) | 23:02 |
fungi | hah | 23:02 |
clarkb | loading changes is slow but that is normal | 23:02 |
fungi | it's friday. you can always get your work done on saturday! ;) | 23:03 |
clarkb | I hopped off the screen and will let you decide when it is safe to close it. But from what I see so far it seems happy | 23:04 |
fungi | yeah, lgtm, i'll wrap it up | 23:04 |
fungi | thanks! | 23:05 |
clarkb | and thank you! | 23:05 |
clarkb | I need some nourishment then diving into zuul reviews | 23:05 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!