Wednesday, 2023-12-06

frickler	JayF: fwiw I've been thinking to set up a link shortener fed by gerrit reviews, that would be pretty simplistic. for now though, I'm just running one privately for the things I regularly need	05:44
frickler	seems gmail is still at least delaying mails because we have more recipients than they like. I really wonder if proactively blocking mails towards them wouldn't be a better solution	07:24
frickler	if not, we'll likely have to do further work and maybe deploy ARC https://docs.mailman3.org/projects/mailman/en/latest/src/mailman/handlers/docs/arc_sign.html	07:25
*** Guest8552 is now known as layer8		08:37
*** layer8 is now known as Guest9391		08:37
*** Guest9391 is now known as layer9		08:39
opendevreview	Merged opendev/zone-opendev.org master: Remove old mirror nodes from DNS https://review.opendev.org/c/opendev/zone-opendev.org/+/902716	12:31
opendevreview	Bartosz Bezak proposed openstack/diskimage-builder master: Add NetworkManager-config-server to rocky-container https://review.opendev.org/c/openstack/diskimage-builder/+/892893	12:57
opendevreview	Jeremy Stanley proposed opendev/system-config master: Switch install-docker playbook to include_tasks https://review.opendev.org/c/opendev/system-config/+/902775	13:58
fungi	noticed that in this deploy job failure: https://zuul.opendev.org/t/openstack/build/7c9446f8b03345158de4346e946f8f31	13:59
fungi	not sure if it was the reason for the failure or just a red herring, but worth cleaning up either way	13:59
fungi	yeah, it wasn't the cause. found this in the log on bridge:	14:00
fungi	Recreating jaeger-docker_jaeger_1 ... done	14:00
fungi	fatal: [tracing01.opendev.org]: FAILED!	14:00
fungi	Timeout when waiting for 127.0.0.1:16686	14:00
fungi	so the container didn't come up, or didn't come up fast enough	14:01
fungi	the periodic build also failed exactly the same way	14:03
fungi	the last successful periodic run was on friday	14:04
fungi	the container log is complaining about subchannel connectivity problems	14:18
fungi	2023-12-06T12:40:47.191630938Z grpc@v1.59.0/server.go:964 [core][Server #5] grpc: Server.Serve failed to create ServerTransport: connection error: desc = ServerHandshake("127.0.0.1:45852") failed: tls: first record does not look like a TLS handshake	14:19
fungi	i'm going to try manually downing and upping the container just to see if i get anything else out of it	14:20
tonyb	Kk	14:20
fungi	log is still full of connection errors after the restart	14:23
fungi	looks like there were updates for the jaegertracing/all-in-one container image 44 hours ago, 4 days ago, and 7 days ago	14:24
tonyb	can we get the SHA for the last good deploy?	14:25
fungi	i don't know if the failures are related to the images, but the image from 7 days ago was definitely before the periodic job started failing	14:26
fungi	the one from 4 days ago falls inside the window between the last successful run on friday and the first failing run on monday	14:28
fungi	the one from 44 hours ago was after the first failure	14:29
fungi	docker image inspect of the 7-day-old image says it's jaegertracing/all-in-one@sha256:963fed00648f7e797fa15a71c6e693b7ddace2ba7968207bb14f657914dac65b	14:30
fungi	"Created": "2023-11-29T06:06:18.566997546Z"	14:30
fungi	can i replace "latest" with "963fed00648f7e797fa15a71c6e693b7ddace2ba7968207bb14f657914dac65b" in the compose file to test? does that syntax work?	14:32
fungi	not found: manifest unknown: manifest unknown	14:33
fungi	guess not	14:33
fungi	i switched from latest to 1.51 after looking at https://hub.docker.com/r/jaegertracing/all-in-one/tags and am seeing similar connection failures	14:35
tonyb	Try FROM jaegertracing/all-in-one@sha256:963fed00648f7e797fa15a71c6e693b7ddace2ba7968207bb14f657914dac65b	14:35
tonyb	Ah okay so that probably isn't it	14:36
fungi	mmm, though these connection failures are info and warn level only	14:37
fungi	2023-12-06T14:36:11.773226176Z grpc@v1.59.0/clientconn.go:1521 [core][Channel #1 SubChannel #2] grpc: addrConn.createTransport failed to connect to {Addr: "localhost:4317", ServerName: "localhost:4317", }. Err: connection error: desc = transport: Error while dialing: dial tcp 127.0.0.1:4317: connect: connection refused	14:37
frickler	infra-root: forgot to mention yesterday, we still have three old held nodes that don't show up in autoholds, can someone look into cleaning those up?	14:44
frickler	there's also node 0035950265 that seems to be stuck in "deleted" state somehow	14:48
fungi	sometimes `openstack server show ...` will include an error message in those situations	14:49
frickler	I think that's more for "deleting" rather than "deleted"? anyway: No Server found for 01b73bd3-ad22-46a6-a88e-6e33fc2f4b61	15:00
fungi	"stuck in deleted state" according to openstack server list? or according to nodepool list?	15:01
frickler	according to the "nodes" tab in zuul, is that the same as nodepool list?	15:02
fungi	i think so, i haven't relied much on the webui for that	15:02
fungi	so is that where you're also seeing the held nodes you're talking about?	15:03
frickler	yes	15:05
fungi	nodepool list reports 11 nodes in a hold state, which corresponds to what i see at https://zuul.opendev.org/t/openstack/nodes	15:06
frickler	just checked, "nodepool list" doesn't have 0035950265, but 31 other very old nodes in "deleted" state	15:06
frickler	so a) some mismatch in state between zuul and nodepool and b) some cleanup in nodepools zk being broken I guess	15:07
fungi	zuul/nodepool switched from numeric node ids to uuids semi-recently	15:08
fungi	maybe that's the event horizon where it lost track of the old deleted nodes	15:10
frickler	I'm still seing numeric node ids both in zuul and nodepool, though. I think the switch was only for image IDs?	15:10
fungi	oh, maybe	15:11
frickler	build ids to be more specific	15:11
frickler	image builds, that is, not zuul builds	15:12
fungi	looks like all the ones `nodepool list` reports in a "deleted" state are missing pretty much all data besides the node number, state, time in state, lock status, username, port	15:13
fungi	no provider even	15:13
fungi	so yes, probably will require some manual cleanup with zkcli	15:13
fungi	looks like they're all around 8-12 months old	15:14
JayF	frickler: running one privately is an option, I should do that	15:59
tonyb	I did a little poking at the ensure-pip role for enabling python3.12. Under the hood both pyenv and stow use the python-build tool from pyenv. It's just a question of when. pyenv would build a python3.12 on every job run, stow would build the python3.12 binary when we build the nodepool image (by using the python-stow-versions DIB element)	16:00
tonyb	I guess for now as a POC option 1 pyenv with job builds is the quickest POC	16:01
fungi	looks like the issue with the jaeger role may be that the 60-second timeout is too short	16:16
fungi	it does eventually listen and start responding on port 16686 but takes a while	16:17
fungi	i'll see if i can get a baseline timeframe	16:17
clarkb	to figure out the extra held nodes nodepool can list them with the detail data. I'll take a look shortly	16:20
fungi	clarkb: the "extra" held nodes already show the comment info in the nodes list (both from the nodepool cli and zuul's webui)	16:20
fungi	there just isn't a corresponding autohold zuul is tracking any longer	16:21
clarkb	oh then we just identify if they need to be held any longer and if not delete them	16:21
clarkb	based on the comment	16:22
fungi	empirical testing suggests 60 seconds is way too short of a timeout for jaeger to start listening. my initial test took 80 seconds. i'll run a few more restarts to see how consistent that is	16:22
tonyb	fungi: so we've just gotten really lucky with the 60s timeout to date?	16:24
fungi	or something has recently caused it to get slower	16:24
clarkb	frickler: fungi: the quoted text in the commit messages for the spf record updates implied that spf or dkim were sufficient. Maybe they actually want spf and dkim? In which case arc does seem like a next step	16:24
tonyb	Fair	16:24
fungi	i can try rolling back the version again to see if it speeds up	16:24
fungi	clarkb: the new deferral responses from gmail doesn't indicate it has anything to do with message authentication, nor does it imply that additional message authentication would help	16:25
clarkb	fwiw jaeger must sit in front of a db of some sort. I'm not sure how safe rolling backwards will be	16:25
clarkb	fungi: oh they chagned the message?	16:25
fungi	clarkb: it says there are too many recipients for the same message-id	16:25
tonyb	I know I had to add SPF and DKIM to my bakeyournoodle.com domain to get google to not send messages directly into spam.	16:25
clarkb	weird that we would do what they ask and then complain about something else	16:25
tonyb	but lists are ... different	16:26
fungi	80 seconds again on my second restart test	16:26
tonyb	I guess the MTA just bailed at the first error so there could be more coming	16:26
tonyb	... Other communities must have hit this	16:27
clarkb	tonyb: likely, but these are also new changeso n the google side	16:28
clarkb	so it may be we're all hitting them in the last couple of days and scratching our heads	16:28
fungi	well, gmail wasn't exactly barfing before. it rate-limited deliveries from the server, saying authenticating by adding either spf or dkim would help (and will become mandatory in a couple months). now it's rate limiting again, but because of the number of people receiving the same post (so basically the number of gmail subscribers to lists)	16:28
clarkb	and I'm sure they'll be intentionally vague/obtuse in the name of not giving spammers a leg up	16:29
fungi	this time the implication is that the deferral is per message-id though, so it seems like some gmail subscribers receive the message, but it takes multiple returns back from the deferral queue before everyone does	16:29
clarkb	the incredibly frustrating thing here is that if you use email you'll be aware of the fact that gmail is the source of a significant portion of spam	16:29
fungi	and yeah, i expect this is just gmail ratcheting up their rules to try to block spam messages. i half wonder if it's also restricted to posts from people using gmail or gmail-hosted addresses. i've seen insinuation elsewhere that gmail is cracking down on messages that say they're "from" a gmail account but are originating from servers outside gmail's network. this may be the time we turn on	16:33
fungi	selective dmarc mitigation on openstack-discuss for people posting from gmail.com and any other domains which seem to be hosted at gmail	16:33
fungi	supposedly they added that feature in the latest release specifically because of gmail delivery problems	16:34
fungi	basically, mailman tries to guess how to mitigate messages with existing dkim signatures by evaluating the published dmarc policies for the post author's domain, but gmail doesn't obey its own published dmarc policies so mailman guesses wrong	16:36
tonyb	https://support.google.com/mail/answer/81126#requirements-5k&zippy=%2Crequirements-for-sending-or-more-messages-per-day	16:36
tonyb	It seems to be a reaonsably long list of do's	16:36
tonyb	SPF, DKIM and ARC	16:36
fungi	we don't send 5k messages a day, but maybe they mean multiplied by the number of recipients	16:37
tonyb	Yeah I suspect 1 list-email to 100 [gmail]accounts counts as 100 in the context of that document	16:39
opendevreview	Brian Rosmaita proposed openstack/project-config master: Implement ironic-unmaintained-core group https://review.opendev.org/c/openstack/project-config/+/902796	16:39
fungi	all my jaeger restart tests are coming in at 80 or 81 seconds	16:41
clarkb	maybe set the timeout to 160 seconds then?	16:42
opendevreview	Jeremy Stanley proposed opendev/system-config master: Increase jaeger startup timeout https://review.opendev.org/c/opendev/system-config/+/902797	16:44
clarkb	infra-root https://review.opendev.org/c/opendev/system-config/+/902490 could use reviews if we still want to restart gerrit and try to use the new key again	16:53
clarkb	I've just come to a realization that that file is not using any templating anymore. Do you want me to make it a normal file copy and stop jinjaing it?	16:53
clarkb	that file == .ssh/config	16:53
fungi	that would probably be better, yes	16:55
clarkb	ok I'll make that update momentarily. Still haven't sorted out ssh keys this morning	16:56
opendevreview	Clark Boylan proposed opendev/system-config master: Reapply "Switch Gerrit replication to a larger RSA key" https://review.opendev.org/c/opendev/system-config/+/902490	17:02
clarkb	fungi: ^ that removes the jinja templating	17:02
fungi	lgtm, thanks!	17:04
tonyb	and me.	17:05
clarkb	we still good to do a restart later today? If so I'll go ahead and approve it now	17:06
tonyb	Yup. I'll be AFK from 1730-1900[UTC] but I'm also optional	17:07
clarkb	frickler: two of the nodes are related to holds I did for testing of the Gerrit bookworm and java 17 update. Those two can be deleted (I'll do this). The third is a kolla octavia debugging hold. I believe all three were leaked bceause we did a zuul update that including removing the zuul database (but not nodepools)	17:08
clarkb	frickler: I'll let you delete 0035154490 when you are done with it	17:09
tonyb	We could also merge https://review.opendev.org/q/topic:gerrit-3.7-cleanup+is:open as it doesn't touch prod right?	17:09
fungi	i expect to step out to a late lunch/early dinner 19:30-20:30 utc	17:09
fungi	otherwise i'm available	17:09
frickler	clarkb: how do I delete it? I only know how to delete autoholds	17:10
clarkb	cool the best time for me is probably after 21:00 anyway (since before that I've got lunch and all the stuff from yesterday to catch up on)	17:10
clarkb	frickler: on a nodepool launcher node (nl01-nl04) you can run nodepool commands using this incantation `sudo docker exec nodepool-docker_nodepool-launcher_1 nodepool $subcommand`. In this case I ran the `list --detail` subcommand and piped it to `grep hold` to find the nodes nodepool sees as heald	17:12
clarkb	frickler: tehn you can run the `delete $nodeid` subcommand to delete nodes directly	17:12
clarkb	frickler: the number I pasted above is the node id (0035154490)	17:12
clarkb	it shouldn't generally be necessary but I believe when we cleared out the zuul database entirely (because there was a manual upgrade problem? logs would tell us exactly why) that only claered out the zuul side of the zk database and nodepool kept its portion of the held records	17:14
clarkb	generally we don't want to dlete the nodepool side of the db because it keeps track of state outside of our systems and we want that to be in sync as much as possible	17:15
opendevreview	Merged opendev/system-config master: Increase jaeger startup timeout https://review.opendev.org/c/opendev/system-config/+/902797	17:22
frickler	clarkb: ok, thx, that seems to have worked fine, node 0035154490 is gone. doing a delete on one of the deleted nodes has put it back into "deleting"	17:29
clarkb	ya the delete command puts nodes in a deleting state in the db then normla nodepool runtime loops process that deletion	17:29
clarkb	when nodes are already deleting and stuck in that state an explicit deletion in nodepool is unlikely to change anything due to this. We can try to delete things manually using the openstack client directly and see if we get any errors back that we can parse and take action on	17:30
clarkb	looking at the ARC stuff that is basically fancy DKIM for mailing lists? We wouldn't need to configure separate DKIM records?	17:33
clarkb	or would we need separate DKIM for the administrative emails that come directly from the server?	17:34
fungi	we'd also have to stop keeping the original from addresses	17:34
tonyb	The doc I linked to indicates that we need SPF and DKIM for all "bulk" mail senders, and ARC for list hosts specifically	17:35
clarkb	hrm I'm confused as to what ARC does then. If we're rewriting the email to say its from us then we would just do normal DKIM ?	17:35
fungi	oh, i have no idea about arc, by "fancy dkim" thought you still meant based on the from address	17:35
tonyb	We're saying it's from us on behalf for "them", and we good with that	17:36
tonyb	IIUC	17:36
clarkb	ya I think my confusion is taht ARC is just DKIM	17:36
fungi	so far what little i've known about arc is that it's yet another attempt by massmail providers to make e-mail impossible for anyone who isn't them	17:36
tonyb	LOL	17:36
clarkb	but we've got another term in play to encapsulate the "remove the source DKIM stuff and replacei t with our own"	17:36
clarkb	the dns records used by arc are dkim records	17:37
clarkb	so it is just dkim with maybe extra steps	17:37
tonyb	heading out. I'll be on my phone if needed	17:37
fungi	but anyway, if people want to get list mail in a timely manner, they can subscribe from a proper mail provider. i've maybe got bandwidth to look at what would be involved in adding our own dkim signing sometime next year	17:38
clarkb	ya I mean I gave up on gmail for open source mailing lists in ~2015? I don't remember exactly when I jumped ship due to the problems they were creating back then	17:42
clarkb	Definitely not new problems. What I think has chagned is in the intervening period more people (often due to employer choices) have ended up on gmail for work like this and gone the opposite direction	17:43
fungi	yes, well i also don't subscribe to mailing lists with my employer-supplied e-mail address for similar reasons	17:44
clarkb	to followup on the ControlPersist chagne I can see ssh processes passing in the new values so it applied properly and doesn't seem to have broken anything. On the process cleanup side of things the main ssh process with the -oControlPersist=180s flag does seem to go awy when ansible goes away. but there are ssh processes associated with .ansible/cp/$socketid sockets that appear to	19:13
clarkb	hang around for the timeout	19:13
clarkb	so there is "leakage" but I think three minutes is still short enough to not create too much headache	19:13
clarkb	those socket paths are the -o ControlPath values so ssh must fork a child or something to manage the socket because the timeout is meant to last beyond the lifetime of the parent if you start another control persist process with the same path?	19:14
clarkb	frickler: ^ fyi since there was qusetion about this	19:15
opendevreview	Merged opendev/system-config master: Reapply "Switch Gerrit replication to a larger RSA key" https://review.opendev.org/c/opendev/system-config/+/902490	19:19
fungi	oh, as for the jaeger startup timeout change, the deploy job worked when it merged	19:23
frickler	clarkb: ok, thx for checking	19:26
clarkb	the gerrit ssh config change appears to have applied as expected	19:26
clarkb	I need lunch soon and will look at service restarts afterwards	19:27
tonyb	clarkb: Yeah. there is a master pid for each ControlSocket. ps `pidof ssh` \| grep -E mux should show you?	19:31
fungi	okay, disappearing for food, back in about an hour	19:34
tonyb	Enjoy	19:35
clarkb	The process for restarting gerrit and using the new key should be something like: docker-compose down; mv id_rsa and id_rsa.pub to new .bak suffixed files; mv the replication wauting queue aside; docker-compose up -d; trigger replication	20:24
clarkb	this is basically the same process as on Friday but we expect different results due to the updated ssh config file with the correct path to the new private key	20:25
clarkb	unfortunately it seems that we only get the "trying key foo" log messages when replication fails and we also get "no more keys"	20:26
clarkb	so there isn't a positive confirmation of the key being used when it succeeds in the replication log. One alternativei f we don't want to mv id_rsa aside is to check the sshd logs on the gitea servers as those should log the hashed pubkey value	20:27
clarkb	if we don't move id_rsa aside because we're concerned about a repeat of friday that may be good enough for confirmation	20:27
tonyb	I get that MINA ssh isn't openssh but is it worth doing something like: docker exec -it --user gerrit gerrit_compose_???_1 ssh -vv gitea10.opendev.org before the replication step	20:29
tonyb	that'd verif that the ssh config file is correct and the key is prsent at each end	20:30
clarkb	ya I think we can do that before we even restart	20:30
clarkb	I'll do that now	20:30
tonyb	is .ssh/config coming from a mounted volume?	20:31
clarkb	yes	20:31
tonyb	Nice	20:31
clarkb	unexpectedly openssh wants me to confirm the ssh host key for gitea09 so I've ^C'd and am trying to sort out why	20:32
clarkb	oh I know the port is wrong	20:32
clarkb	I did `sudo docker exec -it gerrit-compose_gerrit_1 bash` then `ssh -p222 git@gitea09.opendev.org` and that returned "Hi there, gerrit! You've successfully authenticated with the key named Gerrit replication key B, but Gitea does not provide shell access."	20:33
tonyb	if you do include the -vv it'll tell you the Host sections parsed and the key being presented	20:34
clarkb	tonyb: ah ok I can do that again. Though gitea seems to have confirmed it used the correct key	20:35
tonyb	Oh okay	20:35
tonyb	.... Ah I see it there. nevermind	20:36
clarkb	"/var/gerrit/.ssh/config line 1: Applying options for gitea*.opendev.org" and "identity file /var/gerrit/.ssh/replication_id_rsa_B type 0" are in the debug output	20:36
tonyb	That's sounding good.	20:37
clarkb	openssh at least appears to parse this the way we expect	20:37
tonyb	\o/	20:37
clarkb	in that case I'm inlcined to move id_rsa aside just so there is no doubt when we restart since we should be pretty confident that the new key can be used and will be used	20:38
tonyb	Sure.	20:39
fungi	okay, i've returned from food. need to change back into something more comfortable and will be available for gerrit work	20:56
clarkb	infra-root looking at the gitea09 backup failures today did not fail and the failures that happened appaer to have occured due to mysql being updated at the same time as we try to do mysqldumps	20:56
clarkb	automated softrware updates conflicting with automated backups. The good news is we backup to two separate location and only the one location seems to have conflicted with our mysql updates	20:57
clarkb	I think this has to do with overlap in periodic job runtimes and updates upstream of us	20:57
fungi	that sounds plausible	20:57
clarkb	if it persists we should look at maybe removing the 02:00 to 08:00 window of time from valid hours for automated backups or something like that since that is around when periodic jobs run	20:58
clarkb	fungi: see above for additional gerrit replication validation. I think we can start planning a time to restart the service	20:58
clarkb	maybe 21:30 ish?	20:59
fungi	yeah, already saw the ssh tests, i concur	21:00
fungi	30 minutes from now sounds great	21:00
clarkb	I've started a root screen	21:21
clarkb	also how does this look #status notice We are restarting Gerrit again for replication configuration updates after we failed to make them last week. Gerrit may be unavailable for short periods of time in the near future.	21:24
fungi	a bit wordy. i'm not opposed, but if it were me i'd just repeat the one i sent last week for brevity	21:25
fungi	attached to the screen session	21:26
* clarkb looks that one up		21:26
clarkb	here it is: #status notice The Gerrit service on review.opendev.org will be offline momentarily to restart it onto an updated replication key	21:27
clarkb	I'm good with that	21:27
clarkb	I'm tempted to let the zuul gate clean up since a number of changes are saying they are less than a minute away	21:28
clarkb	but then send that notice and proceed	21:28
fungi	yeah, i don't feel like we need to apologize for the previous attempt. it's not like anybody else volunteered to take care of it	21:29
fungi	takes as many tries as it takes	21:29
fungi	not everything gets done right the first time around	21:30
clarkb	the nova job is finishing up	21:31
clarkb	so ya I can wait a couple minutes	21:31
fungi	i'm in no hurry	21:32
clarkb	I think enough things have happened in zuul we can proceed	21:35
clarkb	I'll send the notice now	21:35
clarkb	#status notice The Gerrit service on review.opendev.org will be offline momentarily to restart it onto an updated replication key	21:35
opendevstatus	clarkb: sending notice	21:35
-opendevstatus- NOTICE: The Gerrit service on review.opendev.org will be offline momentarily to restart it onto an updated replication key		21:35
opendevstatus	clarkb: finished sending notice	21:38
clarkb	I will proceed now	21:38
clarkb	gerrit has been restarted. Now we need someone to push something :)	21:40
fungi	i have something to push, just a sec	21:40
fungi	git review took a minute	21:41
fungi	https://opendev.org/openstack/election/commit/21cd3533ddfa587d16bbceed498c48029ab91fa9	21:41
fungi	that says patchset 3 which has a timestamp from now: https://review.opendev.org/c/openstack/election/+/876738	21:42
clarkb	you pushed as a wip so it didn't show up in my listing of open changes :)	21:42
clarkb	the replication log appears to show replication for openstack/election completing successfully though	21:43
clarkb	now to check the actual gitea content	21:43
clarkb	fungi: I confirm that the latest patchset which was pushed after the restart shows up in gitea	21:43
clarkb	I think we are good	21:43
fungi	yep, lgtm!	21:44
fungi	second time's a charm	21:44
clarkb	I'm going to shutdown the screen now	21:44
clarkb	I'll get a change up to remove the old key and rebase the gitea 1.21 upgrade on that	21:44
opendevreview	Clark Boylan proposed opendev/system-config master: Set both replication gitea ssh keys to the same value https://review.opendev.org/c/opendev/system-config/+/902842	21:47
JayF	Internal server error while editing a wiki page, https://home.jvf.cc/~jay/wiki-openstack-org-error-20231206.png	21:50
opendevreview	Clark Boylan proposed opendev/system-config master: Update gitea to 1.21.1 https://review.opendev.org/c/opendev/system-config/+/897679	21:50
JayF	"Lock wait timeout exceeded"	21:50
JayF	Perhaps just a DB needs a restart?	21:51
JayF	I'll note that the page edit did take.	21:51
clarkb	or there is lock contention for that lock for some reason	21:52
JayF	No idea, but wanted to make sure it got reported since I was able to screenshot the error. I know wiki is barely supported if at all	21:52
clarkb	I was able to edit https://wiki.openstack.org/wiki/Meetings/InfraTeamMeeting#Agenda_for_next_meeting without an error	21:53
clarkb	so isn't representative of a general persistent problem (or at least not affecting 100% of edits)	21:53
fungi	the database is remote (rackspace "trove" instance), so network timeouts for database writes aren't unheard of	21:55
opendevreview	Clark Boylan proposed opendev/system-config master: DNM intentional gitea failure to hold a node https://review.opendev.org/c/opendev/system-config/+/848181	21:55
clarkb	I've refreshed the autoholds for ^ and I'll clean up the gerrit replication autoholds tomorrow	21:57
clarkb	I'll approve the gerrit 3.9 image stuff first thing tomorrow	22:42
clarkb	Will be a good conversation item for the gerrit community meeting if nothing else comes up	22:42
tonyb	sounds good	22:43
clarkb	now that EU and NA are both off of DST the meeting should be at 8am pacific time	22:43
tonyb	that isn't terrible, but could be a pain with any school run	22:44
clarkb	it alsoconflicts with some writing show off thing at school, but it sounds like the kids will come home with their writing and can show it off at home attheend of the day so not a big deal	22:45
tonyb	that's frustrating. but good that you have a backup	23:16

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!