Saturday, 2026-01-24

@fungicide:matrix.orglists.o.o is taking forever to respond to web requests. i see uwsgi processes eating up all available processors, and trying to grab /server-status locally on the server with curl has been waiting over a minute for a response even17:21
@fungicide:matrix.orgah yep, all worker slots are full17:22
@fungicide:matrix.orgWWWWWWWWWWWWWWWWWW_WWWWWWWWRRWWWWWRWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWRWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWRWWWWWWWWWWWWWWWWWWWWWWWWWW17:22
@fungicide:matrix.orgi can't wait until the ai craze crashes and burns, at this point i don't even care if it takes the world economy with it, this insanity can't go on forever17:23
@fungicide:matrix.orgseeing as how we've been maxing out swap on that server courtesy of the webui, i don't think it's safe to add more slots in apache unless we resize the vm17:25
@fungicide:matrix.orgwhere resize probably means replace, and then we get new ip addresses and a delivery reputation reset17:26
@fungicide:matrix.orgthough maybe we could run hyperkitty/postorius on a separate vm from mailman-core?17:26
@fungicide:matrix.orgthat would probably be less ridiculous than turning the current server into a smarthost17:28
@fungicide:matrix.orggetting simultaneously crawled by chatgpt, amazonbot, baidu, semrush, claude, and a bunch more17:34
@fungicide:matrix.orgoh, bing's scraping right now too17:34
@fungicide:matrix.organd applebot17:35
@fungicide:matrix.orgmj12bot17:35
@fungicide:matrix.orgahrefsbot17:36
@fungicide:matrix.orgit's like everybody decided saturday was a good day to redownload all the content from that server17:36
@fungicide:matrix.orgdotbot, googlebot, bytespider, petalbot... just watching these scroll by and it's clear that between the search engine and llm training crawlers we're not serving this content to humans with web browsers any more, they're apparently a fraction of a fraction of a percent of the requests17:42
@fungicide:matrix.organd now it's almost completely idle18:09
@fungicide:matrix.org.........................______________________________W_____________W_____........................._____W___________________.........................18:09
@clarkb:matrix.orgfungi: note that often times those those named bots aren't the problem as they tend to respect things liek our crawl delay value20:08
@clarkb:matrix.orgfungi: what I find useful is to sort all the apache requests by user agent to get a sense of relative request counts. On gitea we've also identified particularly expensive requests and filter on those. Maybe we need to identify similar requests for mailman?20:08
@clarkb:matrix.orgwe do set a crawl delay of 2 here. If we decide these properly named bots are the issue and they are respecting crawl delay we might increase that value20:09
@clarkb:matrix.orgI agree we shouldn't add more slots without changing anything else. Another option may be to try and have apache cache things? I think the main risk there is you'll go to an archive link for a thread and new thread responses may not appear until the cache is cleared? that may be ok considering that list traffic tends to be low and slow like a smoker?20:11
@clarkb:matrix.organyway we can think about this more on Monday20:11
@fungicide:matrix.orgoh, yeah disk cache might help, even a very, very large one. we have plenty of storage20:22

Generated by irclog2html.py 4.0.0 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!