Thursday, 2025-10-16

tonybfungi, noonedeadpunk: I think gitea13 is getting hit by 2 crawlers 1 (Facebook) going via the load-balancer, and a second (ChatGPT) going direct to gitea1301:07
Clark[m]tonyb: there is also a crawler coming out of cloud flare IPs with legit looking UAs but you can tell it is a crawler due to the url patterns (it's asking for every file in every repo for every commit hash01:09
Clark[m]Those tend to be more problematic because they aren't good enough to correctly identify themselves. I noted earlier today that I suspect 13 has been identified by the crawlers on the web but the other 5 haven't been so it gets direct requests and the others don't leading to the imbalance. We may need to consider blocking direct access even though it makes debugging more annoying01:10
Clark[m]That should force things to balance out better01:10
tonybClark[m]: I'm not seeing that?01:12
Clark[m]tonyb it's possible they moved on since I checked about 10 hours ago01:12
tonybClark[m]: `grep -v 38.108.68.97 /var/log/apache2/gitea-ssl-access.log | awk -F\" '{sub(":.*", "", $1); print $1}' | sort | uniq -c | sort` shows to IPs that accound for 500+ connections01:13
tonybthey're both ChatGPT (I think)01:13
Clark[m]Ya the cloud flare stuff is someone trying to appear like normal traffic so it's many IPs and many user agents01:13
tonybOoooo01:14
Clark[m]If you look for requests that include commit hashes they stand out.01:14
Clark[m]But also look for weird user agents like Android 3 or Firefox 301:15
Clark[m]They must have systems with massive tables of valid user agents that they iterate through for each request. I've even found typos in the user agents in the past01:15
tonybOh yeah there are *lots* of those.  I was ignoring them until the 2 I mentioned were dealt with.01:15
tonybprobably a mistake01:15
tonybI wonder if we should a) block Facebook at the Loadbalancer (viw robots.txt, which they claim to honnor) ; and b) pull gitea13 out of the pool for a while so that real users don't get random slow servers.01:17
Clark[m]While the chatgpt and Facebook and so on traffic is not zero impact I suspect they are generally much better behaved and it is the botnet crawl every repo commit and file from many IPs that is actually the problem01:18
Clark[m]Which is why they don't properly identify themselves because they know they are not doing what they should01:18
tonybFair enough01:18
tonybIf that'e true, we could just do step "b" as the worst behaving client(s) are going direct01:19
tonybOr we could just ignore things as it's annoying but not all that painful ?01:20
Clark[m]Ya I guess I didn't consider just pulling 13 and letting it be a honeypot01:25
Clark[m]I think that works and is simple. I had initially rejected that as I assumed that we would redirect to another backend but since we're bypassing the load balancer entirely with the bad traffic that may actually work as an interim step01:26
tonybThe Facebook Crawler will "migrate" but I think the will stay where they are01:34
tonyb#status log Removing gitea13 from the load-balancer due to a several crawlers hitting gitea13 and bypassing the load-balancer.  This leaves the node running as a honeypot and (hopefully) minimising human visible impacts01:47
Clark[m]Thanks!01:51
tonybAs expected Facebook has migrated01:51
opendevreviewOpenStack Proposal Bot proposed openstack/project-config master: Normalize projects.yaml  https://review.opendev.org/c/openstack/project-config/+/96255702:14
*** ykarel_ is now known as ykarel07:40
*** mrunge_ is now known as mrunge07:55
fungithis thread may be worth watching: https://discuss.python.org/t/are-setuptools-abandoned/10439010:21
opendevreviewMerged zuul/zuul-jobs master: Make upload-image-s3 hash timeout configurable  https://review.opendev.org/c/zuul/zuul-jobs/+/96366314:54
opendevreviewMerged zuul/zuul-jobs master: Allow disabling compression of uploaded images  https://review.opendev.org/c/zuul/zuul-jobs/+/96366914:54
opendevreviewMerged zuul/zuul-jobs master: Make upload-image-swift hash timeout configurable  https://review.opendev.org/c/zuul/zuul-jobs/+/96372614:54
opendevreviewMerged zuul/zuul-jobs master: Allow disabling compression of uploaded images  https://review.opendev.org/c/zuul/zuul-jobs/+/96372714:57
opendevreviewMerged zuul/zuul-jobs master: Allow upload-image-s3 role to export S3 URLS  https://review.opendev.org/c/zuul/zuul-jobs/+/96382814:57
opendevreviewNicolas Hicher proposed zuul/zuul-jobs master: Refactor: multi-node-bridge to use linux bridge  https://review.opendev.org/c/zuul/zuul-jobs/+/95939318:27
opendevreviewTony Breeds proposed openstack/project-config master: [pti-python-tarball] Add compatibility for older wheels  https://review.opendev.org/c/openstack/project-config/+/96425120:12
tonybfungi: ^^ That's my rough idea for dealing with the wheel dist-info issue(s)20:51
opendevreviewTony Breeds proposed openstack/project-config master: [pti-python-tarball] Add compatibility for older wheels  https://review.opendev.org/c/openstack/project-config/+/96425121:07
opendevreviewVladimir Kozhukalov proposed zuul/zuul-jobs master: [build-container-image] Update buildx change tag  https://review.opendev.org/c/zuul/zuul-jobs/+/96425521:10

Generated by irclog2html.py 4.0.0 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!