162
Aggressive AI scrapers are making it kinda suck to run wikis
(weirdgloop.org)
"We did it, Patrick! We made a technological breakthrough!"
A place for all those who loathe AI to discuss things, post articles, and ridicule the AI hype. Proud supporter of working people. And proud booer of SXSW 2024.
AI, in this case, refers to LLMs, GPT technology, and anything listed as "AI" meant to increase market valuations.
Can you automatically block any user with an unusually high rate of requests?
It's hard because the requests all come from different IPs, at least on my site. 185k "unique visitors" hit my site just yesterday, half from outside of North America, which is odd because my site is pretty local.
You could, but it's tricky to get right I feel. Most small websites use a form of bot detection for visitors to manage this. This might be a service like Cloudflare or an open source thing like Anubis for example.
There's different ways to tackle this and it sucks we are forced into putting time and effort to deal with it.
There's a clever trick from Cloudflare:
https://blog.cloudflare.com/ai-labyrinth/
Poisoning the well at scale. I love it.