133
you are viewing a single comment's thread
view the rest of the comments
[-] flamingos@feddit.uk 2 points 1 day ago

Then you can just block the user agent in nginx or whatever you use, like all the other AI scrapers who ignore robots.txt (*cough* Amazon)

[-] smeenz@lemmy.nz 0 points 20 hours ago

Then the user agent string will just quietly become randomised so you can't match it reliably because it turns out that honouring robots.txt was always little more than a "gentleman's handshake".

[-] dgerard@awful.systems 1 points 13 hours ago

this is a problem we have had for a while now, i assure you

this post was submitted on 20 May 2026
133 points (99.3% liked)

TechTakes

2580 readers
172 users here now

Big brain tech dude got yet another clueless take over at HackerNews etc? Here's the place to vent. Orange site, VC foolishness, all welcome.

This is not debate club. Unless it’s amusing debate.

For actually-good tech, you want our NotAwfulTech community

founded 2 years ago
MODERATORS