133
you are viewing a single comment's thread
view the rest of the comments
[-] smeenz@lemmy.nz 4 points 2 days ago

If sites start blocking googlebot en masse, then googlebot will just start ignoring robots.txt

[-] flamingos@feddit.uk 2 points 1 day ago

Then you can just block the user agent in nginx or whatever you use, like all the other AI scrapers who ignore robots.txt (*cough* Amazon)

[-] smeenz@lemmy.nz 0 points 1 day ago

Then the user agent string will just quietly become randomised so you can't match it reliably because it turns out that honouring robots.txt was always little more than a "gentleman's handshake".

[-] dgerard@awful.systems 1 points 19 hours ago

this is a problem we have had for a while now, i assure you

[-] HK65@sopuli.xyz 4 points 2 days ago

Can they just put an EULA on the site and then sue Google for unauthorized access?

Not in the US of course, but in the EU or something

this post was submitted on 20 May 2026
133 points (99.3% liked)

TechTakes

2580 readers
217 users here now

Big brain tech dude got yet another clueless take over at HackerNews etc? Here's the place to vent. Orange site, VC foolishness, all welcome.

This is not debate club. Unless it’s amusing debate.

For actually-good tech, you want our NotAwfulTech community

founded 2 years ago
MODERATORS