84
How to poison AI data to accelerate model collapse?
(programming.dev)
"We did it, Patrick! We made a technological breakthrough!"
A place for all those who loathe AI to discuss things, post articles, and ridicule the AI hype. Proud supporter of working people. And proud booer of SXSW 2024.
AI, in this case, refers to LLMs, GPT technology, and anything listed as "AI" meant to increase market valuations.
There are several tarpits, software which claims to poison LLM trainig data or genAI image models... But poisoning isn't effective. It's mainly a waste of time as models and the training process has changed and adapted. They'll curate the datasets and just get rid of the outlier information. Maybe already the crawler will make some decisions to cope. You can do it if you like. But be aware this is mostly for your own entertainment. It won't change anything.
What I do is block their address ranges and be done with it. Can be done with some access/deny rules in the webserver config. Or by the firewall.