704
Wikipedia has banned AI-generated text, with two exceptions
(www.howtogeek.com)
This is a most excellent place for technology news and articles.
Wikipedia probably wants to sell access to LLMs to train. It’s only valuable if Wikipedia remains a high-quality, slop-free source.
I think even AI zealots think there should be silos of content to train from that are fully human generated. Training slop on slop makes the slop even worse.
Sell licenses of what? It's already all in the creative commons iirc.
The content is CC licensed, but they are trying to block AI scraping because it overloads their servers. They have a paid API that uses a lot less compute for both Wikipedia and the AI, as well as being a revenue source for Wikipedia.
Yes, but...
https://en.wikipedia.org/wiki/Wikipedia%3ADatabase_download
That's because viewing the page uses server resources, as done API access. If you want the data you can download the database directly.
This was only done because the editors pushed to minimize AI involvement. There's a comment here already mentioning that: https://lemmy.world/comment/22826863