this post was submitted on 18 Jun 2023
179 points (100.0% liked)
Reddit Migration
16 readers
2 users here now
### About Community Tracking and helping #redditmigration to Kbin and the Fediverse. Say hello to the decentralized and open future. To see latest reeddit blackout info, see here: https://reddark.untone.uk/
founded 1 year ago
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
He's mad that a large part of the training corpus for LLMs (incl chatGPT) is just reddit posts and their comments, and that he saw 0 money of it.
Of course, LLM training data only needs to be pulled once and heavily compressed, a way of downloading reddit used to offer where you'd just get a large zip file with all of reddit until now, for free. They're retiring this, which means that the guys making the datasets will either stop (not a good thing, reddit is too nice of a data source) or start scraping the website, which generates additional costs for reddit for 0 reason.
Basically spez is pretending this one-time heavily optimized download is costing him billions, so he takes it out on 3rd party apps instead, because it's too late to go after openAI, since they already got their copy.