215
submitted 1 year ago* (last edited 1 year ago) by take_five_seconds@hexbear.net to c/news@hexbear.net

They also didn't seed,

Supposedly, Meta tried to conceal the seeding by not using Facebook servers while downloading the dataset to "avoid" the "risk" of anyone "tracing back the seeder/downloader" from Facebook servers, an internal message from Meta researcher Frank Zhang said, while describing the work as in "stealth mode." Meta also allegedly modified settings "so that the smallest amount of seeding possible could occur," a Meta executive in charge of project management, Michael Clark, said in a deposition.

all 47 comments
sorted by: hot top new old
[-] AernaLingus@hexbear.net 103 points 1 year ago

Meta also allegedly modified settings "so that the smallest amount of seeding possible could occur,"

and to top it all off, they're goddamn leechers!

[-] buckykat@hexbear.net 62 points 1 year ago

81.7TB is so many fucking books

[-] context@hexbear.net 38 points 1 year ago

half of it is the complete works of chuck tingle

[-] buckykat@hexbear.net 19 points 1 year ago

SWIM has a folder of 9GB of books and it's a lot. This is almost ten thousand times that many.

[-] context@hexbear.net 18 points 1 year ago
[-] Nacarbac@hexbear.net 7 points 1 year ago

"Pounded in the butt by the AI graduated from Facebook's pirate training operation, but not very well compared to the lean efficiency of the pounding provided by DeepSeek with significantly less illegal torrenting, despite the eyepatch and parrot."

[-] context@hexbear.net 3 points 1 year ago

one of my personal favorites from chuck's january 2025 ouevre

[-] deforestgump@hexbear.net 62 points 1 year ago

They should be getting a cease and desist letter any day now

[-] Tachanka@hexbear.net 57 points 1 year ago

Corporation pirates millions of books to train AI: No charge

Bourgeois individual commits billions of dollars in fraud: 40 months in country club prison

Homeless man steals $100 and gives it back: 15 years in general population prison

Any questions?

[-] communism@lemmy.ml 7 points 1 year ago

jfc, please tell me he appealed the sentence

[-] PolandIsAStateOfMind@lemmygrad.ml 51 points 1 year ago

“so that the smallest amount of seeding possible could occur,”

They can't even pirate ethically, fucking landlubbers

[-] boiledfrog@hexbear.net 19 points 1 year ago

Another reason for the wall

[-] Aquilae@hexbear.net 49 points 1 year ago

Meta also allegedly modified settings "so that the smallest amount of seeding possible could occur,"

pathetic

[-] MidnightPocket@hexbear.net 43 points 1 year ago

Imagine having all that corporate funding and still cutting costs on...stealing information.

[-] combat_brandonism@hexbear.net 38 points 1 year ago* (last edited 1 year ago)

torrenting and seeding of pirated books

nerd

downloading them from libgen over http

soviet-chad

[-] Embargo@lemm.ee 34 points 1 year ago
[-] BountifulEggnog@hexbear.net 32 points 1 year ago

Wish I had 81tb of disk :sadness:

[-] Frogmanfromlake@hexbear.net 27 points 1 year ago

It “read” more books than most ever will and yet it still fails to write a decent story

[-] JasonDJ@lemmy.zip 8 points 1 year ago

People love stories. And who has a better story than Frogman from lake?

[-] Frogmanfromlake@hexbear.net 2 points 1 year ago

JasonDJ, of course. A literary giant among men.

[-] NephewAlphaBravo@hexbear.net 7 points 1 year ago

I can't write for shit either, where's my trillion dollar stock valuation?

[-] dRLY@hexbear.net 25 points 1 year ago

much less that Plaintiffs’ books were somehow distributed by Meta.

While I guess that Meta may have used settings to be leech only. Unless they show that they did that (which is of course poor practice if torrenting), the nature of torrenting by default means that even one piece of a file was seeded to another user is "distribution."

[-] glans@hexbear.net 20 points 1 year ago

expropriation now

[-] Flocklesscrow@lemm.ee 19 points 1 year ago

So this will result in criminal charges against all involved, right?

Right?

[-] FuckyWucky@hexbear.net 14 points 1 year ago

:no-waying:

[-] nullpotential@lemmy.dbzer0.com 12 points 1 year ago

We all knew Meta was evil, but damn.

[-] mctoasterson@reddthat.com 12 points 1 year ago* (last edited 1 year ago)

Jokes on them, they could've easily connected to a number of IRC servers/channels through a basic proxy and used scripts to download at least as many books with relative anonymity... albeit slower.

[-] BeBrave@midwest.social 1 points 1 year ago

They don't follow the law.

this post was submitted on 07 Feb 2025
215 points (99.5% liked)

news

24577 readers
696 users here now

Welcome to c/news! We aim to foster a book-club type environment for discussion and critical analysis of the news. Our policy objectives are:

We ask community members to appreciate the uncertainty inherent in critical analysis of current events, the need to constantly learn, and take part in the community with humility. None of us are the One True Leftist, not even you, the reader.

Newcomm and Newsmega Rules:

The Hexbear Code of Conduct and Terms of Service apply here.

  1. Link titles: Please use informative link titles. Overly editorialized titles, particularly if they link to opinion pieces, may get your post removed.

  2. Content warnings: Posts on the newscomm and top-level replies on the newsmega should use content warnings appropriately. Please be thoughtful about wording and triggers when describing awful things in post titles.

  3. Fake news: No fake news posts ever, including April 1st. Deliberate fake news posting is a bannable offense. If you mistakenly post fake news the mod team may ask you to delete/modify the post or we may delete it ourselves.

  4. Link sources: All posts must include a link to their source. Screenshots are fine IF you include the link in the post body. If you are citing a Twitter post as news, please include the Xcancel.com (or another Nitter instance) or at least strip out identifier information from the twitter link. There is also a Firefox extension that can redirect Twitter links to a Nitter instance, such as Libredirect or archive them as you would any other reactionary source.

  5. Archive sites: We highly encourage use of non-paywalled archive sites (i.e. archive.is, web.archive.org, ghostarchive.org) so that links are widely accessible to the community and so that reactionary sources don’t derive data/ad revenue from Hexbear users. If you see a link without an archive link, please archive it yourself and add it to the thread, ask the OP to fix it, or report to mods. Including text of articles in threads is welcome.

  6. Low effort material: Avoid memes/jokes/shitposts in newscomm posts and top-level replies to the newsmega. This kind of content is OK in post replies and in newsmega sub-threads. We encourage the community to balance their contribution of low effort material with effort posts, links to real news/analysis, and meaningful engagement with material posted in the community.

  7. American politics: Discussion and effort posts on the (potential) material impacts of American electoral politics is welcome, but the never-ending circus of American Politics© Brought to You by Mountain Dew™ is not welcome. This refers to polling, pundit reactions, electoral horse races, rumors of who might run, etc.

  8. Electoralism: Please try to avoid struggle sessions about the value of voting/taking part in the electoral system in the West. c/electoralism is right over there.

  9. AI Slop: Don't post AI generated content. Posts about AI race/chip wars/data centers are fine.

founded 5 years ago
MODERATORS