this post was submitted on 28 Sep 2024
644 points (99.2% liked)

Science Memes

11047 readers
3457 users here now

Welcome to c/science_memes @ Mander.xyz!

A place for majestic STEMLORD peacocking, as well as memes about the realities of working in a lab.



Rules

  1. Don't throw mud. Behave like an intellectual and remember the human.
  2. Keep it rooted (on topic).
  3. No spam.
  4. Infographics welcome, get schooled.

This is a science community. We use the Dawkins definition of meme.



Research Committee

Other Mander Communities

Science and Research

Biology and Life Sciences

Physical Sciences

Humanities and Social Sciences

Practical and Applied Sciences

Memes

Miscellaneous

founded 2 years ago
MODERATORS
 
all 41 comments
sorted by: hot top controversial new old
[–] [email protected] 116 points 1 month ago* (last edited 1 month ago) (2 children)

How many of these books will just be totally garbage nonsense just so they could fulfill a prearranged quota.

Now the LLM are filled with a good amount of nonsense.

[–] [email protected] 60 points 1 month ago (2 children)

Just use the llm to make the books that the llm then uses, what could go wrong?

[–] [email protected] 31 points 1 month ago (5 children)

Someone's probably already coined the term, but I'm going to call it LLM inbreeding.

[–] [email protected] 18 points 1 month ago

I suggested this term in academic circles, as a joke.

I also suggested hallucinations ~3-6 years ago only to find out it was ALSO suggested in the 1970s.

Inbreeding, lol

[–] [email protected] 4 points 1 month ago (1 children)

The real term is synthetic data

[–] [email protected] 3 points 1 month ago (1 children)

but it amounts to about the same

[–] [email protected] 4 points 1 month ago

In computer science, garbage in, garbage out (GIGO) is the concept that flawed, biased or poor quality ("garbage") information or input produces a result or output of similar ("garbage") quality. The adage points to the need to improve data quality in, for example, programming.

There was some research article applying this 70s computer science concept to LLMs. It was published in Nature and hit major news outlets. Basically they further trained GPT on its output for a couple generations, until the model degraded terribly. Sounded obvious to me, but seeing it happen on the www is painful nonetheless...

[–] [email protected] 3 points 1 month ago

It's quite similar to another situation known as data incest

[–] [email protected] 3 points 1 month ago

Soylent AI? Auto-infocannibalism

[–] [email protected] 3 points 1 month ago

It can only go right because corporations must be punished for trying to replace people with machines.

[–] [email protected] 4 points 1 month ago

That would be terrible because they are both some of the best academic publishers in the humanities.

[–] [email protected] 49 points 1 month ago (4 children)

And they expect you to do this for free?

[–] [email protected] 25 points 1 month ago (1 children)

Do they not have to pay for the privilege? Or is this not referring to academic publishing? (It’s not super clear, but context indicates academic?)

[–] [email protected] 27 points 1 month ago

If it is that makes it even worse. Academic publishers need to be abolished.

[–] [email protected] 11 points 1 month ago

Nah, they get “Exposure”!

/s

[–] [email protected] 7 points 1 month ago (2 children)

Anyone who reviews for the major publishers is part of the problem.

[–] [email protected] 7 points 1 month ago

For profit corporations don't deserve your volunteer work.

[–] [email protected] 2 points 1 month ago (1 children)

And yet if you aren't a reviewer it makes your CV look worse.

[–] [email protected] 2 points 1 month ago (1 children)

Agreed that you should have some kind of "service" on your CV, but reviewing is pretty low impact. And if you want to review, you can choose something other than the predatory publishers.

[–] [email protected] 1 points 1 month ago

Such as? They're all predatory just to varying degrees.

[–] [email protected] 48 points 1 month ago

Feed the LLM with LLM generated books. No resentment at all!

[–] [email protected] 46 points 1 month ago

Jfc that's gross

[–] [email protected] 17 points 1 month ago

So what you're saying is, don't beat the targets because fuck those guys. Understood.

[–] [email protected] 4 points 1 month ago

What's the academic terminology for "go pound sand"?

[–] [email protected] 2 points 1 month ago* (last edited 1 month ago) (1 children)

Soylent Green is a lie anyway. Your need to "soylentify" half the population to feed the other half every year if it would be the only source of calories.

[–] [email protected] 29 points 1 month ago

No, the point is that they're just recycling the dissidents they were going to murder anyway.