this post was submitted on 13 May 2025
467 points (100.0% liked)
TechTakes
1858 readers
355 users here now
Big brain tech dude got yet another clueless take over at HackerNews etc? Here's the place to vent. Orange site, VC foolishness, all welcome.
This is not debate club. Unless it’s amusing debate.
For actually-good tech, you want our NotAwfulTech community
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
Hallucinations become almost a non issue when working with newer models, custom inference, multishot prompting and RAG
But the models themselves fundamentally can't write good, new code, even if they're perfectly factual
The promptfarmers can push the hallucination rates incrementally lower by spending 10x compute on training (and training on 10x the data and spending 10x on runtime cost) but they're already consuming a plurality of all VC funding so they can't 10x many more times without going bust entirely. And they aren't going to get them down to 0%, hallucinations are intrinsic to how LLMs operate, no patch with run-time inference or multiple tries or RAG will eliminate that.
And as for newer models... o3 actually had a higher hallucination rate because trying to squeeze rational logic out of the models with fine-tuning just breaks them in a different direction.
I will acknowledge in domains with analytically verifiable answers you can check the LLMs that way, but in that case, its no longer primarily an LLM, you've got an entire expert system or proof assistant or whatever that can operate independently of the LLM and the LLM is just providing creative input.
We should maximise hallucinations, actually. That is, we should hack the environmental controls of the data centers to be conducive for fungi growth, and flood them with magic mushrooms spores. We can probably get the rats on board by selling it as a different version of nuking the data centers.
What if [tokes joint] hallucinations are actually, like, proof the models are almost at human level man!
Sadly I have seen people make that exact point
stopping this bit here because I don't want to continue writing a JRE episode
@swlabr @scruiser Java Runtime Environment?
no the worse one
Doesn't really narrow it down, sorry
(jk I don't have beef with the JavaRE)
Joe Rogan Experience!
...side note my most prominent irl conversation about Joe Rogan was with a relative who was trying to convince me it was a good thing that Joe Rogan platformed a celebrity who was saying 1x1=2 (Terrence Howard). Literally beyond parody.
Ok, we are already very far off topic. I actually have heard an interesting take about Terrence Howard and his “New Math”.
(NB: I have an undergrad major in mathematics)
So, one of my favourite comedy podcasts is “My Momma Told Me,” a podcast that talks about black conspiracy theories. In each episode, the topic is framed as “my momma told me <insert conspiracy theory here”. Terrence Howard comes up as a sort of mythological icon on the pod, it’s very funny. On a recent episode they actually get to facetime him, it rules.
Anyway, the take comes from the host Langston Kerman (who does not have a major in mathematics, nor any background in science). It’s a very charitable interpretation that the whole 1x1=2 stuff isn’t so much about creating a new math with different rules, it’s more an expression of the frustration towards the algorithms and calculations that are part of the power structures in society that disadvantage and disenfranchise minority populations. Because this is all being thought of and formulated by people outside of the discipline, anything appearing arithmetic is “math,” so a “new math” is needed.
Basically: it’s kafkaesque. But yeah more likely than not Terrence Howard has gone off the deep end.
I hadn’t encountered either the Howard person nor heard of this podcast, but imma find that episode and listen because it sounds like quite an experience!