578

Apple just proved AI "reasoning" models like Claude, DeepSeek-R1, and o3-mini don't actually reason at all. They just memorize patterns really well. (archive.is)

submitted 21 hours ago* (last edited 21 hours ago) by [email protected] to c/[email protected]

200 comments fedilink hide all child comments

you are viewing a single comment's thread
view the rest of the comments

[-] [email protected] 2 points 19 hours ago

The difference between reasoning models and normal models is reasoning models are two steps, to oversimplify it a little they prompt "how would you go about responding to this" then prompt "write the response"

It's still predicting the most likely thing to come next, but the difference is that it gives the chance for the model to write the most likely instructions to follow for the task, then the most likely result of following the instructions - both of which are much more conformant to patterns than a single jump from prompt to response.

[-] [email protected] 6 points 19 hours ago

But it still manages to fuck it up.

I've been experimenting with using Claude's Sonnet model in Copilot in agent mode for my job, and one of the things that's become abundantly clear is that it has certain types of behavior that are heavily represented in the model, so it assumes you want that behavior even if you explicitly tell it you don't.

Say you're working in a yarn workspaces project, and you instruct Copilot to build and test a new dashboard using an instruction file. You'll need to include explicit and repeated reminders all throughout the file to use yarn, not NPM, because even though yarn is very popular today, there are so many older examples of using NPM in its model that it's just going to assume that's what you actually want - thereby fucking up your codebase.

I've also had lots of cases where I tell it I don't want it to edit any code, just to analyze and explain something that's there and how to update it... and then I have to stop it from editing code anyway, because halfway through it forgot that I didn't want edits, just explanations.

[-] [email protected] 2 points 16 hours ago* (last edited 16 hours ago)

To be fair, the world of JavaScript is such a clusterfuck... Can you really blame the LLM for needing constant reminders about the specifics of your project?

When a programming language has five hundred bazillion absolutely terrible ways of accomplishing a given thing—and endless absolutely awful code examples on the Internet to "learn from"—you're just asking for trouble. Not just from trying to get an LLM to produce what you want but also trying to get humans to do it.

This is why LLMs are so fucking good at writing rust and Python: There's only so many ways to do a thing and the larger community pretty much always uses the same solutions.

JavaScript? How can it even keep up? You're using yarn today but in a year you'll probably like, "fuuuuck this code is garbage... I need to convert this all to [new thing]."

[-] [email protected] 2 points 16 hours ago

That's only part of the problem. Yes, JavaScript is a fragmented clusterfuck. Typescript is leagues better, but by no means perfect. Still, that doesn't explain why the LLM can't recall that I'm using Yarn while it's processing the instruction that specifically told it to use Yarn. Or why it tries to start editing code when I tell it not to. Those are still issues that aren't specific to the language.

load more comments (1 replies)

this post was submitted on 08 Jun 2025

578 points (95.9% liked)

Technology

70995 readers

2995 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related news or articles.
Be excellent to each other!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
Check for duplicates before posting, duplicates may be removed
Accounts 7 days and younger will have their posts automatically removed.

Approved Bots

founded 2 years ago

MODERATORS

[email protected]