490

Car Wash Test on 53 leading AI models: "I want to wash my car. The car wash is 50 meters away. Should I walk or drive?" (opper.ai)

submitted 19 hours ago by fubarx@lemmy.world to c/technology@lemmy.world

210 comments fedilink hide all child comments

Screenshot of this question was making the rounds last week. But this article covers testing against all the well-known models out there.

Also includes outtakes on the 'reasoning' models.

you are viewing a single comment's thread
view the rest of the comments

[-] TankovayaDiviziya@lemmy.world 4 points 5 hours ago* (last edited 53 minutes ago)

We poked fun at this meme, but it goes to show that the LLM is still like a child that needs to be taught to make implicit assumptions and posses contextual knowledge. The current model of LLM needs a lot more input and instructions to do what you want it to do specifically, like a child.

Edit: I know Lemmy scoff at LLM, but people probably also used to scoff at Veirbest's steam machine that it will never amount to anything. Give it time and it will improve. I'm not endorsing AI by the way, I am on the fence about the long term consequence of it, but whether people like it or not, AI will impact human lives.

[-] rob_t_firefly@lemmy.world 11 points 3 hours ago* (last edited 3 hours ago)

LLMs are not children. Children can have experiences, learn things, know things, and grow. Spicy autocomplete will never actually do any of these things.

[-] TankovayaDiviziya@lemmy.world 2 points 51 minutes ago

I'm sure AI will do those things at some point. Nobody expected the same of our microorganism ancestors.

[-] rob_t_firefly@lemmy.world 2 points 16 minutes ago* (last edited 16 minutes ago)

Our microorganism ancestors also did all those things, and they were far beyond anything an LLM can do. Turning a given list of words into numbers, doing a string of math to those numbers, and turning the resulting numbers back into words is not consciousness or wisdom and never will be.

[-] kshade@lemmy.world 13 points 3 hours ago

We have already thrown just about all the Internet and then some at them. It shows that LLMs can not think or reason. Which isn't surprising, they weren't meant to.

[-] eronth@lemmy.world -4 points 3 hours ago

Or at least they can't reason the way we do about our physical world.

[-] Nalivai@lemmy.world 4 points 2 hours ago

You're failing into the same trap. When the letters on the screen tell you something, it's not necessarily the truth. When there is "I'm reasoning" written in a chatbot window, it doesn't mean that there is a something that's reasoning.

[-] zalgotext@sh.itjust.works 13 points 3 hours ago

No, they cannot reason, by any definition of the word. LLMs are statistics-based autocomplete tools. They don't understand what they generate, they're just really good at guessing how words should be strung together based on complicated statistics.

[-] sturmblast@lemmy.world 1 points 2 hours ago

LLMs are a long long way from primetime

[-] Nalivai@lemmy.world 5 points 2 hours ago

By now it's kind of getting clear that fundamentally it's the best version of the thing that we get. This is a primetime.
For some time, there was a legit question of "if we give it enough data, will there be a qualitative jump", and as far as we can see right now, we're way past this jump. Predictive algorithm can form grammatically correct sentences that are related to the context. That's it, that's the jump.
Now a bunch of salespeople are trying to convince us that if there was one jump, there necessarily will be others, while there is no real indication of that.

[-] prole@lemmy.blahaj.zone 6 points 5 hours ago

I'm sure it'll be worth it at some point 🙄

this post was submitted on 23 Feb 2026

490 points (97.3% liked)

Technology

81759 readers

3766 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related news or articles.
Be excellent to each other!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
Check for duplicates before posting, duplicates may be removed
Accounts 7 days and younger will have their posts automatically removed.

Approved Bots

founded 2 years ago

MODERATORS

L3s@lemmy.world

enu@lemmy.world

technopagan@lemmy.world

L4s@lemmy.world

L3s@hackingne.ws