119

submitted 2 years ago* (last edited 2 years ago) by [email protected] to c/[email protected]

175 comments fedilink hide all child comments

Deleted

you are viewing a single comment's thread
view the rest of the comments

[-] [email protected] 5 points 2 years ago* (last edited 2 years ago)

LLMs, IIRC, are really bad at IQ-test type questions that require abstract reasoning, especially if they require multiple steps. So, something like

The box is yellow and red.
If the box is yellow, it is good.
If the box is blue, it is unhappy.
If the box is good and happy, the box is awesome.
If the box is red, it is happy.
Is the box awesome?

is what I'd use.

[-] [email protected] 7 points 2 years ago

From ChatGPT 4:

Yes, the box is awesome.

According to the provided statements, the box is yellow and red.

"If the box is yellow, it is good." - So, since the box is yellow, it is good.

"If the box is red, it is happy." - And since the box is red, it is happy.

Finally, "If the box is good and happy, the box is awesome." - Therefore, because the box is both good (due to being yellow) and happy (due to being red), the box is indeed awesome.

[-] [email protected] 4 points 2 years ago* (last edited 2 years ago)

Bard however, does not seem to get the answer right:

The answer to the trick question is indeterminate.

The box is both yellow and red, so it is both good and happy. However, the box is not blue, so it is not unhappy. Therefore, the box could be awesome, but it could also not be awesome.

The trick question relies on the ambiguity of the word "awesome." In some contexts, "awesome" means "extremely good." In other contexts, "awesome" means "inspiring awe." The box could be either of these things, or it could be neither.

Without more information, it is impossible to say definitively whether or not the box is awesome.

Seems like it got kind of close, with the "The box is both yellow and red, so it is both good and happy"... but then falls apart afterwards.

Edit: I tried to debate with it:

Me:

To me, the answer is "Yes, the box is awesome" because the box is yellow thus it is "good" and it is also red, thus it is "happy". Because the box is "good" and "happy", the question defines the box as "awesome" - if that all makes sense!

Bard:

Yes, your interpretation makes sense. The box is yellow, so it is good. The box is red, so it is happy. Therefore, the box is both good and happy, which means that the box is awesome.

However, as I mentioned before, the word "awesome" is ambiguous. It could mean "extremely good" or "inspiring awe." In the context of this trick question, it is not clear which definition of "awesome" is intended. Therefore, the answer to the question is still indeterminate.

Ultimately, the interpretation of the trick question is up to the individual. If you believe that the box is awesome because it is both good and happy, then that is a valid interpretation. However, if you believe that the word "awesome" has a more specific meaning, then you may interpret the question differently.

Which is interesting to say the least, its almost like its looking a bit too deeply into the question lol.

[-] [email protected] 2 points 2 years ago

Bing had no trouble

The box is yellow and red.
If the box is yellow, it is good.
If the box is blue, it is unhappy.
If the box is good and happy, the box is awesome.
If the box is red, it is happy.
Is the box awesome?

To answer this question, we need to check if the box is both good and happy.

We know that the box is yellow and red.
We also know that if the box is yellow, it is good.
Therefore, we can conclude that the box is good.

We also know that if the box is red, it is happy.
Therefore, we can conclude that the box is happy.

Since the box is both good and happy, we can use the rule that if the box is good and happy, the box is awesome.

Therefore, the answer is yes, the box is awesome.

[-] [email protected] 2 points 2 years ago

Bing is GPT4 based, though I don't think the same version as ChatGPT. But either way GPT4 can solve these types of problems all day.

[-] [email protected] 2 points 2 years ago

Not surprised. I got access to bard a while back and it does quite a lot more hallucinating than even GPT3.5.

Though it doubling down on the wrong answer even when corrected is something I've seen GPT4 do even in some cases. It seems like once it says something, it usually sticks to it.

[-] [email protected] 3 points 2 years ago

That's terrifyingly good wtf

[-] [email protected] 2 points 2 years ago

I was going to say you could give it a math problem that uses big numbers but tried one on GPT4 and it succeeded. GPT3 though will absolutely fail at nontrivial math every time.

[-] [email protected] 2 points 2 years ago

Darn, foiled again. You win this round Chat GPT.

[-] [email protected] 7 points 2 years ago

Thats just an algebraic equation but with words.

[-] [email protected] 6 points 2 years ago

This sounds like something chatGPT could easily convert. Although that is computationally high.

It would also have the side affects of blocking all math illiterate people from creating accounts.

[-] [email protected] 1 points 2 years ago* (last edited 2 years ago)

A + B = C
A = X
D != Y
X + Y = Z
B = Y
Does C = Z?

[-] [email protected] 1 points 2 years ago

exactly

[-] [email protected] 5 points 2 years ago

Um wtf, I'm starting to doubt if I'm a human. 🤔

this post was submitted on 26 Jun 2023

119 points (97.6% liked)

Asklemmy

49237 readers

625 users here now

A loosely moderated place to ask open-ended questions

Search asklemmy 🔍

If your post meets the following criteria, it's welcome here!

Open-ended question
Not offensive: at this point, we do not have the bandwidth to moderate overtly political discussions. Assume best intent and be excellent to each other.
Not regarding using or support for Lemmy: context, see the list of support communities and tools for finding communities below
Not ad nauseam inducing: please make sure it is a question that would be new to most members
An actual topic of discussion

Looking for support?

Looking for a community?

Lemmyverse: community search
sub.rehab: maps old subreddits to fediverse options, marks official as such
[email protected]: a community for finding communities

~Icon~ ~by~ ~@Double_[email protected]~

founded 6 years ago

MODERATORS

[email protected]