this post was submitted on 27 Jun 2024
126 points (98.5% liked)

the_dunk_tank

15900 readers
482 users here now

It's the dunk tank.

This is where you come to post big-brained hot takes by chuds, libs, or even fellow leftists, and tear them to itty-bitty pieces with precision dunkstrikes.

Rule 1: All posts must include links to the subject matter, and no identifying information should be redacted.

Rule 2: If your source is a reactionary website, please use archive.is instead of linking directly.

Rule 3: No sectarianism.

Rule 4: TERF/SWERFs Not Welcome

Rule 5: No ableism of any kind (that includes stuff like libt*rd)

Rule 6: Do not post fellow hexbears.

Rule 7: Do not individually target other instances' admins or moderators.

Rule 8: The subject of a post cannot be low hanging fruit, that is comments/posts made by a private person that have low amount of upvotes/likes/views. Comments/Posts made on other instances that are accessible from hexbear are an exception to this. Posts that do not meet this requirement can be posted to [email protected]

Rule 9: if you post ironic rage bait im going to make a personal visit to your house to make sure you never make this mistake again

founded 4 years ago
MODERATORS
 
you are viewing a single comment's thread
view the rest of the comments
[–] [email protected] 4 points 3 months ago (3 children)

Didn't they manage to make it somewhat good at solving certain math competition problems? Regardless it's a pretty big jump from that to making a breakthrough in physics.

[–] [email protected] 5 points 3 months ago (1 children)

maybe certain ones, but it's generally bad about numbers and mathematical reasoning. he also gets paid to make it fail at math, and it's arguably worse at basic math than physics.

[–] [email protected] 3 points 3 months ago

Very excited to someday have a computer that can do math problems

[–] [email protected] 3 points 3 months ago

I think they had to connect it to Wolfram Alpha

[–] [email protected] 3 points 3 months ago

Yeah deepmind had good results with IMO problems, but only geometry problems. They scored almost at the level of gold medalist. That's only a fraction of IMO problems, though. They did it by combining a formal verification system with a LLM to propose solution paths, and then doing some tree search I think.

This is one way to improve large AI systems and will probably be incorporated in some way in the future, for example by integrating with a language like lean (for math proofs).

They will also be improved by combining with tool use like calculators, code interpreters, web search, calendars, etc. This is already starting to happen to some extent.

LLMs by themselves, at least with current architectures using transformers, are not great at reasoning (counting, arithmetic, symbolic reasoning)