71
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
this post was submitted on 12 May 2026
71 points (98.6% liked)
Technology
42943 readers
331 users here now
A nice place to discuss rumors, happenings, innovations, and challenges in the technology sphere. We also welcome discussions on the intersections of technology and society. If it’s technological news or discussion of technology, it probably belongs here.
Remember the overriding ethos on Beehaw: Be(e) Nice. Each user you encounter here is a person, and should be treated with kindness (even if they’re wrong, or use a Linux distro you don’t like). Personal attacks will not be tolerated.
Subcommunities on Beehaw:
This community's icon was made by Aaron Schneider, under the CC-BY-NC-SA 4.0 license.
founded 4 years ago
MODERATORS
But, but, but, AI coding is the future and devs who don't use AI are gonna get left behind!!!!! You're just a stupid Luddite whose job will be replaced anyways!!!!!!
This benchmark is presenting AI with a challenge that's greater than what human devs normally face. It's supposed to be really hard, it's not surprising that current models get 0%.
The point is that over time models will continue to improve and this benchmark will measure that improvement. A lot of current benchmarks have been saturated, once models are getting near 100% scores there's no point to them any more.
It’s incredible how we went from everyone laughing at the YNGMI crypto bros to the entire economy being built on top of YNGMI AI bros.
No it's not, and part of that is the current legislative laissez-faire in the US that put its regulatory bodies on a hiatus. Under normal circumstances, this stuff should have been under much more scrutiny and regulations. I'm not saying that the state should control what LLMs do or who's access to them, but they could very much tackle the deceptive marketing, environmental and societal impact, unsound financing, abnormal market consolidation, and mitigate the overall financial risk.