"Just a few more trillion dollars bro, then itll be ready..." Like a junkie.
It's safe to assume that any metric they don't disclose is quite damning to them. Plus, these guys don't really care about the environmental impact, or what us tree-hugging environmentalists think. I'm assuming the only group they are scared of upsetting right now is investors. The thing is, even if you don't care about the environment, the problem with LLMs is how poorly they scale.
An important concept when evaluating how something scales is are marginal values, chiefly marginal utility and marginal expenses. Marginal utility is how much utility do you get if you get one more unit of whatever. Marginal expenses is how much it costs to get one more unit. And what the LLMs produce is the probably that a token, T, follows on prefix Q. So P(T|Q) (read: Probably of T, given Q). This is done for all known tokens, and then based on these probabilities, one token is chosen at random. This token is then appended to the prefix, and the process repeats, until the LLM produces a sequence which indicates that it's done talking.
If we now imagine the best possible LLM, then the calculated value for P(T|Q) would be the actual value. However, it's worth noting that this already displays a limitation of LLMs. Namely even if we use this ideal LLM, we're just a few bad dice rolls away from saying something dumb, which then pollutes the context. And the larger we make the LLM, the closer its results get to the actual value. A potential way to measure this precision would be by subtracting P(T|Q) from P_calc(T|Q), and counting the leading zeroes, essentially counting the number of digits we got right. Now, the thing is that each additional digit only provides a tenth of the utility to than the digit before it. While the cost for additional digits goes up exponentially.
So, exponentially decaying marginal utility meets exponentially growing marginal expenses. Which is really bad for companies that try to market LLMs.
Well I mean also that they kinda suck, I feel like I spend more time debugging AI code than I get working code.
I only use it if I’m stuck even if the AI code is wrong it often pushes me in the right direction to find the correct solution for my problem. Like pair programming but a bit shitty.
The best way to use these LLMs with coding is to never use the generated code directly and atomize your problem into smaller questions you ask to the LLM.
So duck programming right?
And fancier intellisense
Obviously it's higher. If it was any lower, they would've made a huge announcement out of it to prove they're better than the competition.
I’m thinking otherwise. I think GPT5 is a much smaller model - with some fallback to previous models if required.
Since it’s running on the exact same hardware with a mostly similar algorithm, using less energy would directly mean it’s a “less intense” model, which translates into an inferior quality in American Investor Language (AIL).
And 2025’s investors doesn’t give a flying fuck about energy efficiency.
And they don't want to disclose the energy efficiency becaaaause ... ?
Because the AI industry is a bubble that exists to sell more GPUs and drive fossil fuel demand
They probably wouldn't really care how efficient it is, but they certainly would care that the costs are lower.
I get the distinct impression that most of the focus for GPT5 was making it easier to divert their overflowing volume of queries to less expensive routes.
If anyone has ever wondered what it would look like if tech giants went all in on "brute force" programming, this is it. This is what it looks like.
They literally don't know. "GPT-5" is several models, with a model gating in front to choose which model to use depending on how "hard" it thinks the question is. They've already been tweaking the front-end to change how it cuts over. They've definitely going to keep changing it.
Is it this?
intense electricity demands, and WATER for cooling.
Sam Altman looks like an SNL actor impersonating Sam Altman.
Duh. Every company like this "suddenly" starts withholding public progress reports, once their progress fucking goes downhill. Stop giving these parasites handouts
Pump it Sammy, pump it harder!!
I have to test it with Copilot for work. So far, in my experience its "enhanced capabilities" mostly involve doing things I didn't ask it to do extremely quickly. For example, it massively fucked up the CSS in an experimental project when I instructed it to extract a React element into its own file.
That's literally all I wanted it to do, yet it took it upon itself to make all sorts of changes to styling for the entire application. I ended up reverting all of its changes and extracting the element myself.
Suffice to say, I will not be recommending GPT 5 going forward.
Sounds like you forgot to instruct it to do a good job.
That's my problem with "AI" in general. It's seemingly impossible to "engineer" a complete piece of software when using LLMs in any capacity that isn't editing a line or two inside singular functions. Too many times I've asked GPT/Gemini to make a small change to a file and had to revert the request because it'd take it upon itself to re-engineer the architecture of my entire application.
We moved to m365 and were encouraged to try new elements. I gave copilot an excel sheet, told it to add 5% to each percent in column B and not to go over 100%. It spat out jumbled up data all reading 6000%.
When will genAI be so good, it'll solve its own energy crisis?
Most certainly it won't happen until after AI has developed a self-preservation bias. It's too bad the solution is turning off the AI.
When you want to create the shiniest honeypot, you need high power consumption.
It's the same tech. It would have to be bigger or chew through "reasoning" tokens to beat benchmarks. So yeah, of course it is.
How can anyone look at that face and trust anything that mad man could have to say.
Photographer1: Sam, could you give us a goofier face?
*click* *click*
Photographer2: Goofier!!
*click* *click* *click* *click*
is there any picture of the guy without his hand up like that?
Technology
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related news or articles.
- Be excellent to each other!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
- Check for duplicates before posting, duplicates may be removed
- Accounts 7 days and younger will have their posts automatically removed.