506
you are viewing a single comment's thread
view the rest of the comments
[-] FaceDeer@fedia.io 2 points 8 hours ago

You can predict how much a task will take in tokens. The accuracy of the prediction may not be perfect, but if you can ballpark it that can tell you a lot about what models to make use of.

Also, not all tokens are the same. Different models require different amounts and kinds of computing power to run. Using a very large context costs more per token because you need a computer with a lot of memory to fit it all. If you need it fast that's more expensive than if you an take your time. Does the task involve vision or audio? Does the context need to be saved for an ongoing chat? Does it need to wait for tool calls to return between rounds? There are a lot of variables that can be tweaked to vary the cost that an AI call will take, and a lot of those variables can be predicted without having to actually run the whole thing first.

The "cranking up" part has not even started yet, and we already have stories like Uber which blew through their complete AI budget for the year,

This is exactly what I'm talking about. Current LLM usage patterns tend to be pretty inefficient because people just thow tasks at the biggest and bestest models. Those models handle them, sure, because they're the biggest and bestest. But most tasks don't need that much.

I've used coding agents a fair bit along with the various other AI applications I've fiddled with, and often I ask them to do things that are dead simple. Create a function to sort some data and select whatever fits certain criteria. Add type checking to a file. Create a unit test for a function. Stuff like that could easily be done by a small local model, but the coding agent sends it off to Opus or whatever just like every other task. That can change.

There still was no guarantee that the output was useable (and there can't be such a guarantee, since hallucinations are a statistical fact, increasing in occurrence with smaller amounts of training Data available).

I don't think you've used modern coding AIs much.

Or, for that matter, worked with human coders.

Remember, this is the "killer" application for LLMs.

There is no one single "killer" application for LLMs. They're about as general a computing platform as you can get.

[-] Wildmimic@anarchist.nexus 2 points 8 hours ago

I used to think like you, and I am still pro local LLMs - I use them as tutors for areas I don't know much about, and since I use the output just as a guide and implement it on my own I quickly realize if something isn't right.

We will see - when OpenAI and Anthropic rush towards IPO this year, which was made very likely because SpaceX has upped the tempo - what the real costs are. If this article and others I've read in the last year are correct, and the prices have to go up x10 to break even, then we are in for a wild ride. I'm only grateful that for now they don't get lumped into the index funds.

this post was submitted on 07 Jun 2026
506 points (98.8% liked)

Technology

85212 readers
4463 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related news or articles.
  3. Be excellent to each other!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
  9. Check for duplicates before posting, duplicates may be removed
  10. Accounts 7 days and younger will have their posts automatically removed.

Approved Bots


founded 3 years ago
MODERATORS