Well, I did say it would be nice if people made reasoned comments instead of just repeating whatever makes them popular.
But ultimately the question this thread is about is "What's behind the growing backlash towards AI data centers?" And I'm answering it. I think the backlash is a moral panic and it's magnified by the nature of fora like this one.
You can predict how much a task will take in tokens. The accuracy of the prediction may not be perfect, but if you can ballpark it that can tell you a lot about what models to make use of.
Also, not all tokens are the same. Different models require different amounts and kinds of computing power to run. Using a very large context costs more per token because you need a computer with a lot of memory to fit it all. If you need it fast that's more expensive than if you an take your time. Does the task involve vision or audio? Does the context need to be saved for an ongoing chat? Does it need to wait for tool calls to return between rounds? There are a lot of variables that can be tweaked to vary the cost that an AI call will take, and a lot of those variables can be predicted without having to actually run the whole thing first.
This is exactly what I'm talking about. Current LLM usage patterns tend to be pretty inefficient because people just thow tasks at the biggest and bestest models. Those models handle them, sure, because they're the biggest and bestest. But most tasks don't need that much.
I've used coding agents a fair bit along with the various other AI applications I've fiddled with, and often I ask them to do things that are dead simple. Create a function to sort some data and select whatever fits certain criteria. Add type checking to a file. Create a unit test for a function. Stuff like that could easily be done by a small local model, but the coding agent sends it off to Opus or whatever just like every other task. That can change.
I don't think you've used modern coding AIs much.
Or, for that matter, worked with human coders.
There is no one single "killer" application for LLMs. They're about as general a computing platform as you can get.