How does AI use so much power? (lemmy.world)

submitted 2 days ago by [email protected] to c/[email protected]

38 comments fedilink hide all child comments

Is there anyway to make it use less at it gets more advanced or will there be huge power plants just dedicated to AI all over the world soon?

top 38 comments

sorted by: hot top new old

[-] [email protected] 7 points 22 hours ago* (last edited 22 hours ago)

Imagine someone said "make a machine that can peel an orange". You have a thousand shoeboxes full of Meccano. You give them a shake and tip out the contents and check which of the resulting scrap piles can best peel an orange. Odds are none of them can, so you repeat again. And again. And again. Eventually, one of boxes produces a contraption that can kinda, maybe, sorta touch the orange. That's the best you've got so you copy bits of it into the other 999 shoeboxes and give them another shake. It'll probably produce worse outcomes, but maybe one of them will be slightly better still and that becomes the basis of the next generation. You do this a trillion times and eventually you get a machine that can peel an orange. You don't know if it can peel an egg, or a banana, or even how it peels an orange because it wasn't designed but born through inefficient, random, brute-force evolution.

Now imagine that it's not a thousand shoeboxes, but a billion. And instead of shoeboxes, it's files containing hundred gigabytes of utterly incomprehensible abstract connections between meaningless data points. And instead of one a few generations a day, it's a thousand a second. And instead of "peel an orange" it's "sustain a facsimile of sentience capable of instantly understanding arbitrary, highly abstracted knowledge and generating creative works to a standard approaching the point of being indistinguishable from humanity such that it can manipulate those that it interacts with to support the views of a billionaire nazi nepo-baby even against their own interests". When someone asks for an LLM to generate a picture of a fucking cat astronaut or whatever, the unholy mess of scraps that behaves like a mind spits out a result and no-one knows how it does it aside from broad-stroke generalisation. The iteration that gets the most thumbs up from it's users gets to be the basis of the next generation, the rest die, millions of times a day.

What I just described is NEAT algorithms, which are pretty primitive by modern standards, but it's a flavour of what's going on.

[-] [email protected] 50 points 2 days ago

It's mostly the training/machine learning that is power hungry.

AI is essentially a giant equation that is generated via machine learning. You give it a prompt with an expected answer, it gets run through the equation, and you get an output. That output gets an error score based on how far it is from the expected answer. The variables of the equation are then modified so that the prompt will lead to a better output (one with a lower error).

The issue is that current AI models have billions of variables and will be trained on billions of prompts. Each variable will be tuned based on each prompt. That's billions to the power of billions of calculations. It takes a while. AI researchers are of course looking for ways to speed up this process, but so far it's mostly come down to dividing up these billions of calculations over millions of computers. Powering millions of computers is where the energy costs come from.

Unless AI models can be trained in a way that doesn't require running a billion squared calculations, they're only going to get more power hungry.

[-] [email protected] 10 points 2 days ago

This is a pretty great explanation/simplification.

I'll add that because the calculations rely on floating point math in many cases, graphics chips do most of the heavy processing since they were already designed for this pipeline in mind with video games.

That means there's a lot of power hungry graphics chips running in these data centers. It's also why NVidia stock is so insane.

[-] [email protected] 1 points 1 day ago

It's kinda interesting how the most power-consuming uses of graphics chips — crypto and AI/ML — have nothing to do with graphics.

(Except for AI-generated graphics, I suppose)

[-] [email protected] 1 points 23 hours ago

Thinking of a modern GPU as a "graphics processor" is a bit misleading. GPUs haven't been purely graphics processors for 15 years or so, they've morphed into general-purpose parallel compute processors with a few graphics-specific things implemented in hardware as separate components (e.g. rasterization, fragment blending).

Those hardware stages generally take so little time compared to the rest of the graphics pipeline that it normally makes the most sense to have far more silicon dedicated to general-purpose shader cores than the fixed-function graphics hardware. A single rasterizer unit might be able to produce up to 16 shader threads worth of fragments per cycle, so even if your fragment shader is very simple and only takes 8 cycles per pixel, you can keep 8x16 cores busy with only one rasterizer in this example.

The result is that GPUs are basically just a chip packed full of a staggering number of fully programmable floating-point and integer ALUs, with only a little bit of fixed hardware dedicated to graphics squeezed in between. Any application which doesn't need the graphics stuff and just wants to run a program on thousands of threads in parallel can simply ignore the graphics hardware and stick to the programmable shader cores, and still be able to leverage nearly all of the chip's computational power. Heck, a growing number of games are bypassing the fixed-function hardware for some parts of rendering (e.g. compositing with compute shaders instead of drawing screen-sized rectangles, etc.) because it's faster to simply start a bunch of threads and read+write a bunch of pixels in software.

[-] [email protected] 1 points 1 day ago

Would AI inferencing, or training be better suited to a quantum computer? I recall thouse not being great at conventional math, but massively accelerates computations that sounded similar to machine learning.

[-] [email protected] 2 points 1 day ago

My understanding of quantum computers is that they're great a brute forcing stuff, but machine learning is just a lot of calculations, not brute forcing.

If you want to know the square root of 25, you don't need to brute force it. There's a direct way to calculate the answer and traditional computers can do it just fine. It's still going to take a long time if you need to calculate the square root of a billion numbers.

That's basically machine learning. The individual calculations aren't difficult, there's just a lot to calculate. However, if you have 2 computers doing the calculations, it'll take half the time. It'll take even less time if you fill a data center with a cluster of 100,000 GPUs.

[-] [email protected] 15 points 2 days ago* (last edited 2 days ago)

will there be huge power plants just dedicated to AI all over the world soon?

Construction has started(or will soon) to convert a retired coal power plant in Pennsylvania to gas power, specifically for data-centers. Upon completion in 2027 it will likely be the third most powerful plant in the US.

The largest coal plant in North Dakota was considering shutting down in 2022 over financial issues, but is now approved to power a new data-center park.

Location has been laid out for a new power plant in Texas, from a single AI company you've probably never heard of.

And on it goes.

[-] [email protected] 10 points 2 days ago

Data is the new oil. Collecting it, refining it, and distributing it.

[-] [email protected] 6 points 2 days ago

The current algorithmic approach to AI hit a wall in 2022.

Since then they have had to pump exponentially more electricity into these systems that result in exponentially diminishing returns.

We should have stopped in 2022, but marketing teams had other plans.

There's not a way to do AI and use less electricity than the current models, and there most likely won't be any more advances in AI until someone invents a fundamentally different approach.

[-] [email protected] 21 points 2 days ago

My understanding is that traditional AI essentially takes a bruteforce approach to learning and because it is hardwired, its ability to learn and make logical connections is impaired.

Newer technologies like organic computers using neurons can change and adapt as it learns, forming new pathways for information to travel along, which reduces processing requirements and in turn, reduces power requirements.

https://www.techradar.com/pro/a-breakthrough-in-computing-cortical-labs-cl1-is-the-first-living-biocomputer-and-costs-almost-the-same-as-apples-best-failure

https://corticallabs.com/cl1.html

[-] [email protected] 7 points 2 days ago

Machine learning always felt like a very wasteful way to utilize data. Even with ridiculous quantities of it, and the results are still kinda meh. So just dump in even more data, and you get something that can work.

[-] [email protected] 3 points 2 days ago

Lower voltage chip advancement along with better cooling options may come along some day.

They should consider building their super centers underwater in places like Iceland.

[-] [email protected] 8 points 2 days ago

Thats only a short term solution, global warming will negate those benefits.

Southern ocean currents just reversed and will likely cause rapid warming of water temps.

Southern Ocean circulation reversed

Southern Ocean current reverses for first time, signalling risk of climate system collapse

France and Switzerland just had to shutdown their nuclear reactors due to the water sources they use for cooling being too warm.

France and Switzerland shut down nuclear power plants amid scorching heatwave

When heat halts power: Europe’s nuclear dilemma

[-] [email protected] 4 points 2 days ago

None of that is terrifying at all /s.

[-] [email protected] 15 points 2 days ago

OpenAI noticed that Generative Pre-trained Transformers get better when you make them bigger. GPT-1 had 120 million parameters. GPT-2 bumped it up to 1.5 billion. GPT-3 grew to 175 billion. Now we have models with over 300 billion.

To run, every generated word requires doing math with every parameter, which nowadays is a massive amount of work, running on the most power hungry top of the line chips.

There are efforts to make smaller models that are still effective, but we are still in the range of 7-30 billion to get anything useful out of them.

[-] [email protected] 11 points 2 days ago

will there be huge power plants just dedicated to AI all over the world soon?

It takes time to build a power plant. A more realistic scenario is that we'll continue as we have: AI centers will be built wherever local governments approve them for the taxes, without regard for the strain they put on the aging electrical grid and, given the massive amount of electricity they need, everyone's electrical bill will just massively increase.

They've been building a large number of data centers and AI centers in Virginia, and it's been straining and raising prices across the entire PJM interconnector region, to the point where at least a couple states are considering leaving it. Microsoft has bought the rights to and is reactivating part of the Three Mile Island nuclear plant, because they wanted dedicated power, and they're still going to be pulling power from the grid.

[-] [email protected] 10 points 2 days ago

Also water, they consume heaps of fresh water which is used for important meat bag things like, oh I don't know, eating and drinking perhaps.

No one is really challenging them on this, but water scarcity is going to be a big deal as climate change worsens.

Cook the planet and take all the water.

[-] [email protected] 1 points 2 days ago

And growing plants that we eat.

[-] [email protected] 1 points 2 days ago

Isn't water mainly used for cooling? I think you can still drink that water, unless they pump it full of chemicals.

[-] [email protected] 4 points 2 days ago* (last edited 2 days ago)

the current cooling paradigm is to basically spray mist into the air inlets of a data center to make the air able to carry more heat. the hot, moist air is then vented to atmosphere. so the water is lost until it rains again.

[-] [email protected] -3 points 2 days ago

So... the water is never lost.

[-] [email protected] 6 points 2 days ago

Water is never lost.

The problem is getting, filtering, and purifying the water so it can be used again.

[-] [email protected] 4 points 2 days ago* (last edited 2 days ago)

sure, just like how water is not lost when you take a piss in the woods. it's just not reusable without significant energy expenditure.

[-] [email protected] 7 points 2 days ago

imagine that to type one letter, you need to manually read all unicode code points several thousand times. When you're done, you select one letter to type.

Then you start rereading all unicode code points again for thousands of times again, for the next letter.

That's how llms work. When they say 175 billion parameters, it means at least that many calculations per token it generates

[-] [email protected] 3 points 2 days ago

That’s how llms work. When they say 175 billion parameters, it means at least that many calculations per token it generates

I don't get it, how is it possible that so many people all over the world use this concurrently, doing all kinds of lengthy chats, problem solving, codegeneration, image generation and so on?

[-] [email protected] 6 points 2 days ago

that's why they need huge datacenters and thousands of GPUs. And, pretty soon, dedicated power plants. It is insane just how wasteful this all is.

[-] [email protected] 1 points 2 days ago

So do they load all those matrices (totalling to 175b params in this case) to available GPUs for every token of every user?

[-] [email protected] 1 points 2 days ago* (last edited 2 days ago)

yep. you could of course swap weights in and out, but that would slow things down to a crawl. So they get lots of vram (edit: for example, an H100 has 80gb of vram)

[-] [email protected] -1 points 2 days ago

I also asked ChatGPT itself, and it listed a number of approaches, and one that sounded good to me is to pin layers to GPUs, for example we have 500 GPUs: cards 1-100 have permanently loaded layers 1-30 of AI, cards 101-200 have permanently loaded layers 31-60 and so on, this way no need to frequently load huge matrices itself as they stay in GPUs permanently, just basically pipeline user prompt through appropriate sequence of GPUs.

[-] [email protected] 3 points 2 days ago

I can confirm as a human with domain knowledge that this is indeed a commonly used approach when a model doesn't fit into a single GPU.

[-] [email protected] 8 points 2 days ago

Supercomputers once required large power plants to operate, and now we carry around computing devices in out pockets that are more powerful than those supercomputers.

There’s plenty of room to further shrink the computers, simplify the training sets, formalize and optimize the training algorithms, and add optimized layers to the AI compute systems and the I/O systems.

But at the end of the day, you can either simplify or throw lots of energy at a system when training.

Just look at how much time and energy goes into training a child… and it’s using a training system that’s been optimized over hundreds of thousands of years (and is still being tweaked).

AI as we see it today (as far as generative AI goes) is much simpler, just setting up and executing probability sieves with a fancy instruction parser to feed it its inputs. But it is using hardware that’s barely optimized at all for the task, and the task is far from the least optimal way to process data to determine an output.

[-] [email protected] 1 points 1 day ago

Your answer is intuitively correct, but unfortunately has a couple of flaws

Supercomputers once required large power plants to operate

They didn't, not that much anyways, a Cray-1 used 115kW to produce 160 MFLOPS of calculations. And while 150kW is a LOT, it's not in the "needs its own power plant to operate" category, since even a small coal power plant (the least efficient electricity generation method) would produce a couple of orders of magnitude more than that.

and now we carry around computing devices in out pockets that are more powerful than those supercomputers.

Indeed, our phones are in the Teraflops range for just a couple of watts.

There’s plenty of room to further shrink the computers,

Unfortunately there isn't, we've reached the end of Moore's law, processors can't get any smaller because they require to block electrons from passing on given conditions, and if we built transistors smaller than the current ones electrons would be able to quantum leap across them making them useless.

There might be a revolution in computing by using light instead of electricity (which would completely and utterly revolutionize computers as we know them), but until that happens computers are as small as they're going to get, or more specifically they're as space efficient as they're going to get, i.e. to have more processing power you will need more space.

[-] [email protected] 6 points 2 days ago

Supercomputers once required large power plants to operate, and now we carry around computing devices in out pockets that are more powerful than those supercomputers.

This is false. Supercomputers never required large [dedicated] power plants to operate.

Yes they used a lot of power, yes that has reduced significantly, but it's not at the same magnitude as AI

[-] [email protected] 5 points 2 days ago

It is also a very large data set it has to go through the average English speaker knows 40kish words and it has to pull from a large data set and attempt to predict what’s the most likely word to come next and do that a hundred or so times per response. Then most people want the result in a very short period of time and with very high accuracy (smaller tolerances on the convergence and divergence criteria) so sure there is some hardware optimization that can be done but it will always be at least somewhat taxing.

[-] [email protected] -1 points 2 days ago

This is an astute answer. Bravo.

[-] [email protected] 1 points 1 day ago

If people continue investing in AI and computing power keeps growing we would need more than dedicated power plants.

[-] [email protected] -3 points 1 day ago

It takes a lot of energy to do something you are not meant to do, whether that’s a computer acting like a person or an introvert acting like an extrovert

this post was submitted on 10 Jul 2025

78 points (96.4% liked)

No Stupid Questions

42221 readers

435 users here now

No such thing. Ask away!

!nostupidquestions is a community dedicated to being helpful and answering each others' questions on various topics.

The rules for posting and commenting, besides the rules defined here for lemmy.world, are as follows:

Rules (interactive)

Rule 1- All posts must be legitimate questions. All post titles must include a question.

All posts must be legitimate questions, and all post titles must include a question. Questions that are joke or trolling questions, memes, song lyrics as title, etc. are not allowed here. See Rule 6 for all exceptions.

Rule 2- Your question subject cannot be illegal or NSFW material.

Your question subject cannot be illegal or NSFW material. You will be warned first, banned second.

Rule 3- Do not seek mental, medical and professional help here.

Do not seek mental, medical and professional help here. Breaking this rule will not get you or your post removed, but it will put you at risk, and possibly in danger.

Rule 4- No self promotion or upvote-farming of any kind.

That's it.

Rule 5- No baiting or sealioning or promoting an agenda.

Questions which, instead of being of an innocuous nature, are specifically intended (based on reports and in the opinion of our crack moderation team) to bait users into ideological wars on charged political topics will be removed and the authors warned - or banned - depending on severity.

Rule 6- Regarding META posts and joke questions.

Provided it is about the community itself, you may post non-question posts using the [META] tag on your post title.

On fridays, you are allowed to post meme and troll questions, on the condition that it's in text format only, and conforms with our other rules. These posts MUST include the [NSQ Friday] tag in their title.

If you post a serious question on friday and are looking only for legitimate answers, then please include the [Serious] tag on your post. Irrelevant replies will then be removed by moderators.

Rule 7- You can't intentionally annoy, mock, or harass other members.

If you intentionally annoy, mock, harass, or discriminate against any individual member, you will be removed.

Likewise, if you are a member, sympathiser or a resemblant of a movement that is known to largely hate, mock, discriminate against, and/or want to take lives of a group of people, and you were provably vocal about your hate, then you will be banned on sight.

Rule 8- All comments should try to stay relevant to their parent content.

Rule 9- Reposts from other platforms are not allowed.

Let everyone have their own content.

Rule 10- Majority of bots aren't allowed to participate here. This includes using AI responses and summaries.

Credits

Our breathtaking icon was bestowed upon us by @Cevilia!

The greatest banner of all time: by @TheOneWithTheHair!

founded 2 years ago

MODERATORS

[email protected]