view the rest of the comments
LocalLLaMA
Welcome to LocalLLaMA! Here we discuss running and developing machine learning models at home. Lets explore cutting edge open source neural network technology together.
Get support from the community! Ask questions, share prompts, discuss benchmarks, get hyped at the latest and greatest model releases! Enjoy talking about our awesome hobby.
As ambassadors of the self-hosting machine learning community, we strive to support each other and share our enthusiasm in a positive constructive way.
Rules:
Rule 1 - No harassment or personal character attacks of community members. I.E no namecalling, no generalizing entire groups of people that make up our community, no baseless personal insults.
Rule 2 - No comparing artificial intelligence/machine learning models to cryptocurrency. I.E no comparing the usefulness of models to that of NFTs, no comparing the resource usage required to train a model is anything close to maintaining a blockchain/ mining for crypto, no implying its just a fad/bubble that will leave people with nothing of value when it burst.
Rule 3 - No comparing artificial intelligence/machine learning to simple text prediction algorithms. I.E statements such as "llms are basically just simple text predictions like what your phone keyboard autocorrect uses, and they're still using the same algorithms since <over 10 years ago>.
Rule 4 - No implying that models are devoid of purpose or potential for enriching peoples lives.
Yeah, says as much in the article. This'll most likely, if it's not vaporware, have a 256 bus, which will be a damn shame for inference speed, just saying if they doubled the bus and sold for ≤ $1000 they'd eat the 5900 alive and generate a lot of goodwill in the influential local LLM community and probably get a lot of free ROCm development. It'd be a damn smart move, but how often can you accuse AMD of that?
I misread this part, thinking you implied a bus width increase is necessary.
For a 512 bit memory bus, AMD would either have to use 1+8 dies if they follow the 7900XTX scheme or have a monolithic behemoth like GB102. The former would have increased power draw but lower manufacturing costs, while the latter is more power efficient and more prone to defects as it's getting close to the aperture size limit.
I'd guess nvidia will soon have to switch to chiplet based GPUs. Maybe AMD stopped (for now?) because not their whole product stack was using chiplet based designs so they had way less flexibility with allocation and binning than with ryzen chiplets.
Has monolithic Vs chiplet been confirmed for 9070? A narrow buswidth on a much smaller (compared to previous I/O-die) technology would mean a whole lot in regards to surface area available for the stream processors.