890

Nvidia falls 14% in premarket trading as China's DeepSeek triggers global tech sell-off (www.cnbc.com)

submitted 4 months ago by [email protected] to c/[email protected]

339 comments fedilink hide all child comments

cross-posted from: https://lemm.ee/post/53805638

you are viewing a single comment's thread
view the rest of the comments

[-] [email protected] 0 points 4 months ago

Yep

[-] [email protected] 2 points 4 months ago

Huh. Everything I'm reading seems to imply it's more like a DSP ASIC than an FPGA (even down to the fact that it's a VLIW processor) but maybe that's wrong.

I'm curious what kind of work you do that's led you to this conclusion about FPGAs. I'm guessing you specifically use FPGAs for this task in your work? I'd love to hear about what kinds of ops you specifically find speedups in. I can imagine many exist, as otherwise there wouldn't be a need for features like tensor cores and transformer acceleration on the latest NVIDIA GPUs (since obviously these features must exploit some inefficiency in GPGPU architectures, up to limits in memory bandwidth of course), but also I wonder how much benefit you can get since in practice a lot of features end up limited by memory bandwidth, and unless you have a gigantic FPGA I imagine this is going to be an issue there as well.

I haven't seriously touched FPGAs in a while, but I work in ML research (namely CV) and I don't know anyone on the research side bothering with FPGAs. Even dedicated accelerators are still mostly niche products because in practice, the software suite needed to run them takes a lot more time to configure. For us on the academic side, you're usually looking at experiments that take a day or a few to run at most. If you're now spending an extra day or two writing RTL instead of just slapping together a few lines of python that implicitly calls CUDA kernels, you're not really benefiting from the potential speedup of FPGAs. On the other hand, I know accelerators are handy for production environments (and in general they're more popular for inference than training).

I suspect it's much easier to find someone who can write quality CUDA or PTX than someone who can write quality RTL, especially with CS being much more popular than ECE nowadays. At a minimum, the whole FPGA skillset seems much less common among my peers. Maybe it'll be more crucial in the future (which will definitely be interesting!) but it's not something I've seen yet.

Looking forward to hearing your perspective!

this post was submitted on 27 Jan 2025

890 points (98.1% liked)

Technology

71799 readers

4166 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related news or articles.
Be excellent to each other!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
Check for duplicates before posting, duplicates may be removed
Accounts 7 days and younger will have their posts automatically removed.

Approved Bots

founded 2 years ago

MODERATORS

[email protected]