this post was submitted on 02 Oct 2024
117 points (100.0% liked)

Technology

37800 readers
351 users here now

A nice place to discuss rumors, happenings, innovations, and challenges in the technology sphere. We also welcome discussions on the intersections of technology and society. If it’s technological news or discussion of technology, it probably belongs here.

Remember the overriding ethos on Beehaw: Be(e) Nice. Each user you encounter here is a person, and should be treated with kindness (even if they’re wrong, or use a Linux distro you don’t like). Personal attacks will not be tolerated.

Subcommunities on Beehaw:


This community's icon was made by Aaron Schneider, under the CC-BY-NC-SA 4.0 license.

founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
[–] [email protected] 7 points 2 months ago (1 children)

One thing I'd push back on in the article is:

That cost-per-user doesn’t decrease as you add more customers. You need more servers. More GPUs.

This is assuming constant use, which is not the case. If I have a server handling LLM prompt requests, and for illustrative purposes each request uses 100% of the single discrete GPU in it, and I only have 1 customer, but that one customer only uses it 5% of the day (which would actually be pretty high in real terms), I can still add additional customers without needing to buy additional servers. The question is whether the given revenue of a single server outweighs its cost to run.

And when it comes to training, that is an upfront cost, that you could (if you get a model to where you want it) stop having to pay whenever you want. I'm pretty surprised they haven't been really leaning into training models for medical diagnoses, because once you have a model that can e.g. spot a type of tumor with n% accuracy beyond a human, you don't really have to refine it further if you don't want to (after all, it's not like the humans can choose to do it better themselves at that point, like they can with writing prompts).

[–] [email protected] 6 points 2 months ago

I'd say they've probably long reached the point where they have enough customers around the world to hold the load on their servers fairly constant. The example with one user only taking 5% of a servers load only works for low customer counts, similar to how you can't count on one wind turbine or solar plant to provide all of your energy but if you have enough of them you can provide a base line of fairly constant energy