this post was submitted on 16 Oct 2023
260 points (90.1% liked)

Technology

58133 readers
4475 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


founded 1 year ago
MODERATORS
 

It is 'nearly unavoidable' that AI will cause a financial crash within a decade, SEC head says::undefined

you are viewing a single comment's thread
view the rest of the comments
[โ€“] [email protected] 3 points 11 months ago (1 children)

Part of what makes localized model engines and custom ML chips interesting is precisely their ability to enable small custom local models. Right now LLMs require so much computational power and massive amounts of data to be trained and operate that even the most expensive options lose money with every prompt query.

So, the reason every tutorial starts with "download this model". Is because there's a good chance you don't have the hundreds of super computer cluster chips and the several hundreds of exabytes of scrapped and curated data needed to train a natural language processing model. There's a reason there are only big players in this game.

[โ€“] [email protected] 2 points 11 months ago* (last edited 11 months ago)

Facts.

Even if you could design your own model... How do you acquire a dataset even a fraction of the size those pretrained models from the corps.

Then how do you train the model in a reasonable time. Other than relying on cloud computing which leads to the same problem of only corps can play this game properly right now.

I designed and collected/labeled the data for a relatively small deep CNN for my masters thesis and training it on 60000 images was taking over a dozen hours (this was 5 years ago at this point so that part may be misremembered) on a 1080ti.