this post was submitted on 10 Dec 2024
329 points (99.4% liked)

A Boring Dystopia

9892 readers
405 users here now

Pictures, Videos, Articles showing just how boring it is to live in a dystopic society, or with signs of a dystopic society.

Rules (Subject to Change)

--Be a Decent Human Being

--Posting news articles: include the source name and exact title from article in your post title

--If a picture is just a screenshot of an article, link the article

--If a video's content isn't clear from title, write a short summary so people know what it's about.

--Posts must have something to do with the topic

--Zero tolerance for Racism/Sexism/Ableism/etc.

--No NSFW content

--Abide by the rules of lemmy.world

founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
[–] [email protected] 2 points 2 weeks ago (1 children)

Regulating it does nothing. Only rich people gets to have deepfakes? Nah, let it be public, so everyone can have some vigilance.

[–] [email protected] 4 points 1 week ago (1 children)

vigilance

Vigilance is like, not drinking the water that comes out of a nuclear reactor.

What we’re talking about here is letting everyone run their own reactor and dump the waste into the street.

You don’t gain vigilance, you lose all habitable public space.

[–] [email protected] 1 points 1 week ago (1 children)

It's a bit late for that. This particular nuclear reactor is open source, free to download and runs on consumer hardware. Can't really unfry that egg and the quality is getting better all the time. Identity fraud is already illegal in most places so not sure exactly what regulation would be appropriate here.

[–] [email protected] 1 points 1 week ago (1 children)

First of all: you need giant data centres to train the models.

Identity fraud is illegal, copyright theft is illegal as well — put the blame on the owner of the data centres.

I know from valid sources that governments know who theses folks are.

[–] [email protected] 1 points 1 week ago

Not entirely true. You don't need your own personal data centre, you can use GPU cloud instances for a lot of that stuff. It's expensive but not so expensive that it would be impossible without being a huge tech company (only 1000s of dollars, not billions). This can be done by anyone with a credit card and some cash to burn. Also, you don't need to train a model from scratch, you can build on existing models that others have published to cut down on training.

However, to impersonate someone's voice you don't need any of that. You only need about 5-10 seconds of audio for a zero-shot impersonation with a pre-trained model. A minute or so for few-shot. This runs on consumer hardware and in some cases even in real time.

Even to build your own model from scratch for high quality voice audio, there doesn't need to be a huge amount of initial training data. Something like xtts was trained with about 10-15K hours of English audio which is actually pretty easy to come by in the public domain. There are a lot of open and public research datasets specifically for this kind of thing, no copyright infringements necessary. If a big tech company wants more audio data than what's publically available, they just pay people to record audio, no need to steal it or risk copyright claims and breaking surveillance laws, they have a budget to exploit people to record whatever they want.

This tech wasn't invented by some evil giant tech company stealing everybody's data, it was mostly geeky computer scientists presenting things at computer speech synthesis conferences. That's not to say there aren't a bunch of huge evil tech companies profiting from this or contributing to this kind of tech, but in the context of audio deepfakes being accessible to scammers, it's not on them and I don't think that some kind of extra copyright regulation on data centres would do anything about it.

The current industry leader in this space in terms of companies trying to monetize speech synthesis is elevenlabs which is a private start-up with only a few dozen employees.

The current tech is not perfect but definitely good enough to fool someone who isn't thinking too hard over a noisy phone call and a scammer doesn't need server time or access to a data centre to do it.