this post was submitted on 05 Nov 2023

218 points (94.3% liked)

Technology

59020 readers

3022 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related content.
Be excellent to each another!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, to ask if your bot can be added please contact us.
Check for duplicates before posting, duplicates may be removed

Approved Bots

founded 1 year ago

MODERATORS

[email protected]

218

AI companies have all kinds of arguments against paying for copyrighted content (www.theverge.com)

submitted 1 year ago by [email protected] to c/[email protected]

66 comments fedilink hide all child comments

AI companies have all kinds of arguments against paying for copyrighted content::The companies building generative AI tools like ChatGPT say updated copyright laws could interfere with their ability to train capable AI models. Here are comments from OpenAI, StabilityAI, Meta, Google, Microsoft and more.

top 50 comments

sorted by: hot top controversial new old

[–] [email protected] 60 points 1 year ago (2 children)

Then feel free to give your copyrighted AI code a free software license :3

[–] [email protected] 31 points 1 year ago (1 children)

This. If the model and its parameters are open source and under an unrestricted license, they can scrape anything they want in my opinion. But if they make money with someone's years of work writing a book, then please give that author some money as well.

[–] [email protected] 2 points 1 year ago* (last edited 1 year ago) (3 children)

But if they make money with someone’s years of work writing a book, then please give that author some money as well.

Why? I've read many books on programming, and now I work as a programmer. The authors of those books don't get a percentage of my income just because they spent years writing the book. I've also read (and written) plenty of open source code over the years, and learned from that code. That doesn't mean I have to give money to all the people who contributed to those projects.

[–] [email protected] 19 points 1 year ago (1 children)

But you bought the books

[–] [email protected] 4 points 1 year ago

He might've borrowed them from a library.

OpenAI could've trained on borrowed ebooks as well

[–] [email protected] 5 points 1 year ago

Like with most things, consent and intent matter. I went out on Halloween when I was a kid and got free candy, so why is it bad if I break in and steal other people's candy?

load more comments (1 replies)

[–] [email protected] 9 points 1 year ago

I will never be totally happy with this situation until they're required to offer a free version of all the models that were created with unlicensed content.

[–] [email protected] 36 points 1 year ago (2 children)

As I’ve said before: Sure, go ahead, train on copyrighted material, but anything AI generates shouldn’t be copyrightable either - it wasn’t created by a human.

[–] [email protected] 13 points 1 year ago (1 children)

That's exactly the way it is now.

[–] [email protected] 15 points 1 year ago

Not where I live. In the U.K. copyright for AI generated works is owned by the person that caused the AI to create the work. See s178 and s9(3) of the Copyright, Design and Patent Act of 1987.

[–] [email protected] 10 points 1 year ago

Problem is that small modifications already make it copyrightable again.

[–] [email protected] 33 points 1 year ago

Well I mean...so do I.

[–] [email protected] 24 points 1 year ago

Stock image companies have probably the strongest CR claim here IMO. An AI trained off their images without paying for licence could act as a market replacement for their service.

[–] [email protected] 21 points 1 year ago (3 children)

Tough tits. Imagine all the books, movies and games that could have been made if copyright didn't exist. Nobody else gets to ignore the rules just because it's inconvenient.

[–] [email protected] 10 points 1 year ago (1 children)

Honestly, if tech companies have to battle it out in court with Disney, imma grab some popcorn and watch the money go brrrrr.

[–] [email protected] 2 points 1 year ago

Epic lawsuits round 2... "There are no heroes here"

[–] [email protected] 4 points 1 year ago

And if it's ok. Then what's the limit on what an AI is, do you have to prove an AI made it? Or can you just write some repetitive work and say it's made by AI and dodge copyright?

[–] [email protected] 1 points 11 months ago (3 children)

That's the key, humans can take inspiration from other people's copyrighted work, why can't AI?

load more comments (3 replies)

[–] [email protected] 16 points 1 year ago (1 children)

Here is what deliberative experts way smarter and more knowledgeable than I am are saying ( on TechDirt )

TLDR: Letting AI be freely trained on human-made artistic content may be dangerous. We may decide to stop it so long as capitalists control who eats and lives. But copyright is not the means to legally stop it. This is a separate issue to how IP law is way, way broken. And precedents stopping software from training on copyrighted work will be used to stop humans from training on copyrighted work. And that's bad.

[–] [email protected] 6 points 1 year ago (2 children)

Agree, it's not much different from a human learning from all these sources and then applying said knowledge

[–] [email protected] 9 points 1 year ago* (last edited 1 year ago) (2 children)

Scale matters. For example

A bunch of random shops having security cameras, where their employees can review footage
Every business in a country having a camera connected to a central surveillance network with facial recognition and search capabilities

Those two things are not the same, even though you could say they're "not much different" - it's just a bunch of cameras after all.

Also, the similarity between human learning and AI training is highly debatable.

[–] [email protected] 4 points 1 year ago (1 children)

Both of your examples are governed by the same set of privacy laws, which talk about consent, purpose and necessity, but not about scale. Legislating around scale open up the inevitable legal quagmires of "what scale is acceptable" and "should activity x be counted the same as activity y to meet the scale-level defined in the law".

Scale makes a difference, but it shouldn't make a legal difference w.r.t. the legality of the activity.

[–] [email protected] 2 points 1 year ago (2 children)

Scale makes a difference, but it shouldn't make a legal difference w.r.t. the legality of the activity.

What do you think the difference between normal internet traffic and a ddos attack is?

[–] [email protected] 2 points 1 year ago (3 children)

Intent is part of it as well. If you have too many people who want to use your service, you're not being attacked, you have an actual shortage of ability to service requests and need to adjust accordingly.

load more comments (3 replies)

[–] [email protected] 2 points 1 year ago (1 children)

Lack of consent and the intent to cause harm.

[–] [email protected] 1 points 1 year ago (1 children)

Ok, then how about automated cold calling vs "live" cold calling?

[–] [email protected] 1 points 1 year ago (1 children)

Falls under unwanted calls, you should be able to opt out of both (though I believe both are currently legal in the US).

[–] [email protected] 1 points 11 months ago

You can opt out of both, but automated cold calling is straight up illegal in the UK (and it's a good thing it is).

[–] [email protected] 1 points 1 year ago

Legally no difference

[–] [email protected] 6 points 1 year ago* (last edited 1 year ago)

When Google trained their playing neural network, they trained it to starcraft2 . It was better at it than professional gamer. It trained by watching 100years of play. Or 36500 days of play. Or 876000 hours of play.

Does a human can do that ? We both know it's impossible. As the other person said, the issue is scale.

[–] [email protected] 16 points 1 year ago (10 children)

The way I see it, if training on copyrighted content is forbidden, then that should apply universally.

Since all people mix together ideas they've learned from their own input to create new things, just like AI does, then all people-produced content should also be inherently uncopyrightable, unless produced by a person who has never been exposed to copyrighted content.

Oh, also all copyrighted content should lose its copyright. The only copyrighted content should be the original cave paintings by the first cavemen to develop art, since all art since then uses its influence.

And if this sounds ridiculous, then it's no less so than arguments that AI shouldn't be allowed to learn.

[–] [email protected] 19 points 1 year ago (2 children)

Copyright is broken, but that's not an argument to let these companies do whatever they want. They're functionally arguing that copyright should remain broken but also they should be exempt. That's the worst of both worlds.

[–] [email protected] 10 points 1 year ago

Yes it seems they want copyright when it suits them and not when it doesn’t.

[–] [email protected] 2 points 1 year ago* (last edited 1 year ago) (1 children)

Who said anything about "do whatever they want"? They should obviously comply with the law.

When a human reads a comment here on Lemmy and learns something they didn't know before - copyright law doesn't stop them from using that knowledge. The same rule should apply to AI.

In my opinion if you don't want AI to learn from your work, then you shouldn't allow humans to learn from it either. That's fine - everyone has the right to keep their work private if they choose to do so... but if you make it publicly available, then you don't get to control who learns from it.

You can control who makes exact replicas of it, and if AI is doing that then sure - charge the company with copyright infringement - but generally that's not how these systems work. They generally don't produce exact copies except for highly structured content where there isn't much creative flexibility (and those tend to not be protected under copyright by the way - they would be protected by patents).

[–] [email protected] 4 points 1 year ago (1 children)

Computers aren't people. AI "learning" is a metaphorical usage of that word. Human learning is a complex mystery we've barely begun to understand, whereas we know exactly what these computer systems are doing; though we use the word "learning" for both, it is a fundamentally different process. Conflating the two is fine for normal conversation, but for technical questions like this, it's silly.

It's perfectly consistent to decide that computers "learning" breaks the rules but human learning doesn't, because they're different things. Computer "learning" is a a new thing, and it's a lot more like creating replicas than human learning is. I think we should treat it as such.

[–] [email protected] 2 points 1 year ago (1 children)

I’m so fed up trying to explain this to people. People thing LLMs are real GAI and are treating them as such.

Computers do not learn like humans. It cannot, and should not be regulated in the same way.

[–] [email protected] 2 points 1 year ago

Yes 100%. Once you drop the false equivalence, the argument boils down to X does Y and therefore Z should be able to do Y, which is obviously not true, because sometimes we need different rules for different things.

[–] [email protected] 7 points 1 year ago* (last edited 1 year ago) (3 children)

Since all people mix together ideas they've learned from their own input to create new things, just like AI does, then all people-produced content should also be inherently uncopyrightable, unless produced by a person who has never been exposed to copyrighted content.

While copyright and IP law at present is massively broken, this is a very poor interpretation of the core argument at play.

Let me break it down:

Yes, all human created art takes significant influence - purposefully, and accidently - from work which has come before it
To have been influenced by that piece, legally, the human will have had to pay the copyright holder to; go to the cinema, buy the bluray, see the performance, go to the gallery, etc. Works out of copyright obviously don't apply here.
To be trained in a discipline, the human likely pays for teaching by others, and those others have also paid copyright holders to view the media that influenced them aswell
Even thought the vast majority of art is influenced by all other art, humans are capable of novel invention- ie things which have not come before - but GenAI fundamentally isn't.

Separately, but related, see the arguments the Pirate Parties used to make about personal piracy being OK, which were fundamentally down to an argument of scale:

A teenager pirating some films to watch cos they are interested in cinema, and being inspired to go to film school is very limited in scope. Even if they pirate hundreds of films, it can't be argued that it's 100 lost sales because the person may have never bought them anyway.
A GenAI company consuming literally all artistic output of humanity, with no payment to the artists what so ever, "learning" to create "new" art, without paying for teaching, and spitting out whatever is asked of it, is massive copyright infringement on the consumption side, and an existential threat to the arts on the generation side

That's the reason people are complaining, cos they aren't being paid today, and they won't be paid tomorrow.

load more comments (3 replies)

load more comments (8 replies)

[–] [email protected] 10 points 1 year ago (1 children)

Most of these companies are just arguing that they shouldn't have to license the works they're using because that would be hard and inconvenient, which isn't terribly compelling to me. But Adobe actually has a novel take I hadn't heard before: they equate AI development to reverse engineering software, which also involves copying things you don't own in order to create a compatible thing you do own. They even cited a related legal case, which is unusual in this pile of sour grapes. I don't know that I'm convinced by Adobe's argument, I still think the artists should have a say in whether their works go into an AI and a chance to get paid for it, but it's the first argument I've seen for a long while that's actually given me something to think about.

load more comments (1 replies)

[–] [email protected] 9 points 1 year ago (2 children)

This thread is interesting reading. Normally, people here complain about capitalism left and right. But when an actual policy choice comes up, the opinions become firmly pro-capitalist. I wonder how that works.

[–] [email protected] 4 points 1 year ago

Human beings are funny characters. They only care when it starts to affect them personally otherwise they say all kinda shit.

[–] [email protected] 4 points 1 year ago (1 children)

Everyone's always up for changing things until it comes to making the actual sacrifices necessary to enact the changes

[–] [email protected] 5 points 1 year ago

That's the thing. I don't see how there is sacrifice involved in this. I would guess that the average user here has personally more to lose than to gain from expanded copyrights.

[–] [email protected] 2 points 1 year ago

This is the best summary I could come up with:

The US Copyright Office is taking public comment on potential new rules around generative AI’s use of copyrighted materials, and the biggest AI companies in the world had plenty to say.

We’ve collected the arguments from Meta, Google, Microsoft, Adobe, Hugging Face, StabilityAI, and Anthropic below, as well as a response from Apple that focused on copyrighting AI-written code.

There are some differences in their approaches, but the overall message for most is the same: They don’t think they should have to pay to train AI models on copyrighted work.

The Copyright Office opened the comment period on August 30th, with an October 18th due date for written comments regarding changes it was considering around the use of copyrighted data for AI model training, whether AI-generated material can be copyrighted without human involvement, and AI copyright liability.

There’s been no shortage of copyright lawsuits in the last year, with artists, authors, developers, and companies alike alleging violations in different cases.

Here are some snippets from each company’s response.

The original article contains 168 words, the summary contains 168 words. Saved 0%. I'm a bot and I'm open source!

load more comments