this post was submitted on 22 Aug 2023
767 points (95.7% liked)

Technology

58133 readers
4335 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


founded 1 year ago
MODERATORS
 

OpenAI now tries to hide that ChatGPT was trained on copyrighted books, including J.K. Rowling's Harry Potter series::A new research paper laid out ways in which AI developers should try and avoid showing LLMs have been trained on copyrighted material.

you are viewing a single comment's thread
view the rest of the comments
[–] [email protected] 17 points 1 year ago (1 children)

I am sure they have patched it by now but at one point I was able to get chatgpt to give me copyright text from books by asking for ever large quotations. It seemed more willing to do this with books out of print.

[–] [email protected] 5 points 1 year ago (1 children)

Yeah, it refuses to give you the first sentence from Harry Potter now.

Which is kinda lame, you can find that on thousands of webpages. Many of which the system indexed.

If someone was looking to pirate the book there are way easier ways than issuing thousands of queries to ChatGPT. Type "Harry Potter torrent" into Google and you will have them all in 30 seconds.

[–] [email protected] 1 points 1 year ago

ChatGPT has a ton of extra query qualifiers added behind the scenes to ensure that specific outputs can’t happen