this post was submitted on 31 Aug 2023
597 points (97.9% liked)
Technology
59143 readers
2284 users here now
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related content.
- Be excellent to each another!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, to ask if your bot can be added please contact us.
- Check for duplicates before posting, duplicates may be removed
Approved Bots
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
I work in this field a good bit, and you're largely correct. That's a great analogy of trying to remove salt from a stew. The only issue with that analogy is that that's technically possible still by distilling the stew and recovering the salt. Even though it would destroy the stew.
At the point that pii data is in the model, it's fully baked. It'd be like trying to get the eggs out of a baked cake. The chemical composition has changed into something else completely.
That's how building a model works today. Like baking a cake.
I'm order to remove or even identify pii data in ML models or LLMs today, we'd need a whole new way of baking a cake that would keep the eggs separate from the cake until just before you tried to take a bite out of it. The tools today don't allow you to do anything like that. They bake you a complete cake.