Archived
Researchers claim they had a ‘100% attack success rate’ on jailbreak attempts against Chinese AI DeepSeek
"DeepSeek R1 was purportedly trained with a fraction of the budgets that other frontier model providers spend on developing their models. However, it comes at a different cost: safety and security," researchers say.
A research team at Cisco managed to jailbreak DeepSeek R1 with a 100% attack success rate. This means that there was not a single prompt from the HarmBench set that did not obtain an affirmative answer from DeepSeek R1. This is in contrast to other frontier models, such as o1, which blocks a majority of adversarial attacks with its model guardrails.
...
In other related news, experts are cited by CNBC that DeepSeek’s privacy policy “isn’t worth the paper it is written on."
...
Why do you care? It's entirely open source and you can download the whole thing and run it on your own hardware for $2,000.
https://digitalspaceport.com/how-to-run-deepseek-r1-671b-fully-locally-on-2000-epyc-rig/
@Onno
No, it's not entirely open source as the datasets and code used to train the model are not.