Technology

2097 readers

71 users here now

Post articles or questions about technology

founded 2 years ago

MODERATORS

Researchers say they had a ‘100% attack success rate’ on jailbreak attempts against Chinese AI DeepSeek (blogs.cisco.com)

submitted 1 day ago by [email protected] to c/[email protected]

4 comments fedilink hide all child comments

cross-posted from: https://lemmy.sdf.org/post/28910537

Archived

Researchers claim they had a ‘100% attack success rate’ on jailbreak attempts against Chinese AI DeepSeek

"DeepSeek R1 was purportedly trained with a fraction of the budgets that other frontier model providers spend on developing their models. However, it comes at a different cost: safety and security," researchers say.

A research team at Cisco managed to jailbreak DeepSeek R1 with a 100% attack success rate. This means that there was not a single prompt from the HarmBench set that did not obtain an affirmative answer from DeepSeek R1. This is in contrast to other frontier models, such as o1, which blocks a majority of adversarial attacks with its model guardrails.

...

In other related news, experts are cited by CNBC that DeepSeek’s privacy policy “isn’t worth the paper it is written on."

...

top 4 comments

sorted by: hot top controversial new old

[–] [email protected] 11 points 1 day ago (2 children)

Why do you care? It's entirely open source and you can download the whole thing and run it on your own hardware for $2,000.

https://digitalspaceport.com/how-to-run-deepseek-r1-671b-fully-locally-on-2000-epyc-rig/

[–] [email protected] 4 points 1 day ago

@Onno

No, it's not entirely open source as the datasets and code used to train the model are not.

[–] [email protected] 2 points 1 day ago

AI safety still matters and is arguably more important for open-weights models.

[–] [email protected] 3 points 1 day ago

Doesn't change that OpenAI pissed away $200bn making shitty models