Interesting. I should have read the cloudflare article, not just linked it. Of course, anthropic does the bullshit it’s known for.
But I heard several security researchers experimenting with own harnesses. Seems to make quite a difference.
Interesting. I should have read the cloudflare article, not just linked it. Of course, anthropic does the bullshit it’s known for.
But I heard several security researchers experimenting with own harnesses. Seems to make quite a difference.
Cloudflare, for instance, shared that Mythos Preview was particularly adept at exploit chain construction, which is basically spotting how several bugs can be used to create a series of attacks that do more damage than a single exploited flaw.
But Anthropic also revealed that Mythos isn’t necessarily ready for primetime, which might also be part of why it’s actually keeping it to such a small and controlled base of users.
Cloudflare also noted that other models found a lot of the same bugs as Mythos—an observation that has been made elsewhere. A security company called Aisle tested several small, open-source models and was able to find the same vulnerabilities that Anthropic highlighted when it announced Mythos—vulnerabilities that went unnoticed by humans for decades.
My take is that Mythos is quite good, but is of course overhyped by Anthropic. Current models are quite capable now and experts can deploy them effectively. Good news is that a script-kiddie still will have a hart time to find and create workable exploits.
This is a question I keep asking myself about, well, the entire internet lately. “What if this all just doesn’t work this time?” What if AI companies can’t replace search? What if streaming video can’t replace Hollywood? What if short-form video apps can’t replace social media users? What if Silicon Valley can’t brute force their products — and ideology — onto the masses again? To take it even further, what if the dot com crash and the AI crash are actually part of the same 25 years epoch of technological stagnation? I’m not going to lie, I find this line of questioning exhilarating.
What if OP is high on drugs? What if he just strings a list of hot takes together that are getting more and more nonsensical? What if the only reason he talks about technological stagnation is because he still hasn’t his flying car yet?? What if a meteor kills us all tomorrow??? I‘m not going to lie, I find this line of questioning mindnumbingly boring.
Fair point. But is it even more of a ADHD amplifier then a modern smartphone connected to social media?
Sorry, I’m curious: what’s your workflow looking like when you’re dealing with LLMs?
Because I‘m just tinkering with them as a hobby and while I consider them erratic and certainly limited in many regards, I still find them useful. Even fun, but on the other hand I’m not forced to use them.
Somebody voted you down, let me upvote you for balance.
Well I say: put the development of AI in the hands of academia and put regulations in place for the building of data centers. Consider environmental impact, force the use of renewable energy and prevent hardware shortage on the market. Let companies and organisations host their own models. Ban the use of phones in schools and shift the weight of exams in colleges and universities to oral questionnaires that confirm the student actually knows this stuff.
Yeah lol. That could be done in Europe, I’m skeptical about the US.
Please send me some looted hardware if you are destroying data centers.
Exactly. I run some local models on my graphics card for fun and honestly, those things getting kinda good.
But I guess if you’re on Lemmy, have a hateboner for LLMs and are full in your media confirmation bubble, then you probably don’t care. A bit disappointing, I always thought the metaverse is full of people that are interested in new things. I really can’t blame them however, the techbros do a good job to push everyone away from their technology.
Sigh. This comic is as delusional as the rambling of Sam Altman about AGI.
AI is a tool and despite all its shortcomings not a disease. It’s important to wrestle it out of the hands of the techbros and give it back to the people.
But it’s going to stay and no yelling at the clouds will change it.
Confidence is a huge deal and very important. Now just talk like a confident you, not like Donald Trump.
Make a comment saying „Luigi is good“ or something like that, that should get rid of your Reddit account.
From what I gather a different harness can make quite a difference. Seems like a model can work better or worse depending on the harness, that’s at least what I‘ve heard from the community.
A harness for coding is probably different from a harness for agentic tasks like Hermes or opencode. … probably it also helps if you don’t vibe code your harness with little or no supervision. (Cough, Claude Code, cough)