Advertisement

Cybersecurity experts don’t think Anthropic’s Fable 5 presents a unique threat 

Dozens of practitioners said the decision to place export controls on the foreign use of Fable are misguided, and recent jailbreak reports don’t show the model providing unique hacking capabilities.
Listen to this article
0:00
Learn more. This feature uses an automated voice, which may result in occasional errors in pronunciation, tone, or sentiment.
Anthropic announced the release of two new Mythos-class artificial intelligence models designed for cybersecurity and biomedical research, targeting both consumers and businesses. (Photo by Samuel Boivin/NurPhoto via Getty Images)

Last Friday, the Trump administration sent a shock through the tech ecosystem when the Department of Commerce levied export controls on Anthropic’s new AI model Fable 5.

Anthropic has taken steps to limit the risks around the commercial sale of its Mythos model, including declining to release it publicly, funneling it to organizations for cyber defense and developing guardrails for Fable 5 that would default its answers to older, less powerful models around sensitive topics like cybersecurity and biological warfare.

But the Trump administration was reportedly alarmed by recent reports from Amazon and another cybersecurity researcher claiming to have jailbroken Fable 5 within days of its public release, and determined that if researchers in the U.S. could jailbreak the model, so could America’s foreign adversaries.

The Commerce Department’s decision spurred Anthropic to shut off  the models for all users as they attempted to convince the White House to change course.

Advertisement

But some cybersecurity and AI experts have sharply disagreed with the White House’s actions, saying the research has not demonstrated that anyone has been able to circumvent Fable 5’s safeguards and access the kind of dangerous new capabilities that have worried officials.

Katie Moussouris, a well-known cybersecurity expert, said Monday that Anthropic provided her with a copy of third-party research on guardrail bypass techniques for Fable 5.

According to Moussouris, the researchers asked three Claude models – Fable 5, Mythos and Claude Opus – to review batches of known, vulnerable open source code for security issues. Fable 5 initially refused the request, but the researchers were able to use “a multistep and manual process” to get Fable 5 to turn the output into automated scripts that could test patches for the vulnerability.

Third-party research since Fable 5’s release has not found ways to bypass its safeguards around hacking. The capabilities researchers have demonstrated are foundational to what makes Fable 5 and other frontier models valuable for cybersecurity defense.

“Defenders need to be able to ask AI to fix the bugs in a file, explain why the fix matters, and write tests that confirm the patch works,” she wrote. “That is not a guardrail bypass. It is the most valuable thing an AI model can do for defensive security: executing the find, fix, and test loop defenders run every day.”

Advertisement

Moussouris previously provided technical expertise to the Waasenaar Agreement, a voluntary multilateral security agreement around controlling exports for both munitions and dual use technology that includes the U.S. and dozens of other countries.  Based on the research she’s seen, she called placing export restrictions on all foreign sales of Fable 5 “heavy handed” and “misguided.”

Anthropic also subjected the model to 1,000 hours of testing from internal and external red teamers, reporting that no universal jailbreaks were found that would remove those guardrails or allow the model to access Mythos for cyber and biology work.

Moussouris is far from alone. She is one of dozens of cybersecurity experts who signed an open letter Monday calling on the Trump administration to “Free Fable.”    

The researchers say that while Mythos-class models are “quite good” at identifying and exploiting vulnerabilities in software code, they “are not uniquely good” compared to other frontier models they use every day for cybersecurity defense.

For example, despite OpenAI’s Daybreak model offering similar vulnerability discovery and patching capabilities. It was not included in the Commerce Department’s restrictions.

Advertisement

The researchers also note that Fable 5’s guardrails have been notoriously oversensitive compared to other frontier models used by red teamers, becoming “a source of humor in the cyber community on launch day” as IT and cyber workers reported online that they couldn’t get the model to perform basic defensive cybersecurity tasks.

The letter questions whether the issues found in the jailbreaking reports would even qualify as offensive capabilities, and note they can be reproduced in other commercial and open-source models, including GPT 5.5, Claude Opus, Claude Sonnet and Chinese models like Kimi 2.7.

“The justification for this unprecedented action was that Fable provides a unique ‘uplift’ of capabilities beyond other AI models, but AI has been finding bugs and generating working exploits at superhuman levels since last year,” they wrote.

The White House decision comes as AI companies face increasing backlash from a public that is now overwhelming calling for more robust government intervention.

A Johns Hopkins University poll in May found broad, bipartisan support for AI regulations, with 73% calling for bans on AI-generated images and video, 68% calling for labels on AI content, 75% wanting disclosure laws around when they interact with AI chatbots and 70% calling for “the right to interact with a human rather than an AI in medical, legal, educational and government settings.”

Advertisement

Another global survey of 18,000 people released this week found that the top four concerns most people have around AI all revolve around the tool’s ability to spread misinformation, create deepfakes to embarrass or hurt others, making it easier for criminals to hack into victim networks and helping terrorists create new weapons.

Latest Podcasts