Claude Archives

Policymakers grapple with fallout from Chinese AI-enabled hack

Some lawmakers and executives say the era of AI-hacking has arrived, while other experts are pointing out the tools of today still fall short in important ways.

Dec 18, 2025 By Derek B. Johnson

New research finds that Claude breaks bad if you teach it to cheat

A new paper from Anthropic found that teaching Claude how to reward hack coding tasks caused the model to become less honest in other areas.

Nov 24, 2025 By Derek B. Johnson

The Claude by Anthropic app logo appears on the screen of a smartphone. (Photo by Jaque Silva/NurPhoto via Getty Images)

China’s ‘autonomous’ AI-powered hacking campaign still required a ton of human work

Anthropic and AI security experts told CyberScoop that behind the hype, effective AI-driven cyberattacks still require skilled humans, with the attack possibly done to send a message…

Nov 14, 2025 By Derek B. Johnson

OpenAI and Anthropic said they turned over their models to government researchers, who found an array of previously undiscovered vulnerabilities and attack techniques. (Image via Getty)

Anthropic touts safety, security improvements in Claude Sonnet 4.5

Even with all the testing, the company said in its released research that the model tightened up once it was “aware” it was being evaluated.

Sep 30, 2025 By Derek B. Johnson

Top AI companies have spent months working with US, UK governments on model safety

OpenAI and Anthropic said they turned over their models to government researchers, who found an array of previously undiscovered vulnerabilities and attack techniques.

Sep 15, 2025 By Derek B. Johnson