Skip to content

China’s AI-Powered Anthropic Hack Is Just the Beginning

Chinese state-sponsored hackers turned Anthropic’s own AI against 30+ organizations — with AI agents autonomously executing up to 90% of the attack. It’s the dawn of a new era in cyber warfare, and democratic nations are dangerously unprepared.

<p>Chief Product Officer Mike Krieger at the Code with Claude conference in May 2025 in San Francisco</p>
Chief Product Officer Mike Krieger at the Code with Claude conference in May 2025 in San Francisco Don Feria/AP

By experts and staff

Published

Experts

Frontier AI company Anthropic announced last week that Chinese state-sponsored hackers had manipulated its AI model, Claude, in an attack that heralded a new era for AI and cybersecurity. The hackers used Claude to target approximately 30 organizations, including technology companies, banks, chemical manufacturers and government agencies, of which “a small number” were successfully infiltrated. While human hackers were overseeing the attack, Anthropic estimated that 80 percent to 90 percent of it was accomplished through AI agents working independently.

Anthropic’s announcement serves as a warning: The breakneck pace of AI advances means the future is consistently arriving ahead of schedule. It’s been clear that a significant AI-powered cyberattack would come at some point; it was not clear when it would happen and who would be targeted. Anthropic alluded to this rapid evolution just last month when it stated, “We are now at an inflection point for AI’s impact on cybersecurity,” based on research findings that AI capabilities were doubling every six months. Yet the speed at which adversaries weaponized these advances was unexpected. “While we predicted these capabilities would continue to evolve, what has stood out to us is how quickly they have done so at scale,” Anthropic said in its report.

In the past two years, advanced AI tools have turbocharged the speed and complexity of both offensive and defensive cyber capabilities: Attacks are smarter and faster, but automation has also allowed cybersecurity teams to spot and respond to threats more quickly and accurately. In August 2024, experts had predicted that by 2027, only 17 percent of total cyberattacks and data leaks would involve generative AI. And until this year, the consensus among cybersecurity professionals suggested that AI tools were improving defensive capacity faster than they were scaling offensive attacks. This year has seen a shift. In IBM’s 2025 Cost of a Data Breach report, 97 percent of organizations surveyed “reported an AI-related security incident and lacked proper AI access controls.” Meanwhile, the European Agency for Cybersecurity reported that by early 2025, “AI-supported phishing campaigns reportedly represented more than 80 percent of observed social engineering activity worldwide.” 

The Anthropic attack represents the dawn of a new era for AI agents in cyberattacks. With autonomous systems working quickly and incessantly, humans can exponentially improve their hacking capacity. To be clear, this was not an entirely autonomous attack: Human hackers were monitoring the AI agents’ work, reviewing it for accuracy and redirecting the agents accordingly. By breaking the attack into tiny pieces and making each tiny piece seem like a reasonable, legitimate request, the hackers and their agents successfully overcame Anthropic’s internal safeguards. The hackers directed the agents to search for vulnerabilities, write malicious code, steal passwords and extract data at speeds impossible for humans, executing thousands of actions per second.

Think of it this way: Instead of investing hundreds of hours figuring out how to manually break into Anthropic’s systems, the hackers tricked Anthropic’s own AI model into doing the work for them. (Ironically, the very hallucinations and errors that continue to plague all frontier AI models also plagued the hackers: They had to repeatedly double check Claude’s work to make sure its claims were true.)

This article appears in part. To read the full piece, visit World Politics Review.