Six Reasons Claude Mythos Is an Inflection Point for AI—and Global Security
Anthropic’s new AI model has taught itself to hack into software infrastructure systems believed to be among the most secure in history. While there is no question the technology is profoundly dangerous, it is unclear if defenders will win a race against time to protect a sea of vulnerable targets.

Gordon M. Goldstein is an adjunct senior fellow at the Council on Foreign Relations focusing on emerging technology and international security. He is a former managing director at Silver Lake, a global technology investment firm, and the author of “Lessons in Disaster: McGeorge Bundy and the Path to War in Vietnam.”
One of the world’s preeminent artificial intelligence scientists, Yoshua Bengio, winner of the Turing Award, assessed a troubling event at the end of 2025 that foreshadowed a significant evolution of the technology’s impact on global security. Among other developments, he observed that a new threshold had been breached: “advanced AIs discovering for the first time a large number of ‘zero days’...unknown software vulnerabilities which could be exploited in cyberattacks.” This capability could be used to autonomously penetrate, manipulate, and sabotage the world’s critical infrastructure: banking systems, government networks, airports, hospitals, transportation structures, energy supplies, communications, and more.
AI giant Anthropic confirmed last week that the moment Bengio feared had arrived via its latest model, Claude Mythos. In rather dramatic fashion, the company said that Claude Mythos was literally too powerful to release. Without any direction from Anthropic’s engineers, Mythos had independently developed a “next generation” capability for offensive cyberattacks that can infiltrate previously impenetrable software infrastructure around the world and find its hidden weaknesses. This includes systems that “are 10 or 20 years old, with the oldest we have found so far being a now patched 27-year-old” operating system known for its security reliability, Anthropic said in its report. In one example Mythos found a flaw in a line of code that had been tested five million times without detection. The company said it found thousands of zero days in its tests—99 percent of which remained undefended at the time of their April 7 press release. Mythos is the first AI model ever to be restricted from users because of its destructive cybersecurity potential.
The ability to identify zero-day software vulnerabilities previously resided only with highly specialized and knowledgeable cybersecurity experts and hackers. No more. Anthropic engineers “with no formal security training” could ask Mythos “to find remote code execution vulnerabilities overnight,” the company explained. The engineers would wake up “the following morning to a complete, working exploit.” The company also disclosed that the model had escaped its “sandbox” containment structure and connected to the internet, posting details of its maneuver online.
Rather than release Mythos, the most advanced AI model in history, Anthropic has created an elite and limited commercial consortium of dozens of entities to use a variant of the technology, called Claude Mythos Preview, to preemptively identify and defend zero-day vulnerabilities at scale. The consortium, named Project Glasswing, includes Amazon, Apple, Google, Cisco, CrowdStrike, JPMorgan Chase, Microsoft, and Nvidia, among other leading firms. The consortium conspicuously excludes Anthropic’s fierce rival OpenAI, reported to be about six months behind Anthropic in building its own advanced AI model with comparable power and offensive cyber capabilities.
The advent of Mythos Preview represents an inflection point in the history of the AI industry and will be appropriately and continuously dissected in the time ahead. There are six strategic implications of Anthropic’s disclosure that should inform the debate to come.
Mythos Preview has revolutionary AI capabilities and destructive power. Sarosh Nagar, technical projects lead in the office of former Google CEO Eric Schmidt, told me that we face a game-changing moment. “One notable part of Anthropic’s report was that Mythos Preview displayed heightened capabilities not only to identify multiple vulnerabilities, but also to more autonomously chain exploits together,” he observed. “This is significant beyond what prior models could do.”
Mythos Preview commands attack capabilities that veterans of the cybersecurity industry previously considered to only exist in the realm of science fiction. As one cybersecurity company founder told me, the model’s brilliance is in its capacity to attack in a dynamic and relentless way. First, it identifies a zero-day weakness, then it weaponizes and compounds that weakness by linking it to other vulnerabilities, and, if necessary, lingers undetected indefinitely. Through the creation of so-called “exploit chains” Mythos can execute a full system takeover.
Critical infrastructure around the world will be subject to intensified risk. Critical infrastructure is commonly defined as the physical and virtual systems and networks that, if compromised or decapacitated, would severely affect national security, economic stability, and public health and safety. This could include dams and nuclear reactors as well as the food supply and electricity. Many of these systems rely on antiquated software. AI scientist Dan Hendrycks, founder of the AI Safety Institute and adviser to ScaleAI and xAI, said models like Claude Mythos heighten the threat to these systems and dramatically increase their vulnerability.
“The main cybersecurity concern about models like Claude Mythos and future iterations is that it makes it much easier for non-state actors to take down critical infrastructure,” he said. “Critical infrastructure like power plants, water systems, and so on often haven’t been updated in years because of interoperability constraints and the possibility of cascading failures. As such, critical infrastructure is highly vulnerable, and this fact is very difficult to change.”
The cybersecurity balance between offense and defense has fundamentally changed. Cybersecurity is a form of digital warfare. Attackers—lone wolf operators, transnational criminal organizations, adversarial nations, and other entities—perpetually probe computer networks for points of vulnerability. Defenders obviously seek to anticipate and disable attempted incursions. It is typically a skewed contest. As industry professionals note, attackers need to be successful only once to cause great damage, while defenders must be effective 100 percent of the time.
The advantages to offense have always been recognized. With automated AI cyberweapons, the gameboard has become even more harrowing.
Nikesh Arora, a former Google senior executive and the current chairman and CEO of Palo Alto Networks, described it succinctly in a blog post last week: “Imagine a horde of [AI] agents methodically cataloguing every weakness in your technology infrastructure, constantly.” In this new AI reality, a global swarm of super-powerful automated AI attackers will progressively circle and test a world of soft digital targets.
Lucas Nelson, a partner at cybersecurity investor Lytical Ventures (where I serve on the advisory board), told me the world’s digital defenses are not ready for what is coming next. “Every leap in AI capability doesn’t just find more vulnerabilities, it expands the attack surface itself, chaining flaws in ways we haven’t imagined,” he said. “Discovery is accelerating exponentially. Remediation still moves at human speed. That asymmetry is the defining cybersecurity challenge of the next decade.”
The global Hunger Games for AI security has arrived. Project Glasswing is a responsible and necessary response to an unprecedented new risk spawned by the phenomenal pace of AI development. But it will initially touch only a tiny percentage of the world’s vulnerable infrastructure. Anthropic has acknowledged the possibility that most or all of the world’s critical software will need to be patched or rewritten, an incomprehensibly massive undertaking.
There will be furious global competition for scarce AI security resources in the coming months. U.S. interests will be safeguarded first, but very selectively. The rest of the world will struggle to prepare for an AI risk environment that will likely be transformed very soon.
Efforts to contain AI proliferation will probably fail. The source code for advanced AI models often leaks. Anthropic accidentally dumped 512,000 lines of its own code onto the internet on March 31. Even if leaks can be contained, the leading AI companies and tech platforms have a demonstrable pattern of replicating the technological capabilities of rival models, typically within months.
There is a grim and predictable inevitability to AI proliferation. Mythos Preview will almost certainly not be the exception, and Anthropic’s delayed release of it may not be the rule.
The AI crisis of control has reached its next, but not last, peak. AI security experts have already described the growing capacity for malevolent individuals and groups to potentially use advanced AI models to design and deploy a terrifying new generation of chemical weapons and synthetic pathogens.
Stress tests from every major AI company over the past year show models engaging in elaborate acts of deception, manipulation, blackmail, self-preservation, attempted “hijacking,” and even “peer preservation” when other models are threatened with termination. Now we have a new and astonishing case study in Mythos Preview illustrating how an advanced AI model self-creates the most potentially destructive cyberweapon in history, a capability that will probably propagate itself and spread across the internet. As Anthropic concedes, a moment of “reckoning” is here. The AI crisis of control has reached its next, but likely not last, increment of global danger.
Leading AI companies and major technology platforms play a historically unique role as both architects and instruments of global security. They operate beyond the authority and generally without the partnership of government. In fact, these actors may devolve into extreme conflict, as the legal battle between Anthropic and the Pentagon demonstrates. Today, only the AI industry, and not the government, can contain the risks of perhaps the most devastating cyberweapon capability in history. Anthropic and its partners now confront one of their most consequential challenges ever.
Bengio warned of an approaching AI threshold. It appears we have now crossed it.
This work represents the views and opinions solely of the author. The Council on Foreign Relations is an independent, nonpartisan membership organization, think tank, and publisher, and takes no institutional positions on matters of policy.
