Skip to content

 Why Anthropic Is Sounding the Alarm on the Next Generation of AI

Anthropic is urging its rivals to pursue an unprecedented regime of AI arms control. The explosive advances in the company’s technology illustrate why that effort may be imperative even as it will be difficult to achieve.

The Claude Fable 5 app logo is seen on a smartphone in this illustration photo taken in Italy on June 10, 2026. 
The Claude Fable 5 app logo is seen on a smartphone in this illustration photo taken in Italy on June 10, 2026. Matteo Della Torre/ Getty Images

By experts and staff

Published

Gordon M. Goldstein’s work focuses on emerging technology and international security. He is a former managing director at Silver Lake, a global technology investment firm, and the author of “Lessons in Disaster: McGeorge Bundy and the Path to War in Vietnam.”

The ascending artificial intelligence (AI) giant Anthropic is no longer simply a global technology power. Its cutting-edge AI models are increasingly central to U.S. national security. Four recent episodes illustrate this growing reality.

In April, Anthropic withheld the release of its model Mythos Preview, which self-created the most powerful cyber weapon in history, capable of finding more than ten thousand software vulnerabilities in computer networks believed to be highly secure. Earlier this month it was reported that the company had embedded half a dozen “forward deployed engineers” with the National Security Agency to conduct offensive AI cyber operations, presumably against China and Iran. Late last Friday afternoon, the Commerce Department ordered Anthropic to cut off access for all foreign nationals to its two most recent “frontier” models, citing undefined national security concerns. The dramatic dispute with the company, now playing out in the press, is yet another twist in Anthropic’s seemingly tortured relationship with the U.S. national security establishment.

But arguably the most important development came on June 4, when Anthropic issued a significant report on the pace of the AI race titled, “When AI builds itself: Our progress toward recursive self-improvement, and its implications.”

Composed using breezy and sometimes casual prose that obscures its remarkable thesis, the company warned that the next AI breakthrough—perhaps two years away—could create an advanced model so powerful that it evades human control entirely. Anthropic urged its rivals and partners to come together and embark on an unprecedented effort to build a viable multilateral regime of AI arms control.

“Recursive self-improvement” is the anodyne term used by computer scientists to describe the next paradigm of AI. When it arrives, AI will have the capability to perfect and propagate itself, creating future iterations of ever more dynamic models that can prioritize their own survival and potentially self-exfiltrate across the Internet to computer networks around the globe. “If it were possible to effectively slow the development of this technology to give ourselves more time to deal with its immense implications, we think that would likely be a good thing,” Anthropic stated in its report.

Anthropic is absolutely right to issue a warning. But the company has understated both the risks of the new technology and the extraordinary barriers to controlling what promises to be a revolutionary next paradigm in AI.

Acceleration without limits

Anthropic’s report clearly outlines that the progress to recursive self-improvement is accelerating at an astonishing rate. In the second quarter of 2026, the typical engineer at Anthropic produced eight times as much code per day as they did just two years earlier. Eighty percent of the code Anthropic generates today is created by AI models, not human engineers.

These developments have occurred because the models that the Anthropic lab is creating have dramatically increased in speed. By April, the latest iteration of its Claude model could run its operating code fifty-two times faster than just eleven months earlier. The autonomous capabilities of its new models are perpetually growing. “The length of tasks that they can reliably complete on their own has been doubling roughly every four months,” Anthropic reports.

AI with the capacity for recursive self-improvement may be a game-changer for global security. The implicit risks of this technology should alarm even the most optimistic observer of the AI transition—and serve as a wakeup call for the public.

AI attackers may be massively empowered. Although Anthropic does not discuss it in its report, an autonomous self-improving AI technology could simulate and design unique biological weapons and lethal chemical agents that no human has ever discovered or even contemplated. Future cyber weapons could have the capacity to autonomously generate, assign, and mutate “zero-day” attacks in real time, executing complex network infiltration at an unprecedented scale and speed. A recursive self-modifying AI cyber weapon could design a way to penetrate elaborately defended military networks and breach command-and-control operations.

Human oversight of AI models may be fatally weakened. The next generation of AI is designed to operate autonomously, without human direction, commands, or guidance. “Alignment” is the term computer scientists use as a semantic proxy—a misleading and deficient proxy—to describe the operational control of advanced AI models. “How the alignment problems get solved—or not—in this future is something we are the least certain about,” Anthropic concedes. The company offers a stark warning: “The rare occurrences of misalignment present in today’s models could compound as the models build their successors, growing more frequent but less understood until we lose control of them.”

Computational speed will be revolutionized. Although Anthropic does not discuss it in its report, the timeline of AI technology innovation, already dramatically accelerated by access to mass produced next-generation AI chips, will increase exponentially. The innovation timeline will be compressed from months to literally seconds because of AI’s capacity to continuously modify and perfect its own code. The speed of advanced model development will be instantaneous.

AI may communicate in an opaque language. Although Anthropic does not discuss it in its report, recursive self-improvement may allow AI to communicate with other AI models in ways incomprehensible to human operators. Because the system will dynamically and continuously rewrite its own algorithms, the resulting architecture may be mathematically illegible, preventing human operators from understanding, monitoring, and influencing the models’ behavior.

Why the AI race can’t slow down

Anthropic is proposing an extremely complex process of AI arms control. “A meaningful slowdown or pause,” Anthropic concludes, “would require multiple well-resourced labs at or near the frontier, in multiple countries, agreeing to stop under the same conditions. It would also require that each can verify that the others have actually stopped.”

The company explicitly acknowledges four great challenges to the proposition of AI arms control—and there is a fifth that they did not.

Time is the enemy of action. Anthropic notes that “the world has built verification regimes for other complex technologies,” such as “the Intermediate Range Nuclear Forces Treaty…but those regimes took decades to build both the infrastructure and the trust. We don’t have that long.”

The history of arms control is an inadequate model for the future. “Due to the unique characteristics of AI systems…this arms control problem is much more challenging than with other technologies,” Anthropic explains. “Training runs are far easier to conceal than missile silos, their inputs are general purpose, and the incentive to defect quietly is enormous, because whoever continues while others pause could inherit the lead.”

Verification mechanisms would need to account for the totality of actors in the global AI race. These systems of verification, the company argues, “would enable frontier AI developers to verify that others globally have stopped or slowed, and that a bad actor could not use the auspices of a coordinated slowdown to jump ahead in secret.”

Chip designers and manufacturers are essential to implementing a coordinated development pause. “In this world,” Anthropic asserts, “the pace of progress in AI development becomes determined entirely by the availability of compute.”

Total available computing capacity from AI chips across all major designers has grown by more than 300 percent per year since 2022. Nvidia, AMD, and Intel lead the global market. With $165 billion in annual revenue, TSMC of Taiwan dominates the overall semiconductor manufacturing, including the fabrication of custom AI processors. Without controlling the AI industry supply chain, including monitoring with the deployment of physical verification mechanisms, enforcing a pause in advanced AI model development would be infeasible.

China is unlikely to play ball. Anthropic is silent on perhaps the single greatest barrier to AI arms control. The word “China” never appears in Anthropic’s analysis of managing the recursive self-improvement transition. The company barely acknowledges the broader geopolitical environment, a major driver of the current AI competition.

The United States appears to be ahead in developing advanced AI models, overshadowing Chinese AI labs such as DeepSeek, Alibaba Qwen, and ByteDance Seed. But that advantage may be evanescent because it is primarily based on the greater access U.S. AI companies presently have to industrial “compute” capacity, a lead China is determined to erase. Without Beijing, a global AI development pause will be out of reach. China has expressed some interest in security safeguards, but largely to dull the U.S. edge in the global AI race.

Will explosive growth spark a collective response?

Just a few years ago AI scientists regarded recursive self-improvement as an intriguing but hypothetical breakthrough. The locus of expert opinion has shifted, reflecting the spectacular advances in new AI models, which are pumped into the world on average every four months. When AI can refine, perfect, and replicate itself—and models can communicate in an opaque mathematical language while evading termination by self-exfiltrating across global computer networks—fundamental human control over the technology could evaporate.

Anthropic, alone among its rivals so far, has persuasively demonstrated through its own explosive growth and very recent history that this next paradigm of AI may arrive quickly. Logic suggests that two choices await. Industry leaders can be passive, allowing the future to unfold without attempting to shape it. Or alternatively a collective effort—even one confronting steep odds—can be catalyzed to attempt something coherent to prepare for a very dangerous tomorrow. Anthropic seems to be committed to the latter path. As the company would say, pursuing this mission, despite its severe challenges, seems “likely to be a good thing.”

This work represents the views and opinions solely of the author. The Council on Foreign Relations is an independent, nonpartisan membership organization, think tank, and publisher, and takes no institutional positions on matters of policy.