Claude Mythos Preview
Last reviewed
Jun 3, 2026
Sources
16 citations
Review status
Source-backed
Revision
v1 · 1,840 words
Improve this article
Add missing citations, update stale details, or suggest a clearer explanation.
Last reviewed
Jun 3, 2026
Sources
16 citations
Review status
Source-backed
Revision
v1 · 1,840 words
Add missing citations, update stale details, or suggest a clearer explanation.
Claude Mythos Preview, usually shortened to Mythos, is a frontier language model developed by Anthropic and announced on April 7, 2026. It is best known not for being released but for being held back: Anthropic declined to make it broadly available after internal testing showed that it could find and exploit software security flaws better than almost any human, a capability the company judged too dangerous to put in the public's hands. Instead of a normal launch, Anthropic gave a small set of critical-infrastructure firms and technology companies private access through a defensive program called Project Glasswing, so they could patch vulnerabilities before models with similar abilities became widely available. The decision was the first time in nearly seven years that a leading AI lab had so publicly withheld a model over safety concerns, the previous case being OpenAI's staged release of GPT-2 in 2019.[1][2][3]
The article title uses the canonical preview name. Anthropic has described "Mythos-class" models as a family, and as of June 2026 the company said it expected to bring those models to all customers "in the coming weeks," so the preview restriction may be temporary.[4]
Mythos is a general-purpose model in the Claude line, not a narrow hacking tool. Anthropic positions it as the successor generation to Claude Opus 4.6 and reports large jumps on standard software-engineering benchmarks: 93.9% on SWE-bench Verified (up from 80.8% for Opus 4.6) and 77.8% on SWE-bench Pro (up from 53.4%).[5] According to coverage of Anthropic's May 2026 funding round, Mythos was among the first models the company trained on next-generation GPUs, the chips that drive AI training.[3][6]
What set Mythos apart in testing was security work. On the CyberGym vulnerability-reproduction benchmark it scored 83.1%, against 66.6% for Opus 4.6.[5] The more striking numbers came from offensive tasks that earlier Claude models had essentially failed. On a fixed set of Firefox JavaScript exploit challenges, Mythos produced 181 working exploits where Opus 4.6 managed only 2, and Anthropic reported that Opus 4.6 had near-zero autonomous exploit success on the same tests.[1] In other words, the offensive capability did not creep up gradually. It appeared as a sharp step change in a single model generation.
In its technical write-up, Anthropic said Mythos could carry out the full chain of a sophisticated attack with little human help. The model could identify previously unknown (zero-day) vulnerabilities, write code to exploit them, and string multiple flaws together into a working intrusion. Specific demonstrations included writing JIT heap sprays to escape browser renderer and operating-system sandboxes, building local privilege-escalation exploits that relied on race conditions, and producing remote code execution attacks against kernel network services. It could also reverse-engineer closed-source binaries to hunt for bugs and turn already-known vulnerabilities into functional exploits.[1]
Three findings became the public face of the model's reach. Mythos found a 27-year-old bug in OpenBSD's TCP SACK handling, a signed-integer-overflow issue that allowed a remote denial of service through a NULL pointer dereference.[1] It found a 16-year-old flaw in the FFmpeg H.264 decoder that, by Anthropic's account, roughly five million automated tests had missed.[1][7] And it autonomously built a working remote-code-execution exploit for a 17-year-old flaw in FreeBSD's NFS authentication (tracked as CVE-2026-4747), assembling a 20-gadget return-oriented-programming chain spread across six network packets.[1] These were not obscure edge cases. They sat in widely deployed, heavily reviewed open-source code that had survived decades of human scrutiny.
The scale was the second concern. Anthropic said Mythos turned up thousands of high- and critical-severity vulnerabilities across every major operating system, every major web browser, cryptographic libraries, and virtual machine monitors. By the company's own count, more than 99% of those flaws remained unpatched at the time of disclosure.[1][2] The danger here is asymmetry: a single capable model can find holes far faster than defenders can fix them, and the same model in the wrong hands could be pointed at ransomware, espionage, or sabotage of critical systems.
Anthropic said it would not make Mythos Preview generally available. The stated reason was to "enable defenders to begin securing the most important systems before models with similar capabilities become broadly available."[1] Because the underlying gains came from general improvements in reasoning and software engineering rather than any cyber-specific trick, the company argued that other labs would likely reach the same place soon, with no guarantee they would choose to restrict access.[2] The move fits the logic of Anthropic's responsible scaling policy, which commits the company to holding back capabilities when the risk of misuse outruns its ability to deploy safely.
The withhold decision drew real scrutiny, and not everyone took Anthropic's framing at face value. Security researcher Bruce Schneier acknowledged the threat is genuine while questioning whether Mythos was uniquely dangerous. He pointed out that the security firm Aisle reproduced Anthropic's published results using smaller, cheaper models, and he raised the possibility that the restricted release was partly "a virtue out of necessity," since Mythos is expensive to run and a general release might have strained Anthropic's resources.[8] Schneier also widened the lens past software: a model this good at finding loopholes, he noted, could be turned on environmental or food-safety rules, systems that take years rather than days to patch.[8] The UK's AI Safety Institute independently assessed Mythos's capabilities as a "step up" relative to other frontier models, which lent weight to Anthropic's account.[9]
Rather than ship Mythos or sit on it entirely, Anthropic built a controlled channel for defensive use. Project Glasswing gives vetted organizations access to Mythos Preview so they can scan their own first-party code and the open-source software they depend on, find vulnerabilities, and patch them before the flaws become public.[10]
The program launched in early April 2026 with a set of founding partners. Anthropic's materials list twelve launch participants (including Anthropic itself): Amazon Web Services, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorgan Chase, the Linux Foundation, Microsoft, NVIDIA, and Palo Alto Networks.[10] Anthropic committed up to $100 million in Mythos Preview usage credits to support the work, plus $4 million in direct donations to open-source security efforts, reported as $2.5 million to Alpha-Omega and the OpenSSF via the Linux Foundation and $1.5 million to the Apache Software Foundation.[10]
The early results were substantial. In its first month, Glasswing used Mythos to autonomously discover more than 10,000 high- and critical-severity zero-day vulnerabilities across widely used software.[2][11] Mozilla, working with the model, found 271 vulnerabilities in Firefox 150, more than ten times the rate it had seen with an earlier Anthropic model.[11] Anthropic framed the human side as the real constraint, saying "the bottleneck in fixing bugs like these is the human capacity to triage, report, and design and deploy patches for them."[11]
On June 2, 2026, Anthropic expanded Glasswing to roughly 150 additional organizations across more than 15 countries, deliberately reaching into sectors that had been thin in the first cohort: power, water, healthcare, communications, and hardware.[11][12] Canada confirmed it had access to the model through the program.[3] The expansion did not quiet every complaint. Reporting noted that some operational-technology providers, the firms that build industrial control systems, felt sidelined by the initial rollout, and policy analysts pointed out that access skewed heavily toward U.S.-based organizations, leaving European allies questioning whether a private American company should be making what amount to national-security calls.[13]
Mythos landed in the middle of an active argument about how, or whether, governments should review powerful AI before release. Anthropic disclosed its first-month findings and promised a public report within 90 days with recommendations for security practice in the AI era.[10] Around the same time, the U.S. Commerce Department's Center for AI Standards and Innovation announced national-security testing of frontier models.[14]
The clearest policy consequence came on June 2, 2026, when President Donald Trump signed an executive order on AI safety, a shift from the administration's earlier hands-off stance. The order asks AI companies to voluntarily submit their most capable models for federal review up to 30 days before public release. An earlier draft had set a 90-day window and was shelved over worries that it would slow U.S. innovation amid competition with China; the final version cut the review period to 30 days and kept it voluntary.[15][16] The order also directs federal agencies to build benchmarks for assessing models' cyber capabilities and to set up an "AI cybersecurity clearinghouse" to review and share information about vulnerabilities.[15] The benchmark work, assigned to the NSA, is reported to target functional abilities such as automated exploit generation and autonomous agent behavior rather than the older approach of drawing a line at a fixed amount of training compute.[16]
The order does not name Mythos or Glasswing.[16] But the connection was hard to miss. A model that finds thousands of zero-days and an executive order built around cyber-capability benchmarks and a vulnerability clearinghouse arrived within weeks of each other, and officials had reportedly grown more anxious as more capable models appeared. Whether voluntary pre-release review is enough remains contested. As the Just Security analysis put it, safeguards only work as a strategy if they apply across every model that gains dangerous capabilities, including open-weight models from China that no U.S. policy can reach.[7]