Claude Mythos Preview found thousands of zero-days in every major OS and browser. Here's what the headlines are missing. published: true

Dev.to / 4/8/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisTools & Practical UsageModels & Research

Key Points

  • Anthropic previewed an unreleased “Claude Mythos Preview” effort that allegedly identified thousands of critical zero-day vulnerabilities across major operating systems and browsers, with some bugs reportedly decades old.
  • The focus is not just vulnerability detection: in tests on Firefox JavaScript, Mythos reportedly turned 72.4% of found issues into working exploits, including register control in an additional 11.6%, indicating a major jump in exploit generation capability.
  • Responsible disclosure is underway, but fewer than 1% of the critical findings are said to be fully patched so far, highlighting how large-scale findings outpace patching capacity.
  • In CyberGym benchmarking, Mythos scored 83.1% versus 66.6% for the next-best Claude Opus, suggesting a generational leap that could reshape expectations for what near-future AI agents can do.
  • Beyond the vulnerabilities, the article argues the long-term strategic value is Anthropic’s broader effort (including a coalition dubbed “Project Glasswing” and free subscription incentives) to help organizations fix issues before attackers can weaponize similar techniques.

Anthropic dropped something big yesterday. Not a new chat model, not a productivity feature. They revealed that an unreleased model called Claude Mythos Preview has been quietly finding thousands of critical security vulnerabilities across every major operating system and every major web browser. Some of these bugs are 27 years old.

They are not releasing the model publicly. Instead, they launched Project Glasswing, a coalition with Apple, Microsoft, Google, AWS, CrowdStrike, NVIDIA, the Linux Foundation, and others. The goal: patch the bugs before attackers build similar AI.

Most of the coverage I have seen reads like a rewritten press release. The actual details buried in the red team report and system card tell a different story. Here is what stood out to me as someone who runs AI agents in production every day.

The exploit rate is the real headline

Finding bugs is one thing. Every static analysis tool finds bugs. The difference with Mythos is that it does not just find vulnerabilities. It builds working exploits for them.

In testing against Firefox's JavaScript shell, Mythos turned 72.4% of discovered vulnerabilities into successful exploits. It achieved register control in another 11.6% of cases. Previous Claude models could spot bugs but failed almost entirely at exploitation. That gap is gone now.

This is not a scanner. This is a model that reads code, understands the logic, finds the flaw, and writes a proof of concept that works. Autonomously.

Less than 1% is patched

Anthropic says thousands of critical zero-days were found. Fewer than 1% have been fully patched so far. The volume is simply too large for the affected organizations to keep up.

They are publishing cryptographic hashes of vulnerability details today, with plans to reveal specifics after fixes ship. This is standard responsible disclosure, but the scale is unprecedented. We have never had a single tool produce this many verified findings at once.

For context: Google's Project Zero, one of the best human vulnerability research teams in the world, publishes around 50 to 80 bugs per year. Mythos found thousands in weeks.

The CyberGym benchmark gap

On the CyberGym evaluation benchmark, Mythos scored 83.1%. Claude Opus 4.6, the next best model, scored 66.6%. That is not a marginal improvement. That is a generational jump within the same model family.

For anyone tracking AI capabilities over time, this should raise questions about what the next generation looks like. If the gap between Opus and Mythos is this large, what happens when competitors catch up to where Mythos is today?

What they are actually giving away

Buried in the announcement is something I think matters more long term than the vulnerability findings themselves.

Anthropic is offering free Claude Max subscriptions to any verifiable open source maintainer. Not Mythos access, but Opus and Sonnet, which are still capable security tools. They committed $100M in usage credits for Project Glasswing partners and donated $4M to open source security organizations through the Linux Foundation and Apache Software Foundation.

If you maintain critical open source software and you have no security budget, which describes most open source maintainers, you can apply through the Claude for Open Source program.

This is a smart move. The vast majority of critical infrastructure runs on code maintained by small teams or individual volunteers. Giving them access to frontier AI for code review could prevent more bugs than any single audit.

The pricing tells you something

When Mythos eventually becomes available through the API, it will cost $25 per million input tokens and $125 per million output tokens. That is roughly 5x more expensive than Opus 4.6.

Anthropic is not positioning this as a general purpose model. The pricing alone ensures it will only be used for high value tasks where the cost of missing a bug is measured in millions. Security audits, compliance reviews, infrastructure hardening. Not chat, not content generation.

The uncomfortable question

Anthropic built this. They chose to handle it responsibly with controlled access and coordinated disclosure. But the capability exists now. It is a matter of time before other labs, or well funded adversaries, reach similar performance.

The 90 day disclosure window Anthropic set for publishing full vulnerability details is tight. With less than 1% of bugs patched, that creates real pressure on every affected vendor. And the affected vendor list is essentially everyone.

Project Glasswing is a starting point. Anthropic said that explicitly. The real question is whether the defenders can stay ahead when this class of capability becomes widely available.

What this means for developers

If you write code that touches the internet, your threat model just changed. Not because of anything you did wrong, but because the cost of finding vulnerabilities in your code just dropped by orders of magnitude.

The practical takeaways:

Audit your dependencies. If Mythos found 27 year old bugs in OpenBSD, your npm packages are not immune. The bugs that survived decades of human review are exactly the kind AI excels at finding.

Watch for the patches. Over the next 90 days, expect a wave of critical security updates across operating systems, browsers, and open source projects. Apply them quickly.

If you maintain open source, apply for Claude Max. Free access to Opus for security review is genuinely useful. Take it.

Rethink your security testing. Static analysis tools that pattern match against known vulnerability types are not enough anymore. The bar just moved.

The age of AI finding bugs faster than humans can fix them is not coming. It arrived yesterday.