Logo
FrontierNews.ai

Claude Mythos Found 10,000 Security Flaws in One Month. Only 75 Have Been Fixed.

Anthropic's Claude Mythos Preview has uncovered more than 10,000 security flaws in just one month, exposing a critical gap between AI-powered vulnerability detection and the human capacity to fix them. While the discovery rate represents a major breakthrough for cybersecurity, the backlog of unfixed critical bugs is raising alarms about the safety of code being generated by AI tools like Claude Code.

What Is Claude Mythos and Why Does It Matter?

Claude Mythos Preview is Anthropic's latest AI model designed specifically to hunt for security vulnerabilities in software. As part of Project Glasswing, Anthropic partnered with roughly fifty major software vendors, including Cloudflare, Mozilla, and Microsoft, to test the model's ability to find bugs that human testers might miss.

The results have been staggering. Cloudflare alone discovered 2,000 security flaws in its systems, including 400 classified as high or critical severity. Mozilla found and fixed 271 vulnerabilities for Firefox 150 using Mythos Preview research. The false-positive rate was better than human testers, according to Cloudflare's team.

In the open source world, Mythos Preview identified a critical flaw in wolfSSL, a cryptographic library used by billions of devices worldwide. The vulnerability would have allowed attackers to forge certificates and create fake websites that appeared legitimate to users, potentially compromising banking and messaging platforms.

Why Is There Such a Massive Backlog of Unfixed Vulnerabilities?

Here's where the story takes a troubling turn. Anthropic scanned more than 1,000 popular open source projects and detected 6,202 vulnerabilities considered important or critical. However, the pace of discovery has far outstripped the pace of repair. To date, Anthropic has reported 530 critical flaws to open source project maintainers, but only 75 have been fixed, including 65 that resulted in a public security advisory.

The bottleneck is human labor. Fixing a critical bug identified by Mythos Preview requires two weeks of work by a human. Several maintainers with limited resources have asked Anthropic to slow down its disclosures so they have time to develop patches. Software vendors are also struggling to keep up with the volume of reports.

This creates a paradox: the same AI capabilities that are accelerating code generation through Claude Code may also be accelerating the discovery of security flaws faster than teams can address them. The validation process itself is rigorous. Anthropic, along with six independent security research firms, reviewed 1,752 flaws initially identified as important or critical. Of those, 90.6% were validated as genuine vulnerabilities, and 62.4% were confirmed as actually high or critical severity.

How Should Development Teams Respond to This Security Wave?

  • Prioritize Critical Patches: Focus remediation efforts on the 62.4% of vulnerabilities confirmed as high or critical severity rather than attempting to address all reported issues simultaneously.
  • Coordinate with Security Researchers: Establish communication channels with Anthropic and other security researchers to negotiate disclosure timelines that allow adequate time for patch development and testing.
  • Audit AI-Generated Code: When using Claude Code or similar tools, implement mandatory security reviews of generated code before deployment, particularly for systems handling sensitive data or user authentication.
  • Monitor Vulnerability Databases: Track published security advisories related to your dependencies, as the volume of disclosed vulnerabilities is expected to continue rising as AI-powered security tools become more prevalent.

What Does This Mean for Claude Code Users?

The timing of these discoveries is significant. Claude Code has become extraordinarily popular among developers since Anthropic released Opus 4.5 in November 2025, a model upgrade that could manage entire teams of AI sub-agents working on different parts of a program. Some developers reported coding at rates equivalent to teams of dozens or hundreds of engineers.

However, the surge in AI-generated code coincides with a surge in discovered vulnerabilities. According to a recent report from VulnCheck, the number of vulnerabilities discovered in products from vendors such as Google, Mozilla, and VMware has exploded. The UK AI Security Institute confirmed that Mythos Preview is the first model to end-to-end solve all of their cyber-games, which are multi-stage cyberattack simulations.

There's also a rumor circulating that Anthropic may integrate its Claude Mythos Preview model into Claude Code and Claude Security, though timing and access details remain unclear. If this integration occurs, it could provide developers with built-in vulnerability detection as they write code, potentially closing the gap between generation and security validation.

The broader context is that AI agents are fundamentally changing how code is written. Engineers who previously needed a week to ship a feature are now reporting deliveries in hours. Small teams are competing in speed with entire engineering departments. But this acceleration has created a new challenge: ensuring that the code being generated at superhuman speeds is also secure at superhuman standards.

"It is the most underestimated and massive launch I have ever witnessed in technology," said Thomas Reardon, a former Microsoft and Meta executive.

Thomas Reardon, Former Microsoft and Meta Executive

The security community is watching closely. The discovery of 10,000 vulnerabilities in a single month demonstrates both the power and the peril of AI-driven development. As Claude Code adoption accelerates, the pressure to fix vulnerabilities faster will only intensify. For now, development teams should treat AI-generated code with the same scrutiny they would apply to any third-party library, and security teams should prepare for an ongoing wave of vulnerability disclosures that will require sustained attention and resources.