Blog
Security
Glasswing's First Month: 10,000 Critical Vulnerabilities and a Preview of What's Coming
Project Glasswing found 10,000 critical vulnerabilities in 30 days across 50 organizations. The coming surge will overwhelm any team still remediating manually.
6 min read

Anthropic published its first progress report on Project Glasswing last week. The numbers deserve attention from anyone responsible for a vulnerability management program.
In the project's first month, Glasswing's approximately 50 partner organizations used Claude Mythos Preview to find more than 10,000 high- or critical-severity vulnerabilities across their software. Separately, Anthropic scanned over 1,000 open-source projects with Mythos and surfaced an estimated 6,200 additional high- or critical-severity findings.
To put that in perspective: the entire CVE ecosystem has been publishing roughly 2,000 to 3,000 high- and critical-severity vulnerabilities per month in 2026. Mythos, working with a limited set of partners over 30 days, generated a volume of critical findings that exceeds what the global vulnerability disclosure process produces in a typical month by a factor of four or five.

The chart above makes the scale visible. Six years of monthly CVE data, with a gradual upward trend that accelerated through 2024 and 2025, followed by a single red bar that dwarfs everything before it. That red bar is one model, scanning for one month, across a fraction of the world's software.
Why this understates the problem
Three things make the Glasswing numbers a floor, not a ceiling.
First, the partner set is small. Approximately 50 organizations, selected because they build and maintain systemically important software. Thousands of enterprise software vendors, SaaS providers, and open-source projects were not included. When broader scanning begins, the total volume of newly discovered vulnerabilities will be significantly larger.
Second, Mythos Preview is not publicly available. Anthropic has restricted access specifically because safeguards against misuse aren't mature enough for a general release. The findings in the Glasswing update come from a controlled, defensive program. Comparable capabilities will emerge from other AI labs. Anthropic acknowledges this directly in the update, noting that models with similar cybersecurity skills "will soon be more broadly available." Based on the historical pattern of AI capability diffusion, leading-lab capabilities tend to appear in open-source models within 12 to 18 months. When that happens, the vulnerability discovery rate will increase again, and this time without the access controls.
Third, the quality bar is high. Of the findings that have been triaged by independent security research firms, 90.6% were confirmed as valid true positives. This is not a flood of false alarms. These are real vulnerabilities, the kind that become the next round of CVEs, patches, and urgent remediation tickets.
Each of these factors points in the same direction: the volume of critical vulnerabilities that security teams will need to process is about to increase by an order of magnitude relative to what they've handled over the past several years.
The partner-level detail is consistent
Individual partner results reinforce the aggregate picture. Cloudflare found 2,000 bugs across their critical-path systems in a month, 400 of them high- or critical-severity, with a false positive rate their team considers better than human testers. Mozilla found and fixed 271 vulnerabilities in Firefox 150 using Mythos, over 10x the yield they saw from Opus 4.6 on a previous Firefox release. Palo Alto Networks' latest release included five times as many patches as a typical cycle. Microsoft has stated publicly that patch volumes will "continue trending larger for some time."
These aren't projections or extrapolations. They're operational results from the first organizations to use a frontier vulnerability discovery model. And the results are remarkably consistent across very different codebases: browser code, network infrastructure, cloud systems, enterprise security products.
The UK's AI Security Institute reported that Mythos Preview is the first model to solve both of their simulated cyberattack ranges end to end. XBOW, an independent security evaluation platform, described the model's precision as "absolutely unprecedented" on a token-for-token basis. The capability step-function that we described in April is now quantified.
The remediation math doesn't work at current speed
These findings all eventually land on someone's remediation queue. The question is whether current remediation workflows can absorb the volume.
Right now, they can't absorb the existing volume. The average enterprise takes over 60 days to close a critical vulnerability. Large organizations carry backlogs where 45% of identified vulnerabilities remain unpatched after 12 months. These numbers describe the baseline, before an order-of-magnitude increase in inbound critical findings.
Meanwhile, the time attackers need to weaponize a vulnerability has compressed to hours. Cogent Research published data last week showing that average time to exploit collapsed from 125 days in January 2025 to under one day in April 2026. At the same time, 62% of critical vulnerabilities with a known exploit had that exploit circulating before any scanner released a detection signature. For the highest-risk vulnerabilities, attackers can act before most security teams even know they're exposed.
The convergence of these two trends creates a problem that human-speed remediation workflows cannot solve. Inbound volume is increasing by a factor of five to ten. Exploitation timelines have compressed from months to hours. The 60-day remediation cycle that was already too slow for current CVE volumes becomes impossible when the inbound rate quadruples and the window to act shrinks to a fraction of a day.
The scanner gap makes it worse
Glasswing's findings also highlight a structural limitation of scanner-based detection that compounds the volume problem.
Vulnerability scanners require a detection signature before they can identify affected systems. Those signatures take time to develop. Cogent Research's Q2 2026 Detection Gap Report analyzed 69,159 CVEs published since January 2025 and found that scanner detection lags range from 0.1 days (Tenable, at the median) to 5.1 days (Rapid7, at the median). 55.7% of critical CVEs never received a scanner detection signature at all.
When AI-generated vulnerability disclosures arrive at the rate Glasswing previews, scanner vendors will face their own capacity constraints in writing new detection plugins. If signature development already lags exploitation at current volumes, the lag will widen as the rate of new disclosures accelerates.
Organizations that rely on scanner output as their sole source of vulnerability awareness will have growing blind spots during the exact window when exposure is highest.
What this means for security teams
The Glasswing data puts concrete numbers on a transition that security teams need to plan for now, not after it arrives in full.
The current CVE ecosystem, where a few thousand high- and critical-severity vulnerabilities appear each month, is the low-water mark. AI-driven vulnerability discovery at the scale Glasswing demonstrates will push that number significantly higher over the next 12 to 18 months. The organizations that absorb this increase without a proportional increase in breaches will be the ones that built two capabilities before the surge hit: detection that doesn't depend on scanner signatures, and remediation that operates autonomously where policy allows.
This is why we launched Zero Day Response and Autonomous Remediation earlier this week. Zero Day Response identifies exposure in customer environments within minutes of a new vulnerability disclosure by matching against software inventory rather than waiting for scanner signatures. Autonomous Remediation builds a contextualized fix, assesses change impact through pre-flight checks, executes through the customer's existing patch management and ITSM systems, and confirms the fix held through independent verification.
Together, they compress the lifecycle from vulnerability disclosure to confirmed resolution from weeks to hours, and for straightforward cases, minutes. A vendor advisory published at 2 AM triggers ingestion. By 2:10 AM, affected assets are identified and risk-scored against the customer's actual environment. For assets in an autonomous remediation zone, the fix begins deploying before the security team's morning standup. For assets requiring human approval, the remediation plan is waiting in the queue with full context when the team arrives.
That operational cadence was built for what Glasswing describes. Ten thousand new critical findings in a month, with exploitation windows measured in hours, requires a response that matches the tempo of the threat. Manual triage, manual prioritization, and manual patch deployment at the current pace will leave organizations exposed during the period when the attack surface is growing fastest.
The vulnerability management programs that come through this transition will be the ones that recognized the volume increase as a structural shift and invested in detection and remediation infrastructure that operates at the speed the threat demands.






