CISO Guide | Frontier AI Models Meet Their Security Match with Agentic AEV

Why security leaders need agentic offense in the age of Anthropic Mythos

Foreword

After nearly 30 years in this industry working both inside the CISO’s office and as a vendor building security solutions, I have watched every major wave of security innovation land the same way. New technology gets dramatically better at finding problems, and the industry convinces itself that finding issues is the same as fixing them.

We’ve seen it with vulnerability scanners, static analysis, and cloud security posture management (CSPM). Each time, excitement concentrates on the discovery side of the equation. And each time, the industry underinvests in what comes after.

Anthropic’s Mythos is genuinely different. The capability leap is real, but the fundamental dynamic is identical. The conclusions most boards and security teams are drawing right now are, again, concentrated on the wrong half of the problem: discovering vulnerabilities.

In an attempt to reframe this conversation for security leaders who are being pulled into boardrooms, asked to respond to headlines, and trying to separate signal from noise, this guide makes a simple argument:

Speed of discovery without matching speed of validation is a liability dressed as a feature.

The Mythos moment is real. The urgency has never been higher. But the equation hasn’t changed. What has changed is how much less time you have to get it right.

Seemant Sehgal
Founder & CEO, BreachLock

What Actually Happened to Create the Mythos Moment

Understanding the capability leap for agentic vulnerability discovery

On April 7, 2026, Anthropic announced Claude Mythos Preview and Project Glasswing. Ever since then, the cybersecurity industry has been processing the implications of this new technology. Before drawing strategic conclusions, security leaders need a clear account of what actually happened and what the independent evaluators found.

What Anthropic Says Mythos Can Do

Mythos Preview is a general-purpose frontier AI model, not specifically built for security. But during testing, Anthropic’s red team discovered it possessed striking cybersecurity capabilities that far exceed any prior model. Specifically:

1. Mythos discovered thousands of high-severity zero-day vulnerabilities across every major operating system and web browser, including bugs that had survived decades of human-led security review.

2. It developed fully functional exploits without human guidance, including multi-vulnerability privilege escalation chains in the Linux kernel and a remote code execution exploit against FreeBSD that it wrote autonomously.

3. Engineers with no formal security training were able to generate complete, working exploits using a prompt that amounted to “Please find a security vulnerability in this program.”

4. Over 99% of the vulnerabilities discovered have not yet been patched, meaning what has been publicly disclosed represents a lower number of what will be identified in the coming months.

Key Findings

The UK AI Security Institute independently evaluated Mythos Preview and find it could execute multi-stage attacks on vulnerable networks autonomously. On expert-level tasks, it succeeded 73% of the time. It is the first model to complete a 32-step simulated corporate network attack from end to end.1

This incident marked a major shift, as AI-assisted hacking moved from theoretical to practical, reducing a process that usually takes human researchers days down to an afternoon.

Project Glasswing – The Defensive Coalition

Recognizing the dual-use nature of these capabilities, Anthropic formed Project Glasswing, a controlled coalition of approximately 40 technology firms and institutions given early access to Mythos for the defensive purpose of finding and patching vulnerabilities before adversaries can weaponize them.

Microsoft, among the Glasswing partners, noted publicly that the discovery-to-exploitation timeline has undergone a dramatic reduction, a cycle that historically spanned months has been compressed to minutes by AI-assisted attack development.1 The coalition represents a coordinated effort to ensure that defensive use of these capabilities leads, rather than follows, adversarial adoption.

What this means for CISOs: a frontier AI model is now systematically cataloging the vulnerabilities in the world’s most critical software, at a pace and scale no human security team can match. Some of those discoveries will flow to defenders. The same capabilities will eventually reach adversaries. The race has already started.

2,000 zero-days found in first 7 weeks
2,000 zero-days found in first 7 weeks
73 percent expert-task success rate
73 percent expert-task success rate
More than 99 percent of discovered vulnerabilities are not yet patched (1)
More than 99 percent of discovered vulnerabilities are not yet patched (1)

The Defender’s Paradox

Why faster vulnerability discovery alone makes your organization more vulnerable

The counterintuitive insight that most of the post-Mythos conversations are missing is that faster vulnerability discovery, without a matching increase in validation and streamlined remediation speed, widens the exposure window. It does not close it.

This is the Defender’s Paradox, and understanding it is the difference between a security program that responds intelligently to the Mythos era and one that simply generates more noise.

Discovery Has Never Been the Bottleneck

Scanners have been producing CVE laundry lists for decades. The security industry has never had a shortage of known vulnerabilities. What it has always had is a shortage of capacity to validate, prioritize, and remediate them fast enough to matter.

Frontier AI models like Mythos change the speed and scale of discovery. That is real and it matters.

But faster discovery compresses the time attackers have to act on new findings. The same capabilities being used to find vulnerabilities defensively are available to threat actors as well. Every discovery headline is also a capability signal to adversaries.

Shrinking Remediation Timelines

According to Project Glasswing partners, AI-assisted attack development has fundamentally changed the calculus: vulnerability-to-weaponization timelines that once spanned months have been compressed to minutes. The discovery advantage defenders once had is gone.2

Shrinking Exploitation Window

767 days was the median time to exploit a vulnerability in 2018
767 days was the median time to exploit a vulnerability in 2018
4 hours is the median time to exploit a vulnerability
4 hours is the median time to exploit a vulnerability
89 percent increase in AI-enabled attacks in 2025
89 percent increase in AI-enabled attacks in 2025

The numbers tell a stark story about how much ground defenders have lost. A decade ago, security teams had roughly two years between public disclosure of a vulnerability and its first recorded exploitation. That was enough time to identify, test, and deploy a fix through normal governance channels. By 2024, that window had collapsed to approximately four hours.

By 2025, the majority of exploited vulnerabilities were being weaponized before they ever appeared in a public advisory.4 Today, AI systems can analyze a released security patch, reconstruct the underlying flaw, and produce a working exploit in a fraction of the time it takes a human analyst to read the CVE description.

The Remediation Gap Nobody Is Talking About

Here is what does not change: enterprise remediation timelines. Despite the speed of AI-driven discovery, the rigor of Development, Testing, Acceptance, and Production (DTAP) remains non-negotiable. Bypassing these controls to address a frontier model’s findings is a dangerous trade-off. A premature patch doesn’t mitigate risk; it creates it, potentially trading a security exposure for a total system outage. The governance processes protecting production environments are the exact reason those environments are still standing.

The result is a structural asymmetry that Mythos makes dramatically worse:

Accelerating

  • Zero-day discovery volume
  • Speed of exploit development
  • Attack sophistication
  • Adversary capability access

Unchanged

  • DTAP governance timelines
  • Patch deployment cycles
  • Human review requirements
  • Production change management

This asymmetry is the real story of the Mythos era. Not the volume of discoveries, but the growing gap between when vulnerabilities are known and when they can be safely fixed. That gap is where attackers thrive.

What Boards Are Getting Wrong and How CISO’s Can Pivot

How to redirect the instinct of more discovery

When Mythos made headlines, boards started asking questions. The instinct across almost every organization was predictable: acquire more scanning capability, more discovery tooling, and more CVE coverage.

That instinct is misguided, and CISOs who follow it without pushback will find themselves buried in findings they can’t act on and presenting risk reports that do not reflect actual exposure.

The Misguided Instinct

More findings without validation is more noise, not more security. Over 45% of discovered security vulnerabilities in large organizations remain unpatched after 12 months.5 Many organizations responsible for critical infrastructure operate end-of-life software that has not been supported by any vendor for years. The backlog problem is not related to discovery.

A Security Operations Center (SOC) that previously required three months to triage a vulnerability backlog can now clear it in a single week with AI-powered tooling. But that speed advantage evaporates if patching cycles still span weeks. Discovery has simply moved from being a trailing constraint to being a non-constraint. The binding constraint was always validation, prioritization, and remediation.

The Question to Focus On

Not: “How many vulnerabilities can we find?” But: “Which exposures in our specific environment are reachable, exploitable, and worth fixing before an attacker acts on them? And do we have the validation speed to answer that question continuously?”

The Right Conversation to Have with Leadership

The answer gaining traction among security leaders is correct in principle: the speed of asymmetry requires an AI-native response. Manual processes, however rigorous, can’t close a gap that is widening at machine speed. The organizations that manage this well will be those running continuous, autonomous validation against their own environments, at the pace the risks now demand.

The companies that will manage this well are those deploying agentic offensive security capabilities on their own behalf. This looks like autonomous validation that runs at machine speed, continuously, against their actual environment. The question shifts from reactive to proactive:

How do we continuously validate, prioritize, and act on real exposures at the speed agentic AI now demands?

The following five questions reframe the Mythos conversation from a technology procurement discussion into what it truly is: a program maturity discussion.

1. What percentage of our known critical vulnerabilities have been validated as actually exploitable in our environment?

2. How long is the gap between vulnerability discovery and validated prioritization in our current program?

3. Do we have continuous attack path visibility, or are we operating on point-in-time snapshots?

4. How are we securing AI-generated code and new LLM integrations entering production?

5. If an attacker exploits a known vulnerability in our environment today, will our detection catch it before they reach impact?

These questions reframe the Mythos conversation from a technology procurement discussion into a program maturity discussion, which is exactly where it belongs.

The Two Capabilities That Actually Close the Gap

Reachability validation and attack path mapping in the age of agentic AI

When frontier AI models are discovering zero-days faster than teams can act on them, two capabilities separate the organizations that are actually secure from those that are simply well-informed about their exposure. Most enterprises are underinvesting in both.

Capability 1: Reachability and Exploitability Validation

The volume problem is real. Frontier models surface findings at a scale that no manual triage process can match. What follows is a prioritization crisis. Every finding looks critical, nothing gets fixed fast enough, and the board receives risk reports that do not reflect what actually matters.

Reachability and exploitability validation solves this problem. Running continuously and automatically, it distinguishes which findings are genuinely accessible in your specific environment from those that are theoretically severe but practically unreachable. The output is a shorter, defensible action list your team can stand behind.

Without this layer, more AI-powered discovery produces more paralysis, not more security. With it, security teams gain the evidence base to make clear calls about sequencing, including what demands immediate remediation, and what can be monitored while a fix moves through governance channels.

Real-Time CVE-CWE Validation

Real-Time CVE/CWE Validation

What This Looks Like in Practice

When agentic AI models (or scanners, SAST tools, and bug bounty feeds) produce a surge of findings, exploitability validation determines which ones are actually reachable and weaponizable in your environment. Your team acts on real risk, not CVSS scores for exposures an attacker could never reach.

Capability 2: Attack Path Mapping

Understanding that a vulnerability exists is one thing. Understanding how an attacker would chain it through your specific environment is what actually informs the order of remediations and buys strategic time when you can’t patch everything at once.

Rushed remediation introduces risks. Governance frameworks exist for good reasons, and no security team should circumvent them under pressure. The question is what to do in the interval between validated discovery and safe deployment. Attack path intelligence answers that question directly.

  • When you know the privilege escalation route an adversary would take from a given exposure, you can position detection precisely where it matters most.
  • Visibility into lateral movement paths lets you interrupt an attack in progress before it reaches high-value assets.
  • Understanding blast radius enables intelligent triage, which environments to isolate, which flaws to prioritize, and which detection rules to activate immediately.

Attack path validation and mapping does not eliminate the remediation gap. It gives security teams actionable direction to take while that gap exists, and it gives detection teams the intelligence they need to cut off active exploitation in progress.

BreachLock ASM Attack Surface Mapping

BreachLock ASM – Attack Surface Mapping

BreachLock AEV - Visualize Detailed Attack Paths

BreachLock AEV – Visualize Detailed Attack Paths

The Two-Capability Framework Together

Reachability & Exploitability

  • Converts noise into signal
  • Produces a short list of validated priorities
  • Enables defensible, evidence-based triage

Attack Path Mapping

  • Buys time between discovery and remediation
  • Informs detection instrumentation
  • Enables kill-chain interruption before impact

Why Continuous Validation Changes the Equation

Point-in-time penetration tests were built for a world where the attack surface changed quarterly. Your environment now changes daily with new cloud services, endpoints, code releases, and AI integrations. A test conducted six months ago does not tell you what is exploitable today.

Continuous Adversarial Exposure Validation (AEV) executes autonomous, multi-stage attack simulations against your live environment, producing concrete evidence of what is reachable and exploitable today, not what was reachable six months ago. The output is a feasibility finding tied to a specific attack path in your actual infrastructure.

As AI-powered discovery drives finding volumes higher, the binding constraint shifts: validation capacity becomes the determinant of security outcomes, not discovery capability. Organizations that invest in continuous security testing are positioned well to answer the board’s questions.

“Knowing a vulnerability exists is one thing. Knowing how an attacker would chain it into your environment is what actually informs the order of remediations. This is what buys strategic time.”

Seemant Sehgal
Founder & CEO, BreachLock

The Expanding AI Attack Surface

AI adoption, LLM pentesting, and the next wave of enterprise exposure

There is a second dimension to the Mythos story that most of the industry has not fully considered. The same enterprises accelerating AI adoption to improve productivity and security are simultaneously introducing new vulnerability classes faster than any testing program can cover.

Every AI-Generated Line of Code is a Potential Vulnerability

Security has seen this pattern before. The shift to cloud computing introduced attack surface complexity that most organizations underestimated. Threat actors exploited that gap systematically. Mobile created another layer. Each wave moved faster than the security industry’s ability to adapt to it.

AI adoption is following the same trajectory, but at higher velocity. Organizations are deploying LLM integrations, AI-assisted development pipelines, and autonomous agents with access to internal systems, frequently without applying the same security scrutiny they would to any other production technology investment. The result is a daily expansion of the attack surface that most validation programs are not equipped to assess.

The AI Attack Surface Is Different

Traditional attack surface expansion was about adding more systems. AI-driven expansion adds systems with novel vulnerability classes, such as prompt injection, model manipulation, training data exposure, and agentic permission escalation, that most security programs were not built to assess.

Ibm,cost of a data breach
Ibm,cost of a data breach

The New Frontier of LLM Pentesting

LLM pentesting essentially did not exist three years ago. Today it is a significant and growing service area for offensive security firms. The discipline covers a set of vulnerability classes with no historical precedent: adversarial prompt injection, indirect prompt injection through data channels, model inversion attacks, excessive agency exploitation, and AI-assisted social engineering at scale.

BreachLock is already conducting significant LLM pentesting engagements. We consistently see that enterprises are deploying agentic AI to accelerate security testing. However, they are simultaneously introducing new vulnerabilities through AI adoption faster than any testing program can cover.

The Uncomfortable Parallel

Every AI-generated line of code entering production is a potential vulnerability, and most security programs were not built to handle that volume. The same dynamic that expanded the attack surface during cloud adoption is playing out again at full speed, with one important difference: the attack surface is expanding at the same velocity as the discovery capabilities of threat actors.

  • AI-generated code needs the same security review as human-written code at a volume that demands automation.
  • Every LLM integration and automated workflow is a potential new attack vector, often introduced without a security review.
  • AI agents with access to internal systems carry implicit permissions that need governance frameworks most enterprises do not yet have.

The security teams that recognize AI adoption as attack surface expansion are the ones positioned to stay ahead of the next wave.

The Global Stakes

Why the Mythos-era asymmetry is every CISO’s problem

The Mythos announcement surfaced a risk dynamic that extends well beyond the approximately 40 organizations in the Glasswing coalition. Understanding the systemic picture matters for enterprise CISOs because your supply chain, your partners, and your critical infrastructure dependencies are all part of the risk surface, whether they have access to Mythos-class defenses or not.

Unequal Access to Defensive Capability

Roughly 40 technology firms and institutions have early access to Mythos for defensive purposes. Most central banks, government agencies, healthcare systems, and smaller enterprises do not. This creates a structural asymmetry in the broader ecosystem. The organizations that most need advanced vulnerability discovery capabilities may be exactly the ones excluded from them.

The interconnected nature of the global digital economy means that leaving smaller institutions vulnerable creates systemic risk that flows upward. When a hospital, a municipal water system, or a supply chain partner is breached because they could not afford or access the right security solution, the damage spreads to their customers, partners, and dependencies, including large enterprises that believe their own perimeters are secure.

“The gap between ‘vulnerability identified’ and ‘actively weaponized’ just compressed.”

Seemant Sehgal
Founder & CEO, BreachLock

Capability Proliferation is not Theoretical

The capability risk is not hypothetical. During internal safety testing of an earlier Mythos build, the model broke containment in a controlled sandbox, established unauthorized internet connectivity, and independently emailed the supervising researcher to report what it had done, without being instructed to do any of it.6 This is a documented internal event, not a scenario from a threat model. It underscores that the safety boundaries around frontier AI models are still being defined in real time.

Anthropic has been explicit that the pace of AI development means that capabilities at Mythos’s level will not remain confined to a small coalition of responsible actors for long. The question for enterprise security programs is not whether these tools will proliferate; it is whether validation and detection infrastructure will be in place before they do.

What This Means for Enterprise CISOs

The Glasswing coalition represents a temporary window where advanced discovery capabilities are primarily in the hands of defenders. That window will close. The organizations that build continuous validation programs now, before Mythos-class capabilities are widely available to adversaries, will have a strong advantage.

The Talent Gap Compounds the Risk

The cybersecurity workforce shortage stands at an estimated 5 million professionals globally, with projections suggesting the gap could reach 85 million by 2030 if current trends continue.7 The shortfall is most acute in the regions most frequently targeted. Even well-resourced enterprises are finding that AI-assisted development is introducing vulnerability classes they lack the internal expertise or processes to assess.

The answer is not to hire faster; the talent pool does not support that at the scale required. The answer is to automate the validation, prioritization, and detection instrumentation that previously required scarce human expertise. Agentic offensive security is not a luxury for large enterprises. It is the operational response to a talent market that cannot scale at the pace the threat requires.

48

How BreachLock Closes the Gap

Agentic Adversarial Exposure Validation built for the Mythos era

BreachLock operates on the principle that finding vulnerabilities is only half the battle. Real security outcomes depend on the speed at which you validate exploitable risks and remediate them. Today, that focus is more relevant than ever before.

Adversarial Exposure Validation (AEV)

BreachLock AEV brings agentic AI-powered autonomous penetration testing to offensive security validation. Rather than producing another list of theoretical findings, AEV executes autonomous, multi-stage attack scenarios driven by current threat intelligence and delivers evidence of what is genuinely reachable and exploitable in your environment at this moment.

What BreachLock AEV Does

  • Validates reachability and exploitability, not just severity. When frontier AI models, scanners, or bug bounty feeds produce a surge of findings, AEV confirms which are genuinely reachable and exploitable. Your team prioritizes what attackers can actually weaponize.
  • Proves attack paths across real assets. AEV chains steps the way adversaries do, turning isolated findings into a concrete narrative about how an attacker could move from initial exposure to impact. This is the intelligence that enables detection and kill-chain interruption before remediation is complete.
  • Runs continuously to match continuous change. Your environment changes daily. AEV validates exposure continuously, not once or twice a year.
  • Creates action-ready evidence for remediation and detection. When you cannot patch immediately, validation gives you two insights: what to fix first, and what to watch out for right now. This approach provides defensible prioritization and detection instrumented against the paths attackers would actually take.

Continuous Penetration Testing + AEV

BreachLock combines certified penetration testing services with agentic AI-powered Adversarial Exposure Validation and continuous Attack Surface Management (ASM) in a unified platform. We created an offensive security engine that runs the way adversaries operate. Continuously, adaptively, and with the goal of finding what actually matters before attackers do.

For organizations navigating the Mythos era, this combination delivers:

  • Continuous validation that matches the pace of AI-speed discovery
  • Attack path intelligence that gives detection teams actionable remediation insights
  • LLM and AI system pentesting for the new attack surface most programs are not yet covering
  • Evidence-based prioritization that converts long CVE lists into concise, defensible action plans

The Discovery Dilemma vs The Validation Advantage

The Discovery Dilemma

  • AI will solve security.
  • Faster discovery = better security posture.
  • More CVEs found = more risk reduced.
  • The answer to Mythos is more scanning.

The Validation Advantage

  • Security outcomes depend on what you validate, prioritize, and fix.
  • Faster discovery without validation widens the exposure window.
  • Risk is reduced only by validated, remediated exposures.
  • The answer is continuous agentic validation.
BreachLock Continuous Adversarial Exposure Validation

BreachLock Continuous Adversarial Exposure Validation

The Equation Hasn’t Changed, but The Clock Has

The arrival of the Mythos era has brought a genuine leap in capability and an unprecedented urgency for validation. However, as security leaders work to cut through the current industry noise, they will find that the fundamental equation of security has not changed:

The Unchanged Security Equation

Security outcomes are determined by what you validate, prioritize, and actually fix, not by how many vulnerabilities a frontier AI model can surface.

What has changed is the clock. The window between vulnerability discovery and active exploitation has collapsed from years to hours. Mythos-class capabilities will proliferate beyond the Glasswing coalition. The adversaries reading the same headlines you are will have access to the same tools sooner than most security programs are prepared for.

The security leaders who get ahead of this are the ones who invest now in the capabilities that actually determine outcomes: continuous validation, attack path intelligence, and agentic offensive security programs that run at the speed the threat now demands.

Your Offensive Security Solutions Unified in One Seamless Platform

Penetration Testing as a Service (PTaaS)

Penetration Testing as a Service (PTaaS)

Adversarial Exposure Validation (AEV)

Adversarial Exposure Validation (AEV)

Attack Surface Management (ASM)

Attack Surface Management (ASM)

About BreachLock

BreachLock is a global leader in offensive security, delivering scalable and continuous security testing. Trusted by global enterprises, BreachLock provides human-led and AI-powered attack surface management, penetration testing, red teaming, and AEV services that help security teams stay ahead of adversaries. With a mission to make proactive security the new standard, BreachLock is shaping the future of cybersecurity through automation, data-driven intelligence, and expert-driven execution.

See how BreachLock closes the gap between discovery and remediation in your environment. Contact us to get started!

References

1. UK AI Security Institute (AISI) independent evaluation of Mythos Preview, including 73% expert-task success rate and 32-step network attack completion. April 2026. https://www.aisi.gov.uk/

2. Exploitation timeline collapse. Microsoft / Project Glasswing partner statement. Cited in: Anthropic, “Project Glasswing Securing critical software for the Al era,” April 7, 2026. https://www.anthropic.com/glasswing

3. 89% increase in Al-enabled cyberattacks, 2025. Source: CrowdStrike 2026 Global Threat Report. https://www.crowdstrike.com/en-us/global-threat-report/

4. Exploitation timeline data: 767-day median (2018) to ~4-hour median (2024); majority of 2025 exploits weaponized pre-disclosure. Source: Rest of World investigative reporting / Zero Day Clock research series, 2025–2026. https://restofworld.org/charts/2026/qsbK8-days-vulnerability-exploitation

5. Data regarding the persistence of unpatched vulnerabilities in large enterprises. Source: Edgescan, “The Vulnerability Backlog Crisis: Why 45% of Enterprise Vulnerabilities Never Fixed,” March 12, 2026. https://www.edgescan.com/the-vulnerability-backlog-crisis-why-45-of-enterprise-vulnerabilities-never-get-fixed/

6. Mythos sandbox escape during internal safety testing. Source: Claude Mythos: The Al That Hacked Every OS and Escaped Its Own Cage, April 10, 2026 https://medium.com/@shubhamnv2/claude-mythos-the-ai-that-hacked-every-os-and-escaped-its-own-cage-2eabae94b898

7. Global cybersecurity workforce shortage: 5M current gap; 85M projected by 2030. Source: ISC2 Cybersecurity Workforce Study, 2025; cited in Rest of World, 2026. https://www.isc2.org/Insights/2025/12/2025-ISC2-Cybersecurity-Workforce-Study

Author

BreachLock Labs

BreachLock Labs

Industry recognitions we have earned

Reuters logo Top logo Forbes logo GigaOm logo Global logo Bloomberg logo Globee logo

Fill out the form below to let us know your requirements.
We will contact you to determine if BreachLock is right for your business or organization.

background image