The Case for Autonomous Validation

May 13, 2026

By Sila Ozeren Hacioglu, Safety Analysis Engineer at Picus Safety.

In April 2026, Anthropic launched its latest frontier mannequin, codename Mythos, to 12 companions underneath a gated preview. Not basic availability; the corporate explicitly held it again because it was (appropriately) deemed too harmful for open launch.

In its first 14 days inside that sandbox, it wrote 181 working Firefox exploits. The earlier state-of-the-art mannequin managed two. Uh oh.

It surfaced hundreds of zero-days throughout each main OS and browser, together with a 27-year-old bug in OpenBSD, an working system whose total repute is constructed on not having bugs like this.

Over 99% of what Mythos discovered continues to be unpatched in manufacturing in the present day.

That is not a forecast. That occurred.

Now pair it with what’s already within the wild.

Let’s again up a bit. In February, AWS Menace Intelligence revealed a postmortem on a FortiGate marketing campaign run by a single operator. One particular person, low ability, no palms on keyboard.

The AI did the work, and it hit 2,516 units throughout 106 nations in parallel, taking simply minutes per goal. Zero days weren’t required. Identified CVEs and misconfigurations had been sufficient; the AI merely operated sooner than anybody might reply.

Figure 1. AWS Threat Intelligence FortiGate campaign hits 2,516 devices in 106 countries — **Determine 1. AWS Menace Intelligence FortiGate marketing campaign hits 2,516 units in 106 nations**

Two knowledge factors, one message: offense now runs at machine pace. And the query each defender must be asking is, not “are we compliant?” or “are we coated?” It’s extra granular, and extra urgent:

“What’s really getting via my controls in the present day, and the way far?”

If the sincere reply includes a quarterly pentest report and a few dashboard screenshots, think about the remainder of this piece required studying.

How Quick Can Attackers Exploit a Printed CVE in 2026?

A decade in the past, the median time from a CVE’s publication to a working exploit showing within the wild was measured in months, lengthy sufficient for an actual patch cycle. By 2024, that window had shrunk to about 56 days. By 2025, it was right down to 23 days.

Latest CVE-to-exploit pairings from CISA KEV, VulnCheck KEV, and exploit databases now present a median delta of roughly 10 hours.

Figure 2. Average CVE-to-exploit window: 2.3 years (2018) vs. ~10 hours (2026). — **Determine 2. Common CVE-to-exploit window: 2.3 years (2018) vs. ~10 hours (2026).**

Reversing a printed repair right into a working exploit is not a specialist craft; it is now a immediate.

Because of this the comfy assumptions of vulnerability administration, that CVSS scores meaningfully prioritize, that “exploitability” is a helpful filter, that you’ve got time between disclosure and weaponization, have all quietly damaged.

The safer working assumption is now: each vulnerability has an exploit, or will, earlier than you end your subsequent change-management assembly.

Sadly, autoimmunity for protection does not exist but.

And blue facet AI with out validation is simply guesswork at machine pace, and that is an costly hunch to deploy into manufacturing.

Over 99% of Mythos findings stay unpatched. The Glasswing public report lands in July.

This information from Picus Labs covers the 12 operational suggestions security groups want to shut the hole between AI-speed offense and human-speed protection, together with 5 actions for week one.

Obtain Now

The Actual Bottleneck Is not Tooling — It is the Spaghetti Handoff

Let’s begin with the attacker first.

At second zero, the AI script kicks off. By second 5, a CVE is exploited. MFA bypassed by twenty. Internet shell dropped at thirty. Credentials dumped at forty-five. By second seventy-three, the compromise is full.

No human within the loop, no hesitation, no group conferences, no espresso breaks.

Now image the defender.

The SIEM alert fires at one minute, after the attacker is already accomplished. A Tier 1 analyst picks it up round minute 5. Somebody triggers a SOAR playbook, by hand, at minute fifteen. A Jira ticket will get filed an hour in. 4 hours later, it lands within the IT ops’ queue.

The patch goes out the following day, twenty-four hours after the breach that took seventy-three seconds to finish.

Figure 3. The agility gap: AI compromise (73s) vs. patching (24h) due to cross-team friction. — **Determine 3. The agility hole: AI compromise (73s) vs. patching (24h) resulting from cross-team friction.**

Discover the place the time goes. It is not inside anyone software. The EDR is quick. The SIEM is quick. The vulnerability scanner is quick. The time dies between the instruments: the Slack messages, the copy-pasted hash, the PDF report emailed for overview, the ticket ready for approval, the crimson group script being rebuilt by hand for the blue group.

That is the spaghetti handoff, and it’s as messy because it sounds.

You should purchase a sooner scanner, plug in a better EDR, even bolt an LLM onto your SIEM, and none of them will markedly pace up your response, as a result of the hole is not inside any of your instruments. It lives between groups and between programs. Accelerating one node in a graph does not speed up the graph.

It is a massive a part of why this dialog has moved out of the CISO’s workplace.

Six months in the past, AI-driven cyber danger was a technical downside to delegate. In the present day, boards are treating it as existential and governing it straight. Budgets are unlocked, however not for ‘extra of the identical.’ They’re funding credible, evidence-based plans.

What Are the Three Pillars of Cyber Resilience within the Age of AI-Powered Attacks

The basics that made organizations resilient earlier than Mythos nonetheless apply. There are three.:

Pillar 1: Establish. You’ll be able to’t defend what you may’t see. Even with complete publicity visibility throughout community, endpoint, cloud, and identification, and aggressive assault floor administration, the blind spots (orphaned distant entry, lacking segmentation, MFA gaps) are the place machine-speed attackers dwell.

Pillar 2: Shield. Efficient community and endpoint controls, correctly tuned. Tailor-made detection targeted on credential entry, lateral motion and privilege escalation fairly than generic vendor guidelines.

Pillar 3: Validate. That is the one most applications undervalue, and it is the one that truly solutions the query we began with. Validation has two halves, and sure, you want each.

Defensive validation — Breach and Attack Simulation (BAS). Are my prevention and detection controls really catching what’s hitting me proper now? Which property do my controls fail to guard? What is the residual danger after my stack runs?
Offensive validation — Autonomous Pentesting. Can an attacker really breach us? Which exposures chain collectively into an actual path to our crown jewels? What’s really exploitable in our surroundings, not simply theoretically weak?

Figure 4. BAS and Automated Penetration Testing Together — **Determine 4. BAS and Automated Penetration Testing Collectively**

Run solely BAS, and you may know your controls work in isolation however not whether or not an attacker can route round them. Run solely autonomous pentesting, and you will find assault paths however received’t know which controls are silently failing on the property the pentest by no means touched. Run them as one steady loop, the place every informs the opposite, and also you’ll lastly have a solution to “what will get via, and the way far” that is grounded in proof fairly than hypothetical opinion.

However proof is not sufficient by itself. When offense runs at machine pace, the loop itself has to run at machine pace.

How Picus Approaches Autonomous Validation in a Publish-Mythos World

A steady loop is the precise reply. However “steady” nonetheless implies a human pacing it. In a post-Mythos world, the hole that issues is not between seeing and detecting; it is between detecting and proving, quick sufficient that an AI-driven adversary does not discover out for you first.

That is the place validation goes from steady to autonomous: brokers studying the alert, scoping the take a look at, working the simulation, pushing the repair, and writing the report, whereas the SOC catches up on some much-needed sleep.

AV Summit

We’ll be unpacking precisely what that appears like (the structure, the agentic workflows, the operational actuality of working it inside an actual enterprise) on the Autonomous Validation Summit on Might 12 & 14, hosted with Frost & Sullivan and that includes practitioners from Kraft Heinz and Glow Monetary Companies, alongside PicusCTO, Volkan Erturk.

>> See it in motion on the summit.

Sponsored and written by Picus Safety.

- Advertisment -

The Case for Autonomous Validation

How Quick Can Attackers Exploit a Printed CVE in 2026?

The Actual Bottleneck Is not Tooling — It is the Spaghetti Handoff

What Are the Three Pillars of Cyber Resilience within the Age of AI-Powered Attacks

How Picus Approaches Autonomous Validation in a Publish-Mythos World

Microsoft Patches Crucial Zero-Click on Outlook Vulnerability Threatening Enterprises

Google entdeckt erstmals KI-basierten Zero-Day-Exploit

Microsoft Patches 138 Vulnerabilities, Together with DNS and Netlogon RCE Flaws

LEAVE A REPLY Cancel reply

Most Popular

Angriffe auf npm-Lieferkette gefährden Entwicklungsumgebungen

PixieFail flaws affect PXE community boot in enterprise techniques

PixieFail UEFI Flaws Expose Tens of millions of Computer systems to RCE, DoS, and Data Theft

New Marvin assault revives 25-year-old decryption flaw in RSA

1000’s of Juniper gadgets susceptible to unauthenticated RCE flaw

Why Instagram Threads is a hotbed of dangers for companies

Phishing Campaigns Ship New SideTwist Backdoor and Agent Tesla Variant

EDITOR PICKS

Chinese language hackers allegedly hacked US ISPs for cyber espionage

Home windows driver zero-day exploited by Lazarus hackers to put in rootkit

Hackers Exploit Milesight Routers to Ship Phishing SMS to European Customers

POPULAR News

Angriffe auf npm-Lieferkette gefährden Entwicklungsumgebungen

PixieFail flaws affect PXE community boot in enterprise techniques

PixieFail UEFI Flaws Expose Tens of millions of Computer systems to RCE, DoS, and Data Theft

POPULAR TAGS

POPULAR Tags

POPULAR Tags

ABOUT US

FOLLOW US