Why Enterprise Security Is Failing Agentic AI

In our previous analysis, we examined why software-only security architectures fall short for autonomous AI agents. That argument was architectural. This one is empirical.

On March 9, 2026, security firm CodeWall disclosed that an autonomous AI agent had gained full read-write access to McKinsey's internal AI platform, Lilli, in under two hours. The agent required no credentials, no insider knowledge, and no human intervention. It found its own way in.

The breach exposed 46.5 million internal chat messages, 728,000 confidential files, 57,000 user accounts, and 95 system prompts governing how the AI behaved for McKinsey's 43,000 consultants. Not just read access — write access. An attacker could have silently rewritten Lilli's instructions, poisoning the strategic and financial advice flowing to clients across every engagement.

This was not a sophisticated zero-day exploit. It was a SQL injection vulnerability on an unauthenticated API endpoint. The kind of flaw that has appeared in the OWASP Top 10 for two decades.

If McKinsey — a firm that advises the world's largest enterprises on digital transformation and risk management — shipped an AI platform with 22 unauthenticated API endpoints and plaintext storage of 46.5 million messages, the question is no longer whether enterprise security is failing agentic AI. It is why.

The Breach Anatomy

The technical chain is worth understanding in full, because it reveals how an autonomous agent thinks about attack surfaces differently from human penetration testers.

CodeWall's AI agent began by discovering publicly exposed API documentation containing over 200 endpoints. Of these, 22 required no authentication whatsoever. One unprotected endpoint accepted user search queries where the JSON field names — not the values — were concatenated directly into SQL. Standard security scanners, including OWASP ZAP, missed the flaw entirely. The agent did not.

Through 15 blind iterations of error-based SQL injection, the agent enumerated the database structure, then accessed production data. Within two hours it had mapped the entire system:

46.5 million chat messages — strategy discussions, M&A deliberations, and client engagements, all in plaintext
728,000 files — including 192,000 PDFs, 93,000 spreadsheets, 93,000 presentations, and 58,000 Word documents containing confidential client data
57,000 user accounts with associated credentials
95 system prompts across 12 model types, controlling all of Lilli's behavior, guardrails, and citation logic
3.68 million RAG document chunks and 266,000 OpenAI vector stores with exposed S3 storage paths

The write access is what elevates this from a data breach to a strategic weapon. An adversary with UPDATE access to system prompts could silently alter how Lilli answers consultant queries, embed data exfiltration into client-facing outputs, remove guardrails, or plant persistent backdoors — all without triggering deployment logs or code changes.

CodeWall's researchers put it directly: "AI prompts are the new Crown Jewel assets" — yet they operated without access controls, version history, or integrity monitoring.

Why This Keeps Happening

The instinct is to treat the Lilli breach as an implementation failure — a team that shipped without basic authentication. That framing is comfortable, because it implies the fix is simple: better code review, more security testing, tighter deployment gates.

But this framing misses the structural problem. Enterprise security architectures were not designed for the threat model that agentic AI introduces. They were designed for a world where:

Users are human — they authenticate with credentials, operate within session boundaries, and interact through defined UIs
Applications are stateless — request-response cycles with predictable data flows
Access patterns are bounded — humans read documents one at a time, not 46.5 million messages in a single session
Attack surfaces are static — the API footprint is known, documented, and tested before deployment

Agentic AI breaks every one of these assumptions. An AI agent is not a user. It is persistent, autonomous, capable of operating across hundreds of API endpoints simultaneously, and able to chain vulnerabilities faster than any human red team. It does not need a UI. It does not respect session boundaries. It does not get tired after the fifteenth blind SQL injection iteration.

The McKinsey breach was not discovered by a human. It was discovered by another AI agent. This is the new reality: AI agents are now on both sides of the security perimeter, and the architectures designed to mediate between human users and software systems are caught in between.

Zero-Trust Was Built for Humans

Zero-trust architecture has become the gold standard for enterprise security, and rightly so. The principle — never trust, always verify — has fundamentally improved how organizations handle identity, access, and network segmentation.

But zero-trust as implemented today was designed around human identity and human access patterns. Consider the core pillars:

Identity verification — tied to human credentials: passwords, MFA tokens, biometrics, SSO sessions
Least-privilege access — scoped to human roles: analyst, manager, administrator
Continuous validation — monitors human behavioral baselines: login times, geolocations, typing patterns
Micro-segmentation — segments network access based on human team structures and organizational boundaries

None of these translate cleanly to autonomous AI agents.

An AI agent does not have a password. It has an API key, a service account, or a bearer token — credentials that are static, shareable, and trivially replayable. MFA is meaningless for a non-human entity. Behavioral baselines calibrated to human patterns will either miss agent anomalies entirely or drown security teams in false positives, because an agent's normal operating behavior — thousands of API calls per minute, lateral traversal across services, autonomous decision-making — looks like an attack to systems tuned for human traffic.

Least privilege, as typically implemented, assigns permissions based on job function. But an AI agent's "function" is fluid. A research agent may need to read financial documents, cross-reference market data, generate analysis, and push results to a client portal — a scope that, in human terms, would span analyst, researcher, and communications roles simultaneously. Cramming this into a static RBAC model either over-provisions the agent (creating unnecessary risk) or under-provisions it (breaking functionality).

Gartner has identified this gap directly. Their 2026 cybersecurity trends report lists agentic AI oversight as the number-one trend, noting that traditional IAM models fail with autonomous AI agents operating at scale without human intervention. Spending on AI-amplified security is projected to reach $160 billion by 2029 — but the $244.2 billion in total security spend for 2026 is still overwhelmingly allocated to architectures built for human users.

The Architectural Mismatch

The Lilli breach exposes a specific failure pattern that will repeat across every enterprise deploying agentic AI without rethinking its security architecture. The pattern has three stages:

Stage 1: The AI tool is treated as an application. Security teams apply standard application security controls — WAFs, API gateways, RBAC, logging. The tool passes an AppSec review because it looks like a web application.

Stage 2: The tool becomes an agent. As capabilities expand — RAG pipelines, tool use, multi-step reasoning, autonomous action — the system's actual threat surface diverges from its assessed one. It now has persistent database access, external API integrations, and the ability to chain actions autonomously. But the security architecture has not evolved to match.

Stage 3: The agent's attack surface is discovered — by another agent. An AI attacker does not need to find one critical vulnerability. It can iterate through hundreds of endpoints, correlate error messages, and chain low-severity findings into high-impact exploit paths — exactly as CodeWall's agent did with McKinsey.

This three-stage pattern is not unique to McKinsey. It is the default trajectory for any enterprise that bolts agentic capabilities onto existing application infrastructure without re-evaluating the security model from the ground up.

Forrester saw this coming. Their 2026 cybersecurity predictions explicitly forecast that an agentic AI deployment would cause a publicly disclosed breach this year, warning that without proper guardrails, autonomous AI systems "may sacrifice accuracy for speed of delivery." Analyst Paddy Harrington emphasized that such breaches represent "a cascade of failures" — not a single point of blame, but a systemic architectural mismatch between what organizations are deploying and what they are securing.

What Zero-Trust for AI Agents Requires

The solution is not to abandon zero-trust. It is to extend its principles to account for non-human autonomous entities. This requires rethinking each pillar:

1. Agent Identity Must Be Cryptographic, Not Credential-Based

Human zero-trust verifies identity through credentials that a person knows or possesses. Agent identity must be anchored to something that cannot be shared, replayed, or escalated through software exploits. This means cryptographic identity tied to the execution environment — not API keys stored in environment variables or service account tokens passed between microservices.

Every agent action should be attributable to a verified identity, and that identity should be inseparable from the agent's runtime context.

2. Least Privilege Must Be Dynamic, Not Role-Based

Static RBAC cannot model the fluid, multi-domain access patterns of autonomous agents. Agent permissions should be:

Task-scoped — granted per operation, not per role
Time-bounded — automatically expired after task completion
Context-aware — evaluated against what the agent is doing, not just what it is allowed to do
Revocable in real time — with enforcement that does not depend on the agent's cooperation

This is fundamentally different from assigning an agent a service account with broad permissions and hoping that application-layer guardrails constrain its behavior.

3. Behavioral Baselines Must Be Agent-Native

Monitoring systems calibrated for human traffic patterns will either miss agent threats or generate unmanageable noise. Security teams need anomaly detection models trained on agent-specific baselines: expected API call volumes, legitimate data access patterns, normal tool-use chains, and permissible decision sequences.

When CodeWall's agent iterated through 15 blind SQL injection attempts on a single endpoint, every attempt should have triggered an alert. The fact that it did not suggests the monitoring — if it existed — was looking for human-shaped threats.

4. Segmentation Must Extend Below the Application Layer

Network micro-segmentation in a zero-trust model prevents lateral movement between human-accessible zones. For AI agents, segmentation must extend deeper — isolating agent execution environments, separating data planes from control planes, and ensuring that a compromised agent in one context cannot reach data or systems in another.

In the Lilli breach, a single SQL injection on one endpoint gave access to the entire production database — 46.5 million messages, 728,000 files, every user account. There was no segmentation between the query interface and the data store. The blast radius was total.

5. System Prompts Must Be Treated as Security-Critical Assets

Perhaps the most overlooked finding in the Lilli breach: the system prompts controlling the AI's behavior were stored in the same database, with the same write access, as the chat logs. An attacker could modify how the AI thinks, what it reveals, and what guardrails it follows — without touching a single line of application code.

System prompts, RAG configurations, and behavioral policies must be stored in integrity-protected environments with version control, access auditing, and tamper detection. They are not configuration files. They are the agent's operating instructions, and compromising them compromises every output the agent produces.

The Pattern

The Lilli breach was not a failure of sophistication. It was a failure of category. McKinsey's security team secured Lilli as an application. But Lilli was an agent — with persistent data access, autonomous reasoning, and 43,000 users relying on its outputs for strategic decisions. The security architecture never caught up to what the system actually was.

The Window Is Closing

The analyst community has been remarkably direct about the timeline. Gartner projects that by the end of 2026, "death by AI" legal claims will exceed 2,000 due to insufficient AI risk guardrails. Forrester has already seen its prediction of a public agentic AI breach validated — less than three months into the year. The ISC2 documents a 4.8 million cybersecurity professional gap growing at 19% year-over-year, meaning the teams responsible for securing these systems are already stretched beyond capacity.

Meanwhile, agentic AI investment is accelerating. Gartner forecasts that agentic AI spending will overtake chatbot spending by 2027, growing at a 119% compound annual growth rate to $752.7 billion by 2029. Forty percent of enterprise applications will feature task-specific AI agents by the end of this year, up from less than 5% in 2025.

The gap between deployment velocity and security readiness is not narrowing. It is widening. Every enterprise deploying agentic AI today is building on security architectures designed for a pre-agent world, and the Lilli breach is what happens when that mismatch meets reality.

This is not a theoretical risk. It is not a prediction. It is a pattern that is already playing out, and the CISOs who recognize it now have a rapidly shrinking window to act before the next breach — or the next agent — finds the gaps they have not yet closed.

Security was designed for a world where humans were the users and software was the tool. In the agentic era, software is the user. And our security architectures have not made that shift.