Skip to content

AI agent security — betting against the models is a losing strategy

Insight: Much of the ~$300M+ investment in "Security for AI" startups is building on a flawed premise: that current model weaknesses (prompt injection, jailbreaks) will persist. Three predictions: (1) Foundation Model Providers will solve their own syntactic security flaws — third-party patches become obsolete with each model release cycle, (2) restricting an agent's context defeats its purpose — a heavily restricted agent is a useless agent, (3) the durable security challenge isn't syntactic exploits but semantic threats — malicious prompts from seemingly trusted sources.

Detail: Shrivu distinguishes between syntactic technical exploits (prompt injection — a solvable model flaw) and context failures (agent fed malicious prompt from trusted source — a durable, architectural problem). The enterprise security dilemma: CISOs may prefer a "secure but useless" agent, but this misaligns with business goals. Agent utility is directly proportional to the context it's given. This frames the MCP security discourse differently than Simon Willison's "Lethal Trifecta" — Shrivu is more optimistic about model-level fixes but more concerned about the architectural trust problem.

Sources