Agents Rule of Two — a practical framework for LLM agent security beyond the lethal trifecta¶


Source type	Practitioner
Confidence	Medium
Last verified	2026-03-20
Topics	`mcp`

Insight: Meta AI published the "Agents Rule of Two" — an agent must satisfy no more than two of three properties: (A) processing untrustworthy inputs, (B) accessing sensitive systems/data, (C) changing state or communicating externally. If all three are required, human-in-the-loop approval is mandatory. This extends Willison's "lethal trifecta" (which only covered data exfiltration) by adding "changing state" — covering tool-use risks like modifying files, sending messages, or making purchases.

Detail: Accompanied by a second paper "The Attacker Moves Second" which analyzes prompt injection from a game-theoretic perspective. Willison notes this Rule of Two neatly addresses the limitation of the lethal trifecta: there are plenty of nastier risks beyond data exfiltration (like state changes triggered by untrustworthy inputs). The framework provides a simple heuristic for developers building MCP-connected agents: if your agent needs all three properties, add human oversight.

Sources

Simon Willison — "New prompt injection papers: Agents Rule of Two and The Attacker Moves Second" (2025-11-03)

Related: existing entry "MCP has fundamental security and UX problems" in external/mcp.md — COMPLEMENTS