Agents Rule of Two — a practical framework for LLM agent security beyond the lethal trifecta¶
Insight: Meta AI published the "Agents Rule of Two" — an agent must satisfy no more than two of three properties: (A) processing untrustworthy inputs, (B) accessing sensitive systems/data, (C) changing state or communicating externally. If all three are required, human-in-the-loop approval is mandatory. This extends Willison's "lethal trifecta" (which only covered data exfiltration) by adding "changing state" — covering tool-use risks like modifying files, sending messages, or making purchases.
Detail: Accompanied by a second paper "The Attacker Moves Second" which analyzes prompt injection from a game-theoretic perspective. Willison notes this Rule of Two neatly addresses the limitation of the lethal trifecta: there are plenty of nastier risks beyond data exfiltration (like state changes triggered by untrustworthy inputs). The framework provides a simple heuristic for developers building MCP-connected agents: if your agent needs all three properties, add human oversight.
Sources
Related: existing entry "MCP has fundamental security and UX problems" in external/mcp.md — COMPLEMENTS