Claude Code's auto mode uses Sonnet 4.6 classifier to evaluate action safety without explicit approval¶


Source type	Official Docs
Confidence	High
Last verified	2026-03-25
Topics	`claude-code`

Insight: Claude Code's auto mode uses Claude Sonnet 4.6 as a separate classifier that evaluates action safety before execution. It blocks actions that escalate beyond task scope, target unrecognized infrastructure, or appear hostile. Default safeguards cover local file operations within project scope, read-only API requests, and declared package installations, while blocking force pushes and mass deletions.

Detail: Auto mode reduces friction by eliminating per-action approval requirements. However, it relies on AI-based classification that is inherently non-deterministic. Willison expresses skepticism about AI-based prompt injection protection, arguing that deterministic sandboxing through infrastructure (file access restrictions, network controls) provides more reliable security than LLM-based evaluation.

Sources

Simon Willison Blog — "Auto mode for Claude Code" (2026-03-24)