AI coding hits a "complexity cliff" — excels at isolated tasks, struggles with integration¶


Source type	Practitioner
Confidence	Medium
Last verified	2026-03-20
Topics	`claude-code` `ai-assisted-design`

Insight: Benchmarks reveal a predictable "complexity cliff" in AI coding: models achieve ~40% success on isolated React tasks (scaffolding, implementing explicit specs) but drop to ~25% on multi-step integrations involving state management and design taste. The gap between "AI helped me ship" and "AI gave me a mess" comes down to context engineering, prompt specificity, workflow structure, and guardrails — all within the developer's control.

Detail: Based on data from Design Arena, Web Dev Arena, SWE-Bench, and Web-Bench. Osmani's core thesis: AI's usefulness for React is extremely uneven but predictable. Understanding where models fail (complex state, design judgment, multi-module coordination) lets developers structure prompts and workflows to avoid waste. Deep React and domain knowledge enable spotting when AI goes off the rails.

Sources

Addy Osmani — "How Good Is AI at Coding React (Really)?" (2025-12-29)

Related: existing entry "AI coding quality is a skill issue" in external/claude-code.md — COMPLEMENTS