Claude SubAgents
Claude SubAgents are isolated Claude instances spawned by a parent (orchestrator) agent to perform bounded, well-defined tasks. They are the primary architectural mechanism for solving context window exhaustion and token cost explosion in production-scale agentic workflows within Claude Code.
The Core Problem They Solve
A single Claude Code session accumulates context linearly: system prompt, CLAUDE.md, memory files, tool outputs, and conversation history all compete for the same finite window (typically 200k tokens, up to 1M+ on higher tiers). As tasks grow in scope:
- Context Blindness occurs when early instructions or file content fall outside the active window.
- Auto-Compaction creates lossy summaries that can drop critical technical constraints.
- Token Costs compound non-linearly because every round-trip carries the full accumulated context.
SubAgents break this pathology by enforcing isolated context boundaries at the agent level.
Architecture
┌──────────────────────────────────────────────┐
│ Orchestrator (Parent Agent) │
│ Full session context: CLAUDE.md + history │
│ │
│ Dispatches tasks ──► SubAgent A (task 1) │
│ ──► SubAgent B (task 2) │
│ ──► SubAgent C (task 3) │
│ │
│ ◄── Receives concise summaries only │
└──────────────────────────────────────────────┘
Key Properties
- Isolated Context Window: Each subagent starts fresh — it receives only a focused prompt and the specific files/tools required for its task, not the parent’s accumulated history.
- Bounded Scope: One subagent, one discrete objective.
- Summarized Return: The subagent returns a concise result to the parent, not a raw context dump, so the parent window does not inherit the subagent’s token usage.
- Parallelism: Multiple subagents can execute concurrently on independent tasks.
Dispatch Patterns
1. Task Decomposition
The orchestrator breaks a large feature into atomic sub-tasks and dispatches one subagent per task. Each returns a status; the orchestrator synthesizes the whole.
2. Parallel File Analysis
When multiple large files must be analyzed simultaneously, one subagent per file is spawned. Only summaries flow back to the parent, preventing bulk file content from bloating the parent context.
3. Sandboxed Experimentation
For exploratory changes (“try three approaches”), the orchestrator spawns parallel subagents and promotes the best result — without polluting the main session with failed attempts.
4. Specialist Delegation
The orchestrator delegates to role-specialized subagents:
- Security Reviewer — loaded with security guidelines only
- Test Generator — loaded with test framework docs and target module
- Documentation Writer — loaded with style guide and public API
Token Cost Model
| Session Type | Per-Call Cost Formula |
|---|---|
| Monolithic | System + Memory + Full History + New Input + Output |
| With SubAgents | Minimal System + Task Prompt + Relevant Files + Output (per subagent) |
At scale, subagent architecture can reduce total session token cost by 60–80% for complex multi-file projects, because the parent only pays for summary returns rather than compounding full-context round-trips.
Integration Points
- Skills: A Skill can internally dispatch subagents for heavy sub-tasks while keeping the Skill’s own context lean.
- Context Window Diagnostics: Use
/contextto monitor parent context saturation — that is the trigger for subagent delegation. - Plan Mode: Plan Mode can draft a decomposition strategy before the orchestrator dispatches subagents in Act Mode.
- Spec-Driven Development: Each task in a spec’s task list maps naturally to a subagent invocation.
When NOT to Use SubAgents
- Simple, single-file tasks: Spawning a subagent adds latency overhead. Use direct tool calls instead.
- Tightly-coupled atomic edits: If changes must be transactionally consistent across many files, a single agent with careful compaction may be safer.
- Strict sequential dependencies: SubAgent results arrive asynchronously; if step N+1 depends critically on step N’s exact output, orchestrate sequentially rather than in parallel.
Design Philosophy
SubAgents apply microservices architecture principles to LLM cognition: small, well-defined interfaces between cognitive units prevent the “monolith” failure mode of a single overloaded context window. Each agent does one thing, knows only what it needs to know, and returns a minimal signal — exactly the same contract as a well-designed service boundary.
This mirrors the Specialized Agent Roles pattern (Architect Agent, Implementation Agent, QA Agent) at the microscale of individual Claude Code sessions.
Related
- Claude-Code
- Context-Engineering
- Claude-Skills
- Claude-Code-Plan-Mode
- Agentic-Workflows
- Agent-Native-SDLC
- LLM-Observability
- Custom Subagent Design Patterns — file format, frontmatter reference, and canonical pattern library for building AI workers
Custom Subagent Configuration
Custom subagents are defined as Markdown files with YAML frontmatter stored in .claude/agents/ (project-scoped) or ~/.claude/agents/ (user-scoped). The file body becomes the system prompt.
---
name: code-reviewer
description: Reviews code quality. Use proactively after code changes.
tools: Read, Grep, Glob, Bash
model: sonnet
memory: project
---
You are a senior code reviewer...Key Frontmatter Fields
| Field | Purpose |
|---|---|
name | Unique identifier; also the @-mention handle |
description | Routing signal — Claude reads this to decide when to delegate |
tools | Allowlist; inherits all tools if omitted |
disallowedTools | Denylist; applied before tools |
model | haiku / sonnet / opus / full model ID / inherit |
permissionMode | default, acceptEdits, auto, dontAsk, bypassPermissions, plan |
memory | user, project, or local — enables cross-session persistent knowledge |
mcpServers | Scope MCP servers exclusively to this agent |
hooks | PreToolUse / PostToolUse / Stop lifecycle hooks |
isolation | worktree for git-isolated execution |
background | true to always run concurrently |
Scope Priority (highest to lowest)
- Managed settings (org-wide)
--agentsCLI flag (current session).claude/agents/(project-level)~/.claude/agents/(user-level)
Built-in Subagents
Claude Code ships three built-in workers:
- Explore — read-only, Haiku-powered codebase search
- Plan — read-only context-gathering during plan mode (cannot spawn sub-subagents)
- general-purpose — full-tool multi-step operations
Invocation Patterns
- Natural language — name the agent; Claude decides whether to delegate
- @-mention —
@code-reviewer look at auth changes— guarantees delegation claude --agent <name>— entire session runs under that agent’s system prompt and tool restrictions
Fork Mode (Experimental)
Enable CLAUDE_CODE_FORK_SUBAGENT=1 to allow subagents that inherit the full parent conversation context rather than starting fresh. Forks share the parent’s prompt cache (cheaper) but sacrifice context isolation. Use when the subagent would need too much background to reconstruct from scratch.
Cost Routing via Model Selection
Routing tasks to cheaper models is the primary cost lever:
- Exploration/search → Haiku
- Balanced analysis → Sonnet
- Complex reasoning → Opus