Claude SubAgents

Claude SubAgents are isolated Claude instances spawned by a parent (orchestrator) agent to perform bounded, well-defined tasks. They are the primary architectural mechanism for solving context window exhaustion and token cost explosion in production-scale agentic workflows within Claude Code.

The Core Problem They Solve

A single Claude Code session accumulates context linearly: system prompt, CLAUDE.md, memory files, tool outputs, and conversation history all compete for the same finite window (typically 200k tokens, up to 1M+ on higher tiers). As tasks grow in scope:

Context Blindness occurs when early instructions or file content fall outside the active window.
Auto-Compaction creates lossy summaries that can drop critical technical constraints.
Token Costs compound non-linearly because every round-trip carries the full accumulated context.

SubAgents break this pathology by enforcing isolated context boundaries at the agent level.

Architecture

┌──────────────────────────────────────────────┐
│          Orchestrator (Parent Agent)          │
│  Full session context: CLAUDE.md + history    │
│                                              │
│  Dispatches tasks ──► SubAgent A (task 1)    │
│                  ──► SubAgent B (task 2)    │
│                  ──► SubAgent C (task 3)    │
│                                              │
│  ◄── Receives concise summaries only          │
└──────────────────────────────────────────────┘

Key Properties

Isolated Context Window: Each subagent starts fresh — it receives only a focused prompt and the specific files/tools required for its task, not the parent’s accumulated history.
Bounded Scope: One subagent, one discrete objective.
Summarized Return: The subagent returns a concise result to the parent, not a raw context dump, so the parent window does not inherit the subagent’s token usage.
Parallelism: Multiple subagents can execute concurrently on independent tasks.

Dispatch Patterns

1. Task Decomposition

The orchestrator breaks a large feature into atomic sub-tasks and dispatches one subagent per task. Each returns a status; the orchestrator synthesizes the whole.

2. Parallel File Analysis

When multiple large files must be analyzed simultaneously, one subagent per file is spawned. Only summaries flow back to the parent, preventing bulk file content from bloating the parent context.

3. Sandboxed Experimentation

For exploratory changes (“try three approaches”), the orchestrator spawns parallel subagents and promotes the best result — without polluting the main session with failed attempts.

4. Specialist Delegation

The orchestrator delegates to role-specialized subagents:

Security Reviewer — loaded with security guidelines only
Test Generator — loaded with test framework docs and target module
Documentation Writer — loaded with style guide and public API

Token Cost Model

Session Type	Per-Call Cost Formula
Monolithic	`System + Memory + Full History + New Input + Output`
With SubAgents	`Minimal System + Task Prompt + Relevant Files + Output` (per subagent)

At scale, subagent architecture can reduce total session token cost by 60–80% for complex multi-file projects, because the parent only pays for summary returns rather than compounding full-context round-trips.

Integration Points

Skills: A Skill can internally dispatch subagents for heavy sub-tasks while keeping the Skill’s own context lean.
Context Window Diagnostics: Use /context to monitor parent context saturation — that is the trigger for subagent delegation.
Plan Mode: Plan Mode can draft a decomposition strategy before the orchestrator dispatches subagents in Act Mode.
Spec-Driven Development: Each task in a spec’s task list maps naturally to a subagent invocation.

When NOT to Use SubAgents

Simple, single-file tasks: Spawning a subagent adds latency overhead. Use direct tool calls instead.
Tightly-coupled atomic edits: If changes must be transactionally consistent across many files, a single agent with careful compaction may be safer.
Strict sequential dependencies: SubAgent results arrive asynchronously; if step N+1 depends critically on step N’s exact output, orchestrate sequentially rather than in parallel.

Design Philosophy

SubAgents apply microservices architecture principles to LLM cognition: small, well-defined interfaces between cognitive units prevent the “monolith” failure mode of a single overloaded context window. Each agent does one thing, knows only what it needs to know, and returns a minimal signal — exactly the same contract as a well-designed service boundary.

This mirrors the Specialized Agent Roles pattern (Architect Agent, Implementation Agent, QA Agent) at the microscale of individual Claude Code sessions.

Claude-Code
Context-Engineering
Claude-Skills
Claude-Code-Plan-Mode
Agentic-Workflows
Agent-Native-SDLC
LLM-Observability
Custom Subagent Design Patterns — file format, frontmatter reference, and canonical pattern library for building AI workers

Custom Subagent Configuration

Custom subagents are defined as Markdown files with YAML frontmatter stored in .claude/agents/ (project-scoped) or ~/.claude/agents/ (user-scoped). The file body becomes the system prompt.

---
name: code-reviewer
description: Reviews code quality. Use proactively after code changes.
tools: Read, Grep, Glob, Bash
model: sonnet
memory: project
---
 
You are a senior code reviewer...

Key Frontmatter Fields

Field	Purpose
`name`	Unique identifier; also the `@-mention` handle
`description`	Routing signal — Claude reads this to decide when to delegate
`tools`	Allowlist; inherits all tools if omitted
`disallowedTools`	Denylist; applied before `tools`
`model`	`haiku` / `sonnet` / `opus` / full model ID / `inherit`
`permissionMode`	`default`, `acceptEdits`, `auto`, `dontAsk`, `bypassPermissions`, `plan`
`memory`	`user`, `project`, or `local` — enables cross-session persistent knowledge
`mcpServers`	Scope MCP servers exclusively to this agent
`hooks`	`PreToolUse` / `PostToolUse` / `Stop` lifecycle hooks
`isolation`	`worktree` for git-isolated execution
`background`	`true` to always run concurrently

Scope Priority (highest to lowest)

Managed settings (org-wide)
--agents CLI flag (current session)
.claude/agents/ (project-level)
~/.claude/agents/ (user-level)

Built-in Subagents

Claude Code ships three built-in workers:

Explore — read-only, Haiku-powered codebase search
Plan — read-only context-gathering during plan mode (cannot spawn sub-subagents)
general-purpose — full-tool multi-step operations

Invocation Patterns

Natural language — name the agent; Claude decides whether to delegate
@-mention — @code-reviewer look at auth changes — guarantees delegation
claude --agent <name> — entire session runs under that agent’s system prompt and tool restrictions

Fork Mode (Experimental)

Enable CLAUDE_CODE_FORK_SUBAGENT=1 to allow subagents that inherit the full parent conversation context rather than starting fresh. Forks share the parent’s prompt cache (cheaper) but sacrifice context isolation. Use when the subagent would need too much background to reconstruct from scratch.

Cost Routing via Model Selection

Routing tasks to cheaper models is the primary cost lever:

Exploration/search → Haiku
Balanced analysis → Sonnet
Complex reasoning → Opus

Rakesh's Brain

Explorer

Claude SubAgents

Claude SubAgents

The Core Problem They Solve

Architecture

Key Properties

Dispatch Patterns

1. Task Decomposition

2. Parallel File Analysis

3. Sandboxed Experimentation

4. Specialist Delegation

Token Cost Model

Integration Points

When NOT to Use SubAgents

Design Philosophy

Custom Subagent Configuration

Key Frontmatter Fields

Scope Priority (highest to lowest)

Built-in Subagents

Invocation Patterns

Fork Mode (Experimental)

Cost Routing via Model Selection

References

Table of Contents

Graph View

Latest Blog Posts

Backlinks

Rakesh's Brain

Explorer

Claude SubAgents

Claude SubAgents

The Core Problem They Solve

Architecture

Key Properties

Dispatch Patterns

1. Task Decomposition

2. Parallel File Analysis

3. Sandboxed Experimentation

4. Specialist Delegation

Token Cost Model

Integration Points

When NOT to Use SubAgents

Design Philosophy

Related

Custom Subagent Configuration

Key Frontmatter Fields

Scope Priority (highest to lowest)

Built-in Subagents

Invocation Patterns

Fork Mode (Experimental)

Cost Routing via Model Selection

References

Table of Contents

Graph View

Latest Blog Posts

Backlinks