🧠 Context Window

Claude Code Context Management

How the 200K context window fills up, when to use /compact vs /clear, and strategies to stay productive on large codebases.

How Claude Code Uses Context

Claude Code appends everything to a running conversation context: your messages, Claude's responses, tool call results (file reads, grep outputs, bash stdout), and CLAUDE.md contents. Every new request sends the entire accumulated context to the API as input tokens.

With Claude Sonnet or Opus, the context window is 200,000 tokens — roughly 150,000 words. That sounds enormous, but fills quickly during real coding sessions:

OperationApprox. tokens added
Read a 500-line file~750 tokens
Read a 3,000-line file~4,500 tokens
Run test suite with 100 failures output~3,000–8,000 tokens
Large git diff (50 files)~15,000–40,000 tokens
CLAUDE.md (500 lines)~750 tokens per request
Average Claude reply~300–800 tokens
One full coding round-trip~2,000–5,000 tokens added

A session with 20–30 round-trips, some file reads, and test output can easily reach 80,000–120,000 tokens. Claude Code warns you when you approach the limit.

/compact vs /clear — Which to Use

/compact — Summarize & Continue

Claude reads the full context, writes a structured summary of progress, decisions, and current state, then replaces the history with that summary. Session continues from where you left off.

✅ Use when: mid-task, you want to keep context of what's been done

/clear — Reset Context

Discards all conversation history entirely. Returns to a clean slate with only CLAUDE.md loaded. Faster than /compact — no summarization step.

⚠️ Use when: switching to a completely different task, or after a dead-end investigation

Tip: Run /compact proactively when you notice context growing large, not after hitting the limit. A good time is after completing a discrete subtask — the summary then accurately captures a complete unit of work.

Strategies to Extend Context for Large Repos

1. Read Selectively

Instead of "read the codebase and tell me what's wrong," say "I think the bug is in src/auth — read src/auth/middleware.ts and src/auth/session.ts." Every unnecessary file read consumes tokens from an irreplaceable pool.

2. Prefer Grep Over File Reads

When searching for a symbol, a grep search returns only matching lines (cheap). Reading the file to find it returns everything (expensive). Instruct Claude: "grep for the function, then read only the relevant section."

3. Trim Bash Output

Long test failures or build logs are token-expensive. Ask Claude to pipe output:

# Instead of: npm test (dumps full output)
# Ask Claude to run: npm test 2>&1 | head -100
# Or: npm test 2>&1 | grep -E "FAIL|Error|×" | head -50

4. Break Into Sessions

For large multi-day tasks, break into self-contained subtasks across separate sessions. Use CLAUDE.md or a TODO.md file to carry context between sessions — notes committed to disk survive session boundaries; in-context notes don't.

5. Use Sub-Agents for Isolated Work

Each sub-agent gets its own fresh 200K context. Delegate self-contained subtasks — "analyze this module and return a summary" — to a sub-agent. The orchestrating agent only consumes the returned summary, not the agent's full working context.

// Pattern: Delegate analysis, consume only the result
Agent({
  prompt: "Read src/billing/ and summarize the pricing logic in under 200 words",
  // Sub-agent gets its own fresh context; returns only the summary
})

6. Use /compact Before Switching Tasks

If you finish one subtask and want to start another in the same session, run /compact first. The summary shrinks history; you get more headroom for the next task without the context snapshot of everything you did before.

What /compact Includes in Its Summary

The auto-compaction summary typically covers:

After compaction, Claude picks up from this summary — it will know what was done and what's next, but won't remember specific code snippets from the compressed history (those are in the files on disk).

Limitation: After /compact, Claude cannot quote from earlier in the conversation or recall specific outputs from discarded turns. For exact recall, commit work to files before compacting — the disk is always available; the context is not.

Context Management in Automated / CI Runs

When using Claude Code via the SDK non-interactively (e.g. in CI), auto-compaction fires when the context reaches a threshold. Your orchestrator receives a compaction notification. Best practice:

// In SDK usage: handle compaction events
on('compact', (summary) => {
  console.log('Context compacted:', summary.text)
  // Optionally inject a follow-up instruction to re-orient the model
  resume("Continue from the compact summary above. Next step: run the tests.")
})

Frequently Asked Questions

What is the Claude Code context window limit?
Claude Code uses Claude models with a 200,000-token context window (~150,000 words). It fills quickly during long sessions because every tool result — file reads, grep outputs, bash stdout — is appended to the context and resent on every request.
What does /compact do in Claude Code?
/compact triggers automatic summarization of the current conversation. Claude writes a structured summary of progress, decisions, and state, then replaces the full history with that summary. You continue working in the same session with a much smaller context. Use it proactively mid-task before hitting the limit.
What is the difference between /compact and /clear in Claude Code?
/compact summarizes and compresses (retains a distilled history). /clear resets completely (all history discarded). Use /compact to continue a task; use /clear when switching to an entirely different task and don't need any memory of the previous work.
How do I avoid running out of context?
Read files selectively, prefer grep over full file reads, trim bash output with head/grep, break large tasks into separate sessions with state carried through committed files, and use sub-agents for isolated subtasks so each gets a fresh context budget.
Does large context cost more?
Yes. Input tokens are billed on every request. A 100,000-token context costs ~$0.30 per message at Claude Sonnet rates. Prompt caching reduces this by 90% for unchanged prefixes, but you still pay for the first request and for new content. /compact and /clear reduce input token costs for subsequent messages.
Can Claude Code automatically compact its own context?
Yes. When approaching the context limit, Claude Code auto-compacts and notifies you. You can also trigger /compact manually at any point. In SDK/non-interactive mode, the auto-compaction fires a notification your orchestrator can handle to log the summary or inject follow-up context before resuming.

Explore More Claude Code Skills

⚡ Using Claude Code? 30 power prompts that 2× your output · £5 £3 first 10Get PDF £3 →