How Claude Code Uses Context
Claude Code appends everything to a running conversation context: your messages, Claude's responses, tool call results (file reads, grep outputs, bash stdout), and CLAUDE.md contents. Every new request sends the entire accumulated context to the API as input tokens.
With Claude Sonnet or Opus, the context window is 200,000 tokens — roughly 150,000 words. That sounds enormous, but fills quickly during real coding sessions:
| Operation | Approx. tokens added |
|---|---|
| Read a 500-line file | ~750 tokens |
| Read a 3,000-line file | ~4,500 tokens |
| Run test suite with 100 failures output | ~3,000–8,000 tokens |
| Large git diff (50 files) | ~15,000–40,000 tokens |
| CLAUDE.md (500 lines) | ~750 tokens per request |
| Average Claude reply | ~300–800 tokens |
| One full coding round-trip | ~2,000–5,000 tokens added |
A session with 20–30 round-trips, some file reads, and test output can easily reach 80,000–120,000 tokens. Claude Code warns you when you approach the limit.
/compact vs /clear — Which to Use
/compact — Summarize & Continue
Claude reads the full context, writes a structured summary of progress, decisions, and current state, then replaces the history with that summary. Session continues from where you left off.
✅ Use when: mid-task, you want to keep context of what's been done
/clear — Reset Context
Discards all conversation history entirely. Returns to a clean slate with only CLAUDE.md loaded. Faster than /compact — no summarization step.
⚠️ Use when: switching to a completely different task, or after a dead-end investigation
/compact proactively when you notice context growing large, not after hitting the limit. A good time is after completing a discrete subtask — the summary then accurately captures a complete unit of work.
Strategies to Extend Context for Large Repos
1. Read Selectively
Instead of "read the codebase and tell me what's wrong," say "I think the bug is in src/auth — read src/auth/middleware.ts and src/auth/session.ts." Every unnecessary file read consumes tokens from an irreplaceable pool.
2. Prefer Grep Over File Reads
When searching for a symbol, a grep search returns only matching lines (cheap). Reading the file to find it returns everything (expensive). Instruct Claude: "grep for the function, then read only the relevant section."
3. Trim Bash Output
Long test failures or build logs are token-expensive. Ask Claude to pipe output:
# Instead of: npm test (dumps full output)
# Ask Claude to run: npm test 2>&1 | head -100
# Or: npm test 2>&1 | grep -E "FAIL|Error|×" | head -50
4. Break Into Sessions
For large multi-day tasks, break into self-contained subtasks across separate sessions. Use CLAUDE.md or a TODO.md file to carry context between sessions — notes committed to disk survive session boundaries; in-context notes don't.
5. Use Sub-Agents for Isolated Work
Each sub-agent gets its own fresh 200K context. Delegate self-contained subtasks — "analyze this module and return a summary" — to a sub-agent. The orchestrating agent only consumes the returned summary, not the agent's full working context.
// Pattern: Delegate analysis, consume only the result
Agent({
prompt: "Read src/billing/ and summarize the pricing logic in under 200 words",
// Sub-agent gets its own fresh context; returns only the summary
})
6. Use /compact Before Switching Tasks
If you finish one subtask and want to start another in the same session, run /compact first. The summary shrinks history; you get more headroom for the next task without the context snapshot of everything you did before.
What /compact Includes in Its Summary
The auto-compaction summary typically covers:
- What task was being worked on and its current completion status
- Files that were created or modified (and the nature of the change)
- Key decisions made and their rationale
- Tests that were run and their results
- Any blockers or open questions
- The next step, if one was stated
After compaction, Claude picks up from this summary — it will know what was done and what's next, but won't remember specific code snippets from the compressed history (those are in the files on disk).
Context Management in Automated / CI Runs
When using Claude Code via the SDK non-interactively (e.g. in CI), auto-compaction fires when the context reaches a threshold. Your orchestrator receives a compaction notification. Best practice:
// In SDK usage: handle compaction events
on('compact', (summary) => {
console.log('Context compacted:', summary.text)
// Optionally inject a follow-up instruction to re-orient the model
resume("Continue from the compact summary above. Next step: run the tests.")
})