Token Efficiency Tactics

Tier 1 “easy win” hacks

start fresh conversations with /clear between unrelated tasks
disconnect unused MCP servers
batch multiple instructions into a single prompt
use a “plan mode” so Claude clarifies requirements before doing heavy work
/context and /cost for visibility, adding a status line in the terminal, keeping the usage dashboard open, being selective when pasting large code or documents, and watching Claude run long tasks so you can stop wasteful loops early.

Tier 2 intermediate strategies

Tier 2 focuses on structural optimizations: keeping your claude.md file lean (under ~200 lines) and using it as an index that points to larger resources instead of embedding them. He advises referring Claude to specific files or functions, manually compacting around 60% context instead of waiting for auto-compact, clearing or compacting before >5‑minute breaks to avoid cache resets, and limiting noisy command outputs that silently bloat context.

Tier 3 advanced tactics and mindset

Tier 3 covers model selection (Sonnet for most coding, Haiku for sub-agents and light tasks, Opus only for deep planning), careful use of sub-agents because they are 7–10x more token-hungry, scheduling heavy sessions for off-peak hours, and treating claude.md as a “systems constitution” that stores stable decisions. He reframes hitting your limit as a sign of power usage—as long as you apply these practices—and closes with a checklist of actions like running /context, setting a status line, disconnecting MCP servers, compacting at 60%, and timing heavy work before resets or during off-peak windows.