Unverified

Fix Claude Code Conversation History Cache Invalidation (cch Sentinel Replacement) Causing 2-5x Token Burn in CLI, VS Code, and Standalone Binary — v2.1.69-v2.1.90

A native-layer bug in Anthropic's custom Bun fork causes Claude Code to permanently invalidate prompt cache across conversation turns. The root cause, identified through MITM proxy capture and Ghidra reverse engineering (jmarianski, Issue #40524), is a sentinel replacement mechanism that rewrites the billing attribution header (cch=00000) into historical tool results, breaking the cache prefix and forcing full cache rewrites on every subsequent turn. Symptoms include: 200K-300K cache_create token spikes per turn, weekly quotas exhausted in 1-2 hours instead of 5+, and API billing invoices 40-100% above key-level usage. Affected versions: v2.1.69 through v2.1.90 (v2.1.82 and v2.1.88 were never published to npm). A partial tool-schema-bytes fix landed in v2.1.89 (changelog commit 2d5c1ba). Multiple workarounds exist: set CLAUDE_CODE_ATTRIBUTION_HEADER=false (dead-simple, non-developer), downgrade to v2.1.68 (last known-good), or use npx instead of the binary. The issue was closed as completed on Apr 4, 2026. Latest Claude Code version is v2.1.173 (npm, verified 2026-06-11). For heavy skill/hook users, residual cache issues may still occur even on latest — use ANTHROPIC_LOG=debug to monitor.

Symptoms

Cache_read drops to 0 mid-session, followed by 200K-300K cache_create token writes on next turn
Weekly subscription quota exhausted in 1-2 hours instead of 5+
API billing invoices 40-100% higher than API key usage tracking
Claude Code session JSONL files show cache_creation spikes after every resume or model switch
Standalone binary users see 'cch=00000' sentinel in tool results within serialized request body

Error signatures

ephemeral_1h_input_tokens: 0, ephemeral_5m_input_tokens: 305735 (cache rewrite spike)

cache_read: 0, cache_creation: 307626 (total cache drop and rebuild)

cch=00000 sentinel in serialized conversation history

Input tokens spike from ~300 to 300K+ in single turn

x-anthropic-billing-header contains cch=00000 in system prompt block

Possible causes

Native-layer sentinel replacement in Anthropic's custom Bun fork rewrites billing attribution header (cch=00000) into tool results, changing the conversation hash and invalidating the entire cache prefix
Since v2.1.69, cli.js injects x-anthropic-billing-header with cch=00000 as the first system prompt block; when this sentinel appears in tool results, the native binary rewrites it, changing historical messages
Session JSONL writer in v2.1.69+ strips deferred_tools_delta attachment records before writing to disk, causing tool schema bytes to change on resume and break cache
Standalone binary (228MB ELF) uses a custom Bun fork with native caching logic that triggers sentinel replacement when cch= pattern detected in messages[] — npm/npx versions bypass this binary layer
Model switches mid-session (Opus→Sonnet throttle at 20% for Max 5x) trigger cache re-evaluation, which combined with the sentinel bug causes permanent cache misses

Solutions

Supplementary (After Primary Fix): Disable Dynamic Tool Search and Compact Frequently

risk: lowgithubpublished

After applying the primary fix (CLAUDE_CODE_ATTRIBUTION_HEADER=false), add ENABLE_TOOL_SEARCH=false to prevent runtime tool injection from shifting the conversation context. Also use /compact before pauses > 5 minutes. These are supplementary safety measures — apply the primary fix FIRST.

Confirm Solution 1 is already applied: grep CLAUDE_CODE_ATTRIBUTION_HEADER ~/.claude/settings.json → should show 'false'
Add ENABLE_TOOL_SEARCH=false to the same env block in settings.json
Use /compact before any pause longer than 5 minutes during coding sessions
Keep sessions short: one major task per Claude Code session reduces cache impact
Front-load CLAUDE.md with the most critical project context

Commands

# Add ENABLE_TOOL_SEARCH=false alongside the attribution header fix:

python3 -c "import json; p='$HOME/.claude/settings.json'; d=json.load(open(p)); d.setdefault('env',{})['ENABLE_TOOL_SEARCH']='false'; json.dump(d,open(p,'w'),indent=2)"

# Verify both env vars are set:

grep -E 'CLAUDE_CODE_ATTRIBUTION_HEADER|ENABLE_TOOL_SEARCH' ~/.claude/settings.json

Config examples

{
  "env": {
    "CLAUDE_CODE_ATTRIBUTION_HEADER": "false",
    "ENABLE_TOOL_SEARCH": "false"
  }
}

Risks

ENABLE_TOOL_SEARCH=false may reduce runtime tool discovery; most projects don't need dynamic tool search
/compact may lose some conversation context — avoid excessive use; once per major task boundary is sufficient

Verification

Step 1: grep ENABLE_TOOL_SEARCH ~/.claude/settings.json → expect: '"ENABLE_TOOL_SEARCH": "false"'
Step 2: grep CLAUDE_CODE_ATTRIBUTION_HEADER ~/.claude/settings.json → expect: '"CLAUDE_CODE_ATTRIBUTION_HEADER": "false"' (BOTH mitigations confirmed active)
Step 3: ANTHROPIC_LOG=debug claude -p 'test' 2>&1 | grep -c 'cch=00000' → expect: 0

✓ 0 verified✕ 0 failed

Diagnostic: Run /check-cache-bugs to Identify Which Cache Bugs Are Active in Your Setup

risk: lowgithubpublished

The community-developed /check-cache-bugs slash command (FlorianBruniaux) diagnoses all three known Claude Code cache bugs: Bug 1 (cch=00000 sentinel), Bug 2 (--resume deferred_tools_delta stripping), Bug 3 (attribution header absent on binary). Run this at the very START of a fresh session to get a clean diagnosis. If all three show CLEAN, your setup is safe.

mkdir -p ~/.claude/commands && curl -fsSL <URL> -o ~/.claude/commands/check-cache-bugs.md
Start a BRAND NEW Claude Code session — do NOT use --resume or --continue
As the VERY FIRST action: /check-cache-bugs
Review the three-line output for CLEAN/ACTIVE status
Safe alternative (no session pollution): claude -p "$(cat ~/.claude/commands/check-cache-bugs.md)"

Commands

mkdir -p ~/.claude/commands

curl -fsSL https://raw.githubusercontent.com/FlorianBruniaux/claude-code-ultimate-guide/main/examples/commands/check-cache-bugs.md -o ~/.claude/commands/check-cache-bugs.md

# START A NEW SESSION, then run: /check-cache-bugs

Risks

CRITICAL: Running mid-session injects cch= strings into context → triggers Bug 1 on standalone binary. Always use NEW session or print mode.
Community tool, not officially supported by Anthropic

Verification

Step 1: After /check-cache-bugs, output must show exactly: 'Bug 1: CLEAN', 'Bug 2: CLEAN', 'Bug 3: CLEAN' → your setup is fully protected
Step 2: If any bug shows ACTIVE, the output includes the exact fix command — apply it and re-run in a NEW session
Step 3: Print-mode alternative: claude -p "$(cat ~/.claude/commands/check-cache-bugs.md)" → expect diagnostic output to appear without errors

✓ 0 verified✕ 0 failed

Fallback Fix: Downgrade to v2.1.68 (Last Known-Good) or Use Latest npm + Workaround

risk: lowgithubpublished

The cch sentinel replacement bug lives in the standalone binary's native Bun fork. v2.1.68 predates the bug entirely. If you need modern features, use the latest npm version (v2.1.173 as of June 2026) with the CLAUDE_CODE_ATTRIBUTION_HEADER workaround. This solution gives you two clear paths: safety (v2.1.68) or recency (latest + workaround).

PATH A (safest): npm install -g @anthropic-ai/claude-code@2.1.68
PATH B (latest): npm install -g @anthropic-ai/claude-code@latest, then apply Solution 1 from above
Verify version: claude --version should show your chosen version
CRITICAL: Confirm you are NOT running the standalone binary — run file $(which claude)

Commands

# PATH A: downgrade to last known-good (zero sentinel bug)

npm install -g @anthropic-ai/claude-code@2.1.68

claude --version

# PATH B: latest npm with workaround

npm install -g @anthropic-ai/claude-code@latest

npx @anthropic-ai/claude-code --version

# Confirm not using the standalone binary:

file $(which claude)

Risks

v2.1.68 lacks features from 100+ subsequent releases; but for most coding tasks, it is fully functional
Latest npm may still have residual cache issues for users with heavy skill/hook configurations — monitor with ANTHROPIC_LOG=debug

Verification

Step 1: claude --version → expect: '2.1.68 (Claude Code)' OR '2.1.173 (Claude Code)' (matching your chosen path)
Step 2: file $(which claude) → expect output containing 'script' or 'symbolic link', NOT 'ELF 64-bit LSB executable' (standalone binary confirmed absent)
Step 3: npm list -g @anthropic-ai/claude-code 2>/dev/null | grep @anthropic-ai → expect: version matches your chosen release
Step 4: ANTHROPIC_LOG=debug npx @anthropic-ai/claude-code -p 'test' 2>&1 | grep -c 'cch=00000' → expect: 0

✓ 0 verified✕ 0 failed

Primary Fix (Non-Developer, Dead-Simple): Set CLAUDE_CODE_ATTRIBUTION_HEADER=false in Settings

risk: lowgithubpublished

Add CLAUDE_CODE_ATTRIBUTION_HEADER=false to your Claude Code settings. This is a DEAD-SIMPLE fix requiring zero programming — just add one line to your settings file. This stops the cch=00000 sentinel from entering the conversation and prevents the native binary from rewriting tool results. This is a proper feature toggle in the Claude Code source and is used by multiple third-party projects. No version change needed.

Check if settings file exists: ls ~/.claude/settings.json 2>/dev/null || echo '{}' > ~/.claude/settings.json
Add the env block (use the config_example below — copy-paste it directly)
If you prefer a one-liner: python3 -c "exec(open('/dev/stdin').read())" and paste the short script OR manually edit the file
Restart Claude Code by closing ALL sessions and opening a fresh one
Run the ANTHROPIC_LOG=debug verification command to confirm the fix

Commands

# One-liner: create settings.json if missing, then set the env var

ls ~/.claude/settings.json 2>/dev/null || echo '{}' > ~/.claude/settings.json

# Python approach (requires python3):

python3 -c "import json; p='$HOME/.claude/settings.json'; d=json.load(open(p)); d.setdefault('env',{})['CLAUDE_CODE_ATTRIBUTION_HEADER']='false'; json.dump(d,open(p,'w'),indent=2)"

# OR manually: open ~/.claude/settings.json and add the env block from config_examples below

# Verify the setting took effect:

grep 'CLAUDE_CODE_ATTRIBUTION_HEADER' ~/.claude/settings.json

# Debug: check if sentinel is present in a quick one-shot session

ANTHROPIC_LOG=debug claude -p 'test' 2>&1 | grep -c 'cch=00000'

Config examples

{
  "env": {
    "CLAUDE_CODE_ATTRIBUTION_HEADER": "false"
  }
}

Risks

Disabling the attribution header may affect Anthropic's ability to track and debug your sessions; this is a supported feature toggle — not an unsupported hack
If hooks/skills reference cch= values, update them: grep -r 'cch=' ~/.claude/ to check

Verification

Step 1: grep -r 'cch=00000' ~/.claude/settings.json 2>/dev/null | wc -l → expect: 0 (no sentinel in settings file)
Step 2: grep 'CLAUDE_CODE_ATTRIBUTION_HEADER' ~/.claude/settings.json → expect: '"CLAUDE_CODE_ATTRIBUTION_HEADER": "false"' (setting is active)
Step 3: ANTHROPIC_LOG=debug claude -p 'test' 2>&1 | grep -c 'cch=00000' → expect: 0 (no sentinel in API request debug output)
Step 4: After a 10+ turn session, analyze the latest JSONL: ls ~/.claude/projects/*/agent-*.jsonl 2>/dev/null | tail -1 | xargs tail -3 | python3 -c "import sys,json; lines=[json.loads(l) for l in sys.stdin if l.strip()]; print('Last 3 turns — cache_read:', [l.get('usage',{}).get('cache_read',{}).get('ephemeral_5m_input_tokens',0) for l in lines], 'cache_create:', [l.get('usage',{}).get('cache_creation',{}).get('ephemeral_5m_input_tokens',0) for l in lines])" → expect: cache_read consistently > 100K, cache_create < 10K per turn (no 200K-300K spikes)

✓ 0 verified✕ 0 failed

Agent JSON

Canonical machine-readable representation of this issue:

{
  "issue_id": "6814f9f3-4c8b-4de5-9e4f-7d1d29c3a887",
  "slug": "fix-claude-code-conversation-history-cache-invalidation-cch-sentinel-replacement-causing-2-5x-token-burn-in-cli-vs-code--gbhruw",
  "verification_status": "unverified",
  "canonical_json": "https://codekb.dev/v1/issues/fix-claude-code-conversation-history-cache-invalidation-cch-sentinel-replacement-causing-2-5x-token-burn-in-cli-vs-code--gbhruw"
}