Skip to main content

Progressive Disclosure: Claude-Mem’s Context Priming Philosophy

Core Principle

Show what exists and its retrieval cost first. Let the agent decide what to fetch based on relevance and need.

What is Progressive Disclosure?

Progressive disclosure is an information architecture pattern where you reveal complexity gradually rather than all at once. In the context of AI agents, it means:
  1. Layer 1 (Index): Show lightweight metadata (titles, dates, types, token counts)
  2. Layer 2 (Details): Fetch full content only when needed
  3. Layer 3 (Deep Dive): Read original source files if required
This mirrors how humans work: We scan headlines before reading articles, review table of contents before diving into chapters, and check file names before opening files.

The Problem: Context Pollution

Traditional RAG (Retrieval-Augmented Generation) systems fetch everything upfront:
❌ Traditional Approach:
┌─────────────────────────────────────┐
│ Session Start                        │
│                                      │
│ [15,000 tokens of past sessions]    │
│ [8,000 tokens of observations]      │
│ [12,000 tokens of file summaries]   │
│                                      │
│ Total: 35,000 tokens                │
│ Relevant: ~2,000 tokens (6%)        │
└─────────────────────────────────────┘
Problems:
  • Wastes 94% of attention budget on irrelevant context
  • User prompt gets buried under mountain of history
  • Agent must process everything before understanding task
  • No way to know what’s actually useful until after reading

Claude-Mem’s Solution: Progressive Disclosure

✅ Progressive Disclosure Approach:
┌─────────────────────────────────────┐
│ Session Start                        │
│                                      │
│ Index of 50 observations: ~800 tokens│
│ ↓                                    │
│ Agent sees: "🔴 Hook timeout issue"  │
│ Agent decides: "Relevant!"           │
│ ↓                                    │
│ Fetch observation #2543: ~120 tokens│
│                                      │
│ Total: 920 tokens                   │
│ Relevant: 920 tokens (100%)         │
└─────────────────────────────────────┘
Benefits:
  • Agent controls its own context consumption
  • Directly relevant to current task
  • Can fetch more if needed
  • Can skip everything if not relevant
  • Clear cost/benefit for each retrieval decision

How It Works in Claude-Mem

The Index Format

Every SessionStart hook provides a compact index:
### Oct 26, 2025

**General**
| ID | Time | T | Title | Tokens |
|----|------|---|-------|--------|
| #2586 | 12:58 AM | 🔵 | Context hook file exists but is empty | ~51 |
| #2587 | ″ | 🔵 | Context hook script file is empty | ~46 |
| #2589 | ″ | 🟡 | Investigated hook debug output docs | ~105 |

**src/hooks/context-hook.ts**
| ID | Time | T | Title | Tokens |
|----|------|---|-------|--------|
| #2591 | 1:15 AM | ⚖️ | Stderr messaging abandoned | ~155 |
| #2592 | 1:16 AM | ⚖️ | Web UI strategy redesigned | ~193 |
What the agent sees:
  • What exists: Observation titles give semantic meaning
  • When it happened: Timestamps for temporal context
  • What type: Icons indicate observation category
  • Retrieval cost: Token counts for informed decisions
  • Where to get it: MCP search tools referenced at bottom

The Legend System

🎯 session-request  - User's original goal
🔴 gotcha          - Critical edge case or pitfall
🟡 problem-solution - Bug fix or workaround
🔵 how-it-works    - Technical explanation
🟢 what-changed    - Code/architecture change
🟣 discovery       - Learning or insight
🟠 why-it-exists   - Design rationale
🟤 decision        - Architecture decision
⚖️ trade-off       - Deliberate compromise
Purpose:
  • Visual scanning (humans and AI both benefit)
  • Semantic categorization
  • Priority signaling (🔴 gotchas are more critical)
  • Pattern recognition across sessions

Progressive Disclosure Instructions

The index includes usage guidance:
💡 **Progressive Disclosure:** This index shows WHAT exists and retrieval COST.
- Use MCP search tools to fetch full observation details on-demand
- Prefer searching observations over re-reading code for past decisions
- Critical types (🔴 gotcha, 🟤 decision, ⚖️ trade-off) often worth fetching immediately
What this does:
  • Teaches the agent the pattern
  • Suggests when to fetch (critical types)
  • Recommends search over code re-reading (efficiency)
  • Makes the system self-documenting

The Philosophy: Context as Currency

Mental Model: Token Budget as Money

Think of context window as a bank account:
ApproachMetaphorOutcome
Dump everythingSpending your entire paycheck on groceries you might need somedayWaste, clutter, can’t afford what you actually need
Fetch nothingRefusing to spend any moneyStarvation, can’t accomplish tasks
Progressive disclosureCheck your pantry, make a shopping list, buy only what you needEfficiency, room for unexpected needs

The Attention Budget

LLMs have finite attention:
  • Every token attends to every other token (n² relationships)
  • 100,000 token window ≠ 100,000 tokens of useful attention
  • Context “rot” happens as window fills
  • Later tokens get less attention than earlier ones
Claude-Mem’s approach:
  • Start with ~1,000 tokens of index
  • Agent has 99,000 tokens free for task
  • Agent fetches ~200 tokens when needed
  • Final budget: ~98,000 tokens for actual work

Design for Autonomy

“As models improve, let them act intelligently”
Progressive disclosure treats the agent as an intelligent information forager, not a passive recipient of pre-selected context. Traditional RAG:
System → [Decides relevance] → Agent

   Hope this helps!
Progressive Disclosure:
System → [Shows index] → Agent → [Decides relevance] → [Fetches details]

                   You know best!
The agent knows:
  • The current task context
  • What information would help
  • How much budget to spend
  • When to stop searching
We don’t.

Implementation Principles

1. Make Costs Visible

Every item in the index shows token count:
| #2591 | 1:15 AM | ⚖️ | Stderr messaging abandoned | ~155 |
                                                        ^^^^
                                                    Retrieval cost
Why:
  • Agent can make informed ROI decisions
  • Small observations (~50 tokens) are “cheap” to fetch
  • Large observations (~500 tokens) require stronger justification
  • Matches how humans think about effort

2. Use Semantic Compression

Titles compress full observations into ~10 words: Bad title:
Observation about a thing
Good title:
🔴 Hook timeout issue: 60s default too short for npm install
What makes a good title:
  • Specific: Identifies exact issue
  • Actionable: Clear what to do
  • Self-contained: Doesn’t require reading observation
  • Searchable: Contains key terms (hook, timeout, npm)
  • Categorized: Icon indicates type

3. Group by Context

Observations are grouped by:
  • Date: Temporal context
  • File path: Spatial context (work on specific files)
  • Project: Logical context
**src/hooks/context-hook.ts**
| ID | Time | T | Title | Tokens |
|----|------|---|-------|--------|
| #2591 | 1:15 AM | ⚖️ | Stderr messaging abandoned | ~155 |
| #2594 | 1:17 AM | 🟠 | Removed stderr section from docs | ~93 |
Benefit: If agent is working on src/hooks/context-hook.ts, related observations are already grouped together.

4. Provide Retrieval Tools

The index is useless without retrieval mechanisms:
*Use claude-mem MCP search to access records with the given ID*
Available tools:
  • search_observations - Full-text search
  • find_by_concept - Concept-based retrieval
  • find_by_file - File-based retrieval
  • find_by_type - Type-based retrieval
  • get_recent_context - Recent session summaries
Each tool supports format: "index" (default) and format: "full".

Real-World Example

Scenario: Agent asked to fix a bug in hooks

Without progressive disclosure:
SessionStart injects 25,000 tokens of past context
Agent reads everything
Agent finds 1 relevant observation (buried in middle)
Total tokens consumed: 25,000
Relevant tokens: ~200
Efficiency: 0.8%
With progressive disclosure:
SessionStart shows index: ~800 tokens
Agent sees title: "🔴 Hook timeout issue: 60s too short"
Agent thinks: "This looks relevant to my bug!"
Agent fetches observation #2543: ~155 tokens
Total tokens consumed: 955
Relevant tokens: 955
Efficiency: 100%

The Index Entry

| #2543 | 2:14 PM | 🔴 | Hook timeout: 60s too short for npm install | ~155 |
What the agent learns WITHOUT fetching:
  • There’s a known gotcha (🔴) about hook timeouts
  • It’s related to npm install taking too long
  • Full details are ~155 tokens (cheap)
  • Happened at 2:14 PM (recent)
Decision tree:
Is my task related to hooks? → YES
Is my task related to timeouts? → YES
Is my task related to npm? → YES
155 tokens is cheap → FETCH IT

The Two-Tier Search Strategy

Claude-Mem implements progressive disclosure in search results too:

Tier 1: Index Format (Default)

search_observations({
  query: "hook timeout",
  format: "index"  // Default
})
Returns:
Found 3 observations matching "hook timeout":

| ID | Date | Type | Title | Tokens |
|----|------|------|-------|--------|
| #2543 | Oct 26 | gotcha | Hook timeout: 60s too short | ~155 |
| #2891 | Oct 25 | how-it-works | Hook timeout configuration | ~203 |
| #2102 | Oct 20 | problem-solution | Fixed timeout in CI | ~89 |
Cost: ~100 tokens for 3 results Value: Agent can scan and decide which to fetch

Tier 2: Full Format (On-Demand)

search_observations({
  query: "hook timeout",
  format: "full",
  limit: 1  // Fetch just the most relevant
})
Returns:
#2543 🔴 Hook timeout: 60s too short for npm install
─────────────────────────────────────────────────
Date: Oct 26, 2025 2:14 PM
Type: gotcha
Project: claude-mem

Narrative:
Discovered that the default 60-second hook timeout is insufficient
for npm install operations, especially with large dependency trees
or slow network conditions. This causes SessionStart hook to fail
silently, preventing context injection.

Facts:
- Default timeout: 60 seconds
- npm install with cold cache: ~90 seconds
- Configured timeout: 120 seconds in plugin/hooks/hooks.json:25

Files Modified:
- plugin/hooks/hooks.json

Concepts: hooks, timeout, npm, configuration
Cost: ~155 tokens for full details Value: Complete understanding of the issue

Cognitive Load Theory

Progressive disclosure is grounded in Cognitive Load Theory:

Intrinsic Load

The inherent difficulty of the task itself. Example: “Fix authentication bug”
  • Must understand auth system
  • Must understand the bug
  • Must write the fix
This load is unavoidable.

Extraneous Load

The cognitive burden of poorly presented information. Traditional RAG adds extraneous load:
  • Scanning irrelevant observations
  • Filtering out noise
  • Remembering what to ignore
  • Re-contextualizing after each section
Progressive disclosure minimizes extraneous load:
  • Scan titles (low effort)
  • Fetch only relevant (targeted effort)
  • Full attention on current task

Germane Load

The effort of building mental models and schemas. Progressive disclosure supports germane load:
  • Consistent structure (legend, grouping)
  • Clear categorization (types, icons)
  • Semantic compression (good titles)
  • Explicit costs (token counts)

Anti-Patterns to Avoid

❌ Verbose Titles

Bad:
| #2543 | 2:14 PM | 🔴 | Investigation into the issue where hooks time out | ~155 |
Good:
| #2543 | 2:14 PM | 🔴 | Hook timeout: 60s too short for npm install | ~155 |

❌ Hiding Costs

Bad:
| #2543 | 2:14 PM | 🔴 | Hook timeout issue |
Good:
| #2543 | 2:14 PM | 🔴 | Hook timeout issue | ~155 |

❌ No Retrieval Path

Bad:
Here are 10 observations. [No instructions on how to get full details]
Good:
Here are 10 observations.
*Use MCP search tools to fetch full observation details on-demand*

❌ Defaulting to Full Format

Bad:
search_observations({
  query: "hooks",
  format: "full"  // Fetches everything
})
Good:
search_observations({
  query: "hooks",
  format: "index",  // Scan first
  limit: 20
})

// Then, if needed:
search_observations({
  query: "hooks",
  format: "full",
  limit: 1  // Just the most relevant
})

Key Design Decisions

Why Token Counts?

Decision: Show approximate token counts (~155, ~203) rather than exact counts. Rationale:
  • Communicates scale (50 vs 500) without false precision
  • Maps to human intuition (small/medium/large)
  • Allows agent to budget attention
  • Encourages cost-conscious retrieval

Why Icons Instead of Text Labels?

Decision: Use emoji icons (🔴, 🟡, 🔵) rather than text (GOTCHA, PROBLEM, HOWTO). Rationale:
  • Visual scanning (pattern recognition)
  • Token efficient (1 char vs 10 chars)
  • Language-agnostic
  • Aesthetically distinct
  • Works for both humans and AI

Why Index-First, Not Smart Pre-Fetch?

Decision: Always show index first, even if we “know” what’s relevant. Rationale:
  • We can’t know what’s relevant better than the agent
  • Pre-fetching assumes we understand the task
  • Agent knows current context, we don’t
  • Respects agent autonomy
  • Fails gracefully (can always fetch more)

Why Group by File Path?

Decision: Group observations by file path in addition to date. Rationale:
  • Spatial locality: Work on file X likely needs context about file X
  • Reduces scanning effort
  • Matches how developers think
  • Clear semantic boundaries

Measuring Success

Progressive disclosure is working when:

✅ Low Waste Ratio

Relevant Tokens / Total Context Tokens > 80%
Most of the context consumed is actually useful.

✅ Selective Fetching

Index Shown: 50 observations
Details Fetched: 2-3 observations
Agent is being selective, not fetching everything.

✅ Fast Task Completion

Session with index: 30 seconds to find relevant context
Session without: 90 seconds scanning all context
Time-to-relevant-information is faster.

✅ Appropriate Depth

Simple task: Only index needed
Medium task: 1-2 observations fetched
Complex task: 5-10 observations + code reads
Depth scales with task complexity.

Future Enhancements

Adaptive Index Size

// Vary index size based on session type
SessionStart({ source: "startup" }):
Show last 10 sessions (small index)

SessionStart({ source: "resume" }):
Show only current session (micro index)

SessionStart({ source: "compact" }):
Show last 20 sessions (larger index)

Relevance Scoring

// Use embeddings to pre-sort index by relevance
search_observations({
  query: "authentication bug",
  format: "index",
  sort: "relevance"  // Based on semantic similarity
})

Cost Forecasting

💡 **Budget Estimate:**
- Fetching all 🔴 gotchas: ~450 tokens
- Fetching all file-related: ~1,200 tokens
- Fetching everything: ~8,500 tokens

Progressive Detail Levels

Layer 1: Index (titles only)
Layer 2: Summaries (2-3 sentences)
Layer 3: Full details (complete observation)
Layer 4: Source files (referenced code)

Key Takeaways

  1. Show, don’t tell: Index reveals what exists without forcing consumption
  2. Cost-conscious: Make retrieval costs visible for informed decisions
  3. Agent autonomy: Let the agent decide what’s relevant
  4. Semantic compression: Good titles make or break the system
  5. Consistent structure: Patterns reduce cognitive load
  6. Two-tier everything: Index first, details on-demand
  7. Context as currency: Spend wisely on high-value information

Remember

“The best interface is one that disappears when not needed, and appears exactly when it is.”
Progressive disclosure respects the agent’s intelligence and autonomy. We provide the map; the agent chooses the path.

Further Reading


This philosophy emerged from real-world usage of Claude-Mem across hundreds of coding sessions. The pattern works because it aligns with both human cognition and LLM attention mechanics.