Progressive Disclosure: Claude-Mem’s Context Priming Philosophy

Core Principle

Show what exists and its retrieval cost first. Let the agent decide what to fetch based on relevance and need.

What is Progressive Disclosure?

Progressive disclosure is an information architecture pattern where you reveal complexity gradually rather than all at once. In the context of AI agents, it means:

Layer 1 (Index): Show lightweight metadata (titles, dates, types, token counts)
Layer 2 (Details): Fetch full content only when needed
Layer 3 (Deep Dive): Read original source files if required

This mirrors how humans work: We scan headlines before reading articles, review table of contents before diving into chapters, and check file names before opening files.

The Problem: Context Pollution

Traditional RAG (Retrieval-Augmented Generation) systems fetch everything upfront:

❌ Traditional Approach:
┌─────────────────────────────────────┐
│ Session Start                        │
│                                      │
│ [15,000 tokens of past sessions]    │
│ [8,000 tokens of observations]      │
│ [12,000 tokens of file summaries]   │
│                                      │
│ Total: 35,000 tokens                │
│ Relevant: ~2,000 tokens (6%)        │
└─────────────────────────────────────┘

Problems:

Wastes 94% of attention budget on irrelevant context
User prompt gets buried under mountain of history
Agent must process everything before understanding task
No way to know what’s actually useful until after reading

Claude-Mem’s Solution: Progressive Disclosure

✅ Progressive Disclosure Approach:
┌─────────────────────────────────────┐
│ Session Start                        │
│                                      │
│ Index of 50 observations: ~800 tokens│
│ ↓                                    │
│ Agent sees: "🔴 Hook timeout issue"  │
│ Agent decides: "Relevant!"           │
│ ↓                                    │
│ Fetch observation #2543: ~120 tokens│
│                                      │
│ Total: 920 tokens                   │
│ Relevant: 920 tokens (100%)         │
└─────────────────────────────────────┘

Benefits:

Agent controls its own context consumption
Directly relevant to current task
Can fetch more if needed
Can skip everything if not relevant
Clear cost/benefit for each retrieval decision

How It Works in Claude-Mem

The Index Format

Every SessionStart hook provides a compact index:

### Oct 26, 2025

**General**
| ID | Time | T | Title | Tokens |
|----|------|---|-------|--------|
| #2586 | 12:58 AM | 🔵 | Context hook file exists but is empty | ~51 |
| #2587 | ″ | 🔵 | Context hook script file is empty | ~46 |
| #2589 | ″ | 🟡 | Investigated hook debug output docs | ~105 |

**src/hooks/context-hook.ts**
| ID | Time | T | Title | Tokens |
|----|------|---|-------|--------|
| #2591 | 1:15 AM | ⚖️ | Stderr messaging abandoned | ~155 |
| #2592 | 1:16 AM | ⚖️ | Web UI strategy redesigned | ~193 |

What the agent sees:

What exists: Observation titles give semantic meaning
When it happened: Timestamps for temporal context
What type: Icons indicate observation category
Retrieval cost: Token counts for informed decisions
Where to get it: MCP search tools referenced at bottom

The Legend System

🎯 session-request  - User's original goal
🔴 gotcha          - Critical edge case or pitfall
🟡 problem-solution - Bug fix or workaround
🔵 how-it-works    - Technical explanation
🟢 what-changed    - Code/architecture change
🟣 discovery       - Learning or insight
🟠 why-it-exists   - Design rationale
🟤 decision        - Architecture decision
⚖️ trade-off       - Deliberate compromise

Purpose:

Visual scanning (humans and AI both benefit)
Semantic categorization
Priority signaling (🔴 gotchas are more critical)
Pattern recognition across sessions

Progressive Disclosure Instructions

The index includes usage guidance:

💡 **Progressive Disclosure:** This index shows WHAT exists and retrieval COST.
- Use MCP search tools to fetch full observation details on-demand
- Prefer searching observations over re-reading code for past decisions
- Critical types (🔴 gotcha, 🟤 decision, ⚖️ trade-off) often worth fetching immediately

What this does:

Teaches the agent the pattern
Suggests when to fetch (critical types)
Recommends search over code re-reading (efficiency)
Makes the system self-documenting

The Philosophy: Context as Currency

Mental Model: Token Budget as Money

Think of context window as a bank account:

Approach	Metaphor	Outcome
Dump everything	Spending your entire paycheck on groceries you might need someday	Waste, clutter, can’t afford what you actually need
Fetch nothing	Refusing to spend any money	Starvation, can’t accomplish tasks
Progressive disclosure	Check your pantry, make a shopping list, buy only what you need	Efficiency, room for unexpected needs

The Attention Budget

LLMs have finite attention:

Every token attends to every other token (n² relationships)
100,000 token window ≠ 100,000 tokens of useful attention
Context “rot” happens as window fills
Later tokens get less attention than earlier ones

Claude-Mem’s approach:

Start with ~1,000 tokens of index
Agent has 99,000 tokens free for task
Agent fetches ~200 tokens when needed
Final budget: ~98,000 tokens for actual work

Design for Autonomy

“As models improve, let them act intelligently”

Progressive disclosure treats the agent as an intelligent information forager, not a passive recipient of pre-selected context. Traditional RAG:

System → [Decides relevance] → Agent
        ↑
   Hope this helps!

Progressive Disclosure:

System → [Shows index] → Agent → [Decides relevance] → [Fetches details]
                          ↑
                   You know best!

The agent knows:

The current task context
What information would help
How much budget to spend
When to stop searching

We don’t.

Implementation Principles

1. Make Costs Visible

Every item in the index shows token count:

| #2591 | 1:15 AM | ⚖️ | Stderr messaging abandoned | ~155 |
                                                        ^^^^
                                                    Retrieval cost

Why:

Agent can make informed ROI decisions
Small observations (~50 tokens) are “cheap” to fetch
Large observations (~500 tokens) require stronger justification
Matches how humans think about effort

2. Use Semantic Compression

Titles compress full observations into ~10 words: Bad title:

Observation about a thing

Good title:

🔴 Hook timeout issue: 60s default too short for npm install

What makes a good title:

Specific: Identifies exact issue
Actionable: Clear what to do
Self-contained: Doesn’t require reading observation
Searchable: Contains key terms (hook, timeout, npm)
Categorized: Icon indicates type

3. Group by Context

Observations are grouped by:

Date: Temporal context
File path: Spatial context (work on specific files)
Project: Logical context

**src/hooks/context-hook.ts**
| ID | Time | T | Title | Tokens |
|----|------|---|-------|--------|
| #2591 | 1:15 AM | ⚖️ | Stderr messaging abandoned | ~155 |
| #2594 | 1:17 AM | 🟠 | Removed stderr section from docs | ~93 |

Benefit: If agent is working on src/hooks/context-hook.ts, related observations are already grouped together.

4. Provide Retrieval Tools

The index is useless without retrieval mechanisms:

*Use claude-mem MCP search to access records with the given ID*

Available tools:

search_observations - Full-text search
find_by_concept - Concept-based retrieval
find_by_file - File-based retrieval
find_by_type - Type-based retrieval
get_recent_context - Recent session summaries

Each tool supports format: "index" (default) and format: "full".

Real-World Example

Scenario: Agent asked to fix a bug in hooks

Without progressive disclosure:

SessionStart injects 25,000 tokens of past context
Agent reads everything
Agent finds 1 relevant observation (buried in middle)
Total tokens consumed: 25,000
Relevant tokens: ~200
Efficiency: 0.8%

With progressive disclosure:

SessionStart shows index: ~800 tokens
Agent sees title: "🔴 Hook timeout issue: 60s too short"
Agent thinks: "This looks relevant to my bug!"
Agent fetches observation #2543: ~155 tokens
Total tokens consumed: 955
Relevant tokens: 955
Efficiency: 100%

The Index Entry

| #2543 | 2:14 PM | 🔴 | Hook timeout: 60s too short for npm install | ~155 |

What the agent learns WITHOUT fetching:

There’s a known gotcha (🔴) about hook timeouts
It’s related to npm install taking too long
Full details are ~155 tokens (cheap)
Happened at 2:14 PM (recent)

Decision tree:

Is my task related to hooks? → YES
Is my task related to timeouts? → YES
Is my task related to npm? → YES
155 tokens is cheap → FETCH IT

The Two-Tier Search Strategy

Claude-Mem implements progressive disclosure in search results too:

Tier 1: Index Format (Default)

search_observations({
  query: "hook timeout",
  format: "index"  // Default
})

Returns:

Found 3 observations matching "hook timeout":

| ID | Date | Type | Title | Tokens |
|----|------|------|-------|--------|
| #2543 | Oct 26 | gotcha | Hook timeout: 60s too short | ~155 |
| #2891 | Oct 25 | how-it-works | Hook timeout configuration | ~203 |
| #2102 | Oct 20 | problem-solution | Fixed timeout in CI | ~89 |

Cost: ~100 tokens for 3 results Value: Agent can scan and decide which to fetch

Tier 2: Full Format (On-Demand)

search_observations({
  query: "hook timeout",
  format: "full",
  limit: 1  // Fetch just the most relevant
})

Returns:

#2543 🔴 Hook timeout: 60s too short for npm install
─────────────────────────────────────────────────
Date: Oct 26, 2025 2:14 PM
Type: gotcha
Project: claude-mem

Narrative:
Discovered that the default 60-second hook timeout is insufficient
for npm install operations, especially with large dependency trees
or slow network conditions. This causes SessionStart hook to fail
silently, preventing context injection.

Facts:
- Default timeout: 60 seconds
- npm install with cold cache: ~90 seconds
- Configured timeout: 120 seconds in plugin/hooks/hooks.json:25

Files Modified:
- plugin/hooks/hooks.json

Concepts: hooks, timeout, npm, configuration

Cost: ~155 tokens for full details Value: Complete understanding of the issue

Cognitive Load Theory

Progressive disclosure is grounded in Cognitive Load Theory:

Intrinsic Load

The inherent difficulty of the task itself. Example: “Fix authentication bug”

Must understand auth system
Must understand the bug
Must write the fix

This load is unavoidable.

Extraneous Load

The cognitive burden of poorly presented information. Traditional RAG adds extraneous load:

Scanning irrelevant observations
Filtering out noise
Remembering what to ignore
Re-contextualizing after each section

Progressive disclosure minimizes extraneous load:

Scan titles (low effort)
Fetch only relevant (targeted effort)
Full attention on current task

Germane Load

The effort of building mental models and schemas. Progressive disclosure supports germane load:

Consistent structure (legend, grouping)
Clear categorization (types, icons)
Semantic compression (good titles)
Explicit costs (token counts)

Anti-Patterns to Avoid

❌ Verbose Titles

Bad:

| #2543 | 2:14 PM | 🔴 | Investigation into the issue where hooks time out | ~155 |

Good:

| #2543 | 2:14 PM | 🔴 | Hook timeout: 60s too short for npm install | ~155 |

❌ Hiding Costs

Bad:

| #2543 | 2:14 PM | 🔴 | Hook timeout issue |

Good:

| #2543 | 2:14 PM | 🔴 | Hook timeout issue | ~155 |

❌ No Retrieval Path

Bad:

Here are 10 observations. [No instructions on how to get full details]

Good:

Here are 10 observations.
*Use MCP search tools to fetch full observation details on-demand*

❌ Defaulting to Full Format

Bad:

search_observations({
  query: "hooks",
  format: "full"  // Fetches everything
})

Good:

search_observations({
  query: "hooks",
  format: "index",  // Scan first
  limit: 20
})

// Then, if needed:
search_observations({
  query: "hooks",
  format: "full",
  limit: 1  // Just the most relevant
})

Key Design Decisions

Why Token Counts?

Decision: Show approximate token counts (~155, ~203) rather than exact counts. Rationale:

Communicates scale (50 vs 500) without false precision
Maps to human intuition (small/medium/large)
Allows agent to budget attention
Encourages cost-conscious retrieval

Why Icons Instead of Text Labels?

Decision: Use emoji icons (🔴, 🟡, 🔵) rather than text (GOTCHA, PROBLEM, HOWTO). Rationale:

Visual scanning (pattern recognition)
Token efficient (1 char vs 10 chars)
Language-agnostic
Aesthetically distinct
Works for both humans and AI

Why Index-First, Not Smart Pre-Fetch?

Decision: Always show index first, even if we “know” what’s relevant. Rationale:

We can’t know what’s relevant better than the agent
Pre-fetching assumes we understand the task
Agent knows current context, we don’t
Respects agent autonomy
Fails gracefully (can always fetch more)

Why Group by File Path?

Decision: Group observations by file path in addition to date. Rationale:

Spatial locality: Work on file X likely needs context about file X
Reduces scanning effort
Matches how developers think
Clear semantic boundaries

Measuring Success

Progressive disclosure is working when:

✅ Low Waste Ratio

Relevant Tokens / Total Context Tokens > 80%

Most of the context consumed is actually useful.

✅ Selective Fetching

Index Shown: 50 observations
Details Fetched: 2-3 observations

Agent is being selective, not fetching everything.

✅ Fast Task Completion

Session with index: 30 seconds to find relevant context
Session without: 90 seconds scanning all context

Time-to-relevant-information is faster.

✅ Appropriate Depth

Simple task: Only index needed
Medium task: 1-2 observations fetched
Complex task: 5-10 observations + code reads

Depth scales with task complexity.

Future Enhancements

Adaptive Index Size

// Vary index size based on session type
SessionStart({ source: "startup" }):
  → Show last 10 sessions (small index)

SessionStart({ source: "resume" }):
  → Show only current session (micro index)

SessionStart({ source: "compact" }):
  → Show last 20 sessions (larger index)

Relevance Scoring

// Use embeddings to pre-sort index by relevance
search_observations({
  query: "authentication bug",
  format: "index",
  sort: "relevance"  // Based on semantic similarity
})

Cost Forecasting

💡 **Budget Estimate:**
- Fetching all 🔴 gotchas: ~450 tokens
- Fetching all file-related: ~1,200 tokens
- Fetching everything: ~8,500 tokens

Progressive Detail Levels

Layer 1: Index (titles only)
Layer 2: Summaries (2-3 sentences)
Layer 3: Full details (complete observation)
Layer 4: Source files (referenced code)

Key Takeaways

Show, don’t tell: Index reveals what exists without forcing consumption
Cost-conscious: Make retrieval costs visible for informed decisions
Agent autonomy: Let the agent decide what’s relevant
Semantic compression: Good titles make or break the system
Consistent structure: Patterns reduce cognitive load
Two-tier everything: Index first, details on-demand
Context as currency: Spend wisely on high-value information

Remember

“The best interface is one that disappears when not needed, and appears exactly when it is.”

Progressive disclosure respects the agent’s intelligence and autonomy. We provide the map; the agent chooses the path.

Get Started

Best Practices

Configuration & Development

Architecture

​Progressive Disclosure: Claude-Mem’s Context Priming Philosophy

​Core Principle

​What is Progressive Disclosure?

​The Problem: Context Pollution

​Claude-Mem’s Solution: Progressive Disclosure

​How It Works in Claude-Mem

​The Index Format

​The Legend System

​Progressive Disclosure Instructions

​The Philosophy: Context as Currency

​Mental Model: Token Budget as Money

​The Attention Budget

​Design for Autonomy

​Implementation Principles

​1. Make Costs Visible

​2. Use Semantic Compression

​3. Group by Context

​4. Provide Retrieval Tools

​Real-World Example

​Scenario: Agent asked to fix a bug in hooks

​The Index Entry

​The Two-Tier Search Strategy

​Tier 1: Index Format (Default)

​Tier 2: Full Format (On-Demand)

​Cognitive Load Theory

​Intrinsic Load

​Extraneous Load

​Germane Load

​Anti-Patterns to Avoid

​❌ Verbose Titles

​❌ Hiding Costs

​❌ No Retrieval Path

​❌ Defaulting to Full Format

​Key Design Decisions

​Why Token Counts?

​Why Icons Instead of Text Labels?

​Why Index-First, Not Smart Pre-Fetch?

​Why Group by File Path?

​Measuring Success

​✅ Low Waste Ratio

​✅ Selective Fetching

​✅ Fast Task Completion

​✅ Appropriate Depth

​Future Enhancements

​Adaptive Index Size

​Relevance Scoring

​Cost Forecasting

​Progressive Detail Levels

​Key Takeaways

​Remember

​Further Reading

Progressive Disclosure: Claude-Mem’s Context Priming Philosophy

Core Principle

What is Progressive Disclosure?

The Problem: Context Pollution

Claude-Mem’s Solution: Progressive Disclosure

How It Works in Claude-Mem

The Index Format

The Legend System

Progressive Disclosure Instructions

The Philosophy: Context as Currency

Mental Model: Token Budget as Money

The Attention Budget

Design for Autonomy

Implementation Principles

1. Make Costs Visible

2. Use Semantic Compression

3. Group by Context

4. Provide Retrieval Tools

Real-World Example

Scenario: Agent asked to fix a bug in hooks

The Index Entry

The Two-Tier Search Strategy

Tier 1: Index Format (Default)

Tier 2: Full Format (On-Demand)

Cognitive Load Theory

Intrinsic Load

Extraneous Load

Germane Load

Anti-Patterns to Avoid

❌ Verbose Titles

❌ Hiding Costs

❌ No Retrieval Path

❌ Defaulting to Full Format

Key Design Decisions

Why Token Counts?

Why Icons Instead of Text Labels?

Why Index-First, Not Smart Pre-Fetch?

Why Group by File Path?

Measuring Success

✅ Low Waste Ratio

✅ Selective Fetching

✅ Fast Task Completion

✅ Appropriate Depth

Future Enhancements

Adaptive Index Size

Relevance Scoring

Cost Forecasting

Progressive Detail Levels

Key Takeaways

Remember

Further Reading