Skip to main content

Memory Search with MCP Tools

Claude-mem provides persistent memory across sessions through 4 MCP tools that follow a token-efficient 3-layer workflow pattern.

Overview

Instead of fetching all historical data upfront (expensive), claude-mem uses a progressive disclosure approach:
  1. Search → Get a compact index with IDs (~50-100 tokens/result)
  2. Timeline → Get context around interesting results
  3. Get Observations → Fetch full details ONLY for filtered IDs
This achieves ~10x token savings compared to traditional RAG approaches.

The 3-Layer Workflow

Layer 1: Search (Index)

Start by searching to get a lightweight index of results:
search(query="authentication bug", type="bugfix", limit=10)
Returns: Compact table with IDs, titles, dates, types Cost: ~50-100 tokens per result Purpose: Survey what exists before fetching details

Layer 2: Timeline (Context)

Get chronological context around specific observations:
timeline(anchor=<observation_id>, depth_before=3, depth_after=3)
Or search and get timeline in one step:
timeline(query="authentication", depth_before=2, depth_after=2)
Returns: Chronological view showing what was happening before/after Cost: Variable, depends on depth Purpose: Understand narrative arc and context

Layer 3: Get Observations (Details)

Fetch full details only for relevant observations:
get_observations(ids=[123, 456, 789])
Returns: Complete observation details (narrative, facts, files, concepts) Cost: ~500-1000 tokens per observation Purpose: Deep dive on specific, validated items

Why This Works

Traditional Approach:
  • Fetch everything upfront: 20,000 tokens
  • Relevance: ~10% (2,000 tokens actually useful)
  • Waste: 18,000 tokens on irrelevant context
3-Layer Approach:
  • Search index: 1,000 tokens (10 results)
  • Timeline context: 500 tokens (around 2 key results)
  • Fetch details: 1,500 tokens (3 observations)
  • Total: 3,000 tokens, 100% relevant

Available Tools

__IMPORTANT - Workflow Documentation

Always visible reminder of the 3-layer workflow pattern. Helps Claude understand how to use the search tools efficiently. Usage: Automatically shown, no need to invoke

search - Search Memory Index

Search your memory and get a compact index with IDs. Parameters:
  • query - Full-text search query (supports AND, OR, NOT, phrase searches)
  • limit - Maximum results (default: 20)
  • offset - Skip first N results for pagination
  • type - Filter by observation type (bugfix, feature, decision, discovery, refactor, change)
  • obs_type - Filter by record type (observation, session, prompt)
  • project - Filter by project name
  • dateStart - Filter by start date (YYYY-MM-DD)
  • dateEnd - Filter by end date (YYYY-MM-DD)
  • orderBy - Sort order (date_desc, date_asc, relevance)
Returns: Compact index table with IDs, titles, dates, types Example:
search(query="database migration", type="bugfix", limit=5, orderBy="date_desc")

timeline - Get Chronological Context

Get a chronological view of observations around a specific point or query. Parameters:
  • anchor - Observation ID to center timeline around (optional if query provided)
  • query - Search query to find anchor automatically (optional if anchor provided)
  • depth_before - Number of observations before anchor (default: 3)
  • depth_after - Number of observations after anchor (default: 3)
  • project - Filter by project name
Returns: Chronological list showing what happened before/during/after Example:
timeline(anchor=12345, depth_before=5, depth_after=5)
Or search-based:
timeline(query="implemented JWT auth", depth_before=3, depth_after=3)

get_observations - Fetch Full Details

Fetch complete observation details by IDs. Always batch multiple IDs in a single call for efficiency. Parameters:
  • ids - Array of observation IDs (required)
  • orderBy - Sort order (date_desc, date_asc)
  • limit - Maximum observations to return
  • project - Filter by project name
Returns: Complete observation details including narrative, facts, files, concepts Example:
get_observations(ids=[123, 456, 789, 1011])
Important: Always batch IDs instead of making separate calls per observation.

Common Use Cases

Debugging Issues

Scenario: Find what went wrong with database connections
Step 1: search(query="error database connection", type="bugfix", limit=10)
  → Review index, identify observations #245, #312, #489

Step 2: timeline(anchor=312, depth_before=3, depth_after=3)
  → See what was happening around the fix

Step 3: get_observations(ids=[312, 489])
  → Get full details on relevant fixes

Understanding Decisions

Scenario: Review architectural choices about authentication
Step 1: search(query="authentication", type="decision", limit=5)
  → Find decision observations

Step 2: get_observations(ids=[<relevant_ids>])
  → Get full decision rationale, trade-offs, facts

Code Archaeology

Scenario: Find when a specific file was modified
Step 1: search(query="worker-service.ts", limit=20)
  → Get all observations mentioning that file

Step 2: timeline(query="worker-service.ts refactor", depth_before=2, depth_after=2)
  → See what led to and followed from the refactor

Step 3: get_observations(ids=[<specific_observation_ids>])
  → Get implementation details

Feature History

Scenario: Track how a feature evolved
Step 1: search(query="dark mode", type="feature", orderBy="date_asc")
  → Chronological view of feature work

Step 2: timeline(anchor=<first_observation_id>, depth_after=10)
  → See the full development timeline

Step 3: get_observations(ids=[<key_milestones>])
  → Deep dive on critical implementation points

Learning from Past Work

Scenario: Review refactoring patterns
Step 1: search(type="refactor", limit=10, orderBy="date_desc")
  → Recent refactoring work

Step 2: get_observations(ids=[<interesting_ids>])
  → Study the patterns and approaches used

Context Recovery

Scenario: Restore context after time away from project
Step 1: search(query="project-name", limit=10, orderBy="date_desc")
  → See recent work

Step 2: timeline(anchor=<most_recent_id>, depth_before=10)
  → Understand what led to current state

Step 3: get_observations(ids=[<critical_observations>])
  → Refresh memory on key decisions

Search Query Syntax

The query parameter supports SQLite FTS5 full-text search syntax:

Boolean Operators

query="authentication AND JWT"           # Both terms must appear
query="OAuth OR JWT"                      # Either term can appear
query="security NOT deprecated"           # Exclude deprecated items

Phrase Searches

query='"database migration"'             # Exact phrase match

Column-Specific Searches

query="title:authentication"             # Search in title only
query="content:database"                  # Search in content only
query="concepts:security"                 # Search in concepts only

Combining Operators

query='"user auth" AND (JWT OR session) NOT deprecated'

Token Management

Token Efficiency Best Practices

  1. Always start with search - Get index first (~50-100 tokens/result)
  2. Use small limits - Start with 3-5 results, increase if needed
  3. Filter before fetching - Use type, date, project filters
  4. Batch get_observations - Always group multiple IDs in one call
  5. Use timeline strategically - Get context only when narrative matters

Token Cost Estimates

OperationTokens per Result
search (index)50-100
timeline (per observation)100-200
get_observations (full details)500-1,000
Example Comparison: Inefficient:
# Fetching 20 full observations upfront: 10,000-20,000 tokens
get_observations(ids=[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20])
Efficient:
# Search index: ~1,000 tokens
search(query="bug fix", limit=20)

# Review IDs, identify 3 relevant observations

# Fetch only relevant: ~1,500-3,000 tokens
get_observations(ids=[5, 12, 18])

# Total: 2,500-4,000 tokens (vs 10,000-20,000)

Advanced Filtering

Date Ranges

search(
  query="performance optimization",
  dateStart="2025-10-01",
  dateEnd="2025-10-31"
)

Multiple Types

For observations of multiple types, make multiple searches or use broader query:
search(query="database", type="bugfix", limit=10)
search(query="database", type="feature", limit=10)

Project-Specific

search(query="API", project="my-app", limit=15)

Pagination

# First page
search(query="refactor", limit=10, offset=0)

# Second page
search(query="refactor", limit=10, offset=10)

# Third page
search(query="refactor", limit=10, offset=20)

Result Metadata

All observations include rich metadata:
  • ID - Unique observation identifier
  • Type - bugfix, feature, decision, discovery, refactor, change
  • Date - When the work occurred
  • Title - Concise description
  • Concepts - Tagged themes (e.g., security, performance, architecture)
  • Files Read - Files examined during work
  • Files Modified - Files changed during work
  • Narrative - Story of what happened and why
  • Facts - Key factual points (decisions made, patterns used, metrics)

Troubleshooting

No Results Found

  1. Broaden your search:
    # Too specific
    search(query="JWT authentication implementation with RS256")
    
    # Better
    search(query="authentication")
    
  2. Check database has data:
    curl "http://localhost:37777/api/search?query=test"
    
  3. Try without filters:
    # Remove type/date filters to see if data exists
    search(query="your-search-term")
    

IDs Not Found in get_observations

Error: “Observation IDs not found: [123, 456]” Causes:
  • IDs from different project (use project parameter)
  • IDs were deleted
  • Typo in ID numbers
Solution:
# Verify IDs exist
search(query="<related-search>")

# Use correct project filter
get_observations(ids=[123, 456], project="correct-project-name")

Token Limit Errors

Error: Response exceeds token limits Solution: Use the 3-layer workflow to reduce upfront costs:
# Instead of fetching 50 full observations:
# get_observations(ids=[1,2,3,...,50])  # 25,000-50,000 tokens!

# Do this:
search(query="<your-query>", limit=50)  # ~2,500-5,000 tokens
# Review index, identify 5 relevant observations
get_observations(ids=[<5-most-relevant>])  # ~2,500-5,000 tokens
# Total: 5,000-10,000 tokens (50-80% savings)

Search Performance

If searches seem slow:
  1. Be more specific in queries (helps FTS5 index)
  2. Use date range filters to narrow scope
  3. Specify project filter when possible
  4. Use smaller limit values

Best Practices

  1. Index First, Details Later - Always start with search to survey options
  2. Filter Before Fetching - Use search parameters to narrow results
  3. Batch ID Fetches - Group multiple IDs in one get_observations call
  4. Use Timeline for Context - When narrative matters, timeline shows the story
  5. Specific Queries - More specific = better relevance
  6. Small Limits Initially - Start with 3-5 results, expand if needed
  7. Review Before Deep Dive - Check index before fetching full details

Technical Details

Architecture: MCP tools are a thin wrapper over the Worker HTTP API (localhost:37777). The MCP server translates tool calls into HTTP requests to the worker service, which handles all business logic, database queries, and Chroma vector search. MCP Server: Located at ~/.claude/plugins/marketplaces/thedotmack/plugin/scripts/mcp-server.cjs Worker Service: Express API on port 37777, managed by Bun Database: SQLite FTS5 full-text search on ~/.claude-mem/claude-mem.db Vector Search: Chroma embeddings for semantic search (underlying implementation)

Next Steps