Memory Search with MCP Tools

Claude-mem provides persistent memory across sessions through 4 MCP tools that follow a token-efficient 3-layer workflow pattern.

Overview

Instead of fetching all historical data upfront (expensive), claude-mem uses a progressive disclosure approach:

Search → Get a compact index with IDs (~50-100 tokens/result)
Timeline → Get context around interesting results
Get Observations → Fetch full details ONLY for filtered IDs

This achieves ~10x token savings compared to traditional RAG approaches.

The 3-Layer Workflow

Layer 1: Search (Index)

Start by searching to get a lightweight index of results:

search(query="authentication bug", type="bugfix", limit=10)

Returns: Compact table with IDs, titles, dates, types Cost: ~50-100 tokens per result Purpose: Survey what exists before fetching details

Layer 2: Timeline (Context)

Get chronological context around specific observations:

timeline(anchor=<observation_id>, depth_before=3, depth_after=3)

Or search and get timeline in one step:

timeline(query="authentication", depth_before=2, depth_after=2)

Returns: Chronological view showing what was happening before/after Cost: Variable, depends on depth Purpose: Understand narrative arc and context

Layer 3: Get Observations (Details)

Fetch full details only for relevant observations:

get_observations(ids=[123, 456, 789])

Returns: Complete observation details (narrative, facts, files, concepts) Cost: ~500-1000 tokens per observation Purpose: Deep dive on specific, validated items

Why This Works

Traditional Approach:

Fetch everything upfront: 20,000 tokens
Relevance: ~10% (2,000 tokens actually useful)
Waste: 18,000 tokens on irrelevant context

3-Layer Approach:

Search index: 1,000 tokens (10 results)
Timeline context: 500 tokens (around 2 key results)
Fetch details: 1,500 tokens (3 observations)
Total: 3,000 tokens, 100% relevant

Available Tools

`__IMPORTANT` - Workflow Documentation

Always visible reminder of the 3-layer workflow pattern. Helps Claude understand how to use the search tools efficiently. Usage: Automatically shown, no need to invoke

`search` - Search Memory Index

Search your memory and get a compact index with IDs. Parameters:

query - Full-text search query (supports AND, OR, NOT, phrase searches)
limit - Maximum results (default: 20)
offset - Skip first N results for pagination
type - Filter by observation type (bugfix, feature, decision, discovery, refactor, change)
obs_type - Filter by record type (observation, session, prompt)
project - Filter by project name
dateStart - Filter by start date (YYYY-MM-DD)
dateEnd - Filter by end date (YYYY-MM-DD)
orderBy - Sort order (date_desc, date_asc, relevance)

Returns: Compact index table with IDs, titles, dates, types Example:

search(query="database migration", type="bugfix", limit=5, orderBy="date_desc")

`timeline` - Get Chronological Context

Get a chronological view of observations around a specific point or query. Parameters:

anchor - Observation ID to center timeline around (optional if query provided)
query - Search query to find anchor automatically (optional if anchor provided)
depth_before - Number of observations before anchor (default: 3)
depth_after - Number of observations after anchor (default: 3)
project - Filter by project name

Returns: Chronological list showing what happened before/during/after Example:

timeline(anchor=12345, depth_before=5, depth_after=5)

Or search-based:

timeline(query="implemented JWT auth", depth_before=3, depth_after=3)

`get_observations` - Fetch Full Details

Fetch complete observation details by IDs. Always batch multiple IDs in a single call for efficiency. Parameters:

ids - Array of observation IDs (required)
orderBy - Sort order (date_desc, date_asc)
limit - Maximum observations to return
project - Filter by project name

Returns: Complete observation details including narrative, facts, files, concepts Example:

get_observations(ids=[123, 456, 789, 1011])

Important: Always batch IDs instead of making separate calls per observation.

Common Use Cases

Debugging Issues

Scenario: Find what went wrong with database connections

Step 1: search(query="error database connection", type="bugfix", limit=10)
  → Review index, identify observations #245, #312, #489

Step 2: timeline(anchor=312, depth_before=3, depth_after=3)
  → See what was happening around the fix

Step 3: get_observations(ids=[312, 489])
  → Get full details on relevant fixes

Understanding Decisions

Scenario: Review architectural choices about authentication

Step 1: search(query="authentication", type="decision", limit=5)
  → Find decision observations

Step 2: get_observations(ids=[<relevant_ids>])
  → Get full decision rationale, trade-offs, facts

Code Archaeology

Scenario: Find when a specific file was modified

Step 1: search(query="worker-service.ts", limit=20)
  → Get all observations mentioning that file

Step 2: timeline(query="worker-service.ts refactor", depth_before=2, depth_after=2)
  → See what led to and followed from the refactor

Step 3: get_observations(ids=[<specific_observation_ids>])
  → Get implementation details

Feature History

Scenario: Track how a feature evolved

Step 1: search(query="dark mode", type="feature", orderBy="date_asc")
  → Chronological view of feature work

Step 2: timeline(anchor=<first_observation_id>, depth_after=10)
  → See the full development timeline

Step 3: get_observations(ids=[<key_milestones>])
  → Deep dive on critical implementation points

Learning from Past Work

Scenario: Review refactoring patterns

Step 1: search(type="refactor", limit=10, orderBy="date_desc")
  → Recent refactoring work

Step 2: get_observations(ids=[<interesting_ids>])
  → Study the patterns and approaches used

Context Recovery

Scenario: Restore context after time away from project

Step 1: search(query="project-name", limit=10, orderBy="date_desc")
  → See recent work

Step 2: timeline(anchor=<most_recent_id>, depth_before=10)
  → Understand what led to current state

Step 3: get_observations(ids=[<critical_observations>])
  → Refresh memory on key decisions

Search Query Syntax

The query parameter supports SQLite FTS5 full-text search syntax:

Boolean Operators

query="authentication AND JWT"           # Both terms must appear
query="OAuth OR JWT"                      # Either term can appear
query="security NOT deprecated"           # Exclude deprecated items

Phrase Searches

query='"database migration"'             # Exact phrase match

Column-Specific Searches

query="title:authentication"             # Search in title only
query="content:database"                  # Search in content only
query="concepts:security"                 # Search in concepts only

Combining Operators

query='"user auth" AND (JWT OR session) NOT deprecated'

Token Management

Token Efficiency Best Practices

Always start with search - Get index first (~50-100 tokens/result)
Use small limits - Start with 3-5 results, increase if needed
Filter before fetching - Use type, date, project filters
Batch get_observations - Always group multiple IDs in one call
Use timeline strategically - Get context only when narrative matters

Token Cost Estimates

Operation	Tokens per Result
search (index)	50-100
timeline (per observation)	100-200
get_observations (full details)	500-1,000

Example Comparison: Inefficient:

# Fetching 20 full observations upfront: 10,000-20,000 tokens
get_observations(ids=[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20])

Efficient:

# Search index: ~1,000 tokens
search(query="bug fix", limit=20)

# Review IDs, identify 3 relevant observations

# Fetch only relevant: ~1,500-3,000 tokens
get_observations(ids=[5, 12, 18])

# Total: 2,500-4,000 tokens (vs 10,000-20,000)

Advanced Filtering

Date Ranges

search(
  query="performance optimization",
  dateStart="2025-10-01",
  dateEnd="2025-10-31"
)

Multiple Types

For observations of multiple types, make multiple searches or use broader query:

search(query="database", type="bugfix", limit=10)
search(query="database", type="feature", limit=10)

Project-Specific

search(query="API", project="my-app", limit=15)

Pagination

# First page
search(query="refactor", limit=10, offset=0)

# Second page
search(query="refactor", limit=10, offset=10)

# Third page
search(query="refactor", limit=10, offset=20)

Result Metadata

All observations include rich metadata:

ID - Unique observation identifier
Type - bugfix, feature, decision, discovery, refactor, change
Date - When the work occurred
Title - Concise description
Concepts - Tagged themes (e.g., security, performance, architecture)
Files Read - Files examined during work
Files Modified - Files changed during work
Narrative - Story of what happened and why
Facts - Key factual points (decisions made, patterns used, metrics)

Troubleshooting

No Results Found

Broaden your search:

# Too specific
search(query="JWT authentication implementation with RS256")

# Better
search(query="authentication")

Check database has data:

curl "http://localhost:37777/api/search?query=test"

Try without filters:

# Remove type/date filters to see if data exists
search(query="your-search-term")

IDs Not Found in get_observations

Error: “Observation IDs not found: [123, 456]” Causes:

IDs from different project (use project parameter)
IDs were deleted
Typo in ID numbers

Solution:

# Verify IDs exist
search(query="<related-search>")

# Use correct project filter
get_observations(ids=[123, 456], project="correct-project-name")

Token Limit Errors

Error: Response exceeds token limits Solution: Use the 3-layer workflow to reduce upfront costs:

# Instead of fetching 50 full observations:
# get_observations(ids=[1,2,3,...,50])  # 25,000-50,000 tokens!

# Do this:
search(query="<your-query>", limit=50)  # ~2,500-5,000 tokens
# Review index, identify 5 relevant observations
get_observations(ids=[<5-most-relevant>])  # ~2,500-5,000 tokens
# Total: 5,000-10,000 tokens (50-80% savings)

Search Performance

If searches seem slow:

Be more specific in queries (helps FTS5 index)
Use date range filters to narrow scope
Specify project filter when possible
Use smaller limit values

Best Practices

Index First, Details Later - Always start with search to survey options
Filter Before Fetching - Use search parameters to narrow results
Batch ID Fetches - Group multiple IDs in one get_observations call
Use Timeline for Context - When narrative matters, timeline shows the story
Specific Queries - More specific = better relevance
Small Limits Initially - Start with 3-5 results, expand if needed
Review Before Deep Dive - Check index before fetching full details

Technical Details

Architecture: MCP tools are a thin wrapper over the Worker HTTP API (localhost:37777). The MCP server translates tool calls into HTTP requests to the worker service, which handles all business logic, database queries, and Chroma vector search. MCP Server: Located at ~/.claude/plugins/marketplaces/thedotmack/plugin/scripts/mcp-server.cjs Worker Service: Express API on port 37777, managed by Bun Database: SQLite FTS5 full-text search on ~/.claude-mem/claude-mem.db Vector Search: Chroma embeddings for semantic search (underlying implementation)

Next Steps

Progressive Disclosure - Philosophy behind 3-layer workflow
Architecture Overview - System components
Database Schema - Understanding the data structure
Claude Desktop Setup - Installation and configuration

Get Started

Cursor Integration

Best Practices

Configuration & Development

Architecture

​Memory Search with MCP Tools

​Overview

​The 3-Layer Workflow

​Layer 1: Search (Index)

​Layer 2: Timeline (Context)

​Layer 3: Get Observations (Details)

​Why This Works

​Available Tools

​__IMPORTANT - Workflow Documentation

​search - Search Memory Index

​timeline - Get Chronological Context

​get_observations - Fetch Full Details

​Common Use Cases

​Debugging Issues

​Understanding Decisions

​Code Archaeology

​Feature History

​Learning from Past Work

​Context Recovery

​Search Query Syntax

​Boolean Operators

​Phrase Searches

​Column-Specific Searches

​Combining Operators

​Token Management

​Token Efficiency Best Practices

​Token Cost Estimates

​Advanced Filtering

​Date Ranges

​Multiple Types

​Project-Specific

​Pagination

​Result Metadata

​Troubleshooting

​No Results Found

​IDs Not Found in get_observations

​Token Limit Errors

​Search Performance

​Best Practices

​Technical Details

​Next Steps

Memory Search with MCP Tools

Overview

The 3-Layer Workflow

Layer 1: Search (Index)

Layer 2: Timeline (Context)

Layer 3: Get Observations (Details)

Why This Works

Available Tools

`__IMPORTANT` - Workflow Documentation

`search` - Search Memory Index

`timeline` - Get Chronological Context

`get_observations` - Fetch Full Details

Common Use Cases

Debugging Issues

Understanding Decisions

Code Archaeology

Feature History

Learning from Past Work

Context Recovery

Search Query Syntax

Boolean Operators

Phrase Searches

Column-Specific Searches

Combining Operators

Token Management

Token Efficiency Best Practices

Token Cost Estimates

Advanced Filtering

Date Ranges

Multiple Types

Project-Specific

Pagination

Result Metadata

Troubleshooting

No Results Found

IDs Not Found in get_observations

Token Limit Errors

Search Performance

Best Practices

Technical Details

Next Steps