Search Architecture
Claude-mem uses an MCP-based search architecture that provides intelligent memory retrieval through 4 streamlined tools following a 3-layer workflow pattern.Overview
Architecture: MCP Tools → MCP Protocol → HTTP API → Worker Service Key Components:- MCP Tools (4 tools) -
search,timeline,get_observations,__IMPORTANT - MCP Server (
plugin/scripts/mcp-server.cjs) - Thin wrapper over HTTP API - HTTP API Endpoints - Fast search operations on Worker Service (port 37777)
- Worker Service - Express.js server with FTS5 full-text search
- SQLite Database - Persistent storage with FTS5 virtual tables
- Chroma Vector DB - Semantic search with hybrid retrieval
How It Works
1. User Query
Claude has access to 4 MCP tools. When searching memory, Claude follows the 3-layer workflow:2. MCP Protocol
MCP server receives tool call via JSON-RPC over stdio:3. HTTP API Call
MCP server translates to HTTP request:4. Worker Processing
Worker service executes FTS5 query:5. Results Returned
Worker returns structured data → MCP server → Claude:6. Claude Processes Results
Claude reviews the index, decides which observations are relevant, and can:- Use
timelineto get context - Use
get_observationsto fetch full details for selected IDs
The 4 MCP Tools
__IMPORTANT - Workflow Documentation
Always visible to Claude. Explains the 3-layer workflow pattern.
Description:
search - Search Memory Index
Tool Definition:
GET /api/search
Parameters:
query- Full-text search querylimit- Maximum results (default: 20)type- Filter by observation typeproject- Filter by project namedateStart,dateEnd- Date range filtersoffset- Pagination offsetorderBy- Sort order
timeline - Get Chronological Context
Tool Definition:
GET /api/timeline
Parameters:
anchor- Observation ID to center timeline around (optional if query provided)query- Search query to find anchor automatically (optional if anchor provided)depth_before- Number of observations before anchor (default: 3)depth_after- Number of observations after anchor (default: 3)project- Filter by project name
get_observations - Fetch Full Details
Tool Definition:
POST /api/observations/batch
Body:
MCP Server Implementation
Location:/Users/YOUR_USERNAME/.claude/plugins/marketplaces/thedotmack/plugin/scripts/mcp-server.cjs
Role: Thin wrapper that translates MCP protocol to HTTP API calls
Key Characteristics:
- ~312 lines of code (reduced from ~2,718 lines in old implementation)
- No business logic - just protocol translation
- Single source of truth: Worker HTTP API
- Simple schemas with
additionalProperties: true
Worker HTTP API
Location:src/services/worker-service.ts
Port: 37777
Search Endpoints:
- Uses
SessionSearchservice for FTS5 queries - Uses
SessionStorefor structured queries - Hybrid search with ChromaDB for semantic similarity
The 3-Layer Workflow Pattern
Design Philosophy
The 3-layer workflow embodies progressive disclosure - a core principle of claude-mem’s architecture. Layer 1: Index (Search)- What: Compact table with IDs, titles, dates, types
- Cost: ~50-100 tokens per result
- Purpose: Survey what exists before committing tokens
- Decision Point: “Which observations are relevant?”
- What: Chronological view of observations around a point
- Cost: Variable based on depth
- Purpose: Understand narrative arc, see what led to/from a point
- Decision Point: “Do I need full details?”
- What: Complete observation data (narrative, facts, files, concepts)
- Cost: ~500-1,000 tokens per observation
- Purpose: Deep dive on validated, relevant observations
- Decision Point: “Apply knowledge to current task”
Token Efficiency
Traditional RAG Approach:Architecture Evolution
Before: Complex MCP Implementation
Approach: 9 MCP tools with detailed parameter schemas Token Cost: ~2,500 tokens in tool definitions per sessionsearch_observations- Full-text searchfind_by_type- Filter by typefind_by_file- Filter by filefind_by_concept- Filter by conceptget_recent_context- Recent sessionsget_observation- Fetch single observationget_session- Fetch sessionget_prompt- Fetch prompthelp- API documentation
- Overlapping operations (search_observations vs find_by_type)
- Complex parameter schemas
- No built-in workflow guidance
- High token cost at session start
After: Streamlined MCP Implementation
Approach: 4 MCP tools following 3-layer workflow Token Cost: ~312 lines of code, simplified tool definitions Tools:__IMPORTANT- Workflow guidance (always visible)search- Step 1 (index)timeline- Step 2 (context)get_observations- Step 3 (details)
- Progressive disclosure built into tool design
- No overlapping operations
- Simple schemas (
additionalProperties: true) - Clear workflow pattern
- ~10x token savings
Key Insight
Before: Progressive disclosure was something Claude had to remember After: Progressive disclosure is enforced by tool design itself The 3-layer workflow pattern makes it structurally difficult to waste tokens:- Can’t fetch details without first getting IDs from search
- Can’t search without seeing workflow reminder (
__IMPORTANT) - Timeline provides middle ground between index and full details
Configuration
Claude Desktop
Add toclaude_desktop_config.json:
Claude Code
MCP server is automatically configured via plugin installation. No manual setup required. Both clients use the same MCP tools - the architecture works identically for Claude Desktop and Claude Code.Security
FTS5 Injection Prevention
All search queries are escaped before FTS5 processing:MCP Protocol Security
- Stdio transport (no network exposure)
- Local-only HTTP API (localhost:37777)
- No authentication needed (local development only)
Performance
FTS5 Full-Text Search: Sub-10ms for typical queries MCP Overhead: Minimal - simple protocol translation Caching: HTTP layer allows response caching (future enhancement) Pagination: Efficient with offset/limit Batching:get_observations accepts multiple IDs in single call
Benefits Over Alternative Approaches
vs. Traditional RAG
Traditional RAG:- Fetches everything upfront
- High token cost
- Low relevance ratio
- Fetches only what’s needed
- ~10x token savings
- 100% relevance (Claude chooses what to fetch)
vs. Previous MCP Implementation (v5.x)
Previous (9 tools):- Complex schemas
- Overlapping operations
- No workflow guidance
- ~2,500 tokens in definitions
- Simple schemas
- Clear workflow
- Built-in guidance
- ~312 lines of code
vs. Skill-Based Approach (Previously)
Skill approach:- Required separate skill files
- HTTP API called directly via curl
- Progressive disclosure through skill loading
- Native MCP protocol (better Claude integration)
- Cleaner architecture (protocol translation layer)
- Works with both Claude Desktop and Claude Code
- Simpler to maintain (no skill files)
Troubleshooting
MCP Server Not Connected
Symptoms: Tools not appearing in Claude Solution:- Check MCP server path in configuration
- Verify worker service is running:
curl http://localhost:37777/api/health - Restart Claude Desktop/Code
Worker Service Not Running
Symptoms: MCP tools fail with connection errors Solution:Empty Search Results
Symptoms: search() returns no results Troubleshooting:- Test API directly:
curl "http://localhost:37777/api/search?query=test" - Check database:
ls ~/.claude-mem/claude-mem.db - Verify observations exist:
curl "http://localhost:37777/api/health"
Next Steps
- Memory Search Usage - User guide with examples
- Progressive Disclosure - Philosophy behind 3-layer workflow
- Worker Service Architecture - HTTP API details
- Database Schema - FTS5 tables and indexes

