> ## Documentation Index
> Fetch the complete documentation index at: https://docs.claude-mem.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Search Architecture

> MCP tools with 3-layer workflow for token-efficient memory retrieval

# Search Architecture

Claude-mem uses an **MCP-based search architecture** that provides intelligent memory retrieval through 4 streamlined tools following a 3-layer workflow pattern.

## Overview

**Architecture**: MCP Tools → MCP Protocol → HTTP API → Worker Service

**Key Components**:

1. **MCP Tools** (4 tools) - `search`, `timeline`, `get_observations`, `__IMPORTANT`
2. **MCP Server** (`plugin/scripts/mcp-server.cjs`) - Thin wrapper over HTTP API
3. **HTTP API Endpoints** - Fast search operations on Worker Service (port 37777)
4. **Worker Service** - Express.js server with FTS5 full-text search
5. **SQLite Database** - Persistent storage with FTS5 virtual tables
6. **Chroma Vector DB** - Semantic search with hybrid retrieval

**Token Efficiency**: \~10x savings through 3-layer workflow pattern

## How It Works

### 1. User Query

Claude has access to 4 MCP tools. When searching memory, Claude follows the 3-layer workflow:

```
Step 1: search(query="authentication bug", type="bugfix", limit=10)
Step 2: timeline(anchor=<observation_id>, depth_before=3, depth_after=3)
Step 3: get_observations(ids=[123, 456, 789])
```

### 2. MCP Protocol

MCP server receives tool call via JSON-RPC over stdio:

```json theme={null}
{
  "method": "tools/call",
  "params": {
    "name": "search",
    "arguments": {
      "query": "authentication bug",
      "type": "bugfix",
      "limit": 10
    }
  }
}
```

### 3. HTTP API Call

MCP server translates to HTTP request:

```typescript theme={null}
const url = `http://localhost:37777/api/search?query=authentication%20bug&type=bugfix&limit=10`;
const response = await fetch(url);
```

### 4. Worker Processing

Worker service executes FTS5 query:

```sql theme={null}
SELECT * FROM observations_fts
WHERE observations_fts MATCH ?
AND type = 'bugfix'
ORDER BY rank
LIMIT 10
```

### 5. Results Returned

Worker returns structured data → MCP server → Claude:

```json theme={null}
{
  "content": [{
    "type": "text",
    "text": "| ID | Time | Title | Type |\n|---|---|---|---|\n| #123 | 2:15 PM | Fixed auth token expiry | bugfix |"
  }]
}
```

### 6. Claude Processes Results

Claude reviews the index, decides which observations are relevant, and can:

* Use `timeline` to get context
* Use `get_observations` to fetch full details for selected IDs

## The 4 MCP Tools

### `__IMPORTANT` - Workflow Documentation

Always visible to Claude. Explains the 3-layer workflow pattern.

**Description:**

```
3-LAYER WORKFLOW (ALWAYS FOLLOW):
1. search(query) → Get index with IDs (~50-100 tokens/result)
2. timeline(anchor=ID) → Get context around interesting results
3. get_observations([IDs]) → Fetch full details ONLY for filtered IDs
NEVER fetch full details without filtering first. 10x token savings.
```

**Purpose:** Ensures Claude follows token-efficient pattern

### `search` - Search Memory Index

**Tool Definition:**

```typescript theme={null}
{
  name: 'search',
  description: 'Step 1: Search memory. Returns index with IDs. Params: query, limit, project, type, obs_type, dateStart, dateEnd, offset, orderBy',
  inputSchema: {
    type: 'object',
    properties: {},
    additionalProperties: true  // Accepts any parameters
  }
}
```

**HTTP Endpoint:** `GET /api/search`

**Parameters:**

* `query` - Full-text search query
* `limit` - Maximum results (default: 20)
* `type` - Filter by observation type
* `project` - Filter by project name
* `dateStart`, `dateEnd` - Date range filters
* `offset` - Pagination offset
* `orderBy` - Sort order

**Returns:** Compact index with IDs, titles, dates, types (\~50-100 tokens per result)

### `timeline` - Get Chronological Context

**Tool Definition:**

```typescript theme={null}
{
  name: 'timeline',
  description: 'Step 2: Get context around results. Params: anchor (observation ID) OR query (finds anchor automatically), depth_before, depth_after, project',
  inputSchema: {
    type: 'object',
    properties: {},
    additionalProperties: true
  }
}
```

**HTTP Endpoint:** `GET /api/timeline`

**Parameters:**

* `anchor` - Observation ID to center timeline around (optional if query provided)
* `query` - Search query to find anchor automatically (optional if anchor provided)
* `depth_before` - Number of observations before anchor (default: 3)
* `depth_after` - Number of observations after anchor (default: 3)
* `project` - Filter by project name

**Returns:** Chronological view showing what happened before/during/after

### `get_observations` - Fetch Full Details

**Tool Definition:**

```typescript theme={null}
{
  name: 'get_observations',
  description: 'Step 3: Fetch full details for filtered IDs. Params: ids (array of observation IDs, required), orderBy, limit, project',
  inputSchema: {
    type: 'object',
    properties: {
      ids: {
        type: 'array',
        items: { type: 'number' },
        description: 'Array of observation IDs to fetch (required)'
      }
    },
    required: ['ids'],
    additionalProperties: true
  }
}
```

**HTTP Endpoint:** `POST /api/observations/batch`

**Body:**

```json theme={null}
{
  "ids": [123, 456, 789],
  "orderBy": "date_desc",
  "project": "my-app"
}
```

**Returns:** Complete observation details (\~500-1,000 tokens per observation)

## MCP Server Implementation

**Location:** `/Users/YOUR_USERNAME/.claude/plugins/marketplaces/thedotmack/plugin/scripts/mcp-server.cjs`

**Role:** Thin wrapper that translates MCP protocol to HTTP API calls

**Key Characteristics:**

* \~312 lines of code (reduced from \~2,718 lines in old implementation)
* No business logic - just protocol translation
* Single source of truth: Worker HTTP API
* Simple schemas with `additionalProperties: true`

**Handler Example:**

```typescript theme={null}
{
  name: 'search',
  handler: async (args: any) => {
    const endpoint = '/api/search';
    const searchParams = new URLSearchParams();

    for (const [key, value] of Object.entries(args)) {
      searchParams.append(key, String(value));
    }

    const url = `http://localhost:37777${endpoint}?${searchParams}`;
    const response = await fetch(url);
    return await response.json();
  }
}
```

## Worker HTTP API

**Location:** `src/services/worker-service.ts`

**Port:** 37777

**Search Endpoints:**

```typescript theme={null}
GET  /api/search           # Main search (used by MCP search tool)
GET  /api/timeline         # Timeline context (used by MCP timeline tool)
POST /api/observations/batch  # Fetch by IDs (used by MCP get_observations tool)
GET  /api/health           # Health check
```

**Database Access:**

* Uses `SessionSearch` service for FTS5 queries
* Uses `SessionStore` for structured queries
* Hybrid search with ChromaDB for semantic similarity

**FTS5 Full-Text Search:**

```typescript theme={null}
// search tool → HTTP GET → FTS5 query
SELECT * FROM observations_fts
WHERE observations_fts MATCH ?
AND type = ?
AND date >= ? AND date <= ?
ORDER BY rank
LIMIT ? OFFSET ?
```

## The 3-Layer Workflow Pattern

### Design Philosophy

The 3-layer workflow embodies **progressive disclosure** - a core principle of claude-mem's architecture.

**Layer 1: Index (Search)**

* **What:** Compact table with IDs, titles, dates, types
* **Cost:** \~50-100 tokens per result
* **Purpose:** Survey what exists before committing tokens
* **Decision Point:** "Which observations are relevant?"

**Layer 2: Context (Timeline)**

* **What:** Chronological view of observations around a point
* **Cost:** Variable based on depth
* **Purpose:** Understand narrative arc, see what led to/from a point
* **Decision Point:** "Do I need full details?"

**Layer 3: Details (Get Observations)**

* **What:** Complete observation data (narrative, facts, files, concepts)
* **Cost:** \~500-1,000 tokens per observation
* **Purpose:** Deep dive on validated, relevant observations
* **Decision Point:** "Apply knowledge to current task"

### Token Efficiency

**Traditional RAG Approach:**

```
Fetch 20 observations upfront: 10,000-20,000 tokens
Relevance: ~10% (only 2 observations actually useful)
Waste: 18,000 tokens on irrelevant context
```

**3-Layer Workflow:**

```
Step 1: search (20 results)        ~1,000-2,000 tokens
Step 2: Review index, filter to 3 relevant IDs
Step 3: get_observations (3 IDs)   ~1,500-3,000 tokens
Total: 2,500-5,000 tokens (50-75% savings)
```

**10x Savings:** By filtering at index level before fetching full details

## Architecture Evolution

### Before: Complex MCP Implementation

**Approach:** 9 MCP tools with detailed parameter schemas

**Token Cost:** \~2,500 tokens in tool definitions per session

* `search_observations` - Full-text search
* `find_by_type` - Filter by type
* `find_by_file` - Filter by file
* `find_by_concept` - Filter by concept
* `get_recent_context` - Recent sessions
* `get_observation` - Fetch single observation
* `get_session` - Fetch session
* `get_prompt` - Fetch prompt
* `help` - API documentation

**Problems:**

* Overlapping operations (search\_observations vs find\_by\_type)
* Complex parameter schemas
* No built-in workflow guidance
* High token cost at session start

**Code Size:** \~2,718 lines in mcp-server.ts

### After: Streamlined MCP Implementation

**Approach:** 4 MCP tools following 3-layer workflow

**Token Cost:** \~312 lines of code, simplified tool definitions

**Tools:**

1. `__IMPORTANT` - Workflow guidance (always visible)
2. `search` - Step 1 (index)
3. `timeline` - Step 2 (context)
4. `get_observations` - Step 3 (details)

**Benefits:**

* Progressive disclosure built into tool design
* No overlapping operations
* Simple schemas (`additionalProperties: true`)
* Clear workflow pattern
* \~10x token savings

**Code Size:** \~312 lines in mcp-server.ts (88% reduction)

### Key Insight

**Before:** Progressive disclosure was something Claude had to remember

**After:** Progressive disclosure is enforced by tool design itself

The 3-layer workflow pattern makes it structurally difficult to waste tokens:

* Can't fetch details without first getting IDs from search
* Can't search without seeing workflow reminder (`__IMPORTANT`)
* Timeline provides middle ground between index and full details

## Configuration

### Claude Desktop

Add to `claude_desktop_config.json`:

```json theme={null}
{
  "mcpServers": {
    "mcp-search": {
      "command": "node",
      "args": [
        "/Users/YOUR_USERNAME/.claude/plugins/marketplaces/thedotmack/plugin/scripts/mcp-server.cjs"
      ]
    }
  }
}
```

### Claude Code

MCP server is automatically configured via plugin installation. No manual setup required.

**Both clients use the same MCP tools** - the architecture works identically for Claude Desktop and Claude Code.

## Security

### FTS5 Injection Prevention

All search queries are escaped before FTS5 processing:

```typescript theme={null}
function escapeFTS5Query(query: string): string {
  return query.replace(/"/g, '""');
}
```

**Testing:** 332 injection attack tests covering special characters, SQL keywords, quote escaping, and boolean operators.

### MCP Protocol Security

* Stdio transport (no network exposure)
* Local-only HTTP API (localhost:37777)
* No authentication needed (local development only)

## Performance

**FTS5 Full-Text Search:** Sub-10ms for typical queries

**MCP Overhead:** Minimal - simple protocol translation

**Caching:** HTTP layer allows response caching (future enhancement)

**Pagination:** Efficient with offset/limit

**Batching:** `get_observations` accepts multiple IDs in single call

## Benefits Over Alternative Approaches

### vs. Traditional RAG

**Traditional RAG:**

* Fetches everything upfront
* High token cost
* Low relevance ratio

**3-Layer MCP:**

* Fetches only what's needed
* \~10x token savings
* 100% relevance (Claude chooses what to fetch)

### vs. Previous MCP Implementation (v5.x)

**Previous (9 tools):**

* Complex schemas
* Overlapping operations
* No workflow guidance
* \~2,500 tokens in definitions

**Current (4 tools):**

* Simple schemas
* Clear workflow
* Built-in guidance
* \~312 lines of code

### vs. Skill-Based Approach (Previously)

**Skill approach:**

* Required separate skill files
* HTTP API called directly via curl
* Progressive disclosure through skill loading

**MCP approach:**

* Native MCP protocol (better Claude integration)
* Cleaner architecture (protocol translation layer)
* Works with both Claude Desktop and Claude Code
* Simpler to maintain (no skill files)

**Migration:** Skill-based search was removed in favor of streamlined MCP architecture.

## Troubleshooting

### MCP Server Not Connected

**Symptoms:** Tools not appearing in Claude

**Solution:**

1. Check MCP server path in configuration
2. Verify worker service is running: `curl http://localhost:37777/api/health`
3. Restart Claude Desktop/Code

### Worker Service Not Running

**Symptoms:** MCP tools fail with connection errors

**Solution:**

```bash theme={null}
npm run worker:status       # Check status
npm run worker:restart      # Restart worker
npm run worker:logs         # View logs
```

### Empty Search Results

**Symptoms:** search() returns no results

**Troubleshooting:**

1. Test API directly: `curl "http://localhost:37777/api/search?query=test"`
2. Check database: `ls ~/.claude-mem/claude-mem.db`
3. Verify observations exist: `curl "http://localhost:37777/api/health"`

## Next Steps

* [Memory Search Usage](/usage/search-tools) - User guide with examples
* [Progressive Disclosure](/progressive-disclosure) - Philosophy behind 3-layer workflow
* [Worker Service Architecture](/architecture/worker-service) - HTTP API details
* [Database Schema](/architecture/database) - FTS5 tables and indexes