Current State of Endless Mode
Core Concept
Endless Mode is a biomimetic memory architecture that solves Claude’s context window exhaustion problem. Instead of keeping full tool outputs in the context window (O(N²) complexity), it:- Captures compressed observations after each tool use
- Replaces transcripts with low token summaries
- Achieves O(N) linear complexity
- Maintains two-tier memory: working memory (compressed) + archive memory (full transcript on disk, maintained by default claude code functionality)
Implementation Status
Status: FUNCTIONAL BUT EXPERIMENTAL Current Branch:beta/endless-mode (ahead of main)
Recent Activity:
- Merged main branch changes
- Resolved merge conflicts in save-hook, SessionStore, SessionRoutes
- Updated documentation to remove misleading token reduction claims
- Added important caveats about beta status
Key Architecture Components
- Pre-Tool-Use Hook - Tracks tool execution start, sends tool_use_id to worker
- Save Hook (PostToolUse) - CRITICAL: Blocks until observation is generated (110s timeout), injects compressed observation back into context
- SessionManager.waitForNextObservation() - Event-driven wait mechanism (no polling)
- SDKAgent - Generates observations via Agent SDK, emits completion events
- Database - Added
tool_use_idcolumn for observation correlation
Configuration
Flow
Known Limitations
From the documentation:- ⚠️ Slower than standard mode - Blocking adds latency
- ⚠️ Still in development - May have bugs
- ⚠️ Not battle-tested - New architecture
- ⚠️ Theoretical projections - Efficiency gains not yet validated in production
What’s Working
- ✅ Synchronous observation injection
- ✅ Event-driven wait mechanism
- ✅ Token reduction via input clearing
- ✅ Database schema with tool_use_id
- ✅ Web UI for version switching
- ✅ Graceful timeout fallbacks
What’s Not Ready
- ❌ Production validation of token savings
- ❌ Comprehensive test coverage
- ❌ Stable channel release
- ❌ Performance benchmarks
- ❌ Long-running session data

