Gemini Provider
Claude-mem supports Google’s Gemini API as an alternative to the Claude Agent SDK for extracting observations from your sessions. This can significantly reduce costs since Gemini offers a generous free tier.
Free Tier Rate Limits: Without billing enabled, Gemini has strict rate limits (5-10 RPM). Enable billing on your Google Cloud project to unlock 1000-4000 RPM while still using the free quota.
Why Use Gemini?
- Cost savings: The free tier covers most individual usage patterns
- Same quality: Gemini extracts observations using the same XML format as Claude
- Seamless fallback: Automatically falls back to Claude if Gemini is unavailable
- Hot-swappable: Switch providers without restarting the worker
Getting a Free API Key
- Go to the Google AI Studio API Key page
- Sign in with your Google account
- Accept the Terms of Service and privacy policies
- Click the Create API key button
- Choose a Google Cloud project or create a new one
- Copy and securely store the generated API key
No billing required to get started, but we recommend enabling billing to unlock higher rate limits (1000-4000 RPM vs 5-10 RPM) while still using the free quota.
Configuration
Settings
| Setting | Values | Default | Description |
|---|
CLAUDE_MEM_PROVIDER | claude, gemini | claude | AI provider for observation extraction |
CLAUDE_MEM_GEMINI_API_KEY | string | — | Your Gemini API key |
CLAUDE_MEM_GEMINI_MODEL | gemini-2.5-flash-lite, gemini-2.5-flash, gemini-3-flash | gemini-2.5-flash-lite | Gemini model to use |
CLAUDE_MEM_GEMINI_BILLING_ENABLED | true, false | false | Skip rate limiting if billing is enabled on Google Cloud |
Using the Settings UI
- Open the viewer at http://localhost:37777
- Click the gear icon to open Settings
- Under AI Provider, select Gemini
- Enter your Gemini API key
- Optionally select a different model
Settings are applied immediately—no restart required.
Manual Configuration
Edit ~/.claude-mem/settings.json:
{
"CLAUDE_MEM_PROVIDER": "gemini",
"CLAUDE_MEM_GEMINI_API_KEY": "your-api-key-here",
"CLAUDE_MEM_GEMINI_MODEL": "gemini-2.5-flash-lite",
"CLAUDE_MEM_GEMINI_BILLING_ENABLED": "true"
}
Alternatively, set the API key via environment variable:
export GEMINI_API_KEY="your-api-key-here"
The settings file takes precedence over the environment variable.
Available Models
| Model | Free Tier RPM | Notes |
|---|
gemini-2.5-flash-lite | 10 | Default, recommended for free tier (highest RPM) |
gemini-2.5-flash | 5 | Higher capability, lower rate limit |
gemini-3-flash | 5 | Latest model, lower rate limit |
Provider Switching
You can switch between Claude and Gemini at any time:
- No restart required: Changes take effect on the next observation
- Conversation history preserved: When switching mid-session, the new provider sees the full conversation context
- Seamless transition: Both providers use the same observation format
Switching via UI
- Open Settings in the viewer
- Change the AI Provider dropdown
- The next observation will use the new provider
Switching via Settings File
{
"CLAUDE_MEM_PROVIDER": "gemini"
}
Fallback Behavior
If Gemini is selected but encounters errors, claude-mem automatically falls back to the Claude Agent SDK:
Triggers fallback:
- Rate limiting (HTTP 429)
- Server errors (HTTP 5xx)
- Network issues (connection refused, timeout)
Does not trigger fallback:
- Missing API key (logs warning, uses Claude from start)
- Invalid API key (fails with error)
When fallback occurs:
- A warning is logged
- Any in-progress messages are reset to pending
- Claude SDK takes over with the full conversation context
Troubleshooting
Either:
- Set
CLAUDE_MEM_GEMINI_API_KEY in ~/.claude-mem/settings.json, or
- Set the
GEMINI_API_KEY environment variable
Rate Limiting
Google has two rate limit tiers for free usage:
Without billing (API key only):
| Model | RPM | TPM |
|---|
| gemini-2.5-flash-lite | 10 | 250K |
| gemini-2.5-flash | 5 | 250K |
| gemini-3-flash | 5 | 250K |
Claude-mem enforces these limits automatically with built-in delays between requests. Processing may be slower but stays within limits.
With billing enabled (still free tier):
| Model | RPM | TPM |
|---|
| gemini-2.5-flash-lite | 4,000 | 4M |
| gemini-2.5-flash | 1,000 | 1M |
| gemini-3-flash | 1,000 | 1M |
Recommended: Enable billing on your Google Cloud project to unlock much higher rate limits. You won’t be charged unless you exceed the generous free quota. This allows claude-mem to process observations instantly instead of waiting between requests.
If you hit rate limits:
- Claude-mem automatically falls back to Claude SDK
- Or switch back to Claude as your primary provider
Observation Quality
If observations seem lower quality with Gemini:
- Note that Claude typically produces slightly higher quality observations
- Consider using Gemini for cost savings and Claude for important projects
Next Steps