Skip to main content

Gemini Provider

Claude-mem supports Google’s Gemini API as an alternative to the Claude Agent SDK for extracting observations from your sessions. This can significantly reduce costs since Gemini offers a generous free tier.
Free Tier Rate Limits: Without billing enabled, Gemini has strict rate limits (5-10 RPM). Enable billing on your Google Cloud project to unlock 1000-4000 RPM while still using the free quota.

Why Use Gemini?

  • Cost savings: The free tier covers most individual usage patterns
  • Same quality: Gemini extracts observations using the same XML format as Claude
  • Seamless fallback: Automatically falls back to Claude if Gemini is unavailable
  • Hot-swappable: Switch providers without restarting the worker

Getting a Free API Key

  1. Go to the Google AI Studio API Key page
  2. Sign in with your Google account
  3. Accept the Terms of Service and privacy policies
  4. Click the Create API key button
  5. Choose a Google Cloud project or create a new one
  6. Copy and securely store the generated API key
No billing required to get started, but we recommend enabling billing to unlock higher rate limits (1000-4000 RPM vs 5-10 RPM) while still using the free quota.

Configuration

Settings

SettingValuesDefaultDescription
CLAUDE_MEM_PROVIDERclaude, geminiclaudeAI provider for observation extraction
CLAUDE_MEM_GEMINI_API_KEYstringYour Gemini API key
CLAUDE_MEM_GEMINI_MODELgemini-2.5-flash-lite, gemini-2.5-flash, gemini-3-flashgemini-2.5-flash-liteGemini model to use
CLAUDE_MEM_GEMINI_BILLING_ENABLEDtrue, falsefalseSkip rate limiting if billing is enabled on Google Cloud

Using the Settings UI

  1. Open the viewer at http://localhost:37777
  2. Click the gear icon to open Settings
  3. Under AI Provider, select Gemini
  4. Enter your Gemini API key
  5. Optionally select a different model
Settings are applied immediately—no restart required.

Manual Configuration

Edit ~/.claude-mem/settings.json:
{
  "CLAUDE_MEM_PROVIDER": "gemini",
  "CLAUDE_MEM_GEMINI_API_KEY": "your-api-key-here",
  "CLAUDE_MEM_GEMINI_MODEL": "gemini-2.5-flash-lite",
  "CLAUDE_MEM_GEMINI_BILLING_ENABLED": "true"
}
Alternatively, set the API key via environment variable:
export GEMINI_API_KEY="your-api-key-here"
The settings file takes precedence over the environment variable.

Available Models

ModelFree Tier RPMNotes
gemini-2.5-flash-lite10Default, recommended for free tier (highest RPM)
gemini-2.5-flash5Higher capability, lower rate limit
gemini-3-flash5Latest model, lower rate limit

Provider Switching

You can switch between Claude and Gemini at any time:
  • No restart required: Changes take effect on the next observation
  • Conversation history preserved: When switching mid-session, the new provider sees the full conversation context
  • Seamless transition: Both providers use the same observation format

Switching via UI

  1. Open Settings in the viewer
  2. Change the AI Provider dropdown
  3. The next observation will use the new provider

Switching via Settings File

{
  "CLAUDE_MEM_PROVIDER": "gemini"
}

Fallback Behavior

If Gemini is selected but encounters errors, claude-mem automatically falls back to the Claude Agent SDK: Triggers fallback:
  • Rate limiting (HTTP 429)
  • Server errors (HTTP 5xx)
  • Network issues (connection refused, timeout)
Does not trigger fallback:
  • Missing API key (logs warning, uses Claude from start)
  • Invalid API key (fails with error)
When fallback occurs:
  1. A warning is logged
  2. Any in-progress messages are reset to pending
  3. Claude SDK takes over with the full conversation context

Troubleshooting

”Gemini API key not configured”

Either:
  • Set CLAUDE_MEM_GEMINI_API_KEY in ~/.claude-mem/settings.json, or
  • Set the GEMINI_API_KEY environment variable

Rate Limiting

Google has two rate limit tiers for free usage: Without billing (API key only):
ModelRPMTPM
gemini-2.5-flash-lite10250K
gemini-2.5-flash5250K
gemini-3-flash5250K
Claude-mem enforces these limits automatically with built-in delays between requests. Processing may be slower but stays within limits. With billing enabled (still free tier):
ModelRPMTPM
gemini-2.5-flash-lite4,0004M
gemini-2.5-flash1,0001M
gemini-3-flash1,0001M
Recommended: Enable billing on your Google Cloud project to unlock much higher rate limits. You won’t be charged unless you exceed the generous free quota. This allows claude-mem to process observations instantly instead of waiting between requests.
If you hit rate limits:
  • Claude-mem automatically falls back to Claude SDK
  • Or switch back to Claude as your primary provider

Observation Quality

If observations seem lower quality with Gemini:
  • Note that Claude typically produces slightly higher quality observations
  • Consider using Gemini for cost savings and Claude for important projects

Next Steps