Memory for AI agents · learns on its own · lower running cost
Make your ChatGPT smarter — and buy your way to instant lower cost, every day.
Self-learning, self-improving memory for your agent — so it stops repeating itself, stops making things up, and stops burning your bill. No more “What are you talking about… again?”
Add it in about a minute — one connector, no terminal. Works with ChatGPT, Claude, Gemini, Hermes & OpenClaw.
query Which database backs the production stack?
RDS Postgres + pgvector, multi-AZ.conf 93%
Watch it decide: answer, double-check, or admit it doesn't know — instead of guessing.
Before / after
You've felt all three of these.
These are the things people actually say about their agents — pulled from public forums, not made up. Here's what changes the day you add UltraMemory.
“It re-learns my whole project for four minutes every new session, then asks me things I already answered.”
It remembers across sessions and apps. Your agent picks up where it left off — no re-explaining, no starting over.
“I burned through my credits — paying 10× more, just for the agent to relearn what it already knew.”
It reuses what’s already proven instead of re-deriving it, so you stop paying twice for the same work. Smaller bill at the end of the month.
“It confidently made something up — and I didn’t catch it until it had already caused a mess.”
When it’s not sure, it says so and double-checks first — instead of guessing with a straight face.
“Before” quotes are real pain points from public forums (Hacker News, GitHub, Reddit) — paraphrased, not testimonials. UltraMemory is in early access.
Why it's not just another memory add-on
Smarter agent. Smaller bill. No guessing.
Storing and searching memories is the easy part — everyone does that. UltraMemory is built around the three things that actually change your day.
Costs you less to run
Your agent keeps spending to re-figure-out what it already worked out. UltraMemory hands it the proven answer instead, so it stops paying to relearn the same thing — and your monthly bill comes down.
less rework = lower spend
Gets smarter on its own
It keeps the moves that work and quietly drops the ones that don’t. The more you use it, the sharper it gets — you never tune a thing. Most memory tools just store; this one improves.
learns from real outcomes, automatically
Knows when it doesn’t know
Instead of confidently making something up, it tells you when it’s unsure and double-checks first. Fewer wrong turns, less cleanup, and an agent you can actually trust.
admits it, instead of guessing
And we'll prove it: our own eval already shows the guess-rate dropping toward zero once the memory is calibrated, and we're running the public LOCOMO / LongMemEval test now — we'll publish our own number, not borrow anyone's.
How we compare
Everyone benchmarks recall. We compete one layer up.
On the public long-memory benchmarks the leaders sit inside a band — and we'll publish our own measured number rather than borrow one. The row that actually decides whether your agent is trustworthy is the one no one else fills.
LongMemEval (GPT-4o judge). Competitor figures are self-reported or from 2026 third-party write-ups (vectorize.io, agentmarketcap, supermemory.ai); long-memory benchmarks are highly methodology-dependent and disputed between vendors.
| Capability | UltraMemory | Mem0 | Zep | Supermemory |
|---|---|---|---|---|
| Finds the right memory by meaning | ✓ | ✓ | ✓ | ✓ |
| Keeps facts straight as they change over time | ✓ | — | ✓ | ◐ |
| Says “I’m not sure” instead of guessing | ✓ | — | — | ◐ |
| Learns which approaches work — and keeps them | ✓ | — | — | — |
| Gets sharper on its own, every night | ✓ | ◐ | ◐ | ◐ |
| Plugs into any agent (open standard) | ✓ | ✓ | ◐ | ✓ |
✓ offered · ◐ partial or different mechanism · — not offered. Reflects publicly documented capabilities as of mid-2026; vendors ship fast — tell us if something's out of date. Zep leads on temporal modeling; we don't claim otherwise.
Two ways in, one brain
A universal front door, and a deep integration.
Both speak to the same Postgres-backed memory. The difference in what they can do is, deliberately, the difference between the tiers.
MCP server
free / starterThe universal front door. Paste a URL and a key into Claude, Cursor, Claude Desktop, Hermes — anything that speaks MCP. Pull-based: your agent calls recall and store when it decides to.
- + memory_recall · memory_write
- + recall_gated · playbook_recall
- – no automatic capture or prefetch
Hermes provider
premiumThe deep integration. Drops into Hermes' lifecycle: injects relevant memory before every turn, captures after, and consolidates on session end. This is the “it just works and self-learns” experience MCP structurally can't offer.
- + auto-inject + auto-capture per turn
- + reflect & consolidate on session end
- + ground-truth instruction injection
# point any MCP client at one URL + key
{ "url": "https://api.ultramemory.us/mcp",
"headers": { "Authorization": "Bearer um_live_…" } }
# or call it directly
curl https://api.ultramemory.us/api/v1/recall/gated \
-H "Authorization: Bearer um_live_…" \
-d '{"query": "what did we decide about auth?"}'
# → { "decision": "abstain",
# "context_block": "No grounded memory. Retrieve before asserting." }Works with your stack
One URL and a key. It plugs into the tools your agents already use.
UltraMemory is a remote MCP server (Streamable HTTP), so any MCP-capable client connects in a line. Hermes users get the deeper path: a first-class memory provider that auto-injects and consolidates on its own.
All listed clients support remote (Streamable HTTP) MCP servers per their own docs. “First-class Hermes provider” = a native memory-provider plugin with lifecycle hooks, not just MCP.
- 1Get a key
Sign up for early access and copy your Bearer API key.
- 2Connect the URL
Add one MCP endpoint to Claude Code, Cursor, or your client — URL + key, done.
- 3Hermes? Drop in the provider
Install the ultramemory memory provider and set memory.provider: ultramemory for auto-inject + session-end consolidation.
claude mcp add --transport http ultramemory \ https://api.ultramemory.us/mcp \ --header "Authorization: Bearer um_live_…"
Cursor: one url + headers block in mcp.json · Claude Desktop: add the URL in Settings → Connectors.
Early-access partners
The first names here are still being written.
We'd rather reserve this space for real teams than fill it with stock quotes. These seats are open — be one of the first to put UltraMemory in front of your agents.
Your testimonial here — what changed once your agent stopped forgetting, stopped guessing, and cost you less to run.
Your testimonial here — what changed once your agent stopped forgetting, stopped guessing, and cost you less to run.
Your testimonial here — what changed once your agent stopped forgetting, stopped guessing, and cost you less to run.
Pricing
Pay for memory that learns, not for storage.
Simple monthly plans. The self-learning — the part that actually makes your agent sharper and lowers what it costs to run — unlocks as you move up. Every plan keeps your memories private to you.
Prices indicative while billing is finalized · your memories stay private and isolated to you on every plan.
Questions
Straight answers.
Why not just use a regular memory tool?
Regular memory finds what’s similar, not what’s actually right — so old or wrong facts come back sounding just as confident. UltraMemory can say “I’m not sure,” and it replaces old facts so your agent acts on what’s true now.
Is my data private?
Yes. Your memories are walled off to you and encrypted — no mixing with anyone else, and no second copy floating around to fall out of sync.
Will it slow my agent down?
Looking something up doesn’t trigger another AI call, so it’s fast. We’ll publish real speed numbers rather than promise one we haven’t measured.
Am I stuck with you?
No. It uses an open standard and your memories live in a plain database — connect any agent, and export everything whenever you want.
How is this different from Mem0, Zep, or Supermemory?
They focus on storing and finding memories well. We add the parts they don’t: it learns what works, gets sharper on its own, and admits when it’s unsure instead of guessing. See the chart above.
What’s in early access?
A working product, the free tier, and direct access to us. Bring an agent and tell us what breaks.
Give your agent memory it can be honest about.
Early access is open. Bring an agent, leave with memory that learns and knows its limits.