Octobrain: Long-Term Memory for AI Assistants

You spend an hour debugging with Claude. You explore the codebase, make architectural decisions, find edge cases. You exit. The next session? You're explaining your project structure from scratch.

This isn't how AI should work. Research from Arize shows LLM performance degrades sharply past 100K tokens — even models advertising million-token windows. Diffray's testing found 11 of 12 models dropped below 50% accuracy at just 32K tokens. The "lost in the middle" effect buries information in long contexts. And Qodo's survey found 65% of developers cite missing context as their biggest barrier during refactoring.

The problem isn't that AI forgets. It's that we had no good way to make it remember.

Octobrain is long-term memory for AI assistants. It persists insights, decisions, and knowledge across sessions — not just within them. Today we're releasing version 0.5.0 with local file indexing, git-aware stale reference cleanup, and full MCP (Model Context Protocol) compliance.

What Octobrain Actually Does

Octobrain runs as an MCP server that exposes memory tools to any compatible client — Claude Desktop, Zed, Cursor, or anything that speaks MCP. It's also a CLI when you need direct control.

# Start the MCP server
octobrain mcp

# Or use it directly
octobrain memory memorize --title "API Pattern" \
  --content "Use REST for CRUD, GraphQL for complex queries" \
  --memory-type architecture --tags "api,design"

The server exposes seven tools: memorize, remember, forget, relate, auto_link, memory_graph, and knowledge_search. Your AI stores insights during conversation, retrieves them in future sessions, and explores connections between related memories.

No manual summary files. No context window bloat. Just memory that persists.

Why We Built This

We didn't start with memory. We started with Octomind — a runtime for specialist AI agents that gives you a senior developer in five seconds via the tap system: octomind run developer:general. Zero configuration, fully tooled.

But our agents kept hitting the same wall. They'd spend hours on a task — exploring the codebase, making architectural decisions, finding solutions. Then the session would end. Context window full. Next session started at zero. Same patterns rediscovered. Same decisions remade. Groundhog Day with a terminal.

We tried the workarounds. Manual memory banks that required discipline we didn't have. Session summaries that lost the nuance of why decisions were made. Static knowledge bases that went stale the moment we renamed a file. Cursor rules, custom scripts, various extensions. All static workarounds for a dynamic problem.

So we built Octobrain.

What Existing Tools Get Wrong

When we looked at memory solutions, we kept finding the same gaps:

Single-strategy systems. Most tools pick one approach — vector search, or BM25, or graph traversal — and that's it. But different queries need different strategies. Sometimes you need semantic similarity. Sometimes you need exact keyword matches. Sometimes you need to traverse relationships. We wanted all of them, fused intelligently.

Static importance. Existing systems treat every memory as equally important forever. That's not how memory works. Some things matter for a day. Some things matter for years. We needed temporal decay — memories that fade naturally unless reinforced by access.

Dead reference accumulation. File paths in memories rot. You refactor, rename, delete. Most systems don't notice. The memories linger, pointing to ghosts. We needed git-aware cleanup that detects renames, penalizes stale references, and deletes the truly obsolete.

No relationships. Real knowledge is connected. A decision implies dependencies. A bug fix supersedes a workaround. A pattern implements an architecture. We needed a system that could model these relationships, not just store isolated facts.

The SOTA Approaches We Actually Use

We didn't invent these techniques. We assembled what works from the state of the art, made it configurable, and let the use case decide.

Hybrid Search with Native RRF

Octobrain combines BM25 full-text search with vector similarity using Reciprocal Rank Fusion via LanceDB's native execute_hybrid(). Same algorithm production search systems use. You find "authentication middleware" when you search "auth layer" — but you also find exact matches when you need them.

Optional Reranking

For critical retrieval, enable the cross-encoder reranker. It scores query-document pairs with a dedicated model. Improves accuracy 20–35%, reduces hallucinations 35%. Off by default because speed matters. Available when you need precision.

Temporal Decay (Ebbinghaus-Style)

Memories fade. That's not a bug — it's how cognition works. Octobrain implements an Ebbinghaus forgetting curve: memories naturally decay unless accessed. Every retrieval boosts importance. Unused memories drift down, important ones stay visible. Configure the half-life (default: 90 days) and floor value.

Auto-Linking (Zettelkasten for AI)

When you store a memory, Octobrain finds semantically similar memories and creates bidirectional relationships. The result is an emergent knowledge graph. Query one memory, get its context — memories that extend it, contradict it, implement it, depend on it.

Git-Aware Stale Reference Cleanup

Memory systems accumulate references to files that no longer exist. Octobrain detects this on startup: if all related files are gone, the memory deletes. If some are gone, importance gets penalized. Files renamed? We detect via git history and update references.

Your memory stays as clean as your codebase.

What's New in 0.5.0

This release reflects what we learned running memory at scale:

Local File Indexing: The knowledge system now indexes local files, not just URLs. Point it at documentation, PDFs, Markdown. It chunks intelligently, tracks parent content for context, reindexes automatically when content goes stale.

Session-Scoped Storage: Knowledge chunks can be session-scoped — temporary storage that cleans itself up after 120 hours. For one-off research that shouldn't pollute long-term memory.

Product and Workflow Memory Types: Two new categories. product for feature specs, requirements, design decisions. workflow for processes, playbooks, recurring patterns.

Official rmcp SDK: Migrated from custom MCP implementation to the official Rust SDK. Better protocol compliance, better stability, better future-proofing.

Stale Reference Cleanup: Automatic detection of dead file references, with git rename tracking.

How It Fits Together: Octomind + Octobrain

Octobrain works standalone. But it's designed for Octomind — our runtime for specialist AI agents.

The flow: Octomind writes code, makes decisions, encounters bugs. Each insight stores in Octobrain with the current git commit hash. Three months later, refactoring that module? Octomind retrieves the original decision — plus related memories, conflicting approaches that were rejected, the reasoning behind trade-offs.

Memory scopes to your project via git remote URL hash. Cross-project queries work when you need shared context. Role filtering (developer, reviewer, etc.) keeps perspectives organized.

Octomind handles the agent. Octobrain handles the memory. Together they solve the full problem: an AI that actually knows your domain, across sessions, without the setup headache.

Why We Didn't Use Mem0 or Zep

We evaluated the memory space before building. Here's what we found:

Mem0: Fastest path to production, excellent managed service, flat vector architecture (no graph relationships). We needed the graph.
Zep: Graph-based, strong on temporal queries, leads LoCoMo benchmark — complex architecture, heavier than we wanted for local-first usage.
Letta: OS-inspired memory tiers, LLM manages retrieval — requires LLM for all memory operations. We wanted something lighter, faster, more predictable.

Octobrain is different. We're not optimizing for one approach — we're making multiple SOTA strategies available and configurable. Hybrid search or pure vector? Your choice. Temporal decay or static importance? Configurable. Auto-linking or manual relationships? Both work. Local embeddings or API-based? Either way.

The goal isn't the simplest memory system. It's the most accurate one for your specific use case.

Open Source, Local-First

Octobrain is Apache-2.0 licensed and runs entirely on your machine. Your code, your memories, your data — never leaves your system unless you explicitly configure an external embedding API. We built it this way because we were tired of choosing between "convenient" managed services that required shipping our codebase to someone else's server, or fragile DIY solutions that fell over under load.

Self-hosting shouldn't mean compromising on capability. Octobrain gives you production-grade hybrid search, temporal decay, and relationship graphs — all running locally, with zero data exposure.

Getting Started

# macOS via Homebrew (recommended)
brew install muvon/tap/octobrain

# Any platform via crates.io
cargo install octobrain

# Start the MCP server
octobrain mcp

Add to your Claude Desktop config:

{
	"mcpServers": {
		"octobrain": {
			"command": "/path/to/octobrain",
			"args": ["mcp"]
		}
	}
}

Restart Claude. Your AI now has long-term memory.

FAQ

Does Octobrain work with other AI assistants besides Claude?

Yes. Any MCP-compatible client works — Claude Desktop, Zed, Cursor, Windsurf, and any custom implementation that speaks the protocol.

Where is my data stored?

Locally on your machine. Octobrain uses SQLite for the graph and LanceDB for vectors. No cloud service, no external storage unless you configure an API-based embedding provider.

How does this compare to just using Claude's built-in memory?

Claude's memory is great for personal preferences and recurring context. Octobrain is for project knowledge — architectural decisions, codebase patterns, bug fixes, design trade-offs. It persists across projects and survives account changes.

Can I import existing notes or documentation?

Yes. The knowledge system in 0.5.0 indexes local files — Markdown, PDFs, text files. Point it at your docs and it chunks, embeds, and makes them searchable.

What happens if I rename or delete files mentioned in memories?

Octobrain detects this on startup. If all related files are gone, the memory deletes. If some remain, importance gets penalized. Renames are detected via git history and references update automatically.

Is there a hosted version?

Not currently. Octobrain is designed local-first. A hosted version would require significant architectural changes and security considerations. If you need this, let us know.

What's Next

We're experimenting with retrieval strategies. Multi-vector representations. Learned sparse retrieval. Query expansion. The memory space evolves fast — we're committed to shipping what works, not what sounds good in a paper.

If you're building AI agents that need to remember things across sessions, try Octobrain. We built it because we needed it. If you're hitting the same wall we did, it might be what you need too.

Octobrain is developed by Muvon Un Limited. Open source under Apache-2.0. Get it on GitHub.