Octobrain 0.7.0: Your AI's Memory Now Sleeps, Forgets, and Focuses

A memory that only ever grows isn't a memory. It's a junk drawer.

Every AI memory tool so far has been additive: store, store, store. The pile gets bigger, your searches get noisier, and the thing you actually need is buried under five near-duplicate notes you wrote about it last Tuesday. More memories stops meaning better recall.

Brains don't work that way. They consolidate while you sleep. They let unused things fade. They organize around what you were trying to do, not the order things happened. That pruning isn't a bug in human memory — it's the entire reason recall stays sharp.

0.6.0 was about what Octobrain could find — reading full documents, grepping across indexed content. 0.7.0 is about what Octobrain's memory does when you're not looking. It stops being a passive log and starts behaving like a brain.

Sleep Consolidation — Memory That Tidies Itself

This is the headline. Octobrain now runs a "sleep" pass that finds clusters of recent, similar memories and folds each cluster into a single consolidated insight.

Say that over one week your agent stored five separate notes while chasing the same rate-limiting bug. Before 0.7.0, a search would return five fuzzy, overlapping hits and leave the ranking to luck. After a sleep pass, those five collapse into one clean, higher-confidence insight — and the originals aren't deleted, they're dampened and kept as provenance, so the trail back to the raw notes survives.

The important part: it's autonomous. No cron job. No tool your agent has to remember to call. It triggers itself lazily — the next time Octobrain starts up, if a day has passed since the last pass, it runs and quietly gets out of the way. A slow or failed pass never blocks anything.

[memory]
sleep_consolidation_enabled        = true   # opt-out, not opt-in
sleep_consolidation_interval_hours = 24     # once a day
sleep_consolidation_threshold      = 0.85   # how similar memories must be to cluster
sleep_consolidation_min_cluster_size = 3    # need at least 3 to bother
sleep_consolidation_max_age_days   = 7      # only recent memories

If you want to force a pass — for testing, or right before a big retrieval — the CLI override is still there:

octobrain memory sleep-consolidate

But the default is that you never touch it. The memory keeps itself clean.

Half-Life Decay — Forgetting, On Purpose

Octobrain now applies a forgetting curve. Every memory's importance decays over time on a half-life schedule — but the moment you access a memory, it gets reinforced. Frequently-recalled memories stay strong; ones nothing ever touches gently fade.

This is the Ebbinghaus forgetting curve, borrowed straight from how human memory works. The effect is that ranking improves on its own. You don't curate anything. The memories you actually use float up; the ones you stored once and never needed sink — without ever being thrown away.

Two guardrails matter here:

Nothing gets zeroed out. There's an importance floor; a low-access memory fades toward the back but never disappears. Decay is a re-ranking signal, not a delete.
Decay persists. Access counts and last-accessed timestamps are stored and survive restarts. Your memory's sense of "what's hot" doesn't reset every session.

[memory]
decay_enabled       = true
decay_half_life_days = 90   # higher = memories stay relevant longer

You can also tune decay per memory — a fast-moving note can be told to fade twice as fast, a foundational decision twice as slow — but for almost everyone the default just works.

Goal-Anchored Consolidation — Memory That Has a Point

Here's the idea agent-memory research keeps circling: consolidate on goal, not on clock. Memories shouldn't just be summarized because time passed — they should collapse around the thing you were trying to accomplish.

0.7.0 makes that a first-class workflow. There's a new Goal memory type and a real lifecycle for every memory — Working → Consolidated → Archived. Memories that contribute toward a goal link to it with an achieves relationship. When the goal is done, you close it, and Octobrain folds every contributing memory into one consolidated insight: importance boosted above its sources, the sources dampened and linked underneath it as the record of how you got there.

# Five scattered findings, one closed goal → one durable insight
octobrain memory consolidate goal_4f2a -s "Shipped rate limiting: token bucket, 429 + Retry-After"

Skip the summary and Octobrain synthesizes one from the source titles. Either way, what was five loose notes becomes one memory that means something — anchored to the intent that produced it.

And this is the same machinery sleep consolidation runs on underneath: each cluster gets a synthetic goal, and the goal pipeline does the rest. One consolidation engine, two ways to trigger it — by hand when you finish something, automatically while you sleep.

Recall Got Smarter — HyDE-lite, On by Default

Vague queries are where semantic search usually falls down. "That thing about retries" doesn't embed anywhere near the note you actually wrote.

0.7.0 ships HyDE-lite query expansion and turns it on by default. The mechanism is simple: do a first-pass search, take the top results, average them into a centroid, and blend that back into your original query before the real search. The query effectively learns what its own answers look like and aims better.

In practice that's +10–30% recall on long-tail and underspecified queries — the exact ones humans actually type. It costs one extra vector lookup per search, needs no LLM (it's pure math), and remember includes it automatically. The keyword (BM25) half of search still uses your literal text, so exact-match queries are unaffected.

You don't configure anything. Vague questions just find the right memory more often.

A Leaner Tool Surface — 8 MCP Tools Down to 5

Every tool you expose over MCP costs your agent context on every single turn — roughly 500–2000 tokens of schema it has to read before it does anything. A bloated tool surface is a tax on every request.

So we cut Octobrain's MCP surface from eight tools to five distinct verbs:

memorize     store a memory   (now with optional related_to[])
remember     recall           (HyDE expansion + 1-hop neighbors, automatic)
forget       delete           (with confirmation)
consolidate  close a goal     (fold N memories into 1 insight)
knowledge    documents        (index, read, match, search)

Three tools left MCP because they no longer earned their schema cost:

sleep_consolidate — now autonomous, so the agent never needs to call it.
relate — folded into memorize. You can pass related_to[] and link a memory to others in the same call that stores it. Contributing toward a goal is now one round-trip: store the finding and mark it achieves the goal together.
memory_graph — remember already returns a memory's immediate neighbors, which covers the common case.

None of them are gone — they all still live in the CLI for admin and debugging. They're just no longer eating your agent's context for features it rarely uses. Five tools, five verbs, no overloaded mode flags.

We Started Measuring Memory Quality — LongMemEval

You can't improve what you don't measure. 0.7.0 adds a LongMemEval benchmarking harness — with checkpointing for long runs — so we can score Octobrain's long-term recall across releases on a real benchmark instead of vibes. It's plumbing you'll never run, but it's why the recall and consolidation claims above are claims we can stand behind, and how the next ones will be too.

Under the Hood

A handful of quieter changes keep the lights on as the memory grows:

Periodic LanceDB maintenance runs automatically to keep the vector store compact and fast over time.
Async auto-linking and graph retrieval — related memories connect in the background without slowing down the write that triggered them.
A consolidated SQL module with centralized literal escaping and hardening — fewer sharp edges, one place to reason about queries.

You won't notice any of these directly. That's the point.

What 0.7.0 Looks Like in Practice

A week in the life of a memory, end to end:

Monday. Your agent stores three notes while debugging a flaky webhook — each via memorize, one of them marked achieves the "fix webhook retries" goal in the same call.
All week. You keep asking about it. Each remember reinforces the memories you touch and quietly fades the ones you don't.
Friday. You ship the fix and consolidate the goal. Three notes become one durable insight: "Webhook retries: exponential backoff, idempotency keys, dead-letter after 5."
Overnight. Sleep consolidation sweeps up the near-duplicate observations you never explicitly linked and folds those too.
Next month. Someone asks "how did we handle webhook failures?" — a vague query that HyDE expands, landing on the consolidated insight instead of five half-remembered fragments.

Nobody curated anything. The memory organized itself.

Upgrading

From 0.6.x, the migration is small:

Config — one rename. If your [embedding] section sets text_model, rename it to model. That's the only breaking config change.

[embedding]
model = "fastembed:nomic-ai/nomic-embed-text-v1.5"   # was: text_model

MCP clients — three tools moved to CLI-only. If your client config or agent prompts call sleep_consolidate, relate, or memory_graph over MCP, update them:

Drop sleep_consolidate calls entirely — it's autonomous now.
Replace relate with memorize's related_to[] to link as you store (or use octobrain memory relate in the CLI for existing-to-existing links).
memory_graph's common case is covered by remember's neighbor results; deeper traversal stays in octobrain memory graph.
consolidate_goal is renamed to consolidate — same schema, cleaner verb.

Storage — nothing to do. New lifecycle and access-tracking columns are added on first run, and legacy memories default to fully active. No manual migration, no downtime.

Defaults — sleep consolidation and HyDE are now on. If you want the old behavior, set sleep_consolidation_enabled = false or [search.hyde] enabled = false. Most people should leave them on.

Source and binaries at github.com/muvon/octobrain. If something breaks, open an issue — we read them.