Octobrain 0.6.0: Your AI Finally Remembers What It Read

Most AI memory tools store what you tell them to store. 0.6.0 changes what Octobrain can find.

This release is about knowledge — not just the memories your AI accumulates from conversations, but the documents, pages, and files you point it at. Read, search, match. Full content extraction. Regex grep across everything it's indexed. And a cleaner MCP surface that removes a tool you never needed to call manually.


Read Anything, Extract Everything

The biggest addition in 0.6.0 is the read command. Give Octobrain a URL or a local file path, and it pulls the full text — no chunking, no summarization, no truncation. Raw content, returned directly.

It handles HTML, PDF, DOCX, and plain text files. Remote URLs and local sources both work the same way.

octobrain knowledge read https://docs.example.com/api-reference
octobrain knowledge read ./spec.pdf

Via MCP, this is the knowledge tool with command: "read". It's the fallback when semantic search isn't precise enough — when you need the whole thing, not just the relevant chunk.

The use case is exactly what it sounds like: your AI agent hits a wall on a vague memory, you point it at the source, it reads the whole document and works from there. No copy-paste, no context window gymnastics.


Regex Match Across Indexed Content

The match command is new and useful for developer workflows.

It runs a regex pattern across everything in the knowledge index — or just a specific source if you pass one — and returns matching lines with their line numbers and source paths.

octobrain knowledge match "error_code|timeout"
octobrain knowledge match "fn\s+handle_" --source ./src/main.rs

This is grep, but over your AI's knowledge base. If you've indexed a codebase, a set of docs, or a collection of URLs, you can search them by exact pattern instead of semantic similarity. The two modes complement each other: search for "find me something about authentication", match for "find every line that mentions auth_token".

The MCP knowledge tool supports this as command: "match". Patterns are validated before execution — bad regex fails fast with a clear error, not a silent empty result.


Streaming Query Results

Under the hood, knowledge queries now stream results from LanceDB instead of collecting everything into memory first.

The old approach had an arbitrary 10,000-row cap and would spike memory on large tables. Streaming removes the cap entirely and keeps peak memory flat regardless of index size. For most users this is invisible — but if you're indexing large codebases or document collections, you'll notice it stops being slow.


auto_link Is Gone (It Still Works, You Just Don't Call It)

The auto_link MCP tool has been removed. This is the only breaking change.

Auto-linking — automatically connecting related memories based on semantic similarity — still happens. It runs on every memorize and update_memory call. You just can't trigger it manually anymore, because there was never a good reason to.

If you had any MCP client config or agent prompts that called auto_link, remove those calls. Everything else stays the same.


Vector Index: No More Dimension Errors

A subtle but annoying bug: LanceDB's PQ index requires that the sub-vector count divides evenly into the embedding dimension. When it didn't, you'd get an opaque indexing error.

0.6.0 fixes this by snapping the sub-vector count down to the nearest valid divisor automatically, capped at 96. The optimizer handles it — you configure nothing, it just works.


Knowledge Sources Are Stricter

Two small but important validation changes:

Directory paths are rejected. If you pass a directory to knowledge index or knowledge read, Octobrain now returns an error instead of silently doing nothing. Files only.

Source URIs are normalized. Trailing slashes are stripped during indexing, so https://example.com/docs/ and https://example.com/docs are treated as the same source. Previously they'd create duplicate entries.


What 0.6.0 Looks Like in Practice

Here's a real agent workflow with this release:

  1. Index your project docs: octobrain knowledge search "rate limiting" — finds the relevant chunk
  2. Need the full spec? octobrain knowledge read ./docs/api.md — full text, no truncation
  3. Looking for a specific error code across all indexed sources? octobrain knowledge match "ERR_4[0-9]{2}" — every matching line, with source and line number
  4. Store a key insight: octobrain memory memorize — auto-links to related memories automatically
  5. Agent picks it up on the next remember call — no manual wiring

The knowledge layer and memory layer work independently but complement each other. Knowledge is for external sources. Memory is for accumulated context and decisions. 0.6.0 makes the knowledge side substantially more capable.


Upgrading

If you're on 0.5.x:

  • Remove any auto_link calls from MCP client configs or agent prompts
  • Everything else is backward compatible

Config format is unchanged. Storage format is unchanged. The knowledge MCP tool gains two new command values (read and match) — existing calls to search, store, and delete are unaffected.

Source and binaries at github.com/muvon/octobrain. If you run into anything, open an issue — we read them.