It's release time.

Two weeks ago we introduced Octofs, Octobrain, Octolib, and the Octomind cloud preview. Since then every one of them has shipped at least one release — most of them several.


Octofs 0.4.3 — regex search, parallel walking, and a delete command

github.com/muvon/octofs · 0.4.0 → 0.4.3

The MCP filesystem server we introduced on May 3 is built around one promise: catch the failure modes AI agents produce when they touch code. The 0.4.x patch line keeps tightening that grip.

What's new since 0.4.0:

  • Regex content search. view now accepts a regex pattern in addition to literal strings. Agents that need to find call sites, references, or structural matches don't have to fall back to shell + ripgrep — it's a first-class search mode.
  • Parallel file walking. Directory traversal is now multi-threaded with a shared worker pool. Large monorepos that took seconds to index now respond in hundreds of milliseconds. Still gitignore-aware on every directory.
  • delete command in text_editor. Agents can now delete lines, ranges, or files atomically through the same tool they use to edit them — no more shelling out to rm. Same atomic-write contract: the file is either there or it isn't, never in between.
  • Preserved file permissions in atomic_write. Before 0.4.2 a write would silently reset the file mode to the umask default. Editing an executable script and losing the +x bit broke things. Fixed: permissions are read before the temp file write and applied before the rename.
  • Stable lock keys for non-existent files. Concurrent creates with path aliasing (./a.rs vs a.rs) used to be a race. Now both resolve to the same canonical lock before the file exists.
  • JSON-encoded array params. Some MCP clients serialize array arguments as JSON strings. Octofs now accepts either shape transparently, so agents stop failing on a "[[1,50]]" vs [[1,50]] mismatch.

If you were on 0.4.0, you can drop in 0.4.3 with no config changes. The hash-based line mode, fuzzy matching, batch conflict detection — all still there, just faster and with sharper edges.

cargo install octofs --version 0.4.3
# or grab a binary at https://github.com/muvon/octofs/releases

Octobrain 0.6.1 — a quiet release on top of a loud one

github.com/muvon/octobrain · 0.6.0 → 0.6.1

0.6.0 was the big one — full document reading via knowledge read, regex match across indexed content, streaming query results, and the auto_link tool finally removed. 0.6.1 is the maintenance pass on top: dependency upgrades and a tuned release profile that trims the binary size and shaves milliseconds off cold starts.

Nothing in 0.6.1 changes the API. If you already shipped 0.6.0 into your stack, this is a drop-in upgrade. If you haven't, the 0.6.0 release post walks through what the read and match commands actually do — and why your AI's "memory" should include the documents it's read, not just the conversations it's had.


Octolib 0.21.5 — reasoning effort, prompt cache keepalive, two new providers

github.com/muvon/octolib · 0.19.0 → 0.21.5

Octolib is the engine behind every LLM call we make — Octomind, Octocode, Octofs, the agents in production, the scripts in our terminals. Since the intro post on April 27 it's gone from 0.19.0 to 0.21.5. Six minor versions, twelve patch releases. Here's what changed:

  • Reasoning effort across providers. A unified effort parameter that maps onto Anthropic's adaptive thinking, OpenAI's reasoning tiers, and the equivalent knobs on every provider that supports them. One call, every backend. The /effort slash command in Octomind hooks straight into this.
  • Anthropic adaptive thinking. Pass effort = "high" and Claude allocates more thinking budget; "low" keeps things fast. Per-TTL cache creation pricing is now honored correctly too.
  • Prompt cache keepalive policy. Long-running agents kept losing their cached prompts because the 5-minute TTL expired between tool calls. Octolib now keeps the cache warm with a background heartbeat. The result: orders-of-magnitude cheaper conversations once the context is built.
  • Two new providers.
    • Fireworks AI — fast OSS inference for Llama, Qwen, Mixtral.
    • Featherless — community-hosted models with pay-per-token billing.
  • DeepSeek tool calling. DeepSeek's tool-call format was non-standard; Octolib now parses it correctly and supports the full thinking + tool-use loop.
  • Image and video attachments via URL. Previously you had to base64-encode media; now provider-aware URL passthrough is the default where supported.
  • HuggingFace reranker via XLM-RoBERTa. Multi-lingual cross-encoder reranking on top of the existing dense + sparse retrieval. Sigmoid normalization on the scores.
  • HTTP/2 keep-alive + compression on every outgoing call. Connection reuse for high-throughput agent workloads. Stale-connection retry. Lower tail latencies under sustained load.

If you're building anything that calls LLMs from Rust, this is the layer to standardize on. Anthropic, OpenAI, Google Gemini, DeepSeek, Moonshot, MiniMax, Z.ai, OpenRouter, NVIDIA NIM, Cerebras, Together, Cloudflare Workers AI, Fireworks, Featherless, Ollama, custom endpoints — same trait, same retry logic, same cost accounting.


Octomind 0.29.0 — five releases in three weeks

github.com/muvon/octomind · 0.25.0 → 0.29.0

The agent runtime hit 0.29.0 today. Five releases in three weeks — that pace wasn't planned, it was the cloud preview surfacing what needed fixing in real time. We fixed it. Full story at octomind.run; here's the short list:

  • Schedule persistence and the /schedule command — recurring agent runs that survive restarts.
  • Intent-based MCP capability auto-activation — tools turn themselves on when the conversation needs them, off when it doesn't.
  • Domain-based agent gating — capabilities are filtered to what the agent's domain declares it needs.
  • Persistent vector cache + local embedding engine — pre-embedded vectors loaded on startup; no first-call cold path.
  • Parallel tool calls by default — the model is now explicitly instructed to batch independent calls in one turn.
  • /effort slash command — wired straight to the reasoning-effort plumbing in Octolib 0.21.x.
  • Project-local shebang tools — drop a script in .agents/tools/ and it's an MCP tool. No registration, no manifest.
  • ACP token usage and cost reporting — every host that speaks ACP (Octorun, Octoweb, your own UI) gets per-message cost meta out of the box.
  • Prompt cache keepalive — same plumbing as Octolib, surfaced as a session-level setting.

The small stuff adds up: continuous left rail for user input, persistent status line with cost delta, highlighted submitted input in history, fixed terminal rendering deadlocks, suppressed Ctrl+C echo. The chat experience in 0.29.0 is noticeably calmer than 0.25.0.


How they fit together

The stack didn't change. Everything in it did:

  • Octolib 0.21.5 — every LLM call, with reasoning effort and prompt cache keepalive
  • Octobrain 0.6.1 — persistent memory across sessions
  • Octofs 0.4.3 — safe filesystem access with regex search and a delete command
  • Octocode — semantic + structural code search (0.14.1 in the wild)
  • Octomind 0.29.0 — the runtime that orchestrates all of it

Single binaries, Apache-2.0, all of them. And all of them moved.

We've already started the next one. If something here unblocks a thing you've been wanting to build, open an issue — features requested in May tend to ship in June.

— Don