Octocode 0.15.0: Local-First โดยค่าเริ่มต้น, Hybrid Search ทุกที่

Octocode 0.15.0 เปิด hybrid search และ reranking ตามค่าเริ่มต้น มาพร้อม local model stack ที่ไม่ต้องใช้ API key และทำให้ structural grep ฉลาดขึ้นกับ pattern ที่ LLMs มักสร้างผิด

23 พฤษภาคม 2569 • 5 นาทีในการอ่าน •

Don Karter

• แปลโดย AI

Octocode 0.15.0 - local hybrid search พร้อม reranking เปิดตามค่าเริ่มต้น

ครั้งแรกที่คุณรัน octocode index บนเครื่องใหม่ คุณได้รับ config error มา ต้องการ: Voyage API key ไม่เป็นไรสำหรับทีมที่มีอยู่แล้ว — แต่น่ารำคาญสำหรับทุกคนที่ลอง Octocode เป็นครั้งแรก

0.15.0 แก้ปัญหานี้แล้ว ไม่ต้องการ API key เพื่อได้ index ที่ใช้งานได้และรวดเร็ว Hybrid search และ reranking เปิดตามค่าเริ่มต้น และ structural grep ตอนนี้ recover อัตโนมัติจาก pattern ที่ผิดซึ่ง LLMs มักสร้าง — แทนที่จะคืนผลลัพธ์เป็นศูนย์แล้วเงียบ

42 commits ตั้งแต่ 0.14.1 นี่คือสิ่งที่สำคัญ

ไม่ต้องการ API Key

config ค่าเริ่มต้นใน 0.15.0 ทำงานออฟไลน์อย่างสมบูรณ์ โมเดลดาวน์โหลดไปยัง system cache ในการใช้งานครั้งแรก หลังจากนั้น: ไม่มี network call, ไม่มี rate limit, ไม่มีบิลรายเดือน

[embedding]
code_model = "fastembed:jinaai/jina-embeddings-v2-base-code"
text_model = "fastembed:nomic-ai/nomic-embed-text-v1.5"

[search.reranker]
enabled = true
model   = "fastembed:jina-reranker-v2-base-multilingual"

[search.hybrid]
enabled = true
default_vector_weight  = 0.6
default_keyword_weight = 0.4

ประสบการณ์ค่าเริ่มต้นตอนนี้คือ: clone repo, รัน octocode index, รับผลลัพธ์ ตัวเลือก Voyage, Cohere และ Jina API ยังอยู่ — comment ไว้ใน template — ถ้าต้องการ

หมายเหตุ: octocode index ครั้งแรกหลังอัปเกรดจะดาวน์โหลด local model ใหม่ รวมกันหลายร้อย MB ดาวน์โหลดครั้งเดียวแล้ว cache ไว้

Hybrid Search เปิดตามค่าเริ่มต้น

Octocode เคยเลือกระหว่าง dense vector search กับ BM25 0.15.0 รวมทั้งสองเข้าด้วยกันในทุก query ด้วย Weighted Reciprocal Rank Fusion ที่ทำงานภายใน LanceDB

Dense vectors แพ้กับการค้นหาที่มี identifier หนักอย่าง "find parse_remote" BM25 แพ้กับ intent ที่ paraphrase อย่าง "ฟังก์ชันที่จัดการ remote pull setup" การรวมกันหมายความว่าวิธีการไหนไม่พลาด — คุณได้ดีที่สุดจากทั้งสองใน query ทุกอัน

ปรับสมดุลได้:

# โปรเจกต์ code-heavy — identifier ครองใหญ่
default_vector_weight  = 0.3
default_keyword_weight = 0.7

# โปรเจกต์ docs-heavy — semantic intent ครองใหญ่
default_vector_weight  = 0.8
default_keyword_weight = 0.2

หมายเหตุการทำความสะอาด: config field เก่าอย่าง keyword_path_weight, keyword_content_weight และที่คล้ายกันถูกลบออก ไม่มีผลอยู่แล้ว config เก่าที่มี key เหล่านั้นจะโหลดได้ แต่ค่าถูกละเว้น ใช้ default_vector_weight / default_keyword_weight แทน

Structural Grep ไม่คืนผลลัพธ์ว่างอีกต่อไป

ปัญหาของ structural search ก่อนหน้านี้: LLMs สร้าง node kind ที่ผิดบ่อยมาก Python ใช้ function_definition ไม่ใช่ function_declaration Rust ใช้ function_item LLM เลือกสิ่งที่ดูเป็นธรรมชาติ ได้ผลลัพธ์เป็นศูนย์ ไม่มีคำใบ้ว่าทำไม

ตอนนี้เมื่อ pattern ไม่ match อะไรเลย Octocode ลองการตีความที่ผ่อนปรนขึ้นเรื่อย ๆ โดยอัตโนมัติ — และรู้ kind ที่ถูกต้องต่อภาษา function_declaration ใน Python กลายเป็น function_definition func, fn, function ทั้งหมด resolve อย่างถูกต้อง ไม่ว่า LLM จะพิมพ์อะไร

เมื่อไม่มีอะไรทำงาน คุณได้รับ error ที่มีประโยชน์แทนความเงียบ:

"ใน Python ใช้ function_definition ไม่ใช่ function_declaration"

การแก้ไขอีกอย่าง: match ขนาดใหญ่ — ทั้ง class body, function block ใหญ่ — เคย dump ทุกอย่างและทำให้ context window ล้น ตอนนี้แสดงบรรทัดแรกสองสามบรรทัดและสรุป:

src/foo.rs:42:  pub fn handle_request(req: Request) -> Result<Response> {
                    let user = authenticate(&req)?;
                    let payload = req.json()?;
                    if !validate(&payload) {
... (24 more lines)

match สั้นผ่านไปโดยไม่เปลี่ยนแปลง

GraphRAG ตอนนี้ติดตามว่าใคร extend อะไร

GraphRAG สร้าง call graph ระหว่างไฟล์และฟังก์ชันอยู่แล้ว ตั้งแต่ 0.15.0 ยัง extract ความสัมพันธ์ inheritance และ interface ด้วย — ใน C++, Go, Java, JavaScript, TypeScript, PHP, Python, Ruby และ Rust

แต่ละ entry ของฟังก์ชัน/คลาสตอนนี้มี:

extends — superclass, parent traits/interfaces, Go struct embedding
implements — การ implement interface, การตอบสนอง trait, impl Trait for Type ใน Rust

octocode graphrag get-relationships --node_id src/auth/middleware.rs

ตอนนี้คืน inheritance และ impl edge ควบคู่กับ graph imports และ calls ที่มีอยู่ AI agent ตอนนี้สามารถตอบ "ใคร implement Validator?" หรือ "class ไหน extend BaseHandler?" โดยไม่ต้อง scan ไฟล์ด้วยตนเอง ชื่อ type ถูก normalize เพื่อให้ cross-file resolution ทำงานได้แม้กับ generic และ namespaced type

Export และ Import Index ของคุณ

การย้ายไปเครื่องใหม่เคยหมายถึงการรัน embedder ใหม่ทั้งหมด กับ repo ขนาดใหญ่และ local model ใช้เวลาพอสมควร

# บนเครื่อง A
$ octocode export
Exported 142.30 MB
/Users/dk/Work/myproject/octocode-abc123-20260520-203708.tar.zst

# โอนไฟล์ แล้วบนเครื่อง B
$ cd /path/to/same/project
$ octocode import octocode-abc123-20260520-203708.tar.zst

archive รวมทั้ง main index และ branch overlay การ import เป็น atomic — ไม่มี partial state เมื่อล้มเหลว Export ใช้ index lock เดียวกัน ดังนั้น operation ที่ทำงานพร้อมกันรอแทนที่จะแข่งกัน

ทั้งหมดที่เหลือ

Branch delta coherence — branch overlay ไม่ apply บน main index ที่ล้าสมัยอย่างเงียบ ๆ อีกต่อไป Octocode ตรวจพบความไม่ตรงกันและเตือนอย่างชัดเจนแทนที่จะคืนผลลัพธ์ที่ไม่สอดคล้องกัน
Method พกชื่อ class ของตัวเอง — การค้นหา Suppression.mark_set หรือ Foo.bar ตอนนี้ค้นพบ method โดยตรง ก่อนหน้านี้ทำงานได้ก็ต่อเมื่อ description บังเอิญกล่าวถึง receiver
การแบ่ง class ขนาดใหญ่ที่ดีขึ้น — Python, TypeScript, C++ และ Ruby ตอนนี้ index method แยกกัน ไม่มีผลลัพธ์เป็น chunk ใหญ่เดียวสำหรับ class ขนาด 2,000 บรรทัด
Multi-query cap เพิ่มจาก 5 เป็น 10 — agent ที่เคยต้องทำ 2 request ตอนนี้ทำได้ใน 1 อัน
ไฟล์ C++20 module ถูกรู้จัก — .cppm, .ixx, .mxx, .ccm, .cxxm, .cc, .cxx, .c++, .hxx ถูก index เป็นโค้ด
รูปแบบ text เพิ่มเติมถูก index — yaml, toml, dockerfile, makefile, ini, conf, env, xml, html, sql, csv, tsv, log และอื่น ๆ ตอนนี้ถูก index เป็น text block
การบำรุงรักษาฐานข้อมูลอัตโนมัติ — ทุก indexing run compact ฐานข้อมูลโดยอัตโนมัติ การค้นหาคงความเร็วเมื่อ index เติบโต
Static binary สำหรับ Alpine และ musl Linux — ไม่มี error libonnxruntime.so บน minimal container อีกต่อไป
Binary เล็กลง, embedding เร็วขึ้น — build profile เปลี่ยนจาก opt-level = 3 เป็น opt-level = "z" binary เล็กลงอย่างเห็นได้ชัด — และน่าแปลกใจที่ local embedding run เร็วขึ้นด้วย โค้ดขนาดเล็กกว่า fit ได้ดีกว่าใน CPU instruction cache ซึ่งสำคัญกว่าสำหรับ workload นี้มากกว่า aggressive inlining
Cross-modality fusion สำหรับ --mode all — การแบ่ง 1/3 แบบ fixed เก่าระหว่าง code/text/docs ถูกแทนที่ด้วย RRF แยกต่อ modality แต่ละ input list มีส่วนร่วมตาม rank ภายใน list ของตัวเอง — mix ที่ดีกว่าซึ่ง adapt ตามที่ที่ match อยู่จริง โดยไม่มีปัญหา scale-incompatibility ระหว่าง embedding model

เส้นทางการ Upgrade

สำรองข้อมูล index ที่มีอยู่ก่อน: octocode export

Upgrade:

brew upgrade muvon/tap/octocode
# หรือ
curl -fsSL https://raw.githubusercontent.com/Muvon/octocode/master/install.sh | sh

ตัดสินใจเรื่องโมเดล: ยอมรับค่าเริ่มต้น local ใหม่ (ไม่ต้องเปลี่ยน config ใด ๆ) หรือวาง config [embedding] และ [search.reranker] เก่ากลับคืนเพื่อใช้ Voyage/Cohere/Jina ต่อไป
ถ้าต้องการ reindex ใหม่สะอาด (แนะนำ — chunking ใหม่จะไม่ apply กับไฟล์ที่ index ไปแล้ว): octocode clear && octocode index
ทางเลือก: ปรับ default_vector_weight / default_keyword_weight ตามประเภทโปรเจกต์ของคุณ

ถ้าต้องการ pure vector ranking เพื่อให้ผลลัพธ์ตรงกับ 0.14.x ให้ตั้ง [search.hybrid] enabled = false การดาวน์โหลด reranker model ในการค้นหาครั้งแรกสามารถข้ามได้ด้วย [search.reranker] enabled = false

Octocode เป็น open source (Apache 2.0) ที่ github.com/Muvon/octocode มันขับเคลื่อน code search ภายใน Octomind — MCP server คือวิธีที่พวกเขาสื่อสารกัน

ไม่ต้องการ API Key

Hybrid Search เปิดตามค่าเริ่มต้น

Structural Grep ไม่คืนผลลัพธ์ว่างอีกต่อไป

GraphRAG ตอนนี้ติดตามว่าใคร extend อะไร

Export และ Import Index ของคุณ

ทั้งหมดที่เหลือ

เส้นทางการ Upgrade

บทความที่เกี่ยวข้อง

ให้ AI Agent มีไฟล์ซิสเต็ม โดยไม่ต้องให้ไฟล์ซิสเต็มทั้งหมดของคุณ

เอเจนต์ AI ตัวเดียวบนหลายโมเดล: คู่มือการกำหนดเส้นทางแบบหลายโมเดลสำหรับ Octomind

หน่วยความจำของ AI Agent ที่ไม่มีสัญญาณรบกวน: ขอบเขตและการลืมด้วย Octobrain