feat: hybrid BM25 + vector RAG for diagnose — pattern recognition and red herring suppression #15
Labels
No labels
compliance
demo
deployment
docs
enhancement
parser
patterns
performance
security
ux
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: Circuit-Forge/turnstone#15
Loading…
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Updated: Hybrid BM25 + Vector Search Architecture
Design spec:
circuitforge-plans/turnstone/superpowers/specs/2026-05-24-hybrid-rag-multiagent-diagnose-design.mdThe gap
Current BM25 FTS5 search misses semantically equivalent log entries with different vocabulary:
"database connection failed"ECONNREFUSED,backend gone away,max retries exceeded,connection reset by peerHybrid score
Start alpha=0.6, beta=0.4 (tunable). Existing pattern-tag boost preserved; vector score is additive.
Vector index — implementation options (preference order)
context_chunks.embedding BLOB, compute in Python. Zero new dependencies. Fast for <100K entries. Start here.Chroma / Qdrant / Weaviate— unnecessary infra for Turnstone scale. Do not use.RAG beyond search
Vector retrieval also drives the multi-agent diagnose pipeline (#29):
Embedding infrastructure (prerequisite)
app/services/embeddings.py— embed text, persist tocontext_chunks.embeddingcontext_chunks.embedding BLOBalready exists in schema — no migration needed.Relates to: #29, #32