Multi-agent diagnose pipeline: specialize into timeline, root-cause, and false-positive-suppressor stages #29

Closed
opened 2026-05-24 22:02:57 -07:00 by pyr0ball · 0 comments
Owner

Updated: Hybrid RAG + Multi-Agent Architecture

Design spec: circuitforge-plans/turnstone/superpowers/specs/2026-05-24-hybrid-rag-multiagent-diagnose-design.md

5-stage pipeline

Stage Task Implementation
1 Timeline reconstructor Pure Python — sort, gap-annotate, group by source
2 Severity classifier k-NN over embeddings (Option A) then fine-tuned BERT-class via Avocet (Option B)
3 Root-cause hypothesizer TextGen LLM (Haiku/local Qwen) + RAG-retrieved runbook + past incident docs
4 False-positive suppressor Cosine similarity vs. known-good corpus (embeddings)
5 Summary synthesizer TextGen LLM + cross-incident retrieval ("similar to INC-YYYY-MM-DD")

Model assignment rationale

  • Stages 1 and 4: no LLM. Deterministic + embedding similarity respectively.
  • Stage 2: start with embedding k-NN (zero training overhead); graduate to fine-tuned classifier once labeled data accumulates via Avocet. HuggingFace search for existing syslog/AIOps fine-tunes in progress — results will inform stage 2 model selection.
  • Stages 3 and 5 share one TextGen model. Stage 3 gets anomaly+critical entries only (pre-filtered by stage 2); stage 5 gets the ranked hypothesis list. Both prompts are short.

Implementation phases

  1. Embedding infrastructure (prerequisite for stages 2-5 and issue #15)
  2. Hybrid search (#15) — BM25 + vector, alpha/beta tunable
  3. Stage 1 + 2 pipeline skeleton (deterministic, unit-testable without LLM)
  4. Stage 3 + 4 RAG injection + suppressor
  5. Stage 5 + cross-incident retrieval
  6. Domain-view mapping (#32)

Relates to: #15, #32

## Updated: Hybrid RAG + Multi-Agent Architecture Design spec: `circuitforge-plans/turnstone/superpowers/specs/2026-05-24-hybrid-rag-multiagent-diagnose-design.md` ### 5-stage pipeline | Stage | Task | Implementation | |---|---|---| | 1 | Timeline reconstructor | Pure Python — sort, gap-annotate, group by source | | 2 | Severity classifier | k-NN over embeddings (Option A) then fine-tuned BERT-class via Avocet (Option B) | | 3 | Root-cause hypothesizer | TextGen LLM (Haiku/local Qwen) + RAG-retrieved runbook + past incident docs | | 4 | False-positive suppressor | Cosine similarity vs. known-good corpus (embeddings) | | 5 | Summary synthesizer | TextGen LLM + cross-incident retrieval ("similar to INC-YYYY-MM-DD") | ### Model assignment rationale - Stages 1 and 4: no LLM. Deterministic + embedding similarity respectively. - Stage 2: start with embedding k-NN (zero training overhead); graduate to fine-tuned classifier once labeled data accumulates via Avocet. HuggingFace search for existing syslog/AIOps fine-tunes in progress — results will inform stage 2 model selection. - Stages 3 and 5 share one TextGen model. Stage 3 gets anomaly+critical entries only (pre-filtered by stage 2); stage 5 gets the ranked hypothesis list. Both prompts are short. ### Implementation phases 1. Embedding infrastructure (prerequisite for stages 2-5 and issue #15) 2. Hybrid search (#15) — BM25 + vector, alpha/beta tunable 3. Stage 1 + 2 pipeline skeleton (deterministic, unit-testable without LLM) 4. Stage 3 + 4 RAG injection + suppressor 5. Stage 5 + cross-incident retrieval 6. Domain-view mapping (#32) Relates to: #15, #32
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: Circuit-Forge/turnstone#29
No description provided.