Multi-agent diagnose pipeline: specialize into timeline, root-cause, and false-positive-suppressor stages #29

New issue

Closed

opened 2026-05-24 22:02:57 -07:00 by pyr0ball · 0 comments

pyr0ball commented

2026-05-24 22:02:57 -07:00

Owner

Updated: Hybrid RAG + Multi-Agent Architecture

Design spec: circuitforge-plans/turnstone/superpowers/specs/2026-05-24-hybrid-rag-multiagent-diagnose-design.md

5-stage pipeline

Stage	Task	Implementation
1	Timeline reconstructor	Pure Python — sort, gap-annotate, group by source
2	Severity classifier	k-NN over embeddings (Option A) then fine-tuned BERT-class via Avocet (Option B)
3	Root-cause hypothesizer	TextGen LLM (Haiku/local Qwen) + RAG-retrieved runbook + past incident docs
4	False-positive suppressor	Cosine similarity vs. known-good corpus (embeddings)
5	Summary synthesizer	TextGen LLM + cross-incident retrieval ("similar to INC-YYYY-MM-DD")

Model assignment rationale

Stages 1 and 4: no LLM. Deterministic + embedding similarity respectively.
Stage 2: start with embedding k-NN (zero training overhead); graduate to fine-tuned classifier once labeled data accumulates via Avocet. HuggingFace search for existing syslog/AIOps fine-tunes in progress — results will inform stage 2 model selection.
Stages 3 and 5 share one TextGen model. Stage 3 gets anomaly+critical entries only (pre-filtered by stage 2); stage 5 gets the ranked hypothesis list. Both prompts are short.

Implementation phases

Embedding infrastructure (prerequisite for stages 2-5 and issue #15)
Hybrid search (#15) — BM25 + vector, alpha/beta tunable
Stage 1 + 2 pipeline skeleton (deterministic, unit-testable without LLM)
Stage 3 + 4 RAG injection + suppressor
Stage 5 + cross-incident retrieval
Domain-view mapping (#32)

Relates to: #15, #32

## Updated: Hybrid RAG + Multi-Agent Architecture Design spec: `circuitforge-plans/turnstone/superpowers/specs/2026-05-24-hybrid-rag-multiagent-diagnose-design.md` ### 5-stage pipeline | Stage | Task | Implementation | |---|---|---| | 1 | Timeline reconstructor | Pure Python — sort, gap-annotate, group by source | | 2 | Severity classifier | k-NN over embeddings (Option A) then fine-tuned BERT-class via Avocet (Option B) | | 3 | Root-cause hypothesizer | TextGen LLM (Haiku/local Qwen) + RAG-retrieved runbook + past incident docs | | 4 | False-positive suppressor | Cosine similarity vs. known-good corpus (embeddings) | | 5 | Summary synthesizer | TextGen LLM + cross-incident retrieval ("similar to INC-YYYY-MM-DD") | ### Model assignment rationale - Stages 1 and 4: no LLM. Deterministic + embedding similarity respectively. - Stage 2: start with embedding k-NN (zero training overhead); graduate to fine-tuned classifier once labeled data accumulates via Avocet. HuggingFace search for existing syslog/AIOps fine-tunes in progress — results will inform stage 2 model selection. - Stages 3 and 5 share one TextGen model. Stage 3 gets anomaly+critical entries only (pre-filtered by stage 2); stage 5 gets the ranked hypothesis list. Both prompts are short. ### Implementation phases 1. Embedding infrastructure (prerequisite for stages 2-5 and issue #15) 2. Hybrid search (#15) — BM25 + vector, alpha/beta tunable 3. Stage 1 + 2 pipeline skeleton (deterministic, unit-testable without LLM) 4. Stage 3 + 4 RAG injection + suppressor 5. Stage 5 + cross-incident retrieval 6. Domain-view mapping (#32) Relates to: #15, #32