Commit graph

9 commits

Author SHA1 Message Date
a4f97e5a79 refactor: extract _score_hypothesis helper, fix exception types, pass device in suppressor 2026-05-25 14:41:33 -07:00
e3da2ab514 feat: Stage 4 — FalsePositiveSuppressor for multi-agent diagnose pipeline (issue #29)
- Implements FalsePositiveSuppressor using embedding cosine similarity
- Lazy corpus embedding via get_embedder() with module-level cache keyed by db_path
- Cache invalidated automatically when the resolved incident corpus changes
- Suppresses hypotheses with novelty_score below configurable threshold (default 0.85)
- Full fallback path (novelty=1.0, no suppression) when model_id empty, embedding
  service unavailable, or no resolved incidents found in DB
- Graceful handling of missing incidents table and DB query failures
- Numpy bool_ leakage prevented by explicit float()/bool() coercion at assignment
- Pure-Python cosine fallback for environments without numpy
- 9 new tests (all mocked, no real model downloads): passthrough, suppress, no-suppress,
  empty list, ranking, empty corpus, DB failure, service unavailable, cache invalidation
- 350 total tests passing (341 pre-existing + 9 new)

Closes: #29
2026-05-25 14:28:31 -07:00
5a4ee3f912 fix: defensive coercion for LLM confidence and cluster fields in hypothesizer
- Add _coerce_float() module-level helper: catches TypeError/ValueError from
  non-numeric LLM output (e.g. 'high', 'N/A') and returns a caller-supplied
  default instead of raising.
- Replace float(item.get('confidence', 0.5)) with
  _coerce_float(item.get('confidence'), 0.5) in _parse_response.
- Guard supporting_cluster_ids: tuple(item.get(...) or []) so a JSON null
  from the LLM does not cause TypeError('NoneType is not iterable').
- runbook_refs is hardcoded as () and not sourced from LLM output; no change
  needed there.
- Add test_non_numeric_confidence_uses_default (Test 10) to cover the 'high'
  string case: asserts no exception and confidence == 0.5.
- 341 tests passing (+1).

Closes: #29
2026-05-25 14:00:30 -07:00
6a4fc40a7d feat: Stage 3 — RootCauseHypothesizer for multi-agent diagnose pipeline (issue #29)
- Add app/services/diagnose/hypothesizer.py with RootCauseHypothesizer class
- Stage 3 of the multi-agent diagnose pipeline: accepts ClassifiedTimeline +
  RetrievedContext, builds a structured JSON prompt, calls the LLM via the
  same cf-orch task → OpenAI-compat fallback pattern used by llm.py
- Parses JSON array response into list[Hypothesis] dataclasses with UUID ids,
  severity validation (WARNING→WARN, unknown→ERROR), confidence coercion
- Gracefully returns [] when llm_url/llm_model absent or clusters empty
- Add tests/test_diagnose_hypothesizer.py: 12 tests, all mocked, no LLM I/O
  covering: valid response, UUID generation, malformed JSON, non-list JSON,
  empty clusters, missing URL/model, max_hypotheses cap, severity mapping,
  confidence string coercion
- 340 tests passing (328 prior + 12 new)

Closes: #29
2026-05-25 13:49:18 -07:00
5c95bc0e96 feat: Stage 2 — SeverityClassifier for multi-agent diagnose pipeline (issue #29)
Three-path classification: ML (transformers pipeline, lazy singleton) →
pattern_tags (YAML pattern severity dict) → regex (detect_severity).

- Path A: HF text-classification pipeline loaded lazily on first classify()
  call via module-level singleton; shim promotes ERROR+keyword hits to CRITICAL
  and demotes low-confidence INFO to DEBUG.
- Path B: maps cluster.pattern_tags through the loaded pattern severity dict;
  picks the highest severity across matching tags.
- Path C: falls back to detect_severity() regex scan on representative_text;
  defaults to INFO when no keyword matches.
- Pattern file resolved from constructor arg or TURNSTONE_PATTERNS env var
  (mirrors app/rest.py convention).
- No crash when transformers is not installed; ImportError on per-cluster ML
  inference triggers clean per-cluster fallback to pattern_tags/regex.
- ClassifiedTimeline.classifier_used reflects the primary session path.

Tests (10 new, 328 total, all passing):
- ML ERROR, CRITICAL promotion, DEBUG demotion, WARNING→WARN
- pattern_tags resolution from YAML fixture
- regex ERROR detection and INFO default
- ImportError clean fallback
- empty timeline no-crash
- ClassifiedTimeline FrozenInstanceError on mutation

Closes: #29
2026-05-25 13:27:17 -07:00
65bbc438de refactor: split TimelineReconstructor.reconstruct into helpers, fix magic number + error handling
- Add gap_significance_seconds constructor param (default 30) to replace hardcoded magic number in gap_count computation
- _parse_iso now returns datetime | None with try/except on ValueError; all callers handle None return by treating malformed timestamps as absent
- Extract reconstruct into four private helpers: _sort_entries, _group_into_raw_clusters, _build_cluster, _dominant_sources_tuple
- Promote _sort_key to module-level function (was nested inside reconstruct)
- Rename old module-level _build_cluster to _make_event_cluster to avoid name collision with new instance method
- Add explanatory comment to type: ignore[arg-type] at _highest_severity call site
- Black-formatted
2026-05-25 13:22:18 -07:00
a6e65ec95d feat: Stage 1 — TimelineReconstructor for multi-agent diagnose pipeline (issue #29)
- Add app/services/diagnose/timeline.py: pure-Python TimelineReconstructor
  - Sorts entries by timestamp_iso (None entries appended at end)
  - Sliding-window clustering anchored to first entry in each cluster
  - Computes cluster_id (sha1[:12]), severity (highest wins), burst flag,
    gap_before_seconds, representative_text (highest rank, longest text tiebreak)
  - Builds TimelineResult with dominant_sources sorted by entry count descending
- Update pipeline.py stub to import TimelineReconstructor (Task 6 wiring prep)
- Add tests/test_diagnose_timeline.py: 15 tests covering all 13 required cases
  plus null-timestamp edge case variant; all 318 tests passing

Closes: #29
2026-05-25 12:54:15 -07:00
7ee87a2982 fix: frozen dataclasses, clean __all__, improve exception logging in diagnose package 2026-05-25 12:31:07 -07:00
d833febe77 refactor: convert diagnose module to package for multi-agent pipeline (issue #29)
- Move app/services/diagnose.py verbatim to app/services/diagnose/legacy.py
- Create app/services/diagnose/__init__.py with full implementation so that
  patch('app.services.diagnose._HAS_DATEPARSER') targets the correct namespace
  and all 303 existing tests continue to pass without modification
- Add app/services/diagnose/models.py with 5 pipeline dataclasses:
  EventCluster, TimelineResult, ClassifiedTimeline, Hypothesis, RankedHypothesis
- Add app/services/diagnose/pipeline.py with run_pipeline() stub (Task 6)
- Add MULTI_AGENT_ENABLED feature flag (off by default via env var)
- Zero behavior change; ruff clean

Closes: #29
2026-05-25 11:12:39 -07:00