Commit graph

2 commits

Author SHA1 Message Date
2c83247f1e feat(classifier): add Hybrid-BERT label mapping shim (#41)
Adds _HYBRID_BERT_LABEL_MAP to translate the 7-class output vocabulary of
krishnas4415/log-anomaly-detection-models (Hybrid-BERT, MIT) to Turnstone
SeverityLabel. _map_label now checks the Hybrid-BERT map before the standard
map so either model family works via TURNSTONE_CLASSIFIER_MODEL without any
additional code path.

Mapping (confirmed from model config.json):
  normal            → INFO
  security_anomaly  → ERROR
  system_failure    → CRITICAL
  performance_issue → WARN
  network_anomaly   → WARN
  config_error      → ERROR
  hardware_issue    → CRITICAL

Keyword-based CRITICAL promotion and low-confidence DEBUG demotion apply on
top of the base mapping (same rules as the standard vocabulary).

11 new tests covering all 7 Hybrid-BERT labels, case-insensitivity, and
regression on standard-vocabulary labels. 372 tests passing total.

Note: custom loading code for the non-standard .pt checkpoint format is
explicitly out of scope — evaluate better-packaged HF alternatives first
(see #41 for candidate list).

Closes: #41
2026-06-01 16:20:31 -07:00
5c95bc0e96 feat: Stage 2 — SeverityClassifier for multi-agent diagnose pipeline (issue #29)
Three-path classification: ML (transformers pipeline, lazy singleton) →
pattern_tags (YAML pattern severity dict) → regex (detect_severity).

- Path A: HF text-classification pipeline loaded lazily on first classify()
  call via module-level singleton; shim promotes ERROR+keyword hits to CRITICAL
  and demotes low-confidence INFO to DEBUG.
- Path B: maps cluster.pattern_tags through the loaded pattern severity dict;
  picks the highest severity across matching tags.
- Path C: falls back to detect_severity() regex scan on representative_text;
  defaults to INFO when no keyword matches.
- Pattern file resolved from constructor arg or TURNSTONE_PATTERNS env var
  (mirrors app/rest.py convention).
- No crash when transformers is not installed; ImportError on per-cluster ML
  inference triggers clean per-cluster fallback to pattern_tags/regex.
- ClassifiedTimeline.classifier_used reflects the primary session path.

Tests (10 new, 328 total, all passing):
- ML ERROR, CRITICAL promotion, DEBUG demotion, WARNING→WARN
- pattern_tags resolution from YAML fixture
- regex ERROR detection and INFO default
- ImportError clean fallback
- empty timeline no-crash
- ClassifiedTimeline FrozenInstanceError on mutation

Closes: #29
2026-05-25 13:27:17 -07:00