feat(classifier): Hybrid-BERT label mapping shim (#41) #62

Closed
pyr0ball wants to merge 0 commits from feat/41-hybrid-bert-shim into main
Owner

Summary

  • Adds _HYBRID_BERT_LABEL_MAP to classifier.py translating the 7-class Hybrid-BERT vocabulary to SeverityLabel
  • _map_label now checks Hybrid-BERT map before standard map — both model families work via TURNSTONE_CLASSIFIER_MODEL without any additional code path
  • 11 new tests covering all 7 labels, case-insensitivity, promotion/demotion rules, and standard vocabulary regression

Test plan

  • 372 tests passing
  • Set TURNSTONE_CLASSIFIER_MODEL=krishnas4415/log-anomaly-detection-models and confirm labels translate correctly (once usable checkpoint loading is implemented — see #41 for tracking)
  • Confirm byviz/bylastic_classification_logs (standard vocab) still classifies correctly
## Summary - Adds `_HYBRID_BERT_LABEL_MAP` to `classifier.py` translating the 7-class Hybrid-BERT vocabulary to `SeverityLabel` - `_map_label` now checks Hybrid-BERT map before standard map — both model families work via `TURNSTONE_CLASSIFIER_MODEL` without any additional code path - 11 new tests covering all 7 labels, case-insensitivity, promotion/demotion rules, and standard vocabulary regression ## Test plan - [ ] 372 tests passing - [ ] Set `TURNSTONE_CLASSIFIER_MODEL=krishnas4415/log-anomaly-detection-models` and confirm labels translate correctly (once usable checkpoint loading is implemented — see #41 for tracking) - [ ] Confirm `byviz/bylastic_classification_logs` (standard vocab) still classifies correctly
pyr0ball added 1 commit 2026-06-01 16:23:37 -07:00
Adds _HYBRID_BERT_LABEL_MAP to translate the 7-class output vocabulary of
krishnas4415/log-anomaly-detection-models (Hybrid-BERT, MIT) to Turnstone
SeverityLabel. _map_label now checks the Hybrid-BERT map before the standard
map so either model family works via TURNSTONE_CLASSIFIER_MODEL without any
additional code path.

Mapping (confirmed from model config.json):
  normal            → INFO
  security_anomaly  → ERROR
  system_failure    → CRITICAL
  performance_issue → WARN
  network_anomaly   → WARN
  config_error      → ERROR
  hardware_issue    → CRITICAL

Keyword-based CRITICAL promotion and low-confidence DEBUG demotion apply on
top of the base mapping (same rules as the standard vocabulary).

11 new tests covering all 7 Hybrid-BERT labels, case-insensitivity, and
regression on standard-vocabulary labels. 372 tests passing total.

Note: custom loading code for the non-standard .pt checkpoint format is
explicitly out of scope — evaluate better-packaged HF alternatives first
(see #41 for candidate list).

Closes: #41
pyr0ball closed this pull request 2026-06-01 20:02:52 -07:00

Pull request closed

Sign in to join this conversation.
No reviewers
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: Circuit-Forge/turnstone#62
No description provided.