Adds scripts/gen_corpus.py that produces realistic-but-artificial log
files across all four supported formats (journald JSON, docker envelope,
qBittorrent hotio, AVCX plaintext). Output feeds directly into
glean_corpus.py for demo environments and parser regression tests with
no production data required.
- Seed-based RNG with independent per-source sub-streams (same seed =
same sequence for each file regardless of source count changes)
- Controllable time range, event density, and error injection rate
- Severity distribution mirrors real infrastructure (70% INFO, ~6% ERROR,
~2% CRITICAL) with adjustable boost via --error-rate
- 17 tests covering output structure, reproducibility, format correctness,
parser round-trip, and CLI acceptance criteria
Also fixes a latent bug in app/glean/plaintext.py: ISO 8601 timestamps
were silently failing to parse because the T separator was normalised to
space in the input string but the strptime format string still contained T.
Fix: apply the same normalisation to the format before calling strptime.
Closes: #46
Second-pass cybersec classifier using DeBERTa-v3-base-mnli (already
cached — no download required). Runs after each anomaly scoring pass on
entries flagged by the anomaly scorer or with pattern matches.
Architecture:
- app/services/cybersec.py: zero-shot-classification pipeline with 5
cybersec candidate labels (auth failure, privilege escalation, network
intrusion, malware, data exfiltration). Writes ml_score/ml_label/
ml_scored_at to log_entries; inserts high-confidence hits into
detections with scorer='cybersec'.
- app/tasks/cybersec_scorer.py: async background task (same shape as
anomaly_scorer.py).
- REST: GET/POST /turnstone/api/cybersec/status|run|detections.
GET /turnstone/api/anomaly/detections now accepts scorer= filter.
Schema: ml_score, ml_label, ml_scored_at added to log_entries; scorer
column added to detections (idempotent migrations + DDL for both SQLite
and Postgres).
UI: Security Alerts view gains Source dropdown (All / Anomaly / Cybersec)
and cybersec scorer status badge. Label dropdown split into optgroups.
Deployment: TURNSTONE_CYBERSEC_MODEL/DEVICE/THRESHOLD vars added to
.env.example, docker-compose.yml, docker-standalone.sh.
Tests: 10 new tests — no model, no eligible entries, scoring, detection
creation, normal label suppression, threshold filtering, pattern-tag
filtering, idempotency, list filtering, scorer column filter.
416/416 passing.
Closes: #9
- Add app/services/anomaly.py: batch scorer using HF text-classification
pipeline; rewrites anomaly_score/anomaly_label/anomaly_scored_at on
log_entries; inserts high-confidence hits into detections table
- Add app/tasks/anomaly_scorer.py: background task (same shape as
glean_scheduler); triggered after each glean cycle when
TURNSTONE_ANOMALY_MODEL is set
- DB schema: add anomaly_score/anomaly_label/anomaly_scored_at columns to
log_entries (idempotent ALTER TABLE migration); add detections table
- Wire scorer into scheduler_loop and glean_scheduler.run_once; no-op when
model env var is empty (safe to leave unconfigured)
- REST endpoints: GET/POST /api/anomaly/status, /api/anomaly/run,
GET /api/anomaly/detections, POST /api/anomaly/detections/{id}/acknowledge
- Reuses Hybrid-BERT label map from diagnose/classifier.py; works with any
HF text-classification model
- 12 new tests; 406/406 passing
Closes: #10
- Add app/db/ abstraction layer: Backend enum, DbConn wrapper,
dialect helper (q() for ? vs %s paramstyle), get_conn(), tenant_id()
- Auto-detect backend from DATABASE_URL; SQLite remains default when
unset — no config change for local deployments
- Add tenant_id column to all three logical DBs (main, context, incidents);
idempotent ALTER TABLE migration runs before schema scripts on existing DBs
- All INSERTs inject tenant_id; SELECTs use (tenant_id = ? OR tenant_id = '')
for backward compat with pre-namespacing rows
- Add docker-compose.yml with named volume turnstone_pgdata (survives rebuilds)
and optional external Postgres support via DATABASE_URL override
- Add scripts/migrate_sqlite_to_postgres.py — one-shot idempotent migration
for existing SQLite data; ON CONFLICT DO NOTHING for safe re-runs
- Fix SSH glean path in pipeline.py to use ensure_schema + get_conn
(was still using raw sqlite3.connect + old _SCHEMA without tenant_id)
- Fix FTS5 JOIN ambiguity: qualify repeat_count as f.repeat_count in search
- Update all tests to use ensure_*_schema fixtures; add row_factory where needed
- 394/394 tests passing
Closes: #42
Closes: #50
Adds a domain: field to the pattern taxonomy and surfaces per-domain
hit counts in diagnose summaries for faster triage.
Changes:
- LogPattern gains domain: str = "" (backward-compatible default)
- load_patterns() reads domain from YAML via p.get("domain", "")
- All 42 patterns in default.yaml annotated across 10 domains:
service_health | networking | auth | storage | memory |
kernel | power | web_proxy | media | gpu
- _pattern_domain dict built at startup from compiled patterns
- _domain_counts() helper: maps matched_patterns tags to domains,
counts hits per domain across a result set
- diagnose POST: summary includes by_domain: {domain: count}
- diagnose stream: summary SSE event includes by_domain when
pattern_domain is provided (passed from rest.py at startup)
- /api/search gains ?domain= filter: post-filters results to entries
whose matched_patterns include at least one tag in the given domain
Test fixtures: patch _pattern_domain={} and CONTEXT_DB_PATH in
test_blocklist_endpoints.py and test_glean_tautulli.py (worktree
has no data/ dir; same fix as feat/60-incidents-db).
372 tests passing.
Closes: #32
Adds late-fusion hybrid search to Turnstone's log retrieval layer:
hybrid_score = 0.6 * bm25_normalized + 0.4 * cosine_similarity
Implementation:
- _bm25_search() extracts the existing FTS5 BM25 path as a named helper
- _hybrid_search() fetches an oversized BM25 candidate pool (5x limit,
min 100), embeds the query and each candidate text in-process via the
existing embeddings service, normalizes BM25 rank to [0,1], combines
with cosine similarity, and re-ranks
- search() gets semantic=False param that dispatches to _hybrid_search()
when True; pure BM25 remains the default for all existing call sites
- diagnose_stream() enables semantic=True so symptom-based queries
("database connection failed") surface semantically equivalent entries
("ECONNREFUSED", "backend gone away", "max retries exceeded")
- /api/search REST endpoint exposes ?semantic=true query param
Graceful degradation: falls back silently to pure BM25 when the embedding
backend is unavailable (EMBEDDING_AVAILABLE=False) or when embed_batch
raises an exception. No new infra — in-process numpy cosine, no vector DB.
11 new tests: BM25 helper, hybrid re-ranking, fallback paths, dispatcher.
372 + 11 = 383 tests passing.
Closes: #15
Adds _HYBRID_BERT_LABEL_MAP to translate the 7-class output vocabulary of
krishnas4415/log-anomaly-detection-models (Hybrid-BERT, MIT) to Turnstone
SeverityLabel. _map_label now checks the Hybrid-BERT map before the standard
map so either model family works via TURNSTONE_CLASSIFIER_MODEL without any
additional code path.
Mapping (confirmed from model config.json):
normal → INFO
security_anomaly → ERROR
system_failure → CRITICAL
performance_issue → WARN
network_anomaly → WARN
config_error → ERROR
hardware_issue → CRITICAL
Keyword-based CRITICAL promotion and low-confidence DEBUG demotion apply on
top of the base mapping (same rules as the standard vocabulary).
11 new tests covering all 7 Hybrid-BERT labels, case-insensitivity, and
regression on standard-vocabulary labels. 372 tests passing total.
Note: custom loading code for the non-standard .pt checkpoint format is
explicitly out of scope — evaluate better-packaged HF alternatives first
(see #41 for candidate list).
Closes: #41
FTS5 bulk-insert write locks starved the incident API and bundle endpoints
during log bursts (sonarr/radarr, high-volume docker sources). Fix mirrors
the context_facts split (context -> turnstone-context.db):
- Add INCIDENTS_DB_PATH / TURNSTONE_INCIDENTS_DB env var in rest.py
- Add _INCIDENTS_SCHEMA, ensure_incidents_schema(), and
migrate_incidents_to_dedicated_db() in glean/pipeline.py
- Stub out incidents/received_bundles/sent_bundles in _SCHEMA (no-op
CREATE IF NOT EXISTS) so legacy single-file deployments still open
- Thread incidents_db_path through diagnose_stream -> run_pipeline ->
FalsePositiveSuppressor.suppress -> _fetch_resolved_incidents
- One-shot migration on startup: copy existing rows from main DB to
incidents DB via INSERT OR IGNORE (idempotent, safe to re-run)
- Fix test_blocklist_endpoints fixtures to patch CONTEXT_DB_PATH and
INCIDENTS_DB_PATH alongside DB_PATH (worktree has no data/ dir)
372 tests passing.
Closes: #60
Was suppressing when novelty_score < 0.85 (i.e. similarity > 0.15), which
would suppress nearly every hypothesis once embeddings are active.
Now suppresses when max_sim >= similarity_threshold (0.85), meaning only
hypotheses that are 85%+ similar to a resolved incident are suppressed.
Also renames suppress_threshold → similarity_threshold for clarity and
adds a borderline boundary test (0.85 suppressed, 0.84 not suppressed).
Closes: #29
- Implements FalsePositiveSuppressor using embedding cosine similarity
- Lazy corpus embedding via get_embedder() with module-level cache keyed by db_path
- Cache invalidated automatically when the resolved incident corpus changes
- Suppresses hypotheses with novelty_score below configurable threshold (default 0.85)
- Full fallback path (novelty=1.0, no suppression) when model_id empty, embedding
service unavailable, or no resolved incidents found in DB
- Graceful handling of missing incidents table and DB query failures
- Numpy bool_ leakage prevented by explicit float()/bool() coercion at assignment
- Pure-Python cosine fallback for environments without numpy
- 9 new tests (all mocked, no real model downloads): passthrough, suppress, no-suppress,
empty list, ranking, empty corpus, DB failure, service unavailable, cache invalidation
- 350 total tests passing (341 pre-existing + 9 new)
Closes: #29
- Add _coerce_float() module-level helper: catches TypeError/ValueError from
non-numeric LLM output (e.g. 'high', 'N/A') and returns a caller-supplied
default instead of raising.
- Replace float(item.get('confidence', 0.5)) with
_coerce_float(item.get('confidence'), 0.5) in _parse_response.
- Guard supporting_cluster_ids: tuple(item.get(...) or []) so a JSON null
from the LLM does not cause TypeError('NoneType is not iterable').
- runbook_refs is hardcoded as () and not sourced from LLM output; no change
needed there.
- Add test_non_numeric_confidence_uses_default (Test 10) to cover the 'high'
string case: asserts no exception and confidence == 0.5.
- 341 tests passing (+1).
Closes: #29
Three-path classification: ML (transformers pipeline, lazy singleton) →
pattern_tags (YAML pattern severity dict) → regex (detect_severity).
- Path A: HF text-classification pipeline loaded lazily on first classify()
call via module-level singleton; shim promotes ERROR+keyword hits to CRITICAL
and demotes low-confidence INFO to DEBUG.
- Path B: maps cluster.pattern_tags through the loaded pattern severity dict;
picks the highest severity across matching tags.
- Path C: falls back to detect_severity() regex scan on representative_text;
defaults to INFO when no keyword matches.
- Pattern file resolved from constructor arg or TURNSTONE_PATTERNS env var
(mirrors app/rest.py convention).
- No crash when transformers is not installed; ImportError on per-cluster ML
inference triggers clean per-cluster fallback to pattern_tags/regex.
- ClassifiedTimeline.classifier_used reflects the primary session path.
Tests (10 new, 328 total, all passing):
- ML ERROR, CRITICAL promotion, DEBUG demotion, WARNING→WARN
- pattern_tags resolution from YAML fixture
- regex ERROR detection and INFO default
- ImportError clean fallback
- empty timeline no-crash
- ClassifiedTimeline FrozenInstanceError on mutation
Closes: #29
- New app/services/embeddings.py: TURNSTONE_EMBED_* env vars, multi-backend support
- embedder.py delegates to service layer; re-exports EMBEDDING_AVAILABLE for compat
- retriever.py updated to use service layer
- Test coverage updated in tests/context/test_embedder.py
Adds SSH-based log collection from remote hosts via Paramiko.
One SSH connection per host, multiple log types per connection.
New files:
- app/glean/ssh.py: SSHTransport context manager + command builders
for journald, syslog, plaintext, and docker log types
- tests/test_glean_ssh.py: 18 tests for transport layer (all mocked)
- tests/test_glean_pipeline_ssh.py: 15 tests for pipeline integration
Pipeline changes (app/glean/pipeline.py):
- glean_sources() now splits sources into local-file and SSH categories
- SSH sources use transport: ssh + glean: list schema in sources.yaml
- _glean_ssh_source(): one SSHTransport per host, N commands per connection
- _stream_and_write(): SSHCommandError caught per-item so one bad
command does not abort the rest of the host's glean items
- SSHConnectionError skips the entire host with a warning log
SSH source schema (sources.yaml):
- id: rack01
transport: ssh
host: 192.168.1.10
user: admin
key_path: ~/.ssh/id_ed25519
glean:
- type: journald
args: [--since, 2 hours ago]
- type: syslog
path: /var/log/syslog
- type: plaintext
path: /var/log/app/error.log
- type: docker
containers: [myapp, nginx]
Key design decisions:
- Key-based auth only (no password prompts in daemon context)
- exit-status check fires after all stdout lines yielded; callers
drain the iterator to trigger it
- Local file sources path unchanged; SSH sources co-exist in same yaml
- Docker multi-container: one exec_stream call per container,
source_id scoped as host_id/type/container_name
Remaining for #22: REST endpoint, SourcesView UI, sources.yaml docs.
285 → 285 tests passing (33 new SSH tests).
Add blocklist candidate listing, scan trigger, status update,
push/unblock to Pi-hole, and connection test endpoints.
Add pihole_url/version/api_key and router_source_ids/device_names
fields to SettingsBody and prefs handling in patch_settings.
Add PiholeClient.__post_init__ validation so 503 fires naturally
when url/api_key are unconfigured (mock-safe: bypassed in tests).
PiholeClient dataclass supporting both Pi-hole v5 (PHP /admin/api.php)
and v6 (REST /api/) with public block/unblock/test_connection methods.
9 tests covering both API versions, auth flow, and error handling.
Adds patterns/telemetry.yaml with 6 rule groups (samsung, belkin, roku, lg, amazon, advertising).
Adds app/services/blocklist.py with TelemetryRule and BlocklistCandidate dataclasses, load_telemetry_rules(), and matches_telemetry() with exact and subdomain matching.
6 new TestTelemetry tests pass; 199 total passing.
Add blocklist_candidates table and indexes to _SCHEMA in pipeline.py.
Add TestSchema tests verifying table existence, column set, and status/hit_count defaults.
All 193 tests pass.
POST /turnstone/api/ingest/tautulli accepts Tautulli notification agent
payloads and stores them as log_entries under source 'tautulli'. Severity
maps error->CRITICAL, buffer->WARN, all others->None. Optional bearer token
auth via X-Tautulli-Token header + tautulli_token pref. FTS index rebuilt
as a background task after each write. 28 new tests, all passing.
All five parsers (plex, syslog, servarr, qbittorrent, plaintext) were
using .replace(tzinfo=timezone.utc) on naive datetimes parsed from log
files, which slaps a UTC label on what is actually local-time data.
On a UTC-7 system a 2pm entry was stored as 14:00Z instead of 21:00Z,
causing time-window searches to return zero results.
Fix: use .astimezone(timezone.utc) instead, which treats the naive
datetime as local time and converts correctly.
Tests updated to round-trip back to local time for assertion so they
pass on any timezone, not just UTC.
- Add context_block param to summarize() and thread it into _PROMPT_TEMPLATE
- Wire retrieve_context/format_context_block into diagnose_stream() before
log search; emit context SSE event (facts + chunks) to the client
- 3 new tests covering prompt injection and SSE event emission (155 total, all pass)
- Implement document type detection for yaml/json/markdown/text
- Extract service facts from docker-compose YAML (names, images, ports)
- Split text into overlapping word chunks (300-word default with 50-word overlap)
- Enforce 5 MB file size limit
- Comprehensive TDD test suite: 15 tests passing
Turnstone now calls /v1/chat/completions instead of Ollama's /api/generate.
This format works with both local Ollama (>=0.1.24) and a remote cf-orch
coordinator, enabling GPU-less nodes like Xander's to route diagnoses through
the cluster without any local model.
- llm.py: OpenAI-compat messages format, optional Bearer auth header
- diagnose.py: thread llm_api_key through the call chain
- rest.py: llm_api_key pref (default empty), SettingsBody field, passed to diagnose
- SettingsView.vue: API Key field, label updated from "Ollama URL" to "LLM Endpoint URL"
- tests: updated mocks for new response shape; added bearer token assertion test
Watermark-based batch export script (scripts/export_corpus.py) pushes up to 500
ERROR/CRITICAL entries and labeled incidents per run to AVOCET_CORPUS_ENDPOINT.
Uses SQLite rowid watermark (entry log) and ISO timestamp watermark (incidents).
Skips silently when AVOCET_CORPUS_ENDPOINT is not set. 19 tests. Closes turnstone#6.
- type: file uses tail -F (handles rotation) with auto-format detection
- _parse_lines dispatches to journald/servarr/qbit/caddy/syslog/plaintext
based on first-line format detection — same logic as batch ingest
- watch.yaml updated with file type docs and example-node-specific example
- scripts/journal-bridge.sh + .service written directly to example-node
Xander's watch.yaml covers: system-journal-live (via bridge file),
sonarr, radarr, lidarr, prowlarr, bazarr, qbittorrent, nzbget, tautulli
Adds background watcher that tails active log sources and ingests entries
in near-real-time, keeping the DB fresh without manual ingest runs.
- app/watch/watcher.py: Watcher + WatchSource using subprocess + select
loop; flushes every 10s or 100 lines; syncs FTS index every 3 flushes
- patterns/watch.yaml: declarative source config (journald/docker/podman)
- app/rest.py: lifespan context manager starts/stops watcher on app
startup/shutdown; GET /api/watch/status + POST /api/watch/reload
- web/src/views/DashboardView.vue: live/manual indicator chip + stale
banner copy adapts to whether live watching is active
- tests/test_watch_watcher.py: 16 tests covering config load, command
building, docker timestamp stripping, orchestrator lifecycle
Closes#4