Commit graph

7 commits

Author SHA1 Message Date
7c275e4f05 feat: domain-view mapping for patterns and diagnose output (#32)
Adds a domain: field to the pattern taxonomy and surfaces per-domain
hit counts in diagnose summaries for faster triage.

Changes:
- LogPattern gains domain: str = "" (backward-compatible default)
- load_patterns() reads domain from YAML via p.get("domain", "")
- All 42 patterns in default.yaml annotated across 10 domains:
    service_health | networking | auth | storage | memory |
    kernel | power | web_proxy | media | gpu
- _pattern_domain dict built at startup from compiled patterns
- _domain_counts() helper: maps matched_patterns tags to domains,
  counts hits per domain across a result set
- diagnose POST: summary includes by_domain: {domain: count}
- diagnose stream: summary SSE event includes by_domain when
  pattern_domain is provided (passed from rest.py at startup)
- /api/search gains ?domain= filter: post-filters results to entries
  whose matched_patterns include at least one tag in the given domain

Test fixtures: patch _pattern_domain={} and CONTEXT_DB_PATH in
test_blocklist_endpoints.py and test_glean_tautulli.py (worktree
has no data/ dir; same fix as feat/60-incidents-db).

372 tests passing.

Closes: #32
2026-06-01 19:57:16 -07:00
ee39ffbd44 fix(glean): add timeout=30s to all pipeline DB connections; add --force flag; new patterns
pipeline.py:
- Add timeout=30.0 to all sqlite3.connect() calls (5 total).
  Previously only ensure_context_schema() had it. The main glean
  writers would fail immediately under lock contention from the live
  watcher or concurrent manual glean runs.

glean_corpus.py:
- Add --force flag (passed through to glean_sources/glean_file/glean_dir).
  Without it, unchanged-fingerprint files were silently skipped even
  after pattern updates. Use after editing patterns/default.yaml.

patterns/default.yaml:
- Add 9 new patterns for Muninn / cluster-wide coverage:
    vpn_tunnel_fail     WireGuard/tunnel service failures
    vpn_handshake       WireGuard peer handshake events
    dns_degraded        systemd-resolved DNS fallback/degradation
    nvidia_api_mismatch NVIDIA kernel module vs userspace mismatch
    nvidia_xid          NVIDIA Xid GPU hardware faults
    nvidia_gpu_reset    NVIDIA GPU reset / NVLink faults
    acpi_error          ACPI firmware _DSM evaluation failures
    thermal_throttle    CPU/GPU thermal throttling / RAPL unavailable
    undervoltage        PSU undervoltage / brownout events
- Sync from /devl/turnstone-cluster/patterns/default.yaml (authoritative
  live copy updated first; repo copy was stale)
2026-05-26 22:36:45 -07:00
aa80f307fe refactor: rename ingest → glean throughout codebase
Renames the app/ingest/ package to app/glean/ and updates all
references across Python modules, shell scripts, Vue components,
tests, and documentation.

Intentionally preserved:
- SQLite column name ingest_time (avoids schema migration)
- RetrievedEntry.ingest_time field (maps to the column above)
- Any public-facing JSON keys that reference ingest_time

Changes by category:
- app/ingest/ → app/glean/ (full package move, all parsers)
- app/tasks/ingest_scheduler.py → app/tasks/glean_scheduler.py
- scripts/ingest_corpus.py → scripts/glean_corpus.py
- tests/test_ingest_*.py → tests/test_glean_*.py
- Docstrings, log messages, comments: ingest → glean
- Env var: TURNSTONE_INGEST_INTERVAL → TURNSTONE_GLEAN_INTERVAL
- Shell scripts: glean.log, glean_corpus.py references
- README.md: multi-source ingest → multi-source glean
- .env.example: updated env var name
- patterns/: new diagnostic patterns from 2026-05-20 SSH incident
  (service_crash_loop, pkg_daemon_restart, ssh_forward_conflict)
- SourcesView.vue: pipeline label updated
- All test import paths updated to app.glean.*

285 tests passing.
2026-05-20 23:02:55 -07:00
1b6482701c feat: journald export + system failure patterns
- Add scripts/export_journal.sh — dumps recent journal (priority 0-5,
  20min window) to /opt/turnstone/data/journal-export.jsonl; idempotent
  via entry_id deduplication so overlap is safe
- Add system-journal source to sources.yaml pointing at the export file
- Add 9 system-level patterns to default.yaml:
  systemd_fail, oom_kill, disk_hw_error, fs_error, kernel_error,
  ssh_brute, container_crash, smart_error, nfs_error
2026-05-11 06:54:42 -07:00
2d148e4f9e feat: qBittorrent log ingestor with 8 diagnostic patterns
Adds app/ingest/qbittorrent.py — auto-detected by the pipeline on the
(YYYY/MM/DD HH:MM:SS) timestamp fingerprint. Handles both slash and dash
date separators, optional [Warning|Critical] bracket levels, and
multi-line continuations (Qt stack traces).

patterns/default.yaml: 8 new qbit_ patterns covering tracker errors,
port bind failures, disk errors, hash check failures, peer bans, download
completion, ratio limits, and session errors.

manage.sh: ingest-qbit [HOST] command mirrors ingest-plex — probes known
default log paths locally or via SSH, ingests, restarts server.

14 tests covering format detection, severity mapping, multiline handling,
and timestamp normalization.
2026-05-10 08:21:16 -07:00
8db8810667 feat: plex EAE watchdog and plex_eae_failure pattern
Add plex_eae_failure pattern to default.yaml targeting the EasyAudioEncoder
crash signature (EAE timeout + I/O error pair, 5s cadence). Pattern fires
when EAE's WAV handoff files stop appearing in the pms temp directory.

Add watch_plex.py: tail-based watchdog that counts EAE timeout events and
auto-restarts plexmediaserver after N consecutive hits (default 3, ~15s of
failure). Includes cooldown, dry-run mode, and a systemd unit template.
2026-05-08 13:41:34 -07:00
64c3996aa1 feat: initial Turnstone POC — ingest, FTS search, MCP server
Ingest pipeline (journald / Caddy / Docker-wrapped formats) with
per-source state tracking (repeat dedup, out-of-order detection),
named pattern tagging at ingest time, and idempotent SHA1-keyed writes.

FTS5 search layer with porter stemmer, severity/source/pattern/time
filters, and BM25 ranking. MCP server (FastMCP stdio) with three tools:
search_logs, diagnose, list_log_sources — compatible with both
Claude Code and Copilot CLI.

WAL mode enabled on all connections. FTS index auto-built after ingest.
MCP configs included for Claude Code (.mcp.json) and Copilot CLI
(.github/copilot/mcp.json).
2026-05-08 12:12:34 -07:00