Commit graph

25 commits

Author SHA1 Message Date
0013ae916d feat(blocklist): telemetry YAML list + loader + domain matcher
Adds patterns/telemetry.yaml with 6 rule groups (samsung, belkin, roku, lg, amazon, advertising).
Adds app/services/blocklist.py with TelemetryRule and BlocklistCandidate dataclasses, load_telemetry_rules(), and matches_telemetry() with exact and subdomain matching.
6 new TestTelemetry tests pass; 199 total passing.
2026-05-15 20:54:40 -07:00
bbb4605829 feat(blocklist): blocklist_candidates schema + tests
Add blocklist_candidates table and indexes to _SCHEMA in pipeline.py.
Add TestSchema tests verifying table existence, column set, and status/hit_count defaults.
All 193 tests pass.
2026-05-15 20:51:00 -07:00
42636673e8 fix: tautulli — hmac token compare, public pattern loader, startup cache, endpoint tests 2026-05-13 19:08:49 -07:00
9c5647bc68 fix: tautulli — entry_id collision on missing ts, token settings, test coverage 2026-05-13 19:04:07 -07:00
24dd4bc568 feat: Tautulli webhook ingest endpoint — plex events -> log_entries
POST /turnstone/api/ingest/tautulli accepts Tautulli notification agent
payloads and stores them as log_entries under source 'tautulli'. Severity
maps error->CRITICAL, buffer->WARN, all others->None. Optional bearer token
auth via X-Tautulli-Token header + tautulli_token pref. FTS index rebuilt
as a background task after each write. 28 new tests, all passing.
2026-05-13 18:41:03 -07:00
f64b834177 fix: ingestors treat naive log timestamps as local time, not UTC
All five parsers (plex, syslog, servarr, qbittorrent, plaintext) were
using .replace(tzinfo=timezone.utc) on naive datetimes parsed from log
files, which slaps a UTC label on what is actually local-time data.
On a UTC-7 system a 2pm entry was stored as 14:00Z instead of 21:00Z,
causing time-window searches to return zero results.

Fix: use .astimezone(timezone.utc) instead, which treats the naive
datetime as local time and converts correctly.

Tests updated to round-trip back to local time for assertion so they
pass on any timezone, not just UTC.
2026-05-13 18:16:33 -07:00
f361c86019 feat: optional sqlite-vec embedding pipeline for Paid-tier RAG 2026-05-13 16:32:57 -07:00
2c408907ac feat: inject environment context into diagnose pipeline and LLM prompt
- Add context_block param to summarize() and thread it into _PROMPT_TEMPLATE
- Wire retrieve_context/format_context_block into diagnose_stream() before
  log search; emit context SSE event (facts + chunks) to the client
- 3 new tests covering prompt injection and SSE event emission (155 total, all pass)
2026-05-13 16:29:26 -07:00
9c8c60e461 feat: wizard state machine — structured Q&A writes context facts and source config 2026-05-13 16:25:52 -07:00
9a4931b0ba feat: context retriever — keyword fact lookup and chunk search 2026-05-13 16:23:54 -07:00
70c8a7deea feat: doc upload adapter — writes facts, document, and chunks to context store 2026-05-13 16:21:55 -07:00
c62b0bb12a feat: context chunker — type detection, YAML extraction, text chunking
- Implement document type detection for yaml/json/markdown/text
- Extract service facts from docker-compose YAML (names, images, ports)
- Split text into overlapping word chunks (300-word default with 50-word overlap)
- Enforce 5 MB file size limit
- Comprehensive TDD test suite: 15 tests passing
2026-05-13 15:54:51 -07:00
dd977f0bf1 feat: context store — fact and document CRUD 2026-05-13 15:53:03 -07:00
bae889ddf2 feat: add context_facts, context_documents, context_chunks tables to schema 2026-05-13 15:51:19 -07:00
cae9cd7eee feat: switch LLM backend to OpenAI-compat; add cf-orch remote inference support
Turnstone now calls /v1/chat/completions instead of Ollama's /api/generate.
This format works with both local Ollama (>=0.1.24) and a remote cf-orch
coordinator, enabling GPU-less nodes like Xander's to route diagnoses through
the cluster without any local model.

- llm.py: OpenAI-compat messages format, optional Bearer auth header
- diagnose.py: thread llm_api_key through the call chain
- rest.py: llm_api_key pref (default empty), SettingsBody field, passed to diagnose
- SettingsView.vue: API Key field, label updated from "Ollama URL" to "LLM Endpoint URL"
- tests: updated mocks for new response shape; added bearer token assertion test
2026-05-12 12:58:38 -07:00
4f93c30c01 feat: periodic corpus export — push ERROR/CRITICAL entries and incidents to Avocet
Watermark-based batch export script (scripts/export_corpus.py) pushes up to 500
ERROR/CRITICAL entries and labeled incidents per run to AVOCET_CORPUS_ENDPOINT.
Uses SQLite rowid watermark (entry log) and ISO timestamp watermark (incidents).
Skips silently when AVOCET_CORPUS_ENDPOINT is not set. 19 tests. Closes turnstone#6.
2026-05-11 17:08:35 -07:00
4151c98f23 feat: add file tail source type; configure example-node watchers
- type: file uses tail -F (handles rotation) with auto-format detection
- _parse_lines dispatches to journald/servarr/qbit/caddy/syslog/plaintext
  based on first-line format detection — same logic as batch ingest
- watch.yaml updated with file type docs and example-node-specific example
- scripts/journal-bridge.sh + .service written directly to example-node

Xander's watch.yaml covers: system-journal-live (via bridge file),
sonarr, radarr, lidarr, prowlarr, bazarr, qbittorrent, nzbget, tautulli
2026-05-11 15:44:10 -07:00
3ebdd4aef0 feat: live watch mode — tail journald/docker/podman sources continuously (#4)
Adds background watcher that tails active log sources and ingests entries
in near-real-time, keeping the DB fresh without manual ingest runs.

- app/watch/watcher.py: Watcher + WatchSource using subprocess + select
  loop; flushes every 10s or 100 lines; syncs FTS index every 3 flushes
- patterns/watch.yaml: declarative source config (journald/docker/podman)
- app/rest.py: lifespan context manager starts/stops watcher on app
  startup/shutdown; GET /api/watch/status + POST /api/watch/reload
- web/src/views/DashboardView.vue: live/manual indicator chip + stale
  banner copy adapts to whether live watching is active
- tests/test_watch_watcher.py: 16 tests covering config load, command
  building, docker timestamp stripping, orchestrator lifecycle

Closes #4
2026-05-11 15:34:13 -07:00
0a4d877ba7 feat: LLM reasoning layer — Ollama summarization on diagnose results 2026-05-11 11:35:07 -07:00
dd48ed1369 fix: correct time_detected logic, immutable sort pattern, add diagnose() test 2026-05-11 09:08:24 -07:00
51c3b769aa feat: add diagnose service with NL time extraction via dateparser
Adds app/services/diagnose.py with parse_time_window() (dateparser-backed
NL time phrase extraction with 60-min fallback) and diagnose() (layered
FTS + window search returning severity/source summary). Includes 5 TDD tests.
2026-05-11 09:04:50 -07:00
9ec60ea7ff feat: syslog and dmesg parsers with graceful journald fallback
- Add syslog.py — RFC 3164 parser for /var/log/syslog, /var/log/messages,
  auth.log, kern.log; ident prepended to message text for searchability
- Add dmesg_log.py — handles both relative [secs.usecs] and human-readable
  [Dow Mon DD HH:MM:SS YYYY] formats; relative timestamps preserved as raw
- Wire both into pipeline.py auto-detection (before plaintext fallback)
- Update export_journal.sh: checks for journalctl availability, falls back
  gracefully on non-systemd systems; adds dmesg -T export (falls back to
  plain dmesg on older kernels)
- Add syslog entries (commented) + dmesg source to sources.yaml
- 30 tests covering both parsers (detection + parse correctness)
2026-05-11 06:57:38 -07:00
50bae0d393 fix: support hotio qBittorrent 5.x log format (N/I/W/C single-char level)
Xander's ghcr.io/hotio/qbittorrent:latest container uses a different format
than the classic GUI build: `(N) 2026-04-26T03:32:59 - message` with a
single-char level code before an ISO timestamp, not inside parens.

Added _HOTIO_RE alongside _CLASSIC_RE; unified via _match_line() helper so
parse() loop is unchanged. 28 tests passing, both formats covered.
2026-05-11 05:55:40 -07:00
2d148e4f9e feat: qBittorrent log ingestor with 8 diagnostic patterns
Adds app/ingest/qbittorrent.py — auto-detected by the pipeline on the
(YYYY/MM/DD HH:MM:SS) timestamp fingerprint. Handles both slash and dash
date separators, optional [Warning|Critical] bracket levels, and
multi-line continuations (Qt stack traces).

patterns/default.yaml: 8 new qbit_ patterns covering tracker errors,
port bind failures, disk errors, hash check failures, peer bans, download
completion, ratio limits, and session errors.

manage.sh: ingest-qbit [HOST] command mirrors ingest-plex — probes known
default log paths locally or via SSH, ingests, restarts server.

14 tests covering format detection, severity mapping, multiline handling,
and timestamp normalization.
2026-05-10 08:21:16 -07:00
64c3996aa1 feat: initial Turnstone POC — ingest, FTS search, MCP server
Ingest pipeline (journald / Caddy / Docker-wrapped formats) with
per-source state tracking (repeat dedup, out-of-order detection),
named pattern tagging at ingest time, and idempotent SHA1-keyed writes.

FTS5 search layer with porter stemmer, severity/source/pattern/time
filters, and BM25 ranking. MCP server (FastMCP stdio) with three tools:
search_logs, diagnose, list_log_sources — compatible with both
Claude Code and Copilot CLI.

WAL mode enabled on all connections. FTS index auto-built after ingest.
MCP configs included for Claude Code (.mcp.json) and Copilot CLI
(.github/copilot/mcp.json).
2026-05-08 12:12:34 -07:00