Commit graph

112 commits

Author SHA1 Message Date
783edbe496 feat: wizard state machine — structured Q&A writes context facts and source config 2026-05-13 16:25:52 -07:00
ef8d164188 feat: context retriever — keyword fact lookup and chunk search 2026-05-13 16:23:54 -07:00
ebbb1af32d feat: doc upload adapter — writes facts, document, and chunks to context store 2026-05-13 16:21:55 -07:00
b23a60a602 feat: context chunker — type detection, YAML extraction, text chunking
- Implement document type detection for yaml/json/markdown/text
- Extract service facts from docker-compose YAML (names, images, ports)
- Split text into overlapping word chunks (300-word default with 50-word overlap)
- Enforce 5 MB file size limit
- Comprehensive TDD test suite: 15 tests passing
2026-05-13 15:54:51 -07:00
54c756dfe8 feat: context store — fact and document CRUD 2026-05-13 15:53:03 -07:00
7461953021 feat: add context_facts, context_documents, context_chunks tables to schema 2026-05-13 15:51:19 -07:00
625b55324a fix: a11y foundation — text-dim contrast, focus-visible, prefers-reduced-motion 2026-05-13 15:48:12 -07:00
784a4072b4 feat: SSE streaming diagnose, severity filter pills, per-source-cap search
- diagnose_stream() async generator: status/summary/entries/reasoning/done events
- POST /api/diagnose/stream SSE endpoint wired in rest.py
- entries_in_window() gains per_source_cap to prevent high-volume sources crowding results
- QuickCapture: severity filter pills, filtered entries view, pipeline status spinner
- llm.py: remove overly broad HTTPStatusError re-raise
2026-05-13 15:45:35 -07:00
b70c89e7b5 feat: try cf-orch task endpoint first; fall back to direct model call
POST /api/inference/task with product=turnstone task=log_analysis routes to
the security reasoning model assigned in cf-orch. Falls back to the OpenAI-
compat /v1/chat/completions path on 404 (no assignment) or if the task
endpoint is absent (local instances, example-node).
2026-05-13 08:20:29 -07:00
caa85b3d30 feat: source-scoped diagnose; multi-node Docker log collection
- Diagnose: add source_filter param threaded through entries_in_window,
  search, _diagnose, and DiagnoseRequest — clicking diagnose on a
  dashboard source now scopes both keyword and window hits to that source
- QuickCapture: read route.query.source; show scope badge with clear ✕;
  auto-run when source param is present without a query
- DashboardView: pass source= (not q=) when navigating to diagnose
- collect_cluster_logs.sh: auto-discover Docker containers on all nodes
  (Heimdall non-watched, Navi, Strahl via SSH); collect Cass Plex logs
  via SSH; write to per-node dirs for directory-mode ingest
- turnstone-cluster.service: add --reload for hot-reload during dev
2026-05-13 08:10:42 -07:00
53fa350adf fix: correct cf-orch port to 7700; fix relative time parsing in diagnose; fix syslog PRI prefix 2026-05-13 05:33:41 -07:00
c1fa5ef70d fix: write ingest log to data dir (alan lacks /var/log write access) 2026-05-13 05:20:56 -07:00
e9faabc07f fix: run collect service as alan user; call ingest directly without Docker 2026-05-13 05:17:43 -07:00
a4ec5a6951 feat: add UDP syslog receiver for network device log collection
scripts/syslog_receiver.py: asyncio UDP server listening on port 5140,
appends raw syslog lines to network-syslog.txt for the Turnstone live
watcher to tail. Requires no root — port 5140 is non-privileged.

scripts/turnstone-syslog-receiver.service: systemd unit for auto-start.

app/ingest/syslog.py: strip optional RFC 3164 <PRI> prefix before
parsing so network-forwarded syslog (OpenWRT logd, Arista EOS, etc.)
is handled correctly without the PRI value breaking the regex.
2026-05-13 04:58:51 -07:00
d769be04d4 refactor: use live watcher + systemd timer instead of cron for cluster ingest
Local Heimdall sources (journal, Docker containers, network syslog) are now
tailed continuously by the built-in watcher via watch.yaml — no periodic
collection needed for those.

SSH collection of remote node journals is now handled by a systemd timer
(turnstone-cluster-collect.service/.timer) instead of cron.
collect_cluster_logs.sh simplified to only SSH-collect remote nodes and
trigger ingest directly.

docker-cluster.sh updated to mount:
  - /var/run/docker.sock (so watcher can run docker logs -f)
  - /run/systemd/journal (so watcher can run journalctl -f)
  - /devl/turnstone-cluster/patterns/ (cluster-specific watch.yaml)
2026-05-13 04:55:25 -07:00
1e8a118f71 feat: add cluster-wide log collection and Heimdall Turnstone deployment
- scripts/collect_cluster_logs.sh: collects journals from Heimdall (local),
  Navi, Sif, Cass, Strahl (SSH), Docker services, and a network syslog
  placeholder; designed for 15-min cron before ingest
- patterns/sources-cluster.yaml: ingest sources config for the full
  CircuitForge cluster stack; points at /devl/turnstone-cluster/data/
- scripts/docker-cluster.sh: Docker deployment for Heimdall cluster monitor;
  seeds preferences.json with cf-orch coordinator URL (localhost:7701) so
  LLM summarization works on first ingest without manual UI config
2026-05-12 18:53:58 -07:00
a21c158917 fix: increase LLM summarize timeout to 120s for remote cf-orch routing
20s was too tight for first-request model swaps in Ollama (model cold load
can take 30-60s). 120s matches coordinator inference timeout.
2026-05-12 18:27:52 -07:00
bb94f8251c fix: podman-standalone.sh builds image and regenerates systemd unit on each run
Running the script after a git pull previously left a stale image in place.
Now: build → run → regenerate systemd unit → daemon-reload, all in one step.
2026-05-12 16:18:37 -07:00
7d46314e86 feat: switch LLM backend to OpenAI-compat; add cf-orch remote inference support
Turnstone now calls /v1/chat/completions instead of Ollama's /api/generate.
This format works with both local Ollama (>=0.1.24) and a remote cf-orch
coordinator, enabling GPU-less nodes like Contributor2's to route diagnoses through
the cluster without any local model.

- llm.py: OpenAI-compat messages format, optional Bearer auth header
- diagnose.py: thread llm_api_key through the call chain
- rest.py: llm_api_key pref (default empty), SettingsBody field, passed to diagnose
- SettingsView.vue: API Key field, label updated from "Ollama URL" to "LLM Endpoint URL"
- tests: updated mocks for new response shape; added bearer token assertion test
2026-05-12 12:58:38 -07:00
afcac6ff05 feat: periodic corpus export — push ERROR/CRITICAL entries and incidents to Avocet
Watermark-based batch export script (scripts/export_corpus.py) pushes up to 500
ERROR/CRITICAL entries and labeled incidents per run to AVOCET_CORPUS_ENDPOINT.
Uses SQLite rowid watermark (entry log) and ISO timestamp watermark (incidents).
Skips silently when AVOCET_CORPUS_ENDPOINT is not set. 19 tests. Closes turnstone#6.
2026-05-11 17:08:35 -07:00
85785a3f76 chore: add update.sh deploy script; gitignore patterns/watch.yaml
update.sh pulls a named branch (default: main), preserves the local
watch.yaml around the pull, rebuilds the image, restarts the service,
and polls health until ready.

Usage: sudo bash /opt/turnstone/scripts/update.sh [branch]

patterns/watch.yaml is site-specific config — gitignored so host
customizations survive git pulls. The template is preserved in git
history (feat/live-watch) for reference.
2026-05-11 16:07:07 -07:00
bb8206d5a1 Merge pull request 'feat: live watch mode — tail journald/docker/podman continuously (#4)' (#16) from feat/live-watch into main 2026-05-11 15:45:30 -07:00
9cc8bf3662 feat: add file tail source type; configure example-node watchers
- type: file uses tail -F (handles rotation) with auto-format detection
- _parse_lines dispatches to journald/servarr/qbit/caddy/syslog/plaintext
  based on first-line format detection — same logic as batch ingest
- watch.yaml updated with file type docs and example-node-specific example
- scripts/journal-bridge.sh + .service written directly to example-node

Contributor2's watch.yaml covers: system-journal-live (via bridge file),
sonarr, radarr, lidarr, prowlarr, bazarr, qbittorrent, nzbget, tautulli
2026-05-11 15:44:10 -07:00
3fd81e5ab1 feat: live watch mode — tail journald/docker/podman sources continuously (#4)
Adds background watcher that tails active log sources and ingests entries
in near-real-time, keeping the DB fresh without manual ingest runs.

- app/watch/watcher.py: Watcher + WatchSource using subprocess + select
  loop; flushes every 10s or 100 lines; syncs FTS index every 3 flushes
- patterns/watch.yaml: declarative source config (journald/docker/podman)
- app/rest.py: lifespan context manager starts/stops watcher on app
  startup/shutdown; GET /api/watch/status + POST /api/watch/reload
- web/src/views/DashboardView.vue: live/manual indicator chip + stale
  banner copy adapts to whether live watching is active
- tests/test_watch_watcher.py: 16 tests covering config load, command
  building, docker timestamp stripping, orchestrator lifecycle

Closes #4
2026-05-11 15:34:13 -07:00
3c758d3626 Merge pull request 'feat: LLM reasoning, severity overrides, dashboard freshness' (#14) from feat/llm-reasoning into main 2026-05-11 13:00:52 -07:00
c12cc6d68a feat: severity overrides + last_ingested timestamp on dashboard 2026-05-11 13:00:11 -07:00
b3c02eebf7 docs: add README — diagnostic log intelligence layer 2026-05-11 12:57:32 -07:00
0882083755 feat: LLM reasoning layer — Ollama summarization on diagnose results 2026-05-11 11:35:07 -07:00
18d80cbfad Merge pull request 'feat: frictionless incident capture' (#13) from feat/frictionless-capture into main 2026-05-11 09:53:25 -07:00
728134321f feat: wire QuickCaptureBar/FAB into App.vue, add Settings route 2026-05-11 09:25:49 -07:00
426b4207eb feat: add SettingsView with entry point style toggle 2026-05-11 09:23:27 -07:00
184ad07f80 refactor: remove incident form from IncidentsView — now in DiagnoseView Structured tab 2026-05-11 09:22:25 -07:00
bf276aeb50 feat: add Quick/Structured tabs to DiagnoseView 2026-05-11 09:20:47 -07:00
3b00ad8b47 fix: surface save errors in QuickCapture via error.value 2026-05-11 09:19:38 -07:00
ba6995e279 feat: add QuickCaptureBar and QuickCaptureFab entry point components 2026-05-11 09:16:33 -07:00
6f17f74fe1 feat: add QuickCapture and IncidentForm components 2026-05-11 09:15:25 -07:00
05ad314ed5 feat: add POST /api/diagnose and GET/PATCH /api/settings endpoints 2026-05-11 09:10:58 -07:00
ca0cb1361e fix: correct time_detected logic, immutable sort pattern, add diagnose() test 2026-05-11 09:08:24 -07:00
abd142addf feat: add diagnose service with NL time extraction via dateparser
Adds app/services/diagnose.py with parse_time_window() (dateparser-backed
NL time phrase extraction with 60-min fallback) and diagnose() (layered
FTS + window search returning severity/source summary). Includes 5 TDD tests.
2026-05-11 09:04:50 -07:00
8840809a85 chore: add .worktrees to gitignore 2026-05-11 08:26:35 -07:00
346ea6e0c6 feat: syslog and dmesg parsers with graceful journald fallback
- Add syslog.py — RFC 3164 parser for /var/log/syslog, /var/log/messages,
  auth.log, kern.log; ident prepended to message text for searchability
- Add dmesg_log.py — handles both relative [secs.usecs] and human-readable
  [Dow Mon DD HH:MM:SS YYYY] formats; relative timestamps preserved as raw
- Wire both into pipeline.py auto-detection (before plaintext fallback)
- Update export_journal.sh: checks for journalctl availability, falls back
  gracefully on non-systemd systems; adds dmesg -T export (falls back to
  plain dmesg on older kernels)
- Add syslog entries (commented) + dmesg source to sources.yaml
- 30 tests covering both parsers (detection + parse correctness)
2026-05-11 06:57:38 -07:00
286778d6a9 feat: journald export + system failure patterns
- Add scripts/export_journal.sh — dumps recent journal (priority 0-5,
  20min window) to /opt/turnstone/data/journal-export.jsonl; idempotent
  via entry_id deduplication so overlap is safe
- Add system-journal source to sources.yaml pointing at the export file
- Add 9 system-level patterns to default.yaml:
  systemd_fail, oom_kill, disk_hw_error, fs_error, kernel_error,
  ssh_brute, container_crash, smart_error, nfs_error
2026-05-11 06:54:42 -07:00
f46aba8165 feat: multi-source ingest via sources.yaml + servarr parser
- Add servarr.py parser for all *arr services (sonarr/radarr/lidarr/
  prowlarr/readarr/whisparr/bazarr) — pipe-delimited format with
  component prefix prepended for searchability
- Add ingest_sources() to pipeline.py; reads sources.yaml, skips
  missing paths with a warning so cron keeps running if a service
  is down
- Add --sources mode to ingest_corpus.py CLI; legacy positional args
  unchanged for backward compat
- Add patterns/sources.yaml with all of Contributor2's discovered service
  log paths (qbit, 7 servarr services, nzbget, tautulli, jellyseerr)
- Replace per-service volume mounts in podman-standalone.sh with
  /opt:/opt:ro + /var/log:/var/log:ro; adding a new source now
  requires only editing sources.yaml — no container restart
2026-05-11 06:26:32 -07:00
30729826a4 fix: use sudo podman for rootful Podman systems 2026-05-11 06:08:19 -07:00
40bbc9225d fix: support hotio qBittorrent 5.x log format (N/I/W/C single-char level)
Contributor2's ghcr.io/hotio/qbittorrent:latest container uses a different format
than the classic GUI build: `(N) 2026-04-26T03:32:59 - message` with a
single-char level code before an ISO timestamp, not inside parens.

Added _HOTIO_RE alongside _CLASSIC_RE; unified via _match_line() helper so
parse() loop is unchanged. 28 tests passing, both formats covered.
2026-05-11 05:55:40 -07:00
457b4fd7ae feat: incident labeling, bundle export, and push/receive flow
Turnstone incidents now carry an issue_type tag (free-text with datalist
suggestions) used to categorize patterns for signature building.

Backend:
- Incident model gains issue_type; additive ALTER TABLE migration keeps
  existing DBs working without a full schema rebuild
- New received_bundles table stores incoming JSON bundles with indexes on
  bundled_at and issue_type
- build_bundle() assembles incident + related log entries into a versioned
  bundle dict; store_bundle()/list_bundles()/get_bundle() for the receiver
- POST /api/incidents/{id}/send — pushes bundle to TURNSTONE_BUNDLE_ENDPOINT
- GET  /api/incidents/{id}/bundle — export without sending
- POST /api/bundles — receive and store an incoming bundle
- GET  /api/bundles — list all received bundles
- TURNSTONE_SOURCE_HOST and TURNSTONE_BUNDLE_ENDPOINT env vars; auto-set
  source host from hostname in podman-standalone.sh

Frontend:
- Incidents form: issue_type field with datalist suggestions; Type column
  in the table; Send Bundle button + status feedback in the detail drawer
- New BundlesView: collapsible bundle rows, inline JSON parse (no extra
  round-trip), Export JSON download button
- Router and nav updated with /bundles route
2026-05-11 05:23:55 -07:00
df87f22416 feat: Podman container deployment for Contributor2's system
- Dockerfile: multi-stage build (node:22-alpine builds Vue SPA, python:3.12-slim
  runs uvicorn) — no Node.js required on the host
- requirements.txt: minimal runtime deps (fastapi, uvicorn, pyyaml, aiofiles,
  pydantic, python-multipart)
- podman-standalone.sh: mirrors Peregrine's deployment pattern; binds
  /opt/turnstone/data + /opt/turnstone/patterns + /opt/qbittorrent/config/data/logs
  (ro); includes cron and Caddy config instructions in comments
- .gitignore: add log/, .turnstone-api.pid (generated by manage.sh dev mode)
2026-05-11 05:13:58 -07:00
f5893c6003 feat: dashboard view, stats API, and composite index for query perf
- Add GET /api/stats endpoint with 24h windowed aggregation (criticals,
  errors, per-source health, recent criticals list)
- Fix timestamp format bug: strftime('%Y-%m-%dT%H:%M:%S', ...) to match
  stored ISO-8601 T-separated timestamps (datetime('now') uses space)
- Add composite index idx_ts_repeat(timestamp_iso, repeat_count) — drops
  stats query from 3.5 s to <1 ms by resolving both WHERE conditions
  from the index without table row fetches
- New DashboardView: 3 stat cards, source health table with health dots,
  diagnose-per-source button, recent criticals panel, zero-state card
- Router default / → /dashboard; Dashboard first in nav
- DiagnoseView: reads ?q= query param on mount and auto-runs; shows
  formatted LLM summary block
- LogEntryRow: expand/collapse for long entries (>200 chars or multiline)
2026-05-11 03:41:55 -07:00
a3c0962277 feat: qBittorrent log ingestor with 8 diagnostic patterns
Adds app/ingest/qbittorrent.py — auto-detected by the pipeline on the
(YYYY/MM/DD HH:MM:SS) timestamp fingerprint. Handles both slash and dash
date separators, optional [Warning|Critical] bracket levels, and
multi-line continuations (Qt stack traces).

patterns/default.yaml: 8 new qbit_ patterns covering tracker errors,
port bind failures, disk errors, hash check failures, peer bans, download
completion, ratio limits, and session errors.

manage.sh: ingest-qbit [HOST] command mirrors ingest-plex — probes known
default log paths locally or via SSH, ingests, restarts server.

14 tests covering format detection, severity mapping, multiline handling,
and timestamp normalization.
2026-05-10 08:21:16 -07:00
19d3827e2d fix: bypass FTS ranking for named-source error retrieval
When diagnose() auto-detects a source name, FTS keyword scoring can
bury real errors whose text doesn't match the symptom query. Add
recent_source_errors() — a plain-SQL scan ordered by timestamp — so
the most recent errors from a known service always surface regardless
of keyword overlap.
2026-05-10 08:14:23 -07:00