Commit graph

120 commits

Author SHA1 Message Date
6f9cfb8018 fix: add error handling to context doc/fact load functions 2026-05-13 17:00:29 -07:00
4096f890be feat: Context view — document and fact management with accessible tables
Adds /context route with tabbed UI for managing uploaded documents and
manually-entered environment facts. Includes inline confirm-before-delete,
add-fact form with category/key/value fields, wizard CTA panel, and
stub components for DocUploadZone and WizardOverlay (Task 14).
2026-05-13 16:57:38 -07:00
88b27a1454 fix: a11y — tab panels v-show, radio roving-tabindex, table header label 2026-05-13 16:53:41 -07:00
29fb31d76c fix: a11y — tablist, health dots, table headers, switch roles, nav landmark 2026-05-13 16:48:38 -07:00
b7e71b0e78 fix: a11y — QuickCapture label/role/aria-live/spinner, LogEntryRow expand button 2026-05-13 16:42:46 -07:00
e0bfa11642 feat: optional sqlite-vec embedding pipeline for Paid-tier RAG 2026-05-13 16:32:57 -07:00
d8c3eba0f8 feat: context REST API — docs, facts, wizard, and debug endpoints
Wires the context/RAG layer into FastAPI via a dedicated _ctx router
(/turnstone/api/context/*): document upload (POST/GET/DELETE /docs),
fact CRUD (POST/GET/DELETE /facts), wizard state machine
(/wizard/schema, /wizard/step, /wizard/apply), and a debug search
endpoint (/debug/search). All blocking DB calls are dispatched via
asyncio.to_thread to keep the event loop free.
2026-05-13 16:31:07 -07:00
b5ce0a24b2 feat: inject environment context into diagnose pipeline and LLM prompt
- Add context_block param to summarize() and thread it into _PROMPT_TEMPLATE
- Wire retrieve_context/format_context_block into diagnose_stream() before
  log search; emit context SSE event (facts + chunks) to the client
- 3 new tests covering prompt injection and SSE event emission (155 total, all pass)
2026-05-13 16:29:26 -07:00
783edbe496 feat: wizard state machine — structured Q&A writes context facts and source config 2026-05-13 16:25:52 -07:00
ef8d164188 feat: context retriever — keyword fact lookup and chunk search 2026-05-13 16:23:54 -07:00
ebbb1af32d feat: doc upload adapter — writes facts, document, and chunks to context store 2026-05-13 16:21:55 -07:00
b23a60a602 feat: context chunker — type detection, YAML extraction, text chunking
- Implement document type detection for yaml/json/markdown/text
- Extract service facts from docker-compose YAML (names, images, ports)
- Split text into overlapping word chunks (300-word default with 50-word overlap)
- Enforce 5 MB file size limit
- Comprehensive TDD test suite: 15 tests passing
2026-05-13 15:54:51 -07:00
54c756dfe8 feat: context store — fact and document CRUD 2026-05-13 15:53:03 -07:00
7461953021 feat: add context_facts, context_documents, context_chunks tables to schema 2026-05-13 15:51:19 -07:00
625b55324a fix: a11y foundation — text-dim contrast, focus-visible, prefers-reduced-motion 2026-05-13 15:48:12 -07:00
784a4072b4 feat: SSE streaming diagnose, severity filter pills, per-source-cap search
- diagnose_stream() async generator: status/summary/entries/reasoning/done events
- POST /api/diagnose/stream SSE endpoint wired in rest.py
- entries_in_window() gains per_source_cap to prevent high-volume sources crowding results
- QuickCapture: severity filter pills, filtered entries view, pipeline status spinner
- llm.py: remove overly broad HTTPStatusError re-raise
2026-05-13 15:45:35 -07:00
b70c89e7b5 feat: try cf-orch task endpoint first; fall back to direct model call
POST /api/inference/task with product=turnstone task=log_analysis routes to
the security reasoning model assigned in cf-orch. Falls back to the OpenAI-
compat /v1/chat/completions path on 404 (no assignment) or if the task
endpoint is absent (local instances, example-node).
2026-05-13 08:20:29 -07:00
caa85b3d30 feat: source-scoped diagnose; multi-node Docker log collection
- Diagnose: add source_filter param threaded through entries_in_window,
  search, _diagnose, and DiagnoseRequest — clicking diagnose on a
  dashboard source now scopes both keyword and window hits to that source
- QuickCapture: read route.query.source; show scope badge with clear ✕;
  auto-run when source param is present without a query
- DashboardView: pass source= (not q=) when navigating to diagnose
- collect_cluster_logs.sh: auto-discover Docker containers on all nodes
  (Heimdall non-watched, Navi, Strahl via SSH); collect Cass Plex logs
  via SSH; write to per-node dirs for directory-mode ingest
- turnstone-cluster.service: add --reload for hot-reload during dev
2026-05-13 08:10:42 -07:00
53fa350adf fix: correct cf-orch port to 7700; fix relative time parsing in diagnose; fix syslog PRI prefix 2026-05-13 05:33:41 -07:00
c1fa5ef70d fix: write ingest log to data dir (alan lacks /var/log write access) 2026-05-13 05:20:56 -07:00
e9faabc07f fix: run collect service as alan user; call ingest directly without Docker 2026-05-13 05:17:43 -07:00
a4ec5a6951 feat: add UDP syslog receiver for network device log collection
scripts/syslog_receiver.py: asyncio UDP server listening on port 5140,
appends raw syslog lines to network-syslog.txt for the Turnstone live
watcher to tail. Requires no root — port 5140 is non-privileged.

scripts/turnstone-syslog-receiver.service: systemd unit for auto-start.

app/ingest/syslog.py: strip optional RFC 3164 <PRI> prefix before
parsing so network-forwarded syslog (OpenWRT logd, Arista EOS, etc.)
is handled correctly without the PRI value breaking the regex.
2026-05-13 04:58:51 -07:00
d769be04d4 refactor: use live watcher + systemd timer instead of cron for cluster ingest
Local Heimdall sources (journal, Docker containers, network syslog) are now
tailed continuously by the built-in watcher via watch.yaml — no periodic
collection needed for those.

SSH collection of remote node journals is now handled by a systemd timer
(turnstone-cluster-collect.service/.timer) instead of cron.
collect_cluster_logs.sh simplified to only SSH-collect remote nodes and
trigger ingest directly.

docker-cluster.sh updated to mount:
  - /var/run/docker.sock (so watcher can run docker logs -f)
  - /run/systemd/journal (so watcher can run journalctl -f)
  - /devl/turnstone-cluster/patterns/ (cluster-specific watch.yaml)
2026-05-13 04:55:25 -07:00
1e8a118f71 feat: add cluster-wide log collection and Heimdall Turnstone deployment
- scripts/collect_cluster_logs.sh: collects journals from Heimdall (local),
  Navi, Sif, Cass, Strahl (SSH), Docker services, and a network syslog
  placeholder; designed for 15-min cron before ingest
- patterns/sources-cluster.yaml: ingest sources config for the full
  CircuitForge cluster stack; points at /devl/turnstone-cluster/data/
- scripts/docker-cluster.sh: Docker deployment for Heimdall cluster monitor;
  seeds preferences.json with cf-orch coordinator URL (localhost:7701) so
  LLM summarization works on first ingest without manual UI config
2026-05-12 18:53:58 -07:00
a21c158917 fix: increase LLM summarize timeout to 120s for remote cf-orch routing
20s was too tight for first-request model swaps in Ollama (model cold load
can take 30-60s). 120s matches coordinator inference timeout.
2026-05-12 18:27:52 -07:00
bb94f8251c fix: podman-standalone.sh builds image and regenerates systemd unit on each run
Running the script after a git pull previously left a stale image in place.
Now: build → run → regenerate systemd unit → daemon-reload, all in one step.
2026-05-12 16:18:37 -07:00
7d46314e86 feat: switch LLM backend to OpenAI-compat; add cf-orch remote inference support
Turnstone now calls /v1/chat/completions instead of Ollama's /api/generate.
This format works with both local Ollama (>=0.1.24) and a remote cf-orch
coordinator, enabling GPU-less nodes like Contributor2's to route diagnoses through
the cluster without any local model.

- llm.py: OpenAI-compat messages format, optional Bearer auth header
- diagnose.py: thread llm_api_key through the call chain
- rest.py: llm_api_key pref (default empty), SettingsBody field, passed to diagnose
- SettingsView.vue: API Key field, label updated from "Ollama URL" to "LLM Endpoint URL"
- tests: updated mocks for new response shape; added bearer token assertion test
2026-05-12 12:58:38 -07:00
afcac6ff05 feat: periodic corpus export — push ERROR/CRITICAL entries and incidents to Avocet
Watermark-based batch export script (scripts/export_corpus.py) pushes up to 500
ERROR/CRITICAL entries and labeled incidents per run to AVOCET_CORPUS_ENDPOINT.
Uses SQLite rowid watermark (entry log) and ISO timestamp watermark (incidents).
Skips silently when AVOCET_CORPUS_ENDPOINT is not set. 19 tests. Closes turnstone#6.
2026-05-11 17:08:35 -07:00
85785a3f76 chore: add update.sh deploy script; gitignore patterns/watch.yaml
update.sh pulls a named branch (default: main), preserves the local
watch.yaml around the pull, rebuilds the image, restarts the service,
and polls health until ready.

Usage: sudo bash /opt/turnstone/scripts/update.sh [branch]

patterns/watch.yaml is site-specific config — gitignored so host
customizations survive git pulls. The template is preserved in git
history (feat/live-watch) for reference.
2026-05-11 16:07:07 -07:00
bb8206d5a1 Merge pull request 'feat: live watch mode — tail journald/docker/podman continuously (#4)' (#16) from feat/live-watch into main 2026-05-11 15:45:30 -07:00
9cc8bf3662 feat: add file tail source type; configure example-node watchers
- type: file uses tail -F (handles rotation) with auto-format detection
- _parse_lines dispatches to journald/servarr/qbit/caddy/syslog/plaintext
  based on first-line format detection — same logic as batch ingest
- watch.yaml updated with file type docs and example-node-specific example
- scripts/journal-bridge.sh + .service written directly to example-node

Contributor2's watch.yaml covers: system-journal-live (via bridge file),
sonarr, radarr, lidarr, prowlarr, bazarr, qbittorrent, nzbget, tautulli
2026-05-11 15:44:10 -07:00
3fd81e5ab1 feat: live watch mode — tail journald/docker/podman sources continuously (#4)
Adds background watcher that tails active log sources and ingests entries
in near-real-time, keeping the DB fresh without manual ingest runs.

- app/watch/watcher.py: Watcher + WatchSource using subprocess + select
  loop; flushes every 10s or 100 lines; syncs FTS index every 3 flushes
- patterns/watch.yaml: declarative source config (journald/docker/podman)
- app/rest.py: lifespan context manager starts/stops watcher on app
  startup/shutdown; GET /api/watch/status + POST /api/watch/reload
- web/src/views/DashboardView.vue: live/manual indicator chip + stale
  banner copy adapts to whether live watching is active
- tests/test_watch_watcher.py: 16 tests covering config load, command
  building, docker timestamp stripping, orchestrator lifecycle

Closes #4
2026-05-11 15:34:13 -07:00
3c758d3626 Merge pull request 'feat: LLM reasoning, severity overrides, dashboard freshness' (#14) from feat/llm-reasoning into main 2026-05-11 13:00:52 -07:00
c12cc6d68a feat: severity overrides + last_ingested timestamp on dashboard 2026-05-11 13:00:11 -07:00
b3c02eebf7 docs: add README — diagnostic log intelligence layer 2026-05-11 12:57:32 -07:00
0882083755 feat: LLM reasoning layer — Ollama summarization on diagnose results 2026-05-11 11:35:07 -07:00
18d80cbfad Merge pull request 'feat: frictionless incident capture' (#13) from feat/frictionless-capture into main 2026-05-11 09:53:25 -07:00
728134321f feat: wire QuickCaptureBar/FAB into App.vue, add Settings route 2026-05-11 09:25:49 -07:00
426b4207eb feat: add SettingsView with entry point style toggle 2026-05-11 09:23:27 -07:00
184ad07f80 refactor: remove incident form from IncidentsView — now in DiagnoseView Structured tab 2026-05-11 09:22:25 -07:00
bf276aeb50 feat: add Quick/Structured tabs to DiagnoseView 2026-05-11 09:20:47 -07:00
3b00ad8b47 fix: surface save errors in QuickCapture via error.value 2026-05-11 09:19:38 -07:00
ba6995e279 feat: add QuickCaptureBar and QuickCaptureFab entry point components 2026-05-11 09:16:33 -07:00
6f17f74fe1 feat: add QuickCapture and IncidentForm components 2026-05-11 09:15:25 -07:00
05ad314ed5 feat: add POST /api/diagnose and GET/PATCH /api/settings endpoints 2026-05-11 09:10:58 -07:00
ca0cb1361e fix: correct time_detected logic, immutable sort pattern, add diagnose() test 2026-05-11 09:08:24 -07:00
abd142addf feat: add diagnose service with NL time extraction via dateparser
Adds app/services/diagnose.py with parse_time_window() (dateparser-backed
NL time phrase extraction with 60-min fallback) and diagnose() (layered
FTS + window search returning severity/source summary). Includes 5 TDD tests.
2026-05-11 09:04:50 -07:00
8840809a85 chore: add .worktrees to gitignore 2026-05-11 08:26:35 -07:00
346ea6e0c6 feat: syslog and dmesg parsers with graceful journald fallback
- Add syslog.py — RFC 3164 parser for /var/log/syslog, /var/log/messages,
  auth.log, kern.log; ident prepended to message text for searchability
- Add dmesg_log.py — handles both relative [secs.usecs] and human-readable
  [Dow Mon DD HH:MM:SS YYYY] formats; relative timestamps preserved as raw
- Wire both into pipeline.py auto-detection (before plaintext fallback)
- Update export_journal.sh: checks for journalctl availability, falls back
  gracefully on non-systemd systems; adds dmesg -T export (falls back to
  plain dmesg on older kernels)
- Add syslog entries (commented) + dmesg source to sources.yaml
- 30 tests covering both parsers (detection + parse correctness)
2026-05-11 06:57:38 -07:00
286778d6a9 feat: journald export + system failure patterns
- Add scripts/export_journal.sh — dumps recent journal (priority 0-5,
  20min window) to /opt/turnstone/data/journal-export.jsonl; idempotent
  via entry_id deduplication so overlap is safe
- Add system-journal source to sources.yaml pointing at the export file
- Add 9 system-level patterns to default.yaml:
  systemd_fail, oom_kill, disk_hw_error, fs_error, kernel_error,
  ssh_brute, container_crash, smart_error, nfs_error
2026-05-11 06:54:42 -07:00