turnstone

Author	SHA1	Message	Date
pyr0ball	e543ab70f7	feat: dual-backend SQLite/Postgres + multi-tenant source namespacing - Add app/db/ abstraction layer: Backend enum, DbConn wrapper, dialect helper (q() for ? vs %s paramstyle), get_conn(), tenant_id() - Auto-detect backend from DATABASE_URL; SQLite remains default when unset — no config change for local deployments - Add tenant_id column to all three logical DBs (main, context, incidents); idempotent ALTER TABLE migration runs before schema scripts on existing DBs - All INSERTs inject tenant_id; SELECTs use (tenant_id = ? OR tenant_id = '') for backward compat with pre-namespacing rows - Add docker-compose.yml with named volume turnstone_pgdata (survives rebuilds) and optional external Postgres support via DATABASE_URL override - Add scripts/migrate_sqlite_to_postgres.py — one-shot idempotent migration for existing SQLite data; ON CONFLICT DO NOTHING for safe re-runs - Fix SSH glean path in pipeline.py to use ensure_schema + get_conn (was still using raw sqlite3.connect + old _SCHEMA without tenant_id) - Fix FTS5 JOIN ambiguity: qualify repeat_count as f.repeat_count in search - Update all tests to use ensure_*_schema fixtures; add row_factory where needed - 394/394 tests passing Closes: #42 Closes: #50	2026-06-08 08:37:54 -07:00
pyr0ball	3fd9b6d5a2	feat(diagnose): tech-level post-processor, offline mode, API auth, context harvest - synthesizer: 3 system prompts (sysadmin/homelab/executive) selected by tech_level pref - settings: tech_level selector (UI + backend) persisted in preferences.json - QuickCapture: shows active level label in diagnosis card header - TURNSTONE_OFFLINE_MODE=1: sets HF_HUB_OFFLINE + TRANSFORMERS_OFFLINE before lib load - TURNSTONE_API_KEY: bearer token auth on all /api/ routes (hmac.compare_digest) - /health always open; unset key = no auth (backward compatible) - docs/air-gapped-deployment.md: full offline deployment guide - scripts/harvest_docs.py: generalized context doc bulk-uploader with manifest support - scripts/manifests/: heimdall-devops.yaml (10 docs ingested) + example.yaml template - fix: _ingest_upload -> _glean_upload in context doc upload endpoint (was 500) Closes: #56 Closes: #45 Closes: #47 Closes: #49 Closes: #21	2026-05-28 08:51:05 -07:00
pyr0ball	854818ca1a	fix(db): add timeout=30s to all sqlite3.connect() calls across app Watcher, REST endpoints, services (search, incidents, blocklist), MCP server, context retriever, embedder, glean_scheduler, and doc_upload all used the default 5-second SQLite busy timeout. During collect glean write phases, watcher flush threads were hitting 'database is locked' errors when the glean held the write lock longer than 5 seconds. All connections now use timeout=30.0, matching the pipeline fix from commit `ee39ffb`. No logic changes.	2026-05-26 23:12:48 -07:00
pyr0ball	ee39ffbd44	fix(glean): add timeout=30s to all pipeline DB connections; add --force flag; new patterns pipeline.py: - Add timeout=30.0 to all sqlite3.connect() calls (5 total). Previously only ensure_context_schema() had it. The main glean writers would fail immediately under lock contention from the live watcher or concurrent manual glean runs. glean_corpus.py: - Add --force flag (passed through to glean_sources/glean_file/glean_dir). Without it, unchanged-fingerprint files were silently skipped even after pattern updates. Use after editing patterns/default.yaml. patterns/default.yaml: - Add 9 new patterns for Muninn / cluster-wide coverage: vpn_tunnel_fail WireGuard/tunnel service failures vpn_handshake WireGuard peer handshake events dns_degraded systemd-resolved DNS fallback/degradation nvidia_api_mismatch NVIDIA kernel module vs userspace mismatch nvidia_xid NVIDIA Xid GPU hardware faults nvidia_gpu_reset NVIDIA GPU reset / NVLink faults acpi_error ACPI firmware _DSM evaluation failures thermal_throttle CPU/GPU thermal throttling / RAPL unavailable undervoltage PSU undervoltage / brownout events - Sync from /devl/turnstone-cluster/patterns/default.yaml (authoritative live copy updated first; repo copy was stale)	2026-05-26 22:36:45 -07:00
pyr0ball	27a1bea0f7	fix(cluster): add Muninn to SSH collection, fix ingest_corpus → glean_corpus rename - Add [muninn] to NODES map in collect_cluster_logs.sh Muninn is accessible via WireGuard (ssh muninn). One-time 7-day backfill already gleaned: 262,659 entries. - Fix broken script reference: ingest_corpus.py was renamed to glean_corpus.py — ongoing cluster glean was silently broken since the rename	2026-05-26 17:02:53 -07:00
pyr0ball	aa80f307fe	refactor: rename ingest → glean throughout codebase Renames the app/ingest/ package to app/glean/ and updates all references across Python modules, shell scripts, Vue components, tests, and documentation. Intentionally preserved: - SQLite column name ingest_time (avoids schema migration) - RetrievedEntry.ingest_time field (maps to the column above) - Any public-facing JSON keys that reference ingest_time Changes by category: - app/ingest/ → app/glean/ (full package move, all parsers) - app/tasks/ingest_scheduler.py → app/tasks/glean_scheduler.py - scripts/ingest_corpus.py → scripts/glean_corpus.py - tests/test_ingest_.py → tests/test_glean_.py - Docstrings, log messages, comments: ingest → glean - Env var: TURNSTONE_INGEST_INTERVAL → TURNSTONE_GLEAN_INTERVAL - Shell scripts: glean.log, glean_corpus.py references - README.md: multi-source ingest → multi-source glean - .env.example: updated env var name - patterns/: new diagnostic patterns from 2026-05-20 SSH incident (service_crash_loop, pkg_daemon_restart, ssh_forward_conflict) - SourcesView.vue: pipeline label updated - All test import paths updated to app.glean.* 285 tests passing.	2026-05-20 23:02:55 -07:00
pyr0ball	729b78e40f	feat: source-scoped diagnose; multi-node Docker log collection - Diagnose: add source_filter param threaded through entries_in_window, search, _diagnose, and DiagnoseRequest — clicking diagnose on a dashboard source now scopes both keyword and window hits to that source - QuickCapture: read route.query.source; show scope badge with clear ✕; auto-run when source param is present without a query - DashboardView: pass source= (not q=) when navigating to diagnose - collect_cluster_logs.sh: auto-discover Docker containers on all nodes (Heimdall non-watched, Navi, Strahl via SSH); collect Cass Plex logs via SSH; write to per-node dirs for directory-mode ingest - turnstone-cluster.service: add --reload for hot-reload during dev	2026-05-13 08:10:42 -07:00
pyr0ball	c7f1a27ee0	fix: correct cf-orch port to 7700; fix relative time parsing in diagnose; fix syslog PRI prefix	2026-05-13 05:33:41 -07:00
pyr0ball	8838653288	fix: write ingest log to data dir (alan lacks /var/log write access)	2026-05-13 05:20:56 -07:00
pyr0ball	ad66d58ad6	fix: run collect service as alan user; call ingest directly without Docker	2026-05-13 05:17:43 -07:00
pyr0ball	f8e86254bb	feat: add UDP syslog receiver for network device log collection scripts/syslog_receiver.py: asyncio UDP server listening on port 5140, appends raw syslog lines to network-syslog.txt for the Turnstone live watcher to tail. Requires no root — port 5140 is non-privileged. scripts/turnstone-syslog-receiver.service: systemd unit for auto-start. app/ingest/syslog.py: strip optional RFC 3164 <PRI> prefix before parsing so network-forwarded syslog (OpenWRT logd, Arista EOS, etc.) is handled correctly without the PRI value breaking the regex.	2026-05-13 04:58:51 -07:00
pyr0ball	07e151b01f	refactor: use live watcher + systemd timer instead of cron for cluster ingest Local Heimdall sources (journal, Docker containers, network syslog) are now tailed continuously by the built-in watcher via watch.yaml — no periodic collection needed for those. SSH collection of remote node journals is now handled by a systemd timer (turnstone-cluster-collect.service/.timer) instead of cron. collect_cluster_logs.sh simplified to only SSH-collect remote nodes and trigger ingest directly. docker-cluster.sh updated to mount: - /var/run/docker.sock (so watcher can run docker logs -f) - /run/systemd/journal (so watcher can run journalctl -f) - /devl/turnstone-cluster/patterns/ (cluster-specific watch.yaml)	2026-05-13 04:55:25 -07:00
pyr0ball	5f2130caf6	feat: add cluster-wide log collection and Heimdall Turnstone deployment - scripts/collect_cluster_logs.sh: collects journals from Heimdall (local), Navi, Sif, Cass, Strahl (SSH), Docker services, and a network syslog placeholder; designed for 15-min cron before ingest - patterns/sources-cluster.yaml: ingest sources config for the full CircuitForge cluster stack; points at /devl/turnstone-cluster/data/ - scripts/docker-cluster.sh: Docker deployment for Heimdall cluster monitor; seeds preferences.json with cf-orch coordinator URL (localhost:7701) so LLM summarization works on first ingest without manual UI config	2026-05-12 18:53:58 -07:00
pyr0ball	4f93c30c01	feat: periodic corpus export — push ERROR/CRITICAL entries and incidents to Avocet Watermark-based batch export script (scripts/export_corpus.py) pushes up to 500 ERROR/CRITICAL entries and labeled incidents per run to AVOCET_CORPUS_ENDPOINT. Uses SQLite rowid watermark (entry log) and ISO timestamp watermark (incidents). Skips silently when AVOCET_CORPUS_ENDPOINT is not set. 19 tests. Closes turnstone#6.	2026-05-11 17:08:35 -07:00
pyr0ball	00f0b0951c	chore: add update.sh deploy script; gitignore patterns/watch.yaml update.sh pulls a named branch (default: main), preserves the local watch.yaml around the pull, rebuilds the image, restarts the service, and polls health until ready. Usage: sudo bash /opt/turnstone/scripts/update.sh [branch] patterns/watch.yaml is site-specific config — gitignored so host customizations survive git pulls. The template is preserved in git history (feat/live-watch) for reference.	2026-05-11 16:07:07 -07:00
pyr0ball	9ec60ea7ff	feat: syslog and dmesg parsers with graceful journald fallback - Add syslog.py — RFC 3164 parser for /var/log/syslog, /var/log/messages, auth.log, kern.log; ident prepended to message text for searchability - Add dmesg_log.py — handles both relative [secs.usecs] and human-readable [Dow Mon DD HH:MM:SS YYYY] formats; relative timestamps preserved as raw - Wire both into pipeline.py auto-detection (before plaintext fallback) - Update export_journal.sh: checks for journalctl availability, falls back gracefully on non-systemd systems; adds dmesg -T export (falls back to plain dmesg on older kernels) - Add syslog entries (commented) + dmesg source to sources.yaml - 30 tests covering both parsers (detection + parse correctness)	2026-05-11 06:57:38 -07:00
pyr0ball	1b6482701c	feat: journald export + system failure patterns - Add scripts/export_journal.sh — dumps recent journal (priority 0-5, 20min window) to /opt/turnstone/data/journal-export.jsonl; idempotent via entry_id deduplication so overlap is safe - Add system-journal source to sources.yaml pointing at the export file - Add 9 system-level patterns to default.yaml: systemd_fail, oom_kill, disk_hw_error, fs_error, kernel_error, ssh_brute, container_crash, smart_error, nfs_error	2026-05-11 06:54:42 -07:00
pyr0ball	f9691277d8	feat: multi-source ingest via sources.yaml + servarr parser - Add servarr.py parser for all *arr services (sonarr/radarr/lidarr/ prowlarr/readarr/whisparr/bazarr) — pipe-delimited format with component prefix prepended for searchability - Add ingest_sources() to pipeline.py; reads sources.yaml, skips missing paths with a warning so cron keeps running if a service is down - Add --sources mode to ingest_corpus.py CLI; legacy positional args unchanged for backward compat - Add patterns/sources.yaml with all of Xander's discovered service log paths (qbit, 7 servarr services, nzbget, tautulli, jellyseerr) - Replace per-service volume mounts in podman-standalone.sh with /opt:/opt:ro + /var/log:/var/log:ro; adding a new source now requires only editing sources.yaml — no container restart	2026-05-11 06:26:32 -07:00
pyr0ball	f8a2f8007b	feat: plain-text and Plex log ingestors - app/ingest/plex.py: Plex Media Server log parser Regex-based line parser for 'Mon DD, YYYY HH:MM:SS.mmm [pid] LEVEL - msg' format. Handles multi-line entries (stack traces). Detects plex_eae_failure and all other patterns via shared pattern library. - app/ingest/plaintext.py: generic fallback parser for unrecognized formats Extracts timestamps (ISO 8601, syslog, common log) and severity via regex. - pipeline.py: detect plex format via is_plex_log(); fall back to plaintext instead of skipping; process .log files alongside .jsonl; add ingest_file() for single-file ingestion. - scripts/ingest_corpus.py: accept single file or directory as target - manage.sh: ingest-plex command SSHes to Cass (or HOST arg), pulls Plex Media Server.log, and ingests it directly	2026-05-08 17:50:01 -07:00
pyr0ball	7083a7c090	chore: standardize manage.sh, remove start_dev.sh - manage.sh: start/stop/restart/status/logs/open/dev/ingest/build-fts/test following avocet pattern (PID files, colored output, native processes) - start mode: builds Vue SPA, uvicorn on :8534, python http.server on :8535 - dev mode: uvicorn --reload + Vite HMR, trap cleanup on exit - scripts/start_dev.sh: removed (superseded by manage.sh dev)	2026-05-08 16:58:12 -07:00
pyr0ball	a45fa901dd	feat: Vue 3 frontend and FastAPI REST layer - app/rest.py: FastAPI app wrapping search/diagnose/sources with CORS - web/: Vue 3 + Vite + UnoCSS + Pinia frontend at port 8535 - LogSearchView: sidebar filters (source, severity, limit) + FTS search - DiagnoseView: layered symptom investigation matching MCP diagnose tool - SourcesView: corpus table with entry count, error count, time range - LogEntryRow: severity badge, pattern chips, repeat count, timestamp - StatusDot: live API health indicator in nav - scripts/start_dev.sh: launch FastAPI (:8534) + Vite dev server (:8535) - .gitignore: add web/node_modules/ and web/dist/ - Caddy: /turnstone* route added to menagerie.circuitforge.tech block (API → :8534 with /turnstone strip, SPA fallback → :8535)	2026-05-08 16:27:59 -07:00
pyr0ball	8db8810667	feat: plex EAE watchdog and plex_eae_failure pattern Add plex_eae_failure pattern to default.yaml targeting the EasyAudioEncoder crash signature (EAE timeout + I/O error pair, 5s cadence). Pattern fires when EAE's WAV handoff files stop appearing in the pms temp directory. Add watch_plex.py: tail-based watchdog that counts EAE timeout events and auto-restarts plexmediaserver after N consecutive hits (default 3, ~15s of failure). Includes cooldown, dry-run mode, and a systemd unit template.	2026-05-08 13:41:34 -07:00
pyr0ball	64c3996aa1	feat: initial Turnstone POC — ingest, FTS search, MCP server Ingest pipeline (journald / Caddy / Docker-wrapped formats) with per-source state tracking (repeat dedup, out-of-order detection), named pattern tagging at ingest time, and idempotent SHA1-keyed writes. FTS5 search layer with porter stemmer, severity/source/pattern/time filters, and BM25 ranking. MCP server (FastMCP stdio) with three tools: search_logs, diagnose, list_log_sources — compatible with both Claude Code and Copilot CLI. WAL mode enabled on all connections. FTS index auto-built after ingest. MCP configs included for Claude Code (.mcp.json) and Copilot CLI (.github/copilot/mcp.json).	2026-05-08 12:12:34 -07:00

23 commits