- Add app/db/ abstraction layer: Backend enum, DbConn wrapper,
dialect helper (q() for ? vs %s paramstyle), get_conn(), tenant_id()
- Auto-detect backend from DATABASE_URL; SQLite remains default when
unset — no config change for local deployments
- Add tenant_id column to all three logical DBs (main, context, incidents);
idempotent ALTER TABLE migration runs before schema scripts on existing DBs
- All INSERTs inject tenant_id; SELECTs use (tenant_id = ? OR tenant_id = '')
for backward compat with pre-namespacing rows
- Add docker-compose.yml with named volume turnstone_pgdata (survives rebuilds)
and optional external Postgres support via DATABASE_URL override
- Add scripts/migrate_sqlite_to_postgres.py — one-shot idempotent migration
for existing SQLite data; ON CONFLICT DO NOTHING for safe re-runs
- Fix SSH glean path in pipeline.py to use ensure_schema + get_conn
(was still using raw sqlite3.connect + old _SCHEMA without tenant_id)
- Fix FTS5 JOIN ambiguity: qualify repeat_count as f.repeat_count in search
- Update all tests to use ensure_*_schema fixtures; add row_factory where needed
- 394/394 tests passing
Closes: #42
Closes: #50
Watcher, REST endpoints, services (search, incidents, blocklist),
MCP server, context retriever, embedder, glean_scheduler, and
doc_upload all used the default 5-second SQLite busy timeout.
During collect glean write phases, watcher flush threads were hitting
'database is locked' errors when the glean held the write lock longer
than 5 seconds.
All connections now use timeout=30.0, matching the pipeline fix
from commit ee39ffb. No logic changes.
pipeline.py:
- Add timeout=30.0 to all sqlite3.connect() calls (5 total).
Previously only ensure_context_schema() had it. The main glean
writers would fail immediately under lock contention from the live
watcher or concurrent manual glean runs.
glean_corpus.py:
- Add --force flag (passed through to glean_sources/glean_file/glean_dir).
Without it, unchanged-fingerprint files were silently skipped even
after pattern updates. Use after editing patterns/default.yaml.
patterns/default.yaml:
- Add 9 new patterns for Muninn / cluster-wide coverage:
vpn_tunnel_fail WireGuard/tunnel service failures
vpn_handshake WireGuard peer handshake events
dns_degraded systemd-resolved DNS fallback/degradation
nvidia_api_mismatch NVIDIA kernel module vs userspace mismatch
nvidia_xid NVIDIA Xid GPU hardware faults
nvidia_gpu_reset NVIDIA GPU reset / NVLink faults
acpi_error ACPI firmware _DSM evaluation failures
thermal_throttle CPU/GPU thermal throttling / RAPL unavailable
undervoltage PSU undervoltage / brownout events
- Sync from /devl/turnstone-cluster/patterns/default.yaml (authoritative
live copy updated first; repo copy was stale)
- Add [muninn] to NODES map in collect_cluster_logs.sh
Muninn is accessible via WireGuard (ssh muninn).
One-time 7-day backfill already gleaned: 262,659 entries.
- Fix broken script reference: ingest_corpus.py was renamed to
glean_corpus.py — ongoing cluster glean was silently broken since the rename
- Diagnose: add source_filter param threaded through entries_in_window,
search, _diagnose, and DiagnoseRequest — clicking diagnose on a
dashboard source now scopes both keyword and window hits to that source
- QuickCapture: read route.query.source; show scope badge with clear ✕;
auto-run when source param is present without a query
- DashboardView: pass source= (not q=) when navigating to diagnose
- collect_cluster_logs.sh: auto-discover Docker containers on all nodes
(Heimdall non-watched, Navi, Strahl via SSH); collect Cass Plex logs
via SSH; write to per-node dirs for directory-mode ingest
- turnstone-cluster.service: add --reload for hot-reload during dev
scripts/syslog_receiver.py: asyncio UDP server listening on port 5140,
appends raw syslog lines to network-syslog.txt for the Turnstone live
watcher to tail. Requires no root — port 5140 is non-privileged.
scripts/turnstone-syslog-receiver.service: systemd unit for auto-start.
app/ingest/syslog.py: strip optional RFC 3164 <PRI> prefix before
parsing so network-forwarded syslog (OpenWRT logd, Arista EOS, etc.)
is handled correctly without the PRI value breaking the regex.
Local Heimdall sources (journal, Docker containers, network syslog) are now
tailed continuously by the built-in watcher via watch.yaml — no periodic
collection needed for those.
SSH collection of remote node journals is now handled by a systemd timer
(turnstone-cluster-collect.service/.timer) instead of cron.
collect_cluster_logs.sh simplified to only SSH-collect remote nodes and
trigger ingest directly.
docker-cluster.sh updated to mount:
- /var/run/docker.sock (so watcher can run docker logs -f)
- /run/systemd/journal (so watcher can run journalctl -f)
- /devl/turnstone-cluster/patterns/ (cluster-specific watch.yaml)
- scripts/collect_cluster_logs.sh: collects journals from Heimdall (local),
Navi, Sif, Cass, Strahl (SSH), Docker services, and a network syslog
placeholder; designed for 15-min cron before ingest
- patterns/sources-cluster.yaml: ingest sources config for the full
CircuitForge cluster stack; points at /devl/turnstone-cluster/data/
- scripts/docker-cluster.sh: Docker deployment for Heimdall cluster monitor;
seeds preferences.json with cf-orch coordinator URL (localhost:7701) so
LLM summarization works on first ingest without manual UI config
Watermark-based batch export script (scripts/export_corpus.py) pushes up to 500
ERROR/CRITICAL entries and labeled incidents per run to AVOCET_CORPUS_ENDPOINT.
Uses SQLite rowid watermark (entry log) and ISO timestamp watermark (incidents).
Skips silently when AVOCET_CORPUS_ENDPOINT is not set. 19 tests. Closes turnstone#6.
update.sh pulls a named branch (default: main), preserves the local
watch.yaml around the pull, rebuilds the image, restarts the service,
and polls health until ready.
Usage: sudo bash /opt/turnstone/scripts/update.sh [branch]
patterns/watch.yaml is site-specific config — gitignored so host
customizations survive git pulls. The template is preserved in git
history (feat/live-watch) for reference.
- Add servarr.py parser for all *arr services (sonarr/radarr/lidarr/
prowlarr/readarr/whisparr/bazarr) — pipe-delimited format with
component prefix prepended for searchability
- Add ingest_sources() to pipeline.py; reads sources.yaml, skips
missing paths with a warning so cron keeps running if a service
is down
- Add --sources mode to ingest_corpus.py CLI; legacy positional args
unchanged for backward compat
- Add patterns/sources.yaml with all of Xander's discovered service
log paths (qbit, 7 servarr services, nzbget, tautulli, jellyseerr)
- Replace per-service volume mounts in podman-standalone.sh with
/opt:/opt:ro + /var/log:/var/log:ro; adding a new source now
requires only editing sources.yaml — no container restart
- app/ingest/plex.py: Plex Media Server log parser
Regex-based line parser for 'Mon DD, YYYY HH:MM:SS.mmm [pid] LEVEL - msg'
format. Handles multi-line entries (stack traces). Detects plex_eae_failure
and all other patterns via shared pattern library.
- app/ingest/plaintext.py: generic fallback parser for unrecognized formats
Extracts timestamps (ISO 8601, syslog, common log) and severity via regex.
- pipeline.py: detect plex format via is_plex_log(); fall back to plaintext
instead of skipping; process *.log files alongside *.jsonl; add ingest_file()
for single-file ingestion.
- scripts/ingest_corpus.py: accept single file or directory as target
- manage.sh: ingest-plex command SSHes to Cass (or HOST arg), pulls
Plex Media Server.log, and ingests it directly
Add plex_eae_failure pattern to default.yaml targeting the EasyAudioEncoder
crash signature (EAE timeout + I/O error pair, 5s cadence). Pattern fires
when EAE's WAV handoff files stop appearing in the pms temp directory.
Add watch_plex.py: tail-based watchdog that counts EAE timeout events and
auto-restarts plexmediaserver after N consecutive hits (default 3, ~15s of
failure). Includes cooldown, dry-run mode, and a systemd unit template.
Ingest pipeline (journald / Caddy / Docker-wrapped formats) with
per-source state tracking (repeat dedup, out-of-order detection),
named pattern tagging at ingest time, and idempotent SHA1-keyed writes.
FTS5 search layer with porter stemmer, severity/source/pattern/time
filters, and BM25 ranking. MCP server (FastMCP stdio) with three tools:
search_logs, diagnose, list_log_sources — compatible with both
Claude Code and Copilot CLI.
WAL mode enabled on all connections. FTS index auto-built after ingest.
MCP configs included for Claude Code (.mcp.json) and Copilot CLI
(.github/copilot/mcp.json).