- Changed glob → rglob in glean_dir so corpus directories with format
subfolders (journald/, docker/, etc.) are fully ingested
- Fixed gen_corpus.py docker SOURCE to emit "docker:<service>" prefix
so the pipeline correctly detects format as 'docker' not 'plaintext'
- 17/17 gen_corpus tests passing
Closes: #46
- Add app/db/ abstraction layer: Backend enum, DbConn wrapper,
dialect helper (q() for ? vs %s paramstyle), get_conn(), tenant_id()
- Auto-detect backend from DATABASE_URL; SQLite remains default when
unset — no config change for local deployments
- Add tenant_id column to all three logical DBs (main, context, incidents);
idempotent ALTER TABLE migration runs before schema scripts on existing DBs
- All INSERTs inject tenant_id; SELECTs use (tenant_id = ? OR tenant_id = '')
for backward compat with pre-namespacing rows
- Add docker-compose.yml with named volume turnstone_pgdata (survives rebuilds)
and optional external Postgres support via DATABASE_URL override
- Add scripts/migrate_sqlite_to_postgres.py — one-shot idempotent migration
for existing SQLite data; ON CONFLICT DO NOTHING for safe re-runs
- Fix SSH glean path in pipeline.py to use ensure_schema + get_conn
(was still using raw sqlite3.connect + old _SCHEMA without tenant_id)
- Fix FTS5 JOIN ambiguity: qualify repeat_count as f.repeat_count in search
- Update all tests to use ensure_*_schema fixtures; add row_factory where needed
- 394/394 tests passing
Closes: #42
Closes: #50
FTS5 bulk-insert write locks starved the incident API and bundle endpoints
during log bursts (sonarr/radarr, high-volume docker sources). Fix mirrors
the context_facts split (context -> turnstone-context.db):
- Add INCIDENTS_DB_PATH / TURNSTONE_INCIDENTS_DB env var in rest.py
- Add _INCIDENTS_SCHEMA, ensure_incidents_schema(), and
migrate_incidents_to_dedicated_db() in glean/pipeline.py
- Stub out incidents/received_bundles/sent_bundles in _SCHEMA (no-op
CREATE IF NOT EXISTS) so legacy single-file deployments still open
- Thread incidents_db_path through diagnose_stream -> run_pipeline ->
FalsePositiveSuppressor.suppress -> _fetch_resolved_incidents
- One-shot migration on startup: copy existing rows from main DB to
incidents DB via INSERT OR IGNORE (idempotent, safe to re-run)
- Fix test_blocklist_endpoints fixtures to patch CONTEXT_DB_PATH and
INCIDENTS_DB_PATH alongside DB_PATH (worktree has no data/ dir)
372 tests passing.
Closes: #60
pipeline.py:
- Add timeout=30.0 to all sqlite3.connect() calls (5 total).
Previously only ensure_context_schema() had it. The main glean
writers would fail immediately under lock contention from the live
watcher or concurrent manual glean runs.
glean_corpus.py:
- Add --force flag (passed through to glean_sources/glean_file/glean_dir).
Without it, unchanged-fingerprint files were silently skipped even
after pattern updates. Use after editing patterns/default.yaml.
patterns/default.yaml:
- Add 9 new patterns for Muninn / cluster-wide coverage:
vpn_tunnel_fail WireGuard/tunnel service failures
vpn_handshake WireGuard peer handshake events
dns_degraded systemd-resolved DNS fallback/degradation
nvidia_api_mismatch NVIDIA kernel module vs userspace mismatch
nvidia_xid NVIDIA Xid GPU hardware faults
nvidia_gpu_reset NVIDIA GPU reset / NVLink faults
acpi_error ACPI firmware _DSM evaluation failures
thermal_throttle CPU/GPU thermal throttling / RAPL unavailable
undervoltage PSU undervoltage / brownout events
- Sync from /devl/turnstone-cluster/patterns/default.yaml (authoritative
live copy updated first; repo copy was stale)
context_facts, context_documents, and context_chunks now live in
turnstone-context.db (sibling of turnstone.db). The glean scheduler
held write locks on the main DB long enough to cause 5-second timeout
failures on context fact inserts; separate files have independent WAL
write locks so they never contend.
Changes:
- pipeline.py: extract _CONTEXT_SCHEMA + ensure_context_schema()
- rest.py: CONTEXT_DB_PATH (TURNSTONE_CONTEXT_DB env var, defaults to
sibling file); init via ensure_context_schema(); all context routes
pass CONTEXT_DB_PATH; diagnose_stream receives context_db_path kwarg
- diagnose/__init__.py: diagnose_stream() accepts context_db_path
(falls back to db_path for backward compat); retrieve_context uses it
- store.py: sqlite3.connect() timeout=30.0 — Python driver retry loop
is independent of PRAGMA busy_timeout; needed for any remaining
contention during test or single-file deployments
Closes: #42
Closes turnstone#22.
## Transport layer (app/glean/ssh.py)
- SSHTransport context manager: key-only auth, paramiko backend
- SSHConnectionError / SSHCommandError exception hierarchy
- exec_stream() generator: yields stdout lines, raises SSHCommandError on
non-zero exit (isinstance(int) guard for test-mock safety)
- Command builders: _build_journald_command, _build_syslog_command,
_build_plaintext_command, _build_docker_command
- 18 unit tests in tests/test_glean_ssh.py
## Pipeline integration (app/glean/pipeline.py)
- _stream_and_write(): per-item error isolation — SSHCommandError skips
one glean item without aborting the rest of the host connection
- _glean_ssh_source(): one SSHTransport per host, dispatches all glean
items (journald/syslog/plaintext/docker); SSHConnectionError aborts host
- glean_sources(): splits local vs SSH sources; local → _glean_files();
SSH → _glean_ssh_source(); shared compiled patterns and DB connection
- glean_ssh_source(): public wrapper for REST use — manages DB connection,
pattern compilation, FTS rebuild lifecycle
- 15 integration tests in tests/test_glean_pipeline_ssh.py
- All 285 tests passing
## REST layer (app/rest.py)
- GET /api/sources/configured: reads sources.yaml and enriches with DB
stats; SSH sources appear before first glean (entry_count=0); sub-source
IDs (rack01/journald, rack01/docker/myapp) aggregated per host entry
- POST /api/sources/{id}/glean: detects transport:ssh and dispatches to
glean_ssh_source() wrapper; local sources unchanged
- Import: glean_ssh_source as _glean_ssh_source
## Frontend (web/src/views/SourcesView.vue)
- Fetches /api/sources/configured (primary) + /api/sources (DB-only) in
parallel; merges into unified SourceRow list
- SSH sources show: ssh badge (with user@host tooltip), glean-type pills
(journald/syslog/docker/etc.), host subtitle
- SSH sub-source IDs (rack01/journald) suppressed from the DB-only list
since they are covered by the parent SSH row
- DB-only sources (uploads) appear below configured sources with 'uploaded'
badge; reglean button disabled (not in sources.yaml)
- Delete zeroes out configured-source stats in-place rather than removing
the row (so the source remains visible for re-gleaning)