turnstone

Author	SHA1	Message	Date
pyr0ball	92d7c21530	fix: group journal sources by prefix:host stem in source health source_ids with 3+ colon segments (e.g. muninn-journal:Muninn:ssh.service) are now aggregated by their prefix:host key at the SQL level in both list_sources() and stats_summary(). This collapses ~19K transient systemd unit rows (crash-loop scope entries from Muninn) into ~24 grouped rows. - list_sources: SQL CASE/INSTR group-by stem + unit_count field - stats_summary: same stem grouping for dashboard source health table - delete endpoint: LIKE-based cascade delete covers grouped stems - SourcesView: unit_count badge (e.g. "2686 units") on grouped rows; delete confirmation names the unit count when deleting a group - Bump version to v0.6.1	2026-06-02 04:35:26 -07:00
pyr0ball	3155bde4ce	feat: hybrid BM25 + vector re-ranking for diagnose search (#15 ) Adds late-fusion hybrid search to Turnstone's log retrieval layer: hybrid_score = 0.6 * bm25_normalized + 0.4 * cosine_similarity Implementation: - _bm25_search() extracts the existing FTS5 BM25 path as a named helper - _hybrid_search() fetches an oversized BM25 candidate pool (5x limit, min 100), embeds the query and each candidate text in-process via the existing embeddings service, normalizes BM25 rank to [0,1], combines with cosine similarity, and re-ranks - search() gets semantic=False param that dispatches to _hybrid_search() when True; pure BM25 remains the default for all existing call sites - diagnose_stream() enables semantic=True so symptom-based queries ("database connection failed") surface semantically equivalent entries ("ECONNREFUSED", "backend gone away", "max retries exceeded") - /api/search REST endpoint exposes ?semantic=true query param Graceful degradation: falls back silently to pure BM25 when the embedding backend is unavailable (EMBEDDING_AVAILABLE=False) or when embed_batch raises an exception. No new infra — in-process numpy cosine, no vector DB. 11 new tests: BM25 helper, hybrid re-ranking, fallback paths, dispatcher. 372 + 11 = 383 tests passing. Closes: #15	2026-06-01 18:13:09 -07:00
pyr0ball	9196465946	fix(db): add timeout=30s to all sqlite3.connect() calls across app Watcher, REST endpoints, services (search, incidents, blocklist), MCP server, context retriever, embedder, glean_scheduler, and doc_upload all used the default 5-second SQLite busy timeout. During collect glean write phases, watcher flush threads were hitting 'database is locked' errors when the glean held the write lock longer than 5 seconds. All connections now use timeout=30.0, matching the pipeline fix from commit `6882248`. No logic changes.	2026-05-26 23:12:48 -07:00
pyr0ball	12cd0a23d5	refactor: rename ingest → glean throughout codebase Renames the app/ingest/ package to app/glean/ and updates all references across Python modules, shell scripts, Vue components, tests, and documentation. Intentionally preserved: - SQLite column name ingest_time (avoids schema migration) - RetrievedEntry.ingest_time field (maps to the column above) - Any public-facing JSON keys that reference ingest_time Changes by category: - app/ingest/ → app/glean/ (full package move, all parsers) - app/tasks/ingest_scheduler.py → app/tasks/glean_scheduler.py - scripts/ingest_corpus.py → scripts/glean_corpus.py - tests/test_ingest_.py → tests/test_glean_.py - Docstrings, log messages, comments: ingest → glean - Env var: TURNSTONE_INGEST_INTERVAL → TURNSTONE_GLEAN_INTERVAL - Shell scripts: glean.log, glean_corpus.py references - README.md: multi-source ingest → multi-source glean - .env.example: updated env var name - patterns/: new diagnostic patterns from 2026-05-20 SSH incident (service_crash_loop, pkg_daemon_restart, ssh_forward_conflict) - SourcesView.vue: pipeline label updated - All test import paths updated to app.glean.* 285 tests passing.	2026-05-20 23:02:55 -07:00
pyr0ball	734e81c8ca	feat: SSE streaming diagnose, severity filter pills, per-source-cap search - diagnose_stream() async generator: status/summary/entries/reasoning/done events - POST /api/diagnose/stream SSE endpoint wired in rest.py - entries_in_window() gains per_source_cap to prevent high-volume sources crowding results - QuickCapture: severity filter pills, filtered entries view, pipeline status spinner - llm.py: remove overly broad HTTPStatusError re-raise	2026-05-13 15:45:35 -07:00
pyr0ball	b88c6d7ebf	feat: source-scoped diagnose; multi-node Docker log collection - Diagnose: add source_filter param threaded through entries_in_window, search, _diagnose, and DiagnoseRequest — clicking diagnose on a dashboard source now scopes both keyword and window hits to that source - QuickCapture: read route.query.source; show scope badge with clear ✕; auto-run when source param is present without a query - DashboardView: pass source= (not q=) when navigating to diagnose - collect_cluster_logs.sh: auto-discover Docker containers on all nodes (Heimdall non-watched, Navi, Strahl via SSH); collect Cass Plex logs via SSH; write to per-node dirs for directory-mode ingest - turnstone-cluster.service: add --reload for hot-reload during dev	2026-05-13 08:10:42 -07:00
pyr0ball	bd35a75137	feat: severity overrides + last_ingested timestamp on dashboard	2026-05-11 13:00:11 -07:00
pyr0ball	fa4d23dd20	feat: dashboard view, stats API, and composite index for query perf - Add GET /api/stats endpoint with 24h windowed aggregation (criticals, errors, per-source health, recent criticals list) - Fix timestamp format bug: strftime('%Y-%m-%dT%H:%M:%S', ...) to match stored ISO-8601 T-separated timestamps (datetime('now') uses space) - Add composite index idx_ts_repeat(timestamp_iso, repeat_count) — drops stats query from 3.5 s to <1 ms by resolving both WHERE conditions from the index without table row fetches - New DashboardView: 3 stat cards, source health table with health dots, diagnose-per-source button, recent criticals panel, zero-state card - Router default / → /dashboard; Dashboard first in nav - DiagnoseView: reads ?q= query param on mount and auto-runs; shows formatted LLM summary block - LogEntryRow: expand/collapse for long entries (>200 chars or multiline)	2026-05-11 03:41:55 -07:00
pyr0ball	90849a2c3a	fix: bypass FTS ranking for named-source error retrieval When diagnose() auto-detects a source name, FTS keyword scoring can bury real errors whose text doesn't match the symptom query. Add recent_source_errors() — a plain-SQL scan ordered by timestamp — so the most recent errors from a known service always surface regardless of keyword overlap.	2026-05-10 08:14:23 -07:00
pyr0ball	f9ab4e5bb0	feat: incident tagging — DB schema, CRUD service, REST API (#1 ) - Add `incidents` table to SQLite schema (id, label, started_at, ended_at, notes, created_at, severity) - Extract `ensure_schema()` from ingest pipeline so tables are always created at startup, not only during ingest - New `app/services/incidents.py`: create/list/get/delete + time-window entry association (FTS keyword search + raw window fallback) - New `entries_in_window()` in search.py: plain SQL scan for incident detail when keyword FTS returns nothing - REST endpoints: POST/GET /api/incidents, GET/DELETE /api/incidents/{id} - Incident detail returns up to 100 associated log entries sorted by timestamp, prioritising FTS keyword hits then ERROR/CRITICAL then all	2026-05-09 15:37:14 -07:00
pyr0ball	3e6eabb7ce	feat: initial Turnstone POC — ingest, FTS search, MCP server Ingest pipeline (journald / Caddy / Docker-wrapped formats) with per-source state tracking (repeat dedup, out-of-order detection), named pattern tagging at ingest time, and idempotent SHA1-keyed writes. FTS5 search layer with porter stemmer, severity/source/pattern/time filters, and BM25 ranking. MCP server (FastMCP stdio) with three tools: search_logs, diagnose, list_log_sources — compatible with both Claude Code and Copilot CLI. WAL mode enabled on all connections. FTS index auto-built after ingest. MCP configs included for Claude Code (.mcp.json) and Copilot CLI (.github/copilot/mcp.json).	2026-05-08 12:12:34 -07:00

11 commits