turnstone

Author	SHA1	Message	Date
pyr0ball	4a2fd0fb0d	feat(pipeline): add TURNSTONE_CLASSIFIER_MODEL env var for Stage 2 ML config Makes the HuggingFace classifier model for Stage 2 configurable via TURNSTONE_CLASSIFIER_MODEL. When unset (default), Stage 2 falls back to pattern_tags then regex — no download required on first run. Also documents TURNSTONE_MULTI_AGENT_DIAGNOSE, TURNSTONE_CLASSIFIER_MODEL, TURNSTONE_EMBED_BACKEND/MODEL/DEVICE in .env.example.	2026-05-25 19:11:32 -07:00
pyr0ball	94d796e103	refactor: pipeline cleanup — 6 follow-up fixes (#33-#38) - #33: Wrap ClassifiedTimeline.cluster_severities in MappingProxyType for true immutability (frozen=True only blocks field reassignment, not dict mutation). - #34: Remove dead suppression branch in synthesizer._build_hypothesis_block. active[] is already filtered to not rh.suppress, so the 'Yes — suppressed' branch was unreachable. Now shows novelty score only. - #35: Extract shared _llm_client.py with call_llm() + extract_content() + strip_json_fences(). Both RootCauseHypothesizer and SummarySynthesizer now import from one source. Also strips JSON fences from LLM output before parsing in hypothesizer._parse_response. - #36: Add per-stage try/except in pipeline.run_pipeline(). Unhandled stage exceptions now emit {type: 'error'} + {type: 'done'} SSE events instead of silently closing the stream. - #37: Move format_context_block() call inside the legacy LLM branch in diagnose/__init__.py — it was being computed unconditionally but only used in the non-pipeline path. - #38: Coerce supporting_cluster_ids items to str() in hypothesizer _parse_response to guard against LLMs returning integers instead of string cluster IDs.	2026-05-25 19:05:56 -07:00
pyr0ball	86361f6c79	fix: invert suppress_threshold semantics to similarity_threshold in FalsePositiveSuppressor Was suppressing when novelty_score < 0.85 (i.e. similarity > 0.15), which would suppress nearly every hypothesis once embeddings are active. Now suppresses when max_sim >= similarity_threshold (0.85), meaning only hypotheses that are 85%+ similar to a resolved incident are suppressed. Also renames suppress_threshold → similarity_threshold for clarity and adds a borderline boundary test (0.85 suppressed, 0.84 not suppressed). Closes: #29	2026-05-25 18:58:52 -07:00
pyr0ball	255c9111d4	fix: tighten suppression_reason display guard, document unused since/until params	2026-05-25 15:02:48 -07:00
pyr0ball	8cbd981ec7	feat: Stage 5 synthesizer + pipeline orchestrator + feature flag wiring (issue #29 ) - Add app/services/diagnose/synthesizer.py: SummarySynthesizer (Stage 5) - Builds structured LLM prompt from ranked hypotheses, timeline, RAG context - Excludes suppressed hypotheses from the narrative prompt - Deterministic fallback when no LLM configured or LLM call fails - Same cf-orch task endpoint + direct OpenAI-compat fallback pattern as other stages - Replace pipeline.py stub with full run_pipeline() async generator - Orchestrates all 5 stages via asyncio.to_thread for each synchronous stage - Yields typed SSE event dicts: status, pipeline_stage (1-4), hypotheses, reasoning, done - Suppressor counts (active vs suppressed) reported in stage 4 event message - Wire MULTI_AGENT_ENABLED feature flag into diagnose_stream() - TURNSTONE_MULTI_AGENT_DIAGNOSE=true routes through run_pipeline() - pipeline emits its own done event; legacy path unchanged when flag is false - Import of run_pipeline added to __init__.py - Add 21 new tests (350 -> 371 passing): - tests/test_diagnose_synthesizer.py: 8 tests (with/without LLM, suppressed, empty ranked, LLM failure fallback) - tests/test_diagnose_pipeline.py: 13 tests (flag off, flag on event sequence, empty entries, no LLM, stage 1 cluster count message) Closes: #29	2026-05-25 14:56:25 -07:00
pyr0ball	9bfae16b54	refactor: extract _score_hypothesis helper, fix exception types, pass device in suppressor	2026-05-25 14:41:33 -07:00
pyr0ball	174cb126e6	feat: Stage 4 — FalsePositiveSuppressor for multi-agent diagnose pipeline (issue #29 ) - Implements FalsePositiveSuppressor using embedding cosine similarity - Lazy corpus embedding via get_embedder() with module-level cache keyed by db_path - Cache invalidated automatically when the resolved incident corpus changes - Suppresses hypotheses with novelty_score below configurable threshold (default 0.85) - Full fallback path (novelty=1.0, no suppression) when model_id empty, embedding service unavailable, or no resolved incidents found in DB - Graceful handling of missing incidents table and DB query failures - Numpy bool_ leakage prevented by explicit float()/bool() coercion at assignment - Pure-Python cosine fallback for environments without numpy - 9 new tests (all mocked, no real model downloads): passthrough, suppress, no-suppress, empty list, ranking, empty corpus, DB failure, service unavailable, cache invalidation - 350 total tests passing (341 pre-existing + 9 new) Closes: #29	2026-05-25 14:28:31 -07:00
pyr0ball	e8c66972fa	fix: defensive coercion for LLM confidence and cluster fields in hypothesizer - Add _coerce_float() module-level helper: catches TypeError/ValueError from non-numeric LLM output (e.g. 'high', 'N/A') and returns a caller-supplied default instead of raising. - Replace float(item.get('confidence', 0.5)) with _coerce_float(item.get('confidence'), 0.5) in _parse_response. - Guard supporting_cluster_ids: tuple(item.get(...) or []) so a JSON null from the LLM does not cause TypeError('NoneType is not iterable'). - runbook_refs is hardcoded as () and not sourced from LLM output; no change needed there. - Add test_non_numeric_confidence_uses_default (Test 10) to cover the 'high' string case: asserts no exception and confidence == 0.5. - 341 tests passing (+1). Closes: #29	2026-05-25 14:00:30 -07:00
pyr0ball	eefd65f903	feat: Stage 3 — RootCauseHypothesizer for multi-agent diagnose pipeline (issue #29 ) - Add app/services/diagnose/hypothesizer.py with RootCauseHypothesizer class - Stage 3 of the multi-agent diagnose pipeline: accepts ClassifiedTimeline + RetrievedContext, builds a structured JSON prompt, calls the LLM via the same cf-orch task → OpenAI-compat fallback pattern used by llm.py - Parses JSON array response into list[Hypothesis] dataclasses with UUID ids, severity validation (WARNING→WARN, unknown→ERROR), confidence coercion - Gracefully returns [] when llm_url/llm_model absent or clusters empty - Add tests/test_diagnose_hypothesizer.py: 12 tests, all mocked, no LLM I/O covering: valid response, UUID generation, malformed JSON, non-list JSON, empty clusters, missing URL/model, max_hypotheses cap, severity mapping, confidence string coercion - 340 tests passing (328 prior + 12 new) Closes: #29	2026-05-25 13:49:18 -07:00
pyr0ball	912ba7ac16	feat: Stage 2 — SeverityClassifier for multi-agent diagnose pipeline (issue #29 ) Three-path classification: ML (transformers pipeline, lazy singleton) → pattern_tags (YAML pattern severity dict) → regex (detect_severity). - Path A: HF text-classification pipeline loaded lazily on first classify() call via module-level singleton; shim promotes ERROR+keyword hits to CRITICAL and demotes low-confidence INFO to DEBUG. - Path B: maps cluster.pattern_tags through the loaded pattern severity dict; picks the highest severity across matching tags. - Path C: falls back to detect_severity() regex scan on representative_text; defaults to INFO when no keyword matches. - Pattern file resolved from constructor arg or TURNSTONE_PATTERNS env var (mirrors app/rest.py convention). - No crash when transformers is not installed; ImportError on per-cluster ML inference triggers clean per-cluster fallback to pattern_tags/regex. - ClassifiedTimeline.classifier_used reflects the primary session path. Tests (10 new, 328 total, all passing): - ML ERROR, CRITICAL promotion, DEBUG demotion, WARNING→WARN - pattern_tags resolution from YAML fixture - regex ERROR detection and INFO default - ImportError clean fallback - empty timeline no-crash - ClassifiedTimeline FrozenInstanceError on mutation Closes: #29	2026-05-25 13:27:17 -07:00
pyr0ball	3b04c81a2b	refactor: split TimelineReconstructor.reconstruct into helpers, fix magic number + error handling - Add gap_significance_seconds constructor param (default 30) to replace hardcoded magic number in gap_count computation - _parse_iso now returns datetime \| None with try/except on ValueError; all callers handle None return by treating malformed timestamps as absent - Extract reconstruct into four private helpers: _sort_entries, _group_into_raw_clusters, _build_cluster, _dominant_sources_tuple - Promote _sort_key to module-level function (was nested inside reconstruct) - Rename old module-level _build_cluster to _make_event_cluster to avoid name collision with new instance method - Add explanatory comment to type: ignore[arg-type] at _highest_severity call site - Black-formatted	2026-05-25 13:22:18 -07:00
pyr0ball	7cff98b1c3	feat: Stage 1 — TimelineReconstructor for multi-agent diagnose pipeline (issue #29 ) - Add app/services/diagnose/timeline.py: pure-Python TimelineReconstructor - Sorts entries by timestamp_iso (None entries appended at end) - Sliding-window clustering anchored to first entry in each cluster - Computes cluster_id (sha1[:12]), severity (highest wins), burst flag, gap_before_seconds, representative_text (highest rank, longest text tiebreak) - Builds TimelineResult with dominant_sources sorted by entry count descending - Update pipeline.py stub to import TimelineReconstructor (Task 6 wiring prep) - Add tests/test_diagnose_timeline.py: 15 tests covering all 13 required cases plus null-timestamp edge case variant; all 318 tests passing Closes: #29	2026-05-25 12:54:15 -07:00
pyr0ball	959a6cbf1c	fix: frozen dataclasses, clean __all__, improve exception logging in diagnose package	2026-05-25 12:31:07 -07:00
pyr0ball	664ab50433	refactor: convert diagnose module to package for multi-agent pipeline (issue #29 ) - Move app/services/diagnose.py verbatim to app/services/diagnose/legacy.py - Create app/services/diagnose/__init__.py with full implementation so that patch('app.services.diagnose._HAS_DATEPARSER') targets the correct namespace and all 303 existing tests continue to pass without modification - Add app/services/diagnose/models.py with 5 pipeline dataclasses: EventCluster, TimelineResult, ClassifiedTimeline, Hypothesis, RankedHypothesis - Add app/services/diagnose/pipeline.py with run_pipeline() stub (Task 6) - Add MULTI_AGENT_ENABLED feature flag (off by default via env var) - Zero behavior change; ruff clean Closes: #29	2026-05-25 11:12:39 -07:00
pyr0ball	5f32a6678d	refactor: extract embeddings service layer — decouple context embedder from Ollama - New app/services/embeddings.py: TURNSTONE_EMBED_* env vars, multi-backend support - embedder.py delegates to service layer; re-exports EMBEDDING_AVAILABLE for compat - retriever.py updated to use service layer - Test coverage updated in tests/context/test_embedder.py	2026-05-25 11:01:25 -07:00
pyr0ball	12cd0a23d5	refactor: rename ingest → glean throughout codebase Renames the app/ingest/ package to app/glean/ and updates all references across Python modules, shell scripts, Vue components, tests, and documentation. Intentionally preserved: - SQLite column name ingest_time (avoids schema migration) - RetrievedEntry.ingest_time field (maps to the column above) - Any public-facing JSON keys that reference ingest_time Changes by category: - app/ingest/ → app/glean/ (full package move, all parsers) - app/tasks/ingest_scheduler.py → app/tasks/glean_scheduler.py - scripts/ingest_corpus.py → scripts/glean_corpus.py - tests/test_ingest_.py → tests/test_glean_.py - Docstrings, log messages, comments: ingest → glean - Env var: TURNSTONE_INGEST_INTERVAL → TURNSTONE_GLEAN_INTERVAL - Shell scripts: glean.log, glean_corpus.py references - README.md: multi-source ingest → multi-source glean - .env.example: updated env var name - patterns/: new diagnostic patterns from 2026-05-20 SSH incident (service_crash_loop, pkg_daemon_restart, ssh_forward_conflict) - SourcesView.vue: pipeline label updated - All test import paths updated to app.glean.* 285 tests passing.	2026-05-20 23:02:55 -07:00
pyr0ball	7f63f155e2	fix(blocklist): get_candidate for O(1) push/unblock, 400 on malformed device_names JSON	2026-05-15 21:19:02 -07:00
pyr0ball	e44c6fd680	feat(blocklist): 6 REST endpoints + Pi-hole settings fields Add blocklist candidate listing, scan trigger, status update, push/unblock to Pi-hole, and connection test endpoints. Add pihole_url/version/api_key and router_source_ids/device_names fields to SettingsBody and prefs handling in patch_settings. Add PiholeClient.__post_init__ validation so 503 fires naturally when url/api_key are unconfigured (mock-safe: bypassed in tests).	2026-05-15 21:15:09 -07:00
pyr0ball	c813832cbe	feat(blocklist): extraction scan + candidate CRUD + full test suite	2026-05-15 21:05:49 -07:00
pyr0ball	0e887837d1	fix(blocklist): validate _v6_auth session JSON, add auth-failure test	2026-05-15 21:03:03 -07:00
pyr0ball	a683297d8b	feat(blocklist): Pi-hole v5/v6 API client + tests PiholeClient dataclass supporting both Pi-hole v5 (PHP /admin/api.php) and v6 (REST /api/) with public block/unblock/test_connection methods. 9 tests covering both API versions, auth flow, and error handling.	2026-05-15 21:00:01 -07:00
pyr0ball	1a3c753093	fix(blocklist): remove premature imports from blocklist.py (Task 2 scope)	2026-05-15 20:58:04 -07:00
pyr0ball	8832061de2	feat(blocklist): telemetry YAML list + loader + domain matcher Adds patterns/telemetry.yaml with 6 rule groups (samsung, belkin, roku, lg, amazon, advertising). Adds app/services/blocklist.py with TelemetryRule and BlocklistCandidate dataclasses, load_telemetry_rules(), and matches_telemetry() with exact and subdomain matching. 6 new TestTelemetry tests pass; 199 total passing.	2026-05-15 20:54:40 -07:00
pyr0ball	63af5aa14b	fix: time window regex misses fuzzy quantifiers like 'last few hours' The relative-time regex only matched digits between 'last/past' and the unit, so 'last few hours' fell through to dateparser which then found the bare word 'hours' and resolved it as midnight local time. Extended the regex to capture 'few', 'couple of', 'several', 'a few' as approximate quantifiers, mapped to 3 units each. Numeric expressions and bare 'last hour' still work as before.	2026-05-13 18:32:54 -07:00
pyr0ball	f19f896300	feat: inject environment context into diagnose pipeline and LLM prompt - Add context_block param to summarize() and thread it into _PROMPT_TEMPLATE - Wire retrieve_context/format_context_block into diagnose_stream() before log search; emit context SSE event (facts + chunks) to the client - 3 new tests covering prompt injection and SSE event emission (155 total, all pass)	2026-05-13 16:29:26 -07:00
pyr0ball	734e81c8ca	feat: SSE streaming diagnose, severity filter pills, per-source-cap search - diagnose_stream() async generator: status/summary/entries/reasoning/done events - POST /api/diagnose/stream SSE endpoint wired in rest.py - entries_in_window() gains per_source_cap to prevent high-volume sources crowding results - QuickCapture: severity filter pills, filtered entries view, pipeline status spinner - llm.py: remove overly broad HTTPStatusError re-raise	2026-05-13 15:45:35 -07:00
pyr0ball	909bb3f78b	feat: try cf-orch task endpoint first; fall back to direct model call POST /api/inference/task with product=turnstone task=log_analysis routes to the security reasoning model assigned in cf-orch. Falls back to the OpenAI- compat /v1/chat/completions path on 404 (no assignment) or if the task endpoint is absent (local instances, xanderland).	2026-05-13 08:20:29 -07:00
pyr0ball	b88c6d7ebf	feat: source-scoped diagnose; multi-node Docker log collection - Diagnose: add source_filter param threaded through entries_in_window, search, _diagnose, and DiagnoseRequest — clicking diagnose on a dashboard source now scopes both keyword and window hits to that source - QuickCapture: read route.query.source; show scope badge with clear ✕; auto-run when source param is present without a query - DashboardView: pass source= (not q=) when navigating to diagnose - collect_cluster_logs.sh: auto-discover Docker containers on all nodes (Heimdall non-watched, Navi, Strahl via SSH); collect Cass Plex logs via SSH; write to per-node dirs for directory-mode ingest - turnstone-cluster.service: add --reload for hot-reload during dev	2026-05-13 08:10:42 -07:00
pyr0ball	03b796eb6e	fix: correct cf-orch port to 7700; fix relative time parsing in diagnose; fix syslog PRI prefix	2026-05-13 05:33:41 -07:00
pyr0ball	dda0b453c2	fix: increase LLM summarize timeout to 120s for remote cf-orch routing 20s was too tight for first-request model swaps in Ollama (model cold load can take 30-60s). 120s matches coordinator inference timeout.	2026-05-12 18:27:52 -07:00
pyr0ball	765d2cb2df	feat: switch LLM backend to OpenAI-compat; add cf-orch remote inference support Turnstone now calls /v1/chat/completions instead of Ollama's /api/generate. This format works with both local Ollama (>=0.1.24) and a remote cf-orch coordinator, enabling GPU-less nodes like Xander's to route diagnoses through the cluster without any local model. - llm.py: OpenAI-compat messages format, optional Bearer auth header - diagnose.py: thread llm_api_key through the call chain - rest.py: llm_api_key pref (default empty), SettingsBody field, passed to diagnose - SettingsView.vue: API Key field, label updated from "Ollama URL" to "LLM Endpoint URL" - tests: updated mocks for new response shape; added bearer token assertion test	2026-05-12 12:58:38 -07:00
pyr0ball	bd35a75137	feat: severity overrides + last_ingested timestamp on dashboard	2026-05-11 13:00:11 -07:00
pyr0ball	b540060639	feat: LLM reasoning layer — Ollama summarization on diagnose results	2026-05-11 11:35:07 -07:00
pyr0ball	b786eee49c	fix: correct time_detected logic, immutable sort pattern, add diagnose() test	2026-05-11 09:08:24 -07:00
pyr0ball	21b988fd66	feat: add diagnose service with NL time extraction via dateparser Adds app/services/diagnose.py with parse_time_window() (dateparser-backed NL time phrase extraction with 60-min fallback) and diagnose() (layered FTS + window search returning severity/source summary). Includes 5 TDD tests.	2026-05-11 09:04:50 -07:00
pyr0ball	18bb93abc9	feat: incident labeling, bundle export, and push/receive flow Turnstone incidents now carry an issue_type tag (free-text with datalist suggestions) used to categorize patterns for signature building. Backend: - Incident model gains issue_type; additive ALTER TABLE migration keeps existing DBs working without a full schema rebuild - New received_bundles table stores incoming JSON bundles with indexes on bundled_at and issue_type - build_bundle() assembles incident + related log entries into a versioned bundle dict; store_bundle()/list_bundles()/get_bundle() for the receiver - POST /api/incidents/{id}/send — pushes bundle to TURNSTONE_BUNDLE_ENDPOINT - GET /api/incidents/{id}/bundle — export without sending - POST /api/bundles — receive and store an incoming bundle - GET /api/bundles — list all received bundles - TURNSTONE_SOURCE_HOST and TURNSTONE_BUNDLE_ENDPOINT env vars; auto-set source host from hostname in podman-standalone.sh Frontend: - Incidents form: issue_type field with datalist suggestions; Type column in the table; Send Bundle button + status feedback in the detail drawer - New BundlesView: collapsible bundle rows, inline JSON parse (no extra round-trip), Export JSON download button - Router and nav updated with /bundles route	2026-05-11 05:23:55 -07:00
pyr0ball	fa4d23dd20	feat: dashboard view, stats API, and composite index for query perf - Add GET /api/stats endpoint with 24h windowed aggregation (criticals, errors, per-source health, recent criticals list) - Fix timestamp format bug: strftime('%Y-%m-%dT%H:%M:%S', ...) to match stored ISO-8601 T-separated timestamps (datetime('now') uses space) - Add composite index idx_ts_repeat(timestamp_iso, repeat_count) — drops stats query from 3.5 s to <1 ms by resolving both WHERE conditions from the index without table row fetches - New DashboardView: 3 stat cards, source health table with health dots, diagnose-per-source button, recent criticals panel, zero-state card - Router default / → /dashboard; Dashboard first in nav - DiagnoseView: reads ?q= query param on mount and auto-runs; shows formatted LLM summary block - LogEntryRow: expand/collapse for long entries (>200 chars or multiline)	2026-05-11 03:41:55 -07:00
pyr0ball	90849a2c3a	fix: bypass FTS ranking for named-source error retrieval When diagnose() auto-detects a source name, FTS keyword scoring can bury real errors whose text doesn't match the symptom query. Add recent_source_errors() — a plain-SQL scan ordered by timestamp — so the most recent errors from a known service always surface regardless of keyword overlap.	2026-05-10 08:14:23 -07:00
pyr0ball	f9ab4e5bb0	feat: incident tagging — DB schema, CRUD service, REST API (#1 ) - Add `incidents` table to SQLite schema (id, label, started_at, ended_at, notes, created_at, severity) - Extract `ensure_schema()` from ingest pipeline so tables are always created at startup, not only during ingest - New `app/services/incidents.py`: create/list/get/delete + time-window entry association (FTS keyword search + raw window fallback) - New `entries_in_window()` in search.py: plain SQL scan for incident detail when keyword FTS returns nothing - REST endpoints: POST/GET /api/incidents, GET/DELETE /api/incidents/{id} - Incident detail returns up to 100 associated log entries sorted by timestamp, prioritising FTS keyword hits then ERROR/CRITICAL then all	2026-05-09 15:37:14 -07:00
pyr0ball	3e6eabb7ce	feat: initial Turnstone POC — ingest, FTS search, MCP server Ingest pipeline (journald / Caddy / Docker-wrapped formats) with per-source state tracking (repeat dedup, out-of-order detection), named pattern tagging at ingest time, and idempotent SHA1-keyed writes. FTS5 search layer with porter stemmer, severity/source/pattern/time filters, and BM25 ranking. MCP server (FastMCP stdio) with three tools: search_logs, diagnose, list_log_sources — compatible with both Claude Code and Copilot CLI. WAL mode enabled on all connections. FTS index auto-built after ingest. MCP configs included for Claude Code (.mcp.json) and Copilot CLI (.github/copilot/mcp.json).	2026-05-08 12:12:34 -07:00

40 commits