Commit graph

61 commits

Author SHA1 Message Date
7cff98b1c3 feat: Stage 1 — TimelineReconstructor for multi-agent diagnose pipeline (issue #29)
- Add app/services/diagnose/timeline.py: pure-Python TimelineReconstructor
  - Sorts entries by timestamp_iso (None entries appended at end)
  - Sliding-window clustering anchored to first entry in each cluster
  - Computes cluster_id (sha1[:12]), severity (highest wins), burst flag,
    gap_before_seconds, representative_text (highest rank, longest text tiebreak)
  - Builds TimelineResult with dominant_sources sorted by entry count descending
- Update pipeline.py stub to import TimelineReconstructor (Task 6 wiring prep)
- Add tests/test_diagnose_timeline.py: 15 tests covering all 13 required cases
  plus null-timestamp edge case variant; all 318 tests passing

Closes: #29
2026-05-25 12:54:15 -07:00
959a6cbf1c fix: frozen dataclasses, clean __all__, improve exception logging in diagnose package 2026-05-25 12:31:07 -07:00
664ab50433 refactor: convert diagnose module to package for multi-agent pipeline (issue #29)
- Move app/services/diagnose.py verbatim to app/services/diagnose/legacy.py
- Create app/services/diagnose/__init__.py with full implementation so that
  patch('app.services.diagnose._HAS_DATEPARSER') targets the correct namespace
  and all 303 existing tests continue to pass without modification
- Add app/services/diagnose/models.py with 5 pipeline dataclasses:
  EventCluster, TimelineResult, ClassifiedTimeline, Hypothesis, RankedHypothesis
- Add app/services/diagnose/pipeline.py with run_pipeline() stub (Task 6)
- Add MULTI_AGENT_ENABLED feature flag (off by default via env var)
- Zero behavior change; ruff clean

Closes: #29
2026-05-25 11:12:39 -07:00
5f32a6678d refactor: extract embeddings service layer — decouple context embedder from Ollama
- New app/services/embeddings.py: TURNSTONE_EMBED_* env vars, multi-backend support
- embedder.py delegates to service layer; re-exports EMBEDDING_AVAILABLE for compat
- retriever.py updated to use service layer
- Test coverage updated in tests/context/test_embedder.py
2026-05-25 11:01:25 -07:00
2fde3a1814 feat: fingerprint-based incremental glean — skip unchanged files (#30)
- Add glean_fingerprints table to schema (sha256 + mtime + size)
- _fingerprint(), _fp_unchanged(), _save_fingerprint() helpers in pipeline.py
- _glean_files() now checks fingerprint; skips file if hash unchanged
- force=True param threads through glean_dir → glean_file → glean_sources
- POST /api/tasks/glean and POST /api/sources/{id}/glean accept force=true
- 14 unit tests in tests/test_glean_fingerprint.py, all passing

Closes: #30
2026-05-25 11:01:18 -07:00
e746d55730 feat: SSH remote glean — transport layer, pipeline integration, REST + UI (#22)
Closes turnstone#22.

## Transport layer (app/glean/ssh.py)
- SSHTransport context manager: key-only auth, paramiko backend
- SSHConnectionError / SSHCommandError exception hierarchy
- exec_stream() generator: yields stdout lines, raises SSHCommandError on
  non-zero exit (isinstance(int) guard for test-mock safety)
- Command builders: _build_journald_command, _build_syslog_command,
  _build_plaintext_command, _build_docker_command
- 18 unit tests in tests/test_glean_ssh.py

## Pipeline integration (app/glean/pipeline.py)
- _stream_and_write(): per-item error isolation — SSHCommandError skips
  one glean item without aborting the rest of the host connection
- _glean_ssh_source(): one SSHTransport per host, dispatches all glean
  items (journald/syslog/plaintext/docker); SSHConnectionError aborts host
- glean_sources(): splits local vs SSH sources; local → _glean_files();
  SSH → _glean_ssh_source(); shared compiled patterns and DB connection
- glean_ssh_source(): public wrapper for REST use — manages DB connection,
  pattern compilation, FTS rebuild lifecycle
- 15 integration tests in tests/test_glean_pipeline_ssh.py
- All 285 tests passing

## REST layer (app/rest.py)
- GET /api/sources/configured: reads sources.yaml and enriches with DB
  stats; SSH sources appear before first glean (entry_count=0); sub-source
  IDs (rack01/journald, rack01/docker/myapp) aggregated per host entry
- POST /api/sources/{id}/glean: detects transport:ssh and dispatches to
  glean_ssh_source() wrapper; local sources unchanged
- Import: glean_ssh_source as _glean_ssh_source

## Frontend (web/src/views/SourcesView.vue)
- Fetches /api/sources/configured (primary) + /api/sources (DB-only) in
  parallel; merges into unified SourceRow list
- SSH sources show: ssh badge (with user@host tooltip), glean-type pills
  (journald/syslog/docker/etc.), host subtitle
- SSH sub-source IDs (rack01/journald) suppressed from the DB-only list
  since they are covered by the parent SSH row
- DB-only sources (uploads) appear below configured sources with 'uploaded'
  badge; reglean button disabled (not in sources.yaml)
- Delete zeroes out configured-source stats in-place rather than removing
  the row (so the source remains visible for re-gleaning)
2026-05-21 12:37:30 -07:00
81a9b0f49d feat: SSH remote host glean — transport layer and pipeline integration (closes #22, backend)
Adds SSH-based log collection from remote hosts via Paramiko.
One SSH connection per host, multiple log types per connection.

New files:
- app/glean/ssh.py: SSHTransport context manager + command builders
  for journald, syslog, plaintext, and docker log types
- tests/test_glean_ssh.py: 18 tests for transport layer (all mocked)
- tests/test_glean_pipeline_ssh.py: 15 tests for pipeline integration

Pipeline changes (app/glean/pipeline.py):
- glean_sources() now splits sources into local-file and SSH categories
- SSH sources use transport: ssh + glean: list schema in sources.yaml
- _glean_ssh_source(): one SSHTransport per host, N commands per connection
- _stream_and_write(): SSHCommandError caught per-item so one bad
  command does not abort the rest of the host's glean items
- SSHConnectionError skips the entire host with a warning log

SSH source schema (sources.yaml):
  - id: rack01
    transport: ssh
    host: 192.168.1.10
    user: admin
    key_path: ~/.ssh/id_ed25519
    glean:
      - type: journald
        args: [--since, 2 hours ago]
      - type: syslog
        path: /var/log/syslog
      - type: plaintext
        path: /var/log/app/error.log
      - type: docker
        containers: [myapp, nginx]

Key design decisions:
- Key-based auth only (no password prompts in daemon context)
- exit-status check fires after all stdout lines yielded; callers
  drain the iterator to trigger it
- Local file sources path unchanged; SSH sources co-exist in same yaml
- Docker multi-container: one exec_stream call per container,
  source_id scoped as host_id/type/container_name

Remaining for #22: REST endpoint, SourcesView UI, sources.yaml docs.
285 → 285 tests passing (33 new SSH tests).
2026-05-20 23:03:13 -07:00
12cd0a23d5 refactor: rename ingest → glean throughout codebase
Renames the app/ingest/ package to app/glean/ and updates all
references across Python modules, shell scripts, Vue components,
tests, and documentation.

Intentionally preserved:
- SQLite column name ingest_time (avoids schema migration)
- RetrievedEntry.ingest_time field (maps to the column above)
- Any public-facing JSON keys that reference ingest_time

Changes by category:
- app/ingest/ → app/glean/ (full package move, all parsers)
- app/tasks/ingest_scheduler.py → app/tasks/glean_scheduler.py
- scripts/ingest_corpus.py → scripts/glean_corpus.py
- tests/test_ingest_*.py → tests/test_glean_*.py
- Docstrings, log messages, comments: ingest → glean
- Env var: TURNSTONE_INGEST_INTERVAL → TURNSTONE_GLEAN_INTERVAL
- Shell scripts: glean.log, glean_corpus.py references
- README.md: multi-source ingest → multi-source glean
- .env.example: updated env var name
- patterns/: new diagnostic patterns from 2026-05-20 SSH incident
  (service_crash_loop, pkg_daemon_restart, ssh_forward_conflict)
- SourcesView.vue: pipeline label updated
- All test import paths updated to app.glean.*

285 tests passing.
2026-05-20 23:02:55 -07:00
82977f365b feat: periodic ingest scheduler + Orchard submission pipeline
Adds asyncio-native background scheduler (TURNSTONE_INGEST_INTERVAL,
default 900s) that runs batch ingest then pushes pattern-matched entries
to a remote CF harvest endpoint (TURNSTONE_SUBMIT_ENDPOINT).

- app/tasks/ingest_scheduler.py: IngestState, scheduler_loop, run_once,
  submit_matched, _query_matched_since — asyncio.Lock prevents concurrent runs
- app/rest.py: POST /api/ingest/batch (pre-parsed entry receiver),
  GET /api/tasks/ingest/status, POST /api/tasks/ingest (manual trigger),
  TURNSTONE_INGEST_INTERVAL + TURNSTONE_SUBMIT_ENDPOINT env wiring in lifespan
- docker-compose.submissions.yml: segregated daniel (8536) + xander (8537)
  receiving instances on Heimdall, isolated DBs under
  /devl/docker/turnstone-submissions/<node>/
- podman-standalone.sh: pass-through for TURNSTONE_SUBMIT_ENDPOINT +
  TURNSTONE_SOURCE_HOST
- app/ingest/mqtt_subscriber.py: MQTT log source adapter
- app/ingest/wazuh.py: Wazuh alert JSON adapter
- tests/test_ingest_wazuh.py: Wazuh adapter test suite
2026-05-20 08:57:25 -07:00
16fe5f70a5 feat: Alpha milestone — corpus management, upload ingest, harvester agent
Closes #1 (incident tagging — already implemented), #2, #3, #5.

- feat(api): DELETE /api/sources/{id} — purge entries + FTS rows for a source
- feat(api): POST /api/sources/{id}/ingest — re-ingest from sources.yaml
- feat(api): POST /api/ingest/upload — multipart log file upload with auto-detect
- feat(ui): SourcesView reingest + delete buttons and upload file input (#2)
- feat(harvester): harvester.py push + incident subcommands (#5)
- feat(harvester): Dockerfile, docker-compose.yml, harvester.sh (containerless)
- feat(config): GPU_SERVER_URL → CF_ORCH_URL resolution + write-back (#20)
- docs: .env.example, README Configuration table, version bump to 0.5.0
2026-05-19 07:45:58 -07:00
7f63f155e2 fix(blocklist): get_candidate for O(1) push/unblock, 400 on malformed device_names JSON 2026-05-15 21:19:02 -07:00
e44c6fd680 feat(blocklist): 6 REST endpoints + Pi-hole settings fields
Add blocklist candidate listing, scan trigger, status update,
push/unblock to Pi-hole, and connection test endpoints.
Add pihole_url/version/api_key and router_source_ids/device_names
fields to SettingsBody and prefs handling in patch_settings.
Add PiholeClient.__post_init__ validation so 503 fires naturally
when url/api_key are unconfigured (mock-safe: bypassed in tests).
2026-05-15 21:15:09 -07:00
c813832cbe feat(blocklist): extraction scan + candidate CRUD + full test suite 2026-05-15 21:05:49 -07:00
0e887837d1 fix(blocklist): validate _v6_auth session JSON, add auth-failure test 2026-05-15 21:03:03 -07:00
a683297d8b feat(blocklist): Pi-hole v5/v6 API client + tests
PiholeClient dataclass supporting both Pi-hole v5 (PHP /admin/api.php)
and v6 (REST /api/) with public block/unblock/test_connection methods.
9 tests covering both API versions, auth flow, and error handling.
2026-05-15 21:00:01 -07:00
1a3c753093 fix(blocklist): remove premature imports from blocklist.py (Task 2 scope) 2026-05-15 20:58:04 -07:00
8832061de2 feat(blocklist): telemetry YAML list + loader + domain matcher
Adds patterns/telemetry.yaml with 6 rule groups (samsung, belkin, roku, lg, amazon, advertising).
Adds app/services/blocklist.py with TelemetryRule and BlocklistCandidate dataclasses, load_telemetry_rules(), and matches_telemetry() with exact and subdomain matching.
6 new TestTelemetry tests pass; 199 total passing.
2026-05-15 20:54:40 -07:00
2967036503 feat(blocklist): blocklist_candidates schema + tests
Add blocklist_candidates table and indexes to _SCHEMA in pipeline.py.
Add TestSchema tests verifying table existence, column set, and status/hit_count defaults.
All 193 tests pass.
2026-05-15 20:51:00 -07:00
9e5c5da7e9 chore: remove stale load_patterns import from rest.py 2026-05-13 21:52:03 -07:00
950a854b58 fix: tautulli — hmac token compare, public pattern loader, startup cache, endpoint tests 2026-05-13 19:08:49 -07:00
72800332c9 fix: tautulli — entry_id collision on missing ts, token settings, test coverage 2026-05-13 19:04:07 -07:00
b61a85dc62 feat: Tautulli webhook ingest endpoint — plex events -> log_entries
POST /turnstone/api/ingest/tautulli accepts Tautulli notification agent
payloads and stores them as log_entries under source 'tautulli'. Severity
maps error->CRITICAL, buffer->WARN, all others->None. Optional bearer token
auth via X-Tautulli-Token header + tautulli_token pref. FTS index rebuilt
as a background task after each write. 28 new tests, all passing.
2026-05-13 18:41:03 -07:00
63af5aa14b fix: time window regex misses fuzzy quantifiers like 'last few hours'
The relative-time regex only matched digits between 'last/past' and
the unit, so 'last few hours' fell through to dateparser which then
found the bare word 'hours' and resolved it as midnight local time.

Extended the regex to capture 'few', 'couple of', 'several', 'a few'
as approximate quantifiers, mapped to 3 units each. Numeric expressions
and bare 'last hour' still work as before.
2026-05-13 18:32:54 -07:00
32f44700f9 fix: ingestors treat naive log timestamps as local time, not UTC
All five parsers (plex, syslog, servarr, qbittorrent, plaintext) were
using .replace(tzinfo=timezone.utc) on naive datetimes parsed from log
files, which slaps a UTC label on what is actually local-time data.
On a UTC-7 system a 2pm entry was stored as 14:00Z instead of 21:00Z,
causing time-window searches to return zero results.

Fix: use .astimezone(timezone.utc) instead, which treats the naive
datetime as local time and converts correctly.

Tests updated to round-trip back to local time for assertion so they
pass on any timezone, not just UTC.
2026-05-13 18:16:33 -07:00
e6075f80b3 fix: final review fixes — port guard, network error handling, wizard back nav, tablist arrow keys, dialog focus trap
- wizard.py: wrap syslog_port int() in try/except to default 514 on non-numeric input
- ContextView: add try/catch to doDelete, doDeleteFact, addFact for network errors
- ContextView: arrow-key navigation for tablist (ArrowLeft/ArrowRight)
- DiagnoseView: arrow-key navigation for tablist (ArrowLeft/ArrowRight)
- WizardOverlay: reset current_step to last schema step when clicking 'Go back and edit'
- WizardOverlay: focus trap on Tab/Shift+Tab within dialog element
2026-05-13 17:40:40 -07:00
82d3d37790 feat: optional sqlite-vec embedding pipeline for Paid-tier RAG 2026-05-13 16:32:57 -07:00
074240c061 feat: context REST API — docs, facts, wizard, and debug endpoints
Wires the context/RAG layer into FastAPI via a dedicated _ctx router
(/turnstone/api/context/*): document upload (POST/GET/DELETE /docs),
fact CRUD (POST/GET/DELETE /facts), wizard state machine
(/wizard/schema, /wizard/step, /wizard/apply), and a debug search
endpoint (/debug/search). All blocking DB calls are dispatched via
asyncio.to_thread to keep the event loop free.
2026-05-13 16:31:07 -07:00
f19f896300 feat: inject environment context into diagnose pipeline and LLM prompt
- Add context_block param to summarize() and thread it into _PROMPT_TEMPLATE
- Wire retrieve_context/format_context_block into diagnose_stream() before
  log search; emit context SSE event (facts + chunks) to the client
- 3 new tests covering prompt injection and SSE event emission (155 total, all pass)
2026-05-13 16:29:26 -07:00
abb61a6e90 feat: wizard state machine — structured Q&A writes context facts and source config 2026-05-13 16:25:52 -07:00
de662725ee feat: context retriever — keyword fact lookup and chunk search 2026-05-13 16:23:54 -07:00
d7b892bfcf feat: doc upload adapter — writes facts, document, and chunks to context store 2026-05-13 16:21:55 -07:00
c17bbf6e26 feat: context chunker — type detection, YAML extraction, text chunking
- Implement document type detection for yaml/json/markdown/text
- Extract service facts from docker-compose YAML (names, images, ports)
- Split text into overlapping word chunks (300-word default with 50-word overlap)
- Enforce 5 MB file size limit
- Comprehensive TDD test suite: 15 tests passing
2026-05-13 15:54:51 -07:00
36c9e607b7 feat: context store — fact and document CRUD 2026-05-13 15:53:03 -07:00
2a2f2e311a feat: add context_facts, context_documents, context_chunks tables to schema 2026-05-13 15:51:19 -07:00
734e81c8ca feat: SSE streaming diagnose, severity filter pills, per-source-cap search
- diagnose_stream() async generator: status/summary/entries/reasoning/done events
- POST /api/diagnose/stream SSE endpoint wired in rest.py
- entries_in_window() gains per_source_cap to prevent high-volume sources crowding results
- QuickCapture: severity filter pills, filtered entries view, pipeline status spinner
- llm.py: remove overly broad HTTPStatusError re-raise
2026-05-13 15:45:35 -07:00
909bb3f78b feat: try cf-orch task endpoint first; fall back to direct model call
POST /api/inference/task with product=turnstone task=log_analysis routes to
the security reasoning model assigned in cf-orch. Falls back to the OpenAI-
compat /v1/chat/completions path on 404 (no assignment) or if the task
endpoint is absent (local instances, xanderland).
2026-05-13 08:20:29 -07:00
b88c6d7ebf feat: source-scoped diagnose; multi-node Docker log collection
- Diagnose: add source_filter param threaded through entries_in_window,
  search, _diagnose, and DiagnoseRequest — clicking diagnose on a
  dashboard source now scopes both keyword and window hits to that source
- QuickCapture: read route.query.source; show scope badge with clear ✕;
  auto-run when source param is present without a query
- DashboardView: pass source= (not q=) when navigating to diagnose
- collect_cluster_logs.sh: auto-discover Docker containers on all nodes
  (Heimdall non-watched, Navi, Strahl via SSH); collect Cass Plex logs
  via SSH; write to per-node dirs for directory-mode ingest
- turnstone-cluster.service: add --reload for hot-reload during dev
2026-05-13 08:10:42 -07:00
03b796eb6e fix: correct cf-orch port to 7700; fix relative time parsing in diagnose; fix syslog PRI prefix 2026-05-13 05:33:41 -07:00
d80d4875db feat: add UDP syslog receiver for network device log collection
scripts/syslog_receiver.py: asyncio UDP server listening on port 5140,
appends raw syslog lines to network-syslog.txt for the Turnstone live
watcher to tail. Requires no root — port 5140 is non-privileged.

scripts/turnstone-syslog-receiver.service: systemd unit for auto-start.

app/ingest/syslog.py: strip optional RFC 3164 <PRI> prefix before
parsing so network-forwarded syslog (OpenWRT logd, Arista EOS, etc.)
is handled correctly without the PRI value breaking the regex.
2026-05-13 04:58:51 -07:00
dda0b453c2 fix: increase LLM summarize timeout to 120s for remote cf-orch routing
20s was too tight for first-request model swaps in Ollama (model cold load
can take 30-60s). 120s matches coordinator inference timeout.
2026-05-12 18:27:52 -07:00
765d2cb2df feat: switch LLM backend to OpenAI-compat; add cf-orch remote inference support
Turnstone now calls /v1/chat/completions instead of Ollama's /api/generate.
This format works with both local Ollama (>=0.1.24) and a remote cf-orch
coordinator, enabling GPU-less nodes like Xander's to route diagnoses through
the cluster without any local model.

- llm.py: OpenAI-compat messages format, optional Bearer auth header
- diagnose.py: thread llm_api_key through the call chain
- rest.py: llm_api_key pref (default empty), SettingsBody field, passed to diagnose
- SettingsView.vue: API Key field, label updated from "Ollama URL" to "LLM Endpoint URL"
- tests: updated mocks for new response shape; added bearer token assertion test
2026-05-12 12:58:38 -07:00
c6e22304d6 feat: add file tail source type; configure xanderland watchers
- type: file uses tail -F (handles rotation) with auto-format detection
- _parse_lines dispatches to journald/servarr/qbit/caddy/syslog/plaintext
  based on first-line format detection — same logic as batch ingest
- watch.yaml updated with file type docs and xanderland-specific example
- scripts/journal-bridge.sh + .service written directly to xanderland

Xander's watch.yaml covers: system-journal-live (via bridge file),
sonarr, radarr, lidarr, prowlarr, bazarr, qbittorrent, nzbget, tautulli
2026-05-11 15:44:10 -07:00
0497d0ad60 feat: live watch mode — tail journald/docker/podman sources continuously (#4)
Adds background watcher that tails active log sources and ingests entries
in near-real-time, keeping the DB fresh without manual ingest runs.

- app/watch/watcher.py: Watcher + WatchSource using subprocess + select
  loop; flushes every 10s or 100 lines; syncs FTS index every 3 flushes
- patterns/watch.yaml: declarative source config (journald/docker/podman)
- app/rest.py: lifespan context manager starts/stops watcher on app
  startup/shutdown; GET /api/watch/status + POST /api/watch/reload
- web/src/views/DashboardView.vue: live/manual indicator chip + stale
  banner copy adapts to whether live watching is active
- tests/test_watch_watcher.py: 16 tests covering config load, command
  building, docker timestamp stripping, orchestrator lifecycle

Closes #4
2026-05-11 15:34:13 -07:00
bd35a75137 feat: severity overrides + last_ingested timestamp on dashboard 2026-05-11 13:00:11 -07:00
b540060639 feat: LLM reasoning layer — Ollama summarization on diagnose results 2026-05-11 11:35:07 -07:00
5fe0fdd022 feat: add POST /api/diagnose and GET/PATCH /api/settings endpoints 2026-05-11 09:10:58 -07:00
b786eee49c fix: correct time_detected logic, immutable sort pattern, add diagnose() test 2026-05-11 09:08:24 -07:00
21b988fd66 feat: add diagnose service with NL time extraction via dateparser
Adds app/services/diagnose.py with parse_time_window() (dateparser-backed
NL time phrase extraction with 60-min fallback) and diagnose() (layered
FTS + window search returning severity/source summary). Includes 5 TDD tests.
2026-05-11 09:04:50 -07:00
1fa36631b3 feat: syslog and dmesg parsers with graceful journald fallback
- Add syslog.py — RFC 3164 parser for /var/log/syslog, /var/log/messages,
  auth.log, kern.log; ident prepended to message text for searchability
- Add dmesg_log.py — handles both relative [secs.usecs] and human-readable
  [Dow Mon DD HH:MM:SS YYYY] formats; relative timestamps preserved as raw
- Wire both into pipeline.py auto-detection (before plaintext fallback)
- Update export_journal.sh: checks for journalctl availability, falls back
  gracefully on non-systemd systems; adds dmesg -T export (falls back to
  plain dmesg on older kernels)
- Add syslog entries (commented) + dmesg source to sources.yaml
- 30 tests covering both parsers (detection + parse correctness)
2026-05-11 06:57:38 -07:00
ec931598ca feat: multi-source ingest via sources.yaml + servarr parser
- Add servarr.py parser for all *arr services (sonarr/radarr/lidarr/
  prowlarr/readarr/whisparr/bazarr) — pipe-delimited format with
  component prefix prepended for searchability
- Add ingest_sources() to pipeline.py; reads sources.yaml, skips
  missing paths with a warning so cron keeps running if a service
  is down
- Add --sources mode to ingest_corpus.py CLI; legacy positional args
  unchanged for backward compat
- Add patterns/sources.yaml with all of Xander's discovered service
  log paths (qbit, 7 servarr services, nzbget, tautulli, jellyseerr)
- Replace per-service volume mounts in podman-standalone.sh with
  /opt:/opt:ro + /var/log:/var/log:ro; adding a new source now
  requires only editing sources.yaml — no container restart
2026-05-11 06:26:32 -07:00