turnstone/patterns/sources-cluster.yaml
pyr0ball aa80f307fe refactor: rename ingest → glean throughout codebase
Renames the app/ingest/ package to app/glean/ and updates all
references across Python modules, shell scripts, Vue components,
tests, and documentation.

Intentionally preserved:
- SQLite column name ingest_time (avoids schema migration)
- RetrievedEntry.ingest_time field (maps to the column above)
- Any public-facing JSON keys that reference ingest_time

Changes by category:
- app/ingest/ → app/glean/ (full package move, all parsers)
- app/tasks/ingest_scheduler.py → app/tasks/glean_scheduler.py
- scripts/ingest_corpus.py → scripts/glean_corpus.py
- tests/test_ingest_*.py → tests/test_glean_*.py
- Docstrings, log messages, comments: ingest → glean
- Env var: TURNSTONE_INGEST_INTERVAL → TURNSTONE_GLEAN_INTERVAL
- Shell scripts: glean.log, glean_corpus.py references
- README.md: multi-source ingest → multi-source glean
- .env.example: updated env var name
- patterns/: new diagnostic patterns from 2026-05-20 SSH incident
  (service_crash_loop, pkg_daemon_restart, ssh_forward_conflict)
- SourcesView.vue: pipeline label updated
- All test import paths updated to app.glean.*

285 tests passing.
2026-05-20 23:02:55 -07:00

55 lines
2.4 KiB
YAML

# Turnstone log sources — Heimdall cluster glean.
# Covers: Heimdall (local), Navi, Sif, Cass, Strahl (SSH-collected),
# Docker services on Heimdall, and network device syslog.
#
# Collected by scripts/collect_cluster_logs.sh before each glean run.
# All paths are container-side (/data/ = bind-mount of /devl/turnstone-cluster/data/).
#
# Cron (collect + glean, every 15 min):
# */15 * * * * bash /Library/Development/CircuitForge/turnstone/scripts/collect_cluster_logs.sh && \
# docker exec turnstone-cluster python scripts/glean_corpus.py \
# --sources /patterns/sources-cluster.yaml --db /data/turnstone.db \
# >> /var/log/turnstone-cluster-glean.log 2>&1
sources:
# ── Heimdall (local) ─────────────────────────────────────────────────────────
- id: heimdall-journal
path: /data/heimdall-journal.jsonl
- id: heimdall-dmesg
path: /data/heimdall-dmesg.txt
# ── Remote cluster nodes (SSH-collected journals) ────────────────────────────
- id: navi-journal
path: /data/navi-journal.jsonl
- id: sif-journal
path: /data/sif-journal.jsonl
- id: cass-journal
path: /data/cass-journal.jsonl
- id: strahl-journal
path: /data/strahl-journal.jsonl
# ── Docker services on Heimdall ──────────────────────────────────────────────
- id: docker-cf-orch-coordinator
path: /data/docker-cf-orch-coordinator.jsonl
- id: docker-cf-web
path: /data/docker-cf-web.jsonl
- id: docker-cf-directus
path: /data/docker-cf-directus.jsonl
- id: docker-caddy-proxy
path: /data/docker-caddy-proxy.jsonl
# ── Network syslog (router, switches, UniFi APs) ─────────────────────────────
# Written by syslog-receiver.service (UDP 5140 → /devl/turnstone-cluster/data/network-syslog.txt).
# Configure devices to send syslog to Heimdall:5140.
# UniFi: Settings → System → Remote Logging → Syslog Host = <YOUR_HOST_IP>:5140
# Ubiquiti EdgeRouter: set system syslog host <YOUR_HOST_IP> facility all level debug
# Managed switches: varies by vendor — target <YOUR_HOST_IP> UDP 5140
- id: network-syslog
path: /data/network-syslog.txt