turnstone/app
pyr0ball 7ab92a5cf4 feat(corpus): synthetic log corpus generator for demos and testing
Adds scripts/gen_corpus.py that produces realistic-but-artificial log
files across all four supported formats (journald JSON, docker envelope,
qBittorrent hotio, AVCX plaintext). Output feeds directly into
glean_corpus.py for demo environments and parser regression tests with
no production data required.

- Seed-based RNG with independent per-source sub-streams (same seed =
  same sequence for each file regardless of source count changes)
- Controllable time range, event density, and error injection rate
- Severity distribution mirrors real infrastructure (70% INFO, ~6% ERROR,
  ~2% CRITICAL) with adjustable boost via --error-rate
- 17 tests covering output structure, reproducibility, format correctness,
  parser round-trip, and CLI acceptance criteria

Also fixes a latent bug in app/glean/plaintext.py: ISO 8601 timestamps
were silently failing to parse because the T separator was normalised to
space in the input string but the strptime format string still contained T.
Fix: apply the same normalisation to the format before calling strptime.

Closes: #46
2026-06-11 10:57:20 -07:00
..
api feat: initial Turnstone POC — ingest, FTS search, MCP server 2026-05-08 12:12:34 -07:00
context feat: dual-backend SQLite/Postgres + multi-tenant source namespacing 2026-06-08 08:37:54 -07:00
db feat(alerts): security alerts tab — full scorer integration 2026-06-10 14:32:43 -07:00
glean feat(corpus): synthetic log corpus generator for demos and testing 2026-06-11 10:57:20 -07:00
services fix(cybersec): clean up debug traceback logging 2026-06-10 13:20:56 -07:00
tasks feat: cybersec zero-shot scoring pipeline (#9) 2026-06-10 01:03:25 -07:00
watch fix(watcher): remove per-flush FTS sync to eliminate SQLite write lock contention 2026-06-10 12:42:24 -07:00
__init__.py feat: initial Turnstone POC — ingest, FTS search, MCP server 2026-05-08 12:12:34 -07:00
mcp_server.py feat: dual-backend SQLite/Postgres + multi-tenant source namespacing 2026-06-08 08:37:54 -07:00
rest.py feat(incidents): incident timeline visualizer + fix entry lookup using wrong DB path 2026-06-10 16:02:24 -07:00