feat: push-based log corpus export — periodic ERROR/CRITICAL batch push to Avocet #6

Closed
opened 2026-05-09 15:31:23 -07:00 by pyr0ball · 0 comments
Owner

Push-based periodic export of ERROR/CRITICAL log entries from Turnstone nodes to Avocet for logreading fine-tune training.

Design spec: circuitforge-plans/turnstone/superpowers/specs/2026-05-11-log-corpus-pipeline-design.md

Turnstone changes

  • New script: scripts/export_corpus.py — watermark-based batch push (max 500 entries per run)
  • New env vars: AVOCET_CORPUS_ENDPOINT, AVOCET_CONSENT_TOKEN
  • Watermark files: /data/corpus_watermark.txt, /data/incident_watermark.txt
  • Update cron on all consented nodes to include export step after ingest
  • Preserve watermark files in update.sh across git pulls (alongside watch.yaml)

What is exported

  • Raw ERROR/CRITICAL entries — bulk corpus, unlabeled, will be labeled in Avocet
  • Labeled incident bundles — existing build_bundle() output, higher quality (human context attached)
  • INFO/DEBUG/WARN: excluded (too noisy for failure detection training)

Consented nodes (2026-05-11)

  • xanderland.tv — Xander (consent: Signal chat 2026-05-11)
  • orchard/heimdall, navi, sif — CF fleet (implicit consent)
  • Daniel's system — pending WireGuard + Huginn

See also: avocet#NEW (receiver + labeling UI)

Push-based periodic export of ERROR/CRITICAL log entries from Turnstone nodes to Avocet for logreading fine-tune training. **Design spec:** `circuitforge-plans/turnstone/superpowers/specs/2026-05-11-log-corpus-pipeline-design.md` ## Turnstone changes - New script: `scripts/export_corpus.py` — watermark-based batch push (max 500 entries per run) - New env vars: `AVOCET_CORPUS_ENDPOINT`, `AVOCET_CONSENT_TOKEN` - Watermark files: `/data/corpus_watermark.txt`, `/data/incident_watermark.txt` - Update cron on all consented nodes to include export step after ingest - Preserve watermark files in `update.sh` across git pulls (alongside watch.yaml) ## What is exported - **Raw ERROR/CRITICAL entries** — bulk corpus, unlabeled, will be labeled in Avocet - **Labeled incident bundles** — existing `build_bundle()` output, higher quality (human context attached) - INFO/DEBUG/WARN: excluded (too noisy for failure detection training) ## Consented nodes (2026-05-11) - xanderland.tv — Xander (consent: Signal chat 2026-05-11) - orchard/heimdall, navi, sif — CF fleet (implicit consent) - Daniel's system — pending WireGuard + Huginn See also: avocet#NEW (receiver + labeling UI)
pyr0ball added this to the (deleted) milestone 2026-05-09 15:31:23 -07:00
pyr0ball changed title from feat: labeled incident export — emit tagged incidents as Avocet-compatible training JSONL to feat: push-based log corpus export — periodic ERROR/CRITICAL batch push to Avocet 2026-05-11 16:25:43 -07:00
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: Circuit-Forge/turnstone#6
No description provided.