feat: Orchard branch auto-enrollment and submission anonymization #27

Closed
opened 2026-05-20 08:26:57 -07:00 by pyr0ball · 0 comments
Owner

Summary

As The Orchard grows beyond Xander and Daniel, manual provisioning (new container + Caddy route per submitter) does not scale. We need an enrollment flow that provisions new branches automatically and anonymizes submitted entries before they reach the Avocet training store.

Proposed design

Terminology

  • Each external Turnstone node that submits to harvest.circuitforge.tech is a branch
  • The act of adding a new branch is grafting
  • The segregated receiving store per branch feeds Avocet for training

Enrollment flow (grafting)

  1. Branch operator calls POST /api/orchard/graft with {slug, contact_email, agreed_to_terms: true}
  2. Heimdall auto-provisions:
    • Data dir at /devl/docker/turnstone-submissions/<slug>/
    • Copies default patterns
    • Starts a new turnstone-submissions-<slug> container on the next available port (8536+)
    • Adds a handle_path /<slug>/* block to harvest.circuitforge.tech in Caddyfile and reloads
  3. Returns {submit_endpoint: "https://harvest.circuitforge.tech/<slug>", api_key: "<token>"}
  4. Branch operator sets TURNSTONE_SUBMIT_ENDPOINT and TURNSTONE_SUBMIT_KEY in their .env

Anonymization

Run as a post-processing step (separate worker, not in the ingest path) over each branch DB before Avocet reads it:

  • IPs → stable pseudonyms (HMAC-based, salt per branch, not reversible)
  • Hostnames → host-<short-hash>
  • Usernames in log text → user-<short-hash>
  • Preserve timestamps, severity, matched_patterns (training signal stays intact)

API key auth

POST /api/ingest/batch should require Authorization: Bearer <api_key> when TURNSTONE_BRANCH_KEY is set on the receiving instance.

Acceptance criteria

  • POST /api/orchard/graft endpoint on harvest receiver (or management endpoint)
  • Container auto-provisioning (Docker SDK or template-based compose + subprocess)
  • Caddy route auto-injection and reload
  • Anonymization worker + anonymized flag per branch DB entry
  • API key auth on batch ingest endpoint
  • Deactivation endpoint (DELETE /api/orchard/branches/<slug>)

Notes

  • Start simple: graft endpoint is admin-only (bearer token from .env), not self-service
  • Port allocation: scan 8536+ for next free port
  • Caddyfile injection: write to a harvest-branches.caddy include file rather than editing main Caddyfile
## Summary As The Orchard grows beyond Xander and Daniel, manual provisioning (new container + Caddy route per submitter) does not scale. We need an enrollment flow that provisions new branches automatically and anonymizes submitted entries before they reach the Avocet training store. ## Proposed design ### Terminology - Each external Turnstone node that submits to harvest.circuitforge.tech is a **branch** - The act of adding a new branch is **grafting** - The segregated receiving store per branch feeds **Avocet** for training ### Enrollment flow (grafting) 1. Branch operator calls `POST /api/orchard/graft` with `{slug, contact_email, agreed_to_terms: true}` 2. Heimdall auto-provisions: - Data dir at `/devl/docker/turnstone-submissions/<slug>/` - Copies default patterns - Starts a new `turnstone-submissions-<slug>` container on the next available port (8536+) - Adds a `handle_path /<slug>/*` block to harvest.circuitforge.tech in Caddyfile and reloads 3. Returns `{submit_endpoint: "https://harvest.circuitforge.tech/<slug>", api_key: "<token>"}` 4. Branch operator sets `TURNSTONE_SUBMIT_ENDPOINT` and `TURNSTONE_SUBMIT_KEY` in their `.env` ### Anonymization Run as a post-processing step (separate worker, not in the ingest path) over each branch DB before Avocet reads it: - IPs → stable pseudonyms (HMAC-based, salt per branch, not reversible) - Hostnames → `host-<short-hash>` - Usernames in log text → `user-<short-hash>` - Preserve timestamps, severity, matched_patterns (training signal stays intact) ### API key auth `POST /api/ingest/batch` should require `Authorization: Bearer <api_key>` when `TURNSTONE_BRANCH_KEY` is set on the receiving instance. ## Acceptance criteria - [ ] `POST /api/orchard/graft` endpoint on harvest receiver (or management endpoint) - [ ] Container auto-provisioning (Docker SDK or template-based compose + subprocess) - [ ] Caddy route auto-injection and reload - [ ] Anonymization worker + `anonymized` flag per branch DB entry - [ ] API key auth on batch ingest endpoint - [ ] Deactivation endpoint (`DELETE /api/orchard/branches/<slug>`) ## Notes - Start simple: graft endpoint is admin-only (bearer token from `.env`), not self-service - Port allocation: scan 8536+ for next free port - Caddyfile injection: write to a `harvest-branches.caddy` include file rather than editing main Caddyfile
pyr0ball added this to the beta milestone 2026-06-01 15:09:59 -07:00
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: Circuit-Forge/turnstone#27
No description provided.