Pipeline scripts: write structured logs to shared dir for Turnstone training #141
Labels
No labels
accessibility
backlog
beta-feedback
bug
duplicate
enhancement
feature-request
help wanted
invalid
needs-design
needs-triage
question
wontfix
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: Circuit-Forge/kiwi#141
Loading…
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Summary
Pipeline scrape scripts (discover_wayback.py, scrape_recipes.py, and future scrapers) should emit structured log lines to a shared directory so Avocet can ingest them as Turnstone logreading training data.
Shared log directory
One JSONL file per run, named by script + timestamp:
Log line schema
Each line should be a JSON object matching Turnstones expected format:
Implementation
Add a
_setup_pipeline_log(script_name)helper inscripts/pipeline/utils.py(create if needed):logging.FileHandlerpointing at/Library/Assets/logs/pipeline/<script_name>_<ts>.jsonlmain()in each pipeline script alongside the existinglogging.basicConfigNotes
logging.basicConfigto stderr stays unchanged (human-readable dev output)/Library/Assets/kiwi/pipeline/