avocet

History

pyr0ball 9bb88b168f feat(corpus): pipeline log ingest from shared dir (closes #67 ) Pull-side companion to kiwi#141. Ingests structured JSONL pipeline logs from /Library/Assets/logs/pipeline/ into the log corpus for Turnstone logreading model training. - app/data/log_corpus.py: add ingested_pipeline_files tracking table, _pipeline_ingest_dir() config helper, _ingest_one_file() parser, and POST /api/corpus/pipeline-ingest endpoint - source_host = "pipeline_scrape"; source_id from logger field; extra dict stored as matched_patterns; batch_type = "pipeline_log" - Idempotent by filename: skips files already in ingested_pipeline_files - config/label_tool.yaml.example: add corpus section with pipeline_ingest_dir and push sources comment block - tests/test_log_corpus.py: 8 new tests covering ingest, idempotency, non-JSONL filtering, malformed line resilience, incremental runs		2026-05-17 11:28:33 -07:00
..
data	feat(corpus): pipeline log ingest from shared dir (closes #67 )	2026-05-17 11:28:33 -07:00
eval	feat: add embed-bench rate and export endpoints	2026-05-11 08:07:17 -07:00
train	fix: align train job/results API envelope, config_json key, progress SSE, dashboard model_key	2026-05-02 21:22:18 -07:00
api.py	feat: log corpus receiver — accept Turnstone push batches and label for logreading fine-tune	2026-05-11 17:07:54 -07:00
cforch.py	fix(tests): resolve 5 pre-existing test failures on main (closes #56 )	2026-05-17 11:21:58 -07:00
cloud_session.py	refactor: import detect_byok from cf-core, remove local copy	2026-04-25 16:45:47 -07:00
dashboard.py	feat: multi-bench dashboard, API path migration, benchmark reliability fixes	2026-05-11 09:05:12 -07:00
imap_fetch.py	feat: extract fetch routes and IMAP helpers into app/data/fetch.py	2026-05-01 21:57:31 -07:00
imitate.py	feat: move imitate API into app/data/imitate.py	2026-05-01 22:12:19 -07:00
models.py	fix(tests): resolve 5 pre-existing test failures on main (closes #56 )	2026-05-17 11:21:58 -07:00
nodes.py	feat(fleet): profile editor, assignments tab, node management polish	2026-05-17 11:23:47 -07:00
plans_bench.py	chore(models): refresh model registries with current cluster catalog	2026-05-17 11:24:03 -07:00
sft.py	feat: move SFT corrections API into app/data/corrections.py	2026-05-01 22:02:22 -07:00
style.py	refactor(bench): extract benchmark tabs — classifier, compare, llm-eval, style, voice	2026-04-24 14:56:17 -07:00
utils.py	fix: restore ensure_ascii=False in utils jsonl helpers; remove dead _last_action from api.py	2026-05-01 20:59:44 -07:00
voice.py	refactor(bench): extract benchmark tabs — classifier, compare, llm-eval, style, voice	2026-04-24 14:56:17 -07:00