avocet

Author	SHA1	Message	Date
pyr0ball	30f19711ec	feat(avocet): add cancel endpoints for benchmark and finetune jobs Adds POST /api/benchmark/cancel and POST /api/finetune/cancel endpoints that terminate the running subprocess (kill on 3s timeout), and updates the run generators to emit a cancelled SSE event instead of error when the job was intentionally stopped.	2026-03-15 18:09:20 -07:00
pyr0ball	60fe1231ce	fix(avocet): _MODELS_DIR overridable in tests; sanitize score paths against path traversal	2026-03-15 16:07:27 -07:00
pyr0ball	ef8adfb035	feat(avocet): add /api/finetune/status and /api/finetune/run endpoints	2026-03-15 16:04:34 -07:00
pyr0ball	64fd19a7b6	fix(avocet): move TorchDataset import to top; split sample_count into total+train	2026-03-15 16:02:43 -07:00
pyr0ball	8ba34bb2d1	feat(avocet): run_finetune, CLI, multi-score-file merge with last-write-wins dedup - load_and_prepare_data() now accepts Path \| list[Path]; single-Path callers unchanged - Dedup by MD5(subject + body[:100]); last file/row wins (lets later runs correct labels) - Prints summary line when duplicates are dropped - Added _EmailDataset (TorchDataset wrapper), run_finetune(), and argparse CLI - run_finetune() saves model + tokenizer + training_info.json with score_files provenance - Stratified split guard: val set size clamped to at least n_classes (handles tiny example data) - 3 new unit tests (merge, last-write-wins dedup, single-Path compat) + 1 integration test - All 16 tests pass (15 unit + 1 integration)	2026-03-15 15:52:41 -07:00
pyr0ball	f262b23cf5	fix(avocet): tighten body truncation test to exact 400-char assertion	2026-03-15 15:44:19 -07:00
pyr0ball	5eb593569d	feat(avocet): add finetune data pipeline, class weights, WeightedTrainer Implements load_and_prepare_data (JSONL ingestion with class filtering), compute_class_weights (inverse-frequency, div-by-zero safe), compute_metrics_for_trainer (macro F1 + accuracy), and WeightedTrainer.compute_loss (**kwargs-safe for Transformers 4.38+ num_items_in_batch). All 12 tests pass.	2026-03-15 15:38:45 -07:00
pyr0ball	2d795b9573	fix(avocet): guard discover_finetuned_models against malformed/incomplete training_info.json	2026-03-15 15:18:13 -07:00
pyr0ball	36117b35c4	feat(avocet): auto-discover fine-tuned models in benchmark harness	2026-03-15 11:59:13 -07:00
pyr0ball	da8478082e	fix(avocet): FineTunedAdapter GPU device routing + precise body truncation test	2026-03-15 10:56:47 -07:00
pyr0ball	7a4ca422ca	feat(avocet): add FineTunedAdapter for local checkpoint inference	2026-03-15 10:54:38 -07:00
pyr0ball	f38c73db97	feat: add GET /api/fetch/stream SSE endpoint for real-time IMAP progress	2026-03-04 12:05:23 -08:00
pyr0ball	965362f5e3	feat: add POST /api/accounts/test endpoint	2026-03-04 12:04:42 -08:00
pyr0ball	f64be8bbe0	feat: add GET /api/stats and GET /api/stats/download endpoints	2026-03-04 12:04:11 -08:00
pyr0ball	c5a74d3821	feat: add GET/POST /api/config endpoints for IMAP account management	2026-03-04 12:03:40 -08:00
pyr0ball	1d1f25641b	feat: extract IMAP logic to app/imap_fetch.py for reuse by API	2026-03-04 11:42:22 -08:00
pyr0ball	82eeb4defc	fix: prevent blank page on rebuild and queue drain on skip/discard Two bugs fixed: 1. Blank white page after vue SPA rebuild: browsers cached old index.html referencing old asset hashes. Assets are deleted on rebuild, causing 404s for JS/CSS -> blank page. Fix: serve index.html with Cache-Control: no-cache so browsers always fetch fresh HTML. Hashed assets (/assets/chunk-abc123.js) remain cacheable forever. 2. Queue draining to empty on skip/discard: handleSkip and handleDiscard never refilled the local queue buffer. After enough skips, store.current went null and the empty state showed (blank-looking). Fix: both handlers now call fetchBatch() when queue drops below 3, matching handleLabel. Also: sync classifier_adapters LABELS to match current 10-label schema (new_lead + hired, remove unrelated). 48 Python tests pass, 48 frontend tests pass.	2026-03-03 19:26:34 -08:00
pyr0ball	682a958c28	fix(avocet): strip HTML from email bodies — stdlib HTMLParser, no deps	2026-03-03 16:28:18 -08:00
pyr0ball	4a76f6ba41	fix(avocet): undo — commit-then-clear order, empty-records guard, skip dedup, stronger test	2026-03-03 15:41:58 -08:00
pyr0ball	80a8195899	feat(avocet): discard, undo, labels config, static serving — backend complete	2026-03-03 15:35:01 -08:00
pyr0ball	f0e9886ab2	feat(avocet): POST /api/skip endpoint	2026-03-03 15:21:32 -08:00
pyr0ball	ff27053aa9	feat(avocet): POST /api/label endpoint	2026-03-03 15:14:04 -08:00
pyr0ball	c1a6dd4fc5	fix(avocet): queue_with_items fixture uses api._DATA_DIR to avoid implicit tmp_path coupling	2026-03-03 15:03:57 -08:00
pyr0ball	9abae0478c	feat(avocet): GET /api/queue endpoint	2026-03-03 15:00:59 -08:00
pyr0ball	c8aea3c39f	fix(avocet): _write_jsonl empty-list writes empty file; add reset_last_action helper	2026-03-03 14:36:18 -08:00
pyr0ball	ffd1450f62	feat(avocet): FastAPI skeleton + JSONL helpers	2026-03-03 13:30:28 -08:00
pyr0ball	fd476e4199	feat: 9 labels (add event_rescheduled/unrelated/digest), wildcard Other label, InvalidCharacterError fix	2026-02-27 14:34:15 -08:00
pyr0ball	0e238a9e37	feat: initial avocet repo — email classifier training tool Scrape → Store → Process pipeline for building email classifier benchmark data across the CircuitForge menagerie. - app/label_tool.py — Streamlit card-stack UI, multi-account IMAP fetch, 6-bucket labeling, undo/skip, keyboard shortcuts (1-6/S/U) - scripts/classifier_adapters.py — ZeroShotAdapter (+ two_pass), GLiClassAdapter, RerankerAdapter; ABC with lazy model loading - scripts/benchmark_classifier.py — 13-model registry, --score, --compare, --list-models, --export-db; uses label_tool.yaml for IMAP - tests/ — 20 tests, all passing, zero model downloads required - config/label_tool.yaml.example — multi-account IMAP template - data/email_score.jsonl.example — sample labeled data for CI Labels: interview_scheduled, offer_received, rejected, positive_response, survey_received, neutral	2026-02-27 14:07:38 -08:00

28 commits