Commit graph

32 commits

Author SHA1 Message Date
03dac57fd9 feat: sft_import.py — run discovery and JSONL deduplication 2026-04-08 07:13:37 -07:00
25880e377d refactor: consolidate HTML extraction into app/utils.py
Rename _strip_html/_extract_body to strip_html/extract_body (public API).
Remove duplicate _TextExtractor, strip_html, and _extract_body from
imap_fetch.py; import from app.utils instead. Update test_label_tool.py
to use the new public names.
2026-04-08 06:52:15 -07:00
ae0ac19505 chore: retire Streamlit app, scaffold sft branch
- Delete app/label_tool.py (Streamlit UI retired; Vue SPA is sole UI)
- Extract _strip_html and _extract_body into app/utils.py (stdlib-only, reusable)
- Update tests/test_label_tool.py import to app.utils
- Rename start-api/stop-api/restart-api/open-api → start/stop/restart/open in manage.sh
- Remove STREAMLIT variable and all Streamlit-specific case blocks from manage.sh
- Update manage.sh usage section to reflect Vue+FastAPI-only commands
- Add data/sft_candidates.jsonl and data/sft_approved.jsonl to .gitignore
- Add sft.bench_results_dir key to config/label_tool.yaml.example
2026-04-08 06:18:12 -07:00
e38a28dcc3 fix(avocet): narrow cancel except clause, clear stale cancel flags on new run
- except clause in cancel_benchmark/cancel_finetune narrowed from Exception
  to _subprocess.TimeoutExpired (C1)
- _cancelled_jobs.discard() called after registering new proc to prevent
  a stale flag from a prior run masking errors (I2)
- local `import subprocess` removed from run_benchmark and
  run_finetune_endpoint; all Popen calls updated to _subprocess.Popen (I1)
- test patch targets updated from subprocess.Popen to app.api._subprocess.Popen;
  cancelled-event tests updated to set flag in proc.wait() side-effect so
  the discard-on-new-run logic is exercised correctly
2026-03-15 18:13:01 -07:00
0ab49609c0 feat(avocet): add cancel endpoints for benchmark and finetune jobs
Adds POST /api/benchmark/cancel and POST /api/finetune/cancel endpoints
that terminate the running subprocess (kill on 3s timeout), and updates
the run generators to emit a cancelled SSE event instead of error when
the job was intentionally stopped.
2026-03-15 18:09:20 -07:00
dd352f07cd fix(avocet): _MODELS_DIR overridable in tests; sanitize score paths against path traversal 2026-03-15 16:07:27 -07:00
903624a4b8 feat(avocet): add /api/finetune/status and /api/finetune/run endpoints 2026-03-15 16:04:34 -07:00
48e02f2ed6 fix(avocet): move TorchDataset import to top; split sample_count into total+train 2026-03-15 16:02:43 -07:00
939ce06f45 feat(avocet): run_finetune, CLI, multi-score-file merge with last-write-wins dedup
- load_and_prepare_data() now accepts Path | list[Path]; single-Path callers unchanged
- Dedup by MD5(subject + body[:100]); last file/row wins (lets later runs correct labels)
- Prints summary line when duplicates are dropped
- Added _EmailDataset (TorchDataset wrapper), run_finetune(), and argparse CLI
- run_finetune() saves model + tokenizer + training_info.json with score_files provenance
- Stratified split guard: val set size clamped to at least n_classes (handles tiny example data)
- 3 new unit tests (merge, last-write-wins dedup, single-Path compat) + 1 integration test
- All 16 tests pass (15 unit + 1 integration)
2026-03-15 15:52:41 -07:00
4e70e79b26 fix(avocet): tighten body truncation test to exact 400-char assertion 2026-03-15 15:44:19 -07:00
de5794611b feat(avocet): add finetune data pipeline, class weights, WeightedTrainer
Implements load_and_prepare_data (JSONL ingestion with class filtering),
compute_class_weights (inverse-frequency, div-by-zero safe), compute_metrics_for_trainer
(macro F1 + accuracy), and WeightedTrainer.compute_loss (**kwargs-safe for
Transformers 4.38+ num_items_in_batch). All 12 tests pass.
2026-03-15 15:38:45 -07:00
d1a36bfd63 fix(avocet): guard discover_finetuned_models against malformed/incomplete training_info.json 2026-03-15 15:18:13 -07:00
df37a8e16d feat(avocet): auto-discover fine-tuned models in benchmark harness 2026-03-15 11:59:13 -07:00
179cb67e1c fix(avocet): FineTunedAdapter GPU device routing + precise body truncation test 2026-03-15 10:56:47 -07:00
dc321de59f feat(avocet): add FineTunedAdapter for local checkpoint inference 2026-03-15 10:54:38 -07:00
07407117a5 feat: add GET /api/fetch/stream SSE endpoint for real-time IMAP progress 2026-03-04 12:05:23 -08:00
e5e66b09cc feat: add POST /api/accounts/test endpoint 2026-03-04 12:04:42 -08:00
47a2178ee4 feat: add GET /api/stats and GET /api/stats/download endpoints 2026-03-04 12:04:11 -08:00
3f0cd7e837 feat: add GET/POST /api/config endpoints for IMAP account management 2026-03-04 12:03:40 -08:00
8a0545a6e7 feat: extract IMAP logic to app/imap_fetch.py for reuse by API 2026-03-04 11:42:22 -08:00
3788254abd fix: prevent blank page on rebuild and queue drain on skip/discard
Two bugs fixed:

1. Blank white page after vue SPA rebuild: browsers cached old index.html
   referencing old asset hashes. Assets are deleted on rebuild, causing
   404s for JS/CSS -> blank page. Fix: serve index.html with
   Cache-Control: no-cache so browsers always fetch fresh HTML.
   Hashed assets (/assets/chunk-abc123.js) remain cacheable forever.

2. Queue draining to empty on skip/discard: handleSkip and handleDiscard
   never refilled the local queue buffer. After enough skips, store.current
   went null and the empty state showed (blank-looking). Fix: both handlers
   now call fetchBatch() when queue drops below 3, matching handleLabel.

Also: sync classifier_adapters LABELS to match current 10-label schema
(new_lead + hired, remove unrelated).

48 Python tests pass, 48 frontend tests pass.
2026-03-03 19:26:34 -08:00
2fdafc1d10 fix(avocet): strip HTML from email bodies — stdlib HTMLParser, no deps 2026-03-03 16:28:18 -08:00
01cc908eab fix(avocet): undo — commit-then-clear order, empty-records guard, skip dedup, stronger test 2026-03-03 15:41:58 -08:00
f4facc6484 feat(avocet): discard, undo, labels config, static serving — backend complete 2026-03-03 15:35:01 -08:00
5912b73705 feat(avocet): POST /api/skip endpoint 2026-03-03 15:21:32 -08:00
ce202d97ea feat(avocet): POST /api/label endpoint 2026-03-03 15:14:04 -08:00
6556e3fef0 fix(avocet): queue_with_items fixture uses api._DATA_DIR to avoid implicit tmp_path coupling 2026-03-03 15:03:57 -08:00
8898258055 feat(avocet): GET /api/queue endpoint 2026-03-03 15:00:59 -08:00
d36d0be166 fix(avocet): _write_jsonl empty-list writes empty file; add reset_last_action helper 2026-03-03 14:36:18 -08:00
f06114e648 feat(avocet): FastAPI skeleton + JSONL helpers 2026-03-03 13:30:28 -08:00
4c346aa328 feat: 9 labels (add event_rescheduled/unrelated/digest), wildcard Other label, InvalidCharacterError fix 2026-02-27 14:34:15 -08:00
d68754d432 feat: initial avocet repo — email classifier training tool
Scrape → Store → Process pipeline for building email classifier
benchmark data across the CircuitForge menagerie.

- app/label_tool.py — Streamlit card-stack UI, multi-account IMAP fetch,
  6-bucket labeling, undo/skip, keyboard shortcuts (1-6/S/U)
- scripts/classifier_adapters.py — ZeroShotAdapter (+ two_pass),
  GLiClassAdapter, RerankerAdapter; ABC with lazy model loading
- scripts/benchmark_classifier.py — 13-model registry, --score,
  --compare, --list-models, --export-db; uses label_tool.yaml for IMAP
- tests/ — 20 tests, all passing, zero model downloads required
- config/label_tool.yaml.example — multi-account IMAP template
- data/email_score.jsonl.example — sample labeled data for CI

Labels: interview_scheduled, offer_received, rejected,
        positive_response, survey_received, neutral
2026-02-27 14:07:38 -08:00