feat: Corrections tab — SFT candidate import, review, and JSONL export #15

Merged
pyr0ball merged 99 commits from feat/sft-corrections into main 2026-04-08 22:19:01 -07:00

99 commits

Author SHA1 Message Date
f17aae3bd2 feat: add dev command for hot-reload (uvicorn --reload + Vite HMR)
- manage.sh: dev command starts uvicorn --reload on :8503 and Vite dev
  server (auto-port from 5173); kills API on EXIT/INT/TERM trap
- manage.sh: ENV_UI defaults to 'cf' env (overridable via AVOCET_ENV)
- vite.config.ts: add server.proxy to forward /api to :8503 so Vite
  dev server can reach the backend without CORS issues
2026-04-08 19:43:40 -07:00
09e334359f fix: pessimistic submit/undo, config null-safe, load config on mount
- sft.py GET /config: use `or {}` guard so `sft: ~` (null YAML) doesn't
  return None instead of the default empty config
- CorrectionsView: convert handleCorrect/Discard/Flag and handleUndo from
  optimistic to pessimistic — queue mutation only happens after server
  confirms; failures leave item in queue so user can retry cleanly
- SettingsView: call loadSftConfig() on mount so saved bench_results_dir
  is populated instead of always starting empty
2026-04-08 18:49:38 -07:00
353d0a47a0 feat: Corrections tab — router, sidebar, settings, SFT config endpoints
- Add /corrections route to Vue router (lazy-loaded CorrectionsView)
- Add Corrections nav item (✍️) to AppSidebar after Benchmark
- Add cf-orch Integration section to SettingsView with bench_results_dir
  field, run scanner, and per-run import table
- Add GET /api/sft/config and POST /api/sft/config endpoints to app/sft.py
2026-04-08 18:29:22 -07:00
e63d77127b feat: CorrectionsView and useSftKeyboard composable 2026-04-08 15:26:13 -07:00
03e5f9f9b4 fix: guard null failure_reason render, fix mid-quality test description
- Add v-if guard on failure-reason <p> so null renders no element (not literal "null")
- Clarify mid-quality test description: score is 0.4 to <0.7 (exclusive upper bound)
- Add test: renders nothing for failure_reason when null (+1 → 14 SftCard tests)
2026-04-08 15:23:19 -07:00
e16ea95dcc fix: guard aria-describedby from rendering undefined string 2026-04-08 15:22:12 -07:00
8873920b83 feat: SftCard — quality chip, prompt collapsible, action buttons, correction area slot 2026-04-08 15:19:37 -07:00
2d939b77f9 feat: SftCorrectionArea — inline correction text area component 2026-04-08 15:16:45 -07:00
137a9dbb8e fix: nullable failure_reason, factory fixture for sft store tests 2026-04-08 15:14:29 -07:00
9c11916d81 feat: useSftStore — SftQueueItem type and Pinia store 2026-04-08 15:11:17 -07:00
b6d45c746c fix: shared _is_exportable predicate, return type annotations on export/stats 2026-04-08 15:07:24 -07:00
07807f0d05 feat: sft router — /export and /stats endpoints 2026-04-08 14:46:08 -07:00
4ad2907ae8 fix: use Literal type for SubmitRequest.action field 2026-04-08 14:33:38 -07:00
f19cab60f7 feat: sft router — /queue, /submit, /undo endpoints 2026-04-08 14:22:06 -07:00
b330e84111 fix: sft router — yaml error handling, none filter, shared jsonl utils, fixture restore 2026-04-08 14:07:09 -07:00
597ffc7324 feat: sft router skeleton — /api/sft/runs and /api/sft/import 2026-04-08 13:54:58 -07:00
cfde474454 fix: log on malformed json in _read_jsonl, use streaming id dedup 2026-04-08 07:37:22 -07:00
bbfae1a622 fix: log warning when sft record is missing id field 2026-04-08 07:30:46 -07:00
03dac57fd9 feat: sft_import.py — run discovery and JSONL deduplication 2026-04-08 07:13:37 -07:00
25880e377d refactor: consolidate HTML extraction into app/utils.py
Rename _strip_html/_extract_body to strip_html/extract_body (public API).
Remove duplicate _TextExtractor, strip_html, and _extract_body from
imap_fetch.py; import from app.utils instead. Update test_label_tool.py
to use the new public names.
2026-04-08 06:52:15 -07:00
ae0ac19505 chore: retire Streamlit app, scaffold sft branch
- Delete app/label_tool.py (Streamlit UI retired; Vue SPA is sole UI)
- Extract _strip_html and _extract_body into app/utils.py (stdlib-only, reusable)
- Update tests/test_label_tool.py import to app.utils
- Rename start-api/stop-api/restart-api/open-api → start/stop/restart/open in manage.sh
- Remove STREAMLIT variable and all Streamlit-specific case blocks from manage.sh
- Update manage.sh usage section to reflect Vue+FastAPI-only commands
- Add data/sft_candidates.jsonl and data/sft_approved.jsonl to .gitignore
- Add sft.bench_results_dir key to config/label_tool.yaml.example
2026-04-08 06:18:12 -07:00
de2a2935b9 chore: gitignore CLAUDE.md and docs/superpowers (BSL 1.1 compliance) 2026-03-27 01:00:30 -07:00
0d252da2a0 feat(avocet): add cancel buttons for benchmark and fine-tune runs 2026-03-15 18:15:35 -07:00
e38a28dcc3 fix(avocet): narrow cancel except clause, clear stale cancel flags on new run
- except clause in cancel_benchmark/cancel_finetune narrowed from Exception
  to _subprocess.TimeoutExpired (C1)
- _cancelled_jobs.discard() called after registering new proc to prevent
  a stale flag from a prior run masking errors (I2)
- local `import subprocess` removed from run_benchmark and
  run_finetune_endpoint; all Popen calls updated to _subprocess.Popen (I1)
- test patch targets updated from subprocess.Popen to app.api._subprocess.Popen;
  cancelled-event tests updated to set flag in proc.wait() side-effect so
  the discard-on-new-run logic is exercised correctly
2026-03-15 18:13:01 -07:00
0ab49609c0 feat(avocet): add cancel endpoints for benchmark and finetune jobs
Adds POST /api/benchmark/cancel and POST /api/finetune/cancel endpoints
that terminate the running subprocess (kill on 3s timeout), and updates
the run generators to emit a cancelled SSE event instead of error when
the job was intentionally stopped.
2026-03-15 18:09:20 -07:00
db44c9323e fix(avocet): use_reentrant=False for gradient checkpointing
Reentrant gradient checkpointing (the default) conflicts with Accelerate's
gradient accumulation context manager -- causes 'backward through graph a
second time' on the first training step. use_reentrant=False uses the
non-reentrant autograd hook path which is compatible with Accelerate >= 0.27.
2026-03-15 17:23:40 -07:00
cbc382cc88 fix(avocet): reduce deberta-small VRAM + auto-select freest GPU for training
- deberta-small: batch_size 16→8 + grad_accum 1→2 (same effective batch),
  gradient_checkpointing=True (fp16 stays off: DeBERTa v3 disentangled
  attention overflows fp16 at the gather step)
- api: _best_cuda_device() picks highest free-VRAM GPU via nvidia-smi;
  sets CUDA_VISIBLE_DEVICES in subprocess env to prevent DataParallel
  replication across both GPUs; adds PYTORCH_ALLOC_CONF=expandable_segments
- SSE log now reports which GPU was selected
2026-03-15 17:09:06 -07:00
ed818dc341 feat(avocet): add restart-api command to manage.sh 2026-03-15 17:04:00 -07:00
5d68b0706f fix(avocet): use startsWith for error class in ft-log (consistent with benchmark log) 2026-03-15 16:14:47 -07:00
65548f4ddb feat(avocet): add fine-tune section and trained models badge row to BenchmarkView 2026-03-15 16:09:51 -07:00
dd352f07cd fix(avocet): _MODELS_DIR overridable in tests; sanitize score paths against path traversal 2026-03-15 16:07:27 -07:00
903624a4b8 feat(avocet): add /api/finetune/status and /api/finetune/run endpoints 2026-03-15 16:04:34 -07:00
48e02f2ed6 fix(avocet): move TorchDataset import to top; split sample_count into total+train 2026-03-15 16:02:43 -07:00
939ce06f45 feat(avocet): run_finetune, CLI, multi-score-file merge with last-write-wins dedup
- load_and_prepare_data() now accepts Path | list[Path]; single-Path callers unchanged
- Dedup by MD5(subject + body[:100]); last file/row wins (lets later runs correct labels)
- Prints summary line when duplicates are dropped
- Added _EmailDataset (TorchDataset wrapper), run_finetune(), and argparse CLI
- run_finetune() saves model + tokenizer + training_info.json with score_files provenance
- Stratified split guard: val set size clamped to at least n_classes (handles tiny example data)
- 3 new unit tests (merge, last-write-wins dedup, single-Path compat) + 1 integration test
- All 16 tests pass (15 unit + 1 integration)
2026-03-15 15:52:41 -07:00
4e70e79b26 fix(avocet): tighten body truncation test to exact 400-char assertion 2026-03-15 15:44:19 -07:00
de5794611b feat(avocet): add finetune data pipeline, class weights, WeightedTrainer
Implements load_and_prepare_data (JSONL ingestion with class filtering),
compute_class_weights (inverse-frequency, div-by-zero safe), compute_metrics_for_trainer
(macro F1 + accuracy), and WeightedTrainer.compute_loss (**kwargs-safe for
Transformers 4.38+ num_items_in_batch). All 12 tests pass.
2026-03-15 15:38:45 -07:00
d1a36bfd63 fix(avocet): guard discover_finetuned_models against malformed/incomplete training_info.json 2026-03-15 15:18:13 -07:00
df37a8e16d feat(avocet): auto-discover fine-tuned models in benchmark harness 2026-03-15 11:59:13 -07:00
179cb67e1c fix(avocet): FineTunedAdapter GPU device routing + precise body truncation test 2026-03-15 10:56:47 -07:00
dc321de59f feat(avocet): add FineTunedAdapter for local checkpoint inference 2026-03-15 10:54:38 -07:00
f4a654933d chore(avocet): add scikit-learn to classifier env 2026-03-15 09:44:04 -07:00
a53f3a7341 feat(avocet): benchmark UI, label fixes, BenchmarkView with charts and SSE run 2026-03-15 09:39:37 -07:00
ce1b8c2215 fix(avocet): reset card element state when new item loads to clear previous animation inline styles 2026-03-08 07:44:02 -07:00
f1933ab51c feat(avocet): badge pop via Anime.js spring transition hook 2026-03-08 07:35:49 -07:00
6a898bbdee fix(avocet): constrain grid-active to 640px on wide viewports using left/right offsets 2026-03-08 07:26:46 -07:00
efc2d33de2 feat(avocet): animate bucket grid rise with Anime.js spring 2026-03-08 07:17:56 -07:00
5c6aa02998 fix(avocet): restore drag aura color feedback via updateAura in useCardAnimation 2026-03-08 07:14:24 -07:00
9302644259 feat(avocet): wire Anime.js card animation into EmailCardStack
Replace CSS keyframe dismiss classes and inline cardStyle/deltaX/deltaY
with useCardAnimation composable — pickup/setDragPosition/snapBack/animateDismiss
are now called from pointer event handlers and a dismissType watcher.
2026-03-08 07:07:58 -07:00
b68c176278 feat(avocet): add useCardAnimation composable with Anime.js
TDD: 8 tests written first (red), then composable implemented (green).
Adapts to Anime.js v4 API: 2-arg animate(), object-param spring(),
utils.set() for instant drag-position updates without cache desync.
2026-03-08 06:52:27 -07:00
d02c937ff1 feat(avocet): add animejs v4 dependency 2026-03-08 06:47:50 -07:00
611f510547 docs: add privacy policy reference 2026-03-05 20:59:37 -08:00
1a95d4d580 fix(avocet): ball escapes overflow clip, floats above header/footer with z-index + transparency 2026-03-05 15:14:24 -08:00
351703d9db fix(avocet): grid pinned to viewport with height 100dvh + card ball floats above finger at scale 0.55 2026-03-05 15:07:58 -08:00
d7cd01a8da feat(avocet): add velocity-based fling detection to toss gesture (option B: speed + alignment) 2026-03-05 14:55:10 -08:00
8947dc5d05 feat(avocet): add toss-zone overlays and grid-rise animation to LabelView 2026-03-05 13:41:52 -08:00
fc8cb9a8bd feat(avocet): replace swipe+HTML5-drag with unified pointer-events toss gesture 2026-03-05 10:38:52 -08:00
cac02b2c5f feat(avocet): replace HTML5 drag events on LabelBucketGrid with hoveredBucket prop 2026-03-05 10:10:48 -08:00
8a2df0e2f8 feat: card crumples to small ball on drag pickup so buckets expand fully 2026-03-04 12:38:46 -08:00
33f5e0d8a1 fix: keyboard shortcuts now work after labels load (lazy keymap evaluation)
useLabelKeyboard now accepts labels as Label[] | (() => Label[]).
The keymap is rebuilt on every keypress from the getter result instead of
being captured once at construction time — so keys 1–9 now fire correctly
after the async /api/config/labels fetch completes.

LabelView passes () => labels.value so the reactive ref is read lazily.

New test: 'evaluates labels getter on each keypress' covers the async-load
scenario (empty list → no match; push a label → key fires).
2026-03-04 12:32:25 -08:00
43ef2ff8d2 fix: pin bucket grid to bottom of viewport with sticky footer; prevents mis-click from layout shift 2026-03-04 12:26:04 -08:00
c4498a8190 feat: implement FetchView — SSE progress bars, account selection, targeted fetch 2026-03-04 12:23:58 -08:00
6b6205e4ed feat: implement StatsView — label distribution bars, file info, download 2026-03-04 12:21:21 -08:00
9ef2c1251d feat: implement SettingsView — IMAP account management, test connection, display toggles 2026-03-04 12:20:30 -08:00
c94d271f4c feat: add useApiSSE helper for Server-Sent Events connections 2026-03-04 12:17:46 -08:00
2a48ab0f03 feat: add Vue Router + stow-able AppSidebar; stub Fetch/Stats/Settings views 2026-03-04 12:12:26 -08:00
07407117a5 feat: add GET /api/fetch/stream SSE endpoint for real-time IMAP progress 2026-03-04 12:05:23 -08:00
e5e66b09cc feat: add POST /api/accounts/test endpoint 2026-03-04 12:04:42 -08:00
47a2178ee4 feat: add GET /api/stats and GET /api/stats/download endpoints 2026-03-04 12:04:11 -08:00
3f0cd7e837 feat: add GET/POST /api/config endpoints for IMAP account management 2026-03-04 12:03:40 -08:00
8a0545a6e7 feat: extract IMAP logic to app/imap_fetch.py for reuse by API 2026-03-04 11:42:22 -08:00
dc92ecff5f fix: bucket grid now renders 3x3+1 numpad layout on all screen sizes 2026-03-04 11:31:36 -08:00
8d2fdf6299 fix: UndoToast now emits expire after 5s so toast self-dismisses 2026-03-04 11:29:03 -08:00
3788254abd fix: prevent blank page on rebuild and queue drain on skip/discard
Two bugs fixed:

1. Blank white page after vue SPA rebuild: browsers cached old index.html
   referencing old asset hashes. Assets are deleted on rebuild, causing
   404s for JS/CSS -> blank page. Fix: serve index.html with
   Cache-Control: no-cache so browsers always fetch fresh HTML.
   Hashed assets (/assets/chunk-abc123.js) remain cacheable forever.

2. Queue draining to empty on skip/discard: handleSkip and handleDiscard
   never refilled the local queue buffer. After enough skips, store.current
   went null and the empty state showed (blank-looking). Fix: both handlers
   now call fetchBatch() when queue drops below 3, matching handleLabel.

Also: sync classifier_adapters LABELS to match current 10-label schema
(new_lead + hired, remove unrelated).

48 Python tests pass, 48 frontend tests pass.
2026-03-03 19:26:34 -08:00
cb9ebb805c fix(avocet): normalize queue schema + bind to 0.0.0.0 for LAN access
- Add _item_id() (content hash) + _normalize() to map legacy JSONL fields
  (from_addr/account/no-id) to Vue schema (from/source/id)
- All mutating endpoints now look up by _normalize(x)[id] — handles both
  stored-id (test fixtures) and content-hash (real data) transparently
- Change uvicorn bind from 127.0.0.1 to 0.0.0.0 so LAN clients can connect
2026-03-03 18:43:00 -08:00
5ac64b3429 fix(avocet): start-api polls port instead of sleeping 1s — avoids false-success on slow start 2026-03-03 18:11:53 -08:00
2fdafc1d10 fix(avocet): strip HTML from email bodies — stdlib HTMLParser, no deps 2026-03-03 16:28:18 -08:00
65d9f6089e feat(avocet): easter eggs — hired confetti, century mark, clean sweep, midnight labeler, cursor trail 2026-03-03 16:24:47 -08:00
5cf5fe9c43 feat(avocet): manage.sh start-api / stop-api / open-api commands 2026-03-03 16:23:56 -08:00
8e3d263847 feat(avocet): LabelView — wires store, API, card stack, keyboard, easter eggs
Implements Task 13: LabelView.vue wires together the label store, API
fetch, card stack, bucket grid, keyboard shortcuts, haptics, motion
preference, and three easter egg badges (on-a-roll, speed round, fifty
deep). App.vue updated to mount LabelView and restore hacker-mode theme
on load. 3 new LabelView tests; all 48 tests pass, build clean.
2026-03-03 16:21:07 -08:00
05d12a1417 feat(avocet): LabelBucketGrid bucket-mode CSS — spring expansion, glow on drop 2026-03-03 16:19:29 -08:00
97437f39c9 feat(avocet): EmailCardStack — swipe gestures, depth shadows, dismissal classes 2026-03-03 16:16:09 -08:00
5f03db20dc feat(avocet): UndoToast — 5-second countdown, undo button, accessible 2026-03-03 16:13:02 -08:00
2fd101f382 feat(avocet): useLabelKeyboard — 1-9, h, S, D, U, ? shortcuts 2026-03-03 16:12:58 -08:00
452730bb98 feat(avocet): LabelBucketGrid — numpad layout, bucket-mode expansion, drag drop 2026-03-03 16:04:31 -08:00
f5aca77ff6 feat(avocet): EmailCard component — subject, from/date, body preview, expand/collapse 2026-03-03 16:03:01 -08:00
7edcb089a9 feat(avocet): useApi, useMotion, useHaptics, useEasterEgg (Konami/hacker mode)
- useApiFetch: typed fetch wrapper with network/http error discrimination
- useMotion: reactive localStorage override for rich-animation toggle, respects OS prefers-reduced-motion
- useHaptics: label/discard/skip/undo vibration patterns, gated on rich mode
- useKonamiCode + useHackerMode: 10-key Konami sequence → hacker theme, persisted in localStorage
- test-setup.ts: jsdom matchMedia stub so useMotion imports cleanly in Vitest
- smoke.test.ts: import smoke tests for all 4 composables (12 tests, all passing)
2026-03-03 15:58:43 -08:00
50dd4c2f45 feat(avocet): Pinia label store with queue, lastAction, easter egg counter 2026-03-03 15:54:44 -08:00
b10c337214 fix(avocet): align theme with Peregrine design system — full token set, dark mode, self-hosted fonts 2026-03-03 15:52:38 -08:00
a9e5c0d8f3 feat(avocet): CircuitForge base theme + Avocet Slate Teal/Russet colors 2026-03-03 15:49:07 -08:00
02efd5fb1d feat(avocet): Vite + Vue 3 + UnoCSS + Vitest scaffold 2026-03-03 15:46:58 -08:00
01cc908eab fix(avocet): undo — commit-then-clear order, empty-records guard, skip dedup, stronger test 2026-03-03 15:41:58 -08:00
f4facc6484 feat(avocet): discard, undo, labels config, static serving — backend complete 2026-03-03 15:35:01 -08:00
5912b73705 feat(avocet): POST /api/skip endpoint 2026-03-03 15:21:32 -08:00
efd9d69692 fix(avocet): store original item in _last_action; add requirements.txt 2026-03-03 15:16:54 -08:00
ce202d97ea feat(avocet): POST /api/label endpoint 2026-03-03 15:14:04 -08:00
6556e3fef0 fix(avocet): queue_with_items fixture uses api._DATA_DIR to avoid implicit tmp_path coupling 2026-03-03 15:03:57 -08:00
8898258055 feat(avocet): GET /api/queue endpoint 2026-03-03 15:00:59 -08:00
d36d0be166 fix(avocet): _write_jsonl empty-list writes empty file; add reset_last_action helper 2026-03-03 14:36:18 -08:00
f06114e648 feat(avocet): FastAPI skeleton + JSONL helpers 2026-03-03 13:30:28 -08:00