avocet

Author	SHA1	Message	Date
pyr0ball	b6d45c746c	fix: shared _is_exportable predicate, return type annotations on export/stats	2026-04-08 15:07:24 -07:00
pyr0ball	07807f0d05	feat: sft router — /export and /stats endpoints	2026-04-08 14:46:08 -07:00
pyr0ball	4ad2907ae8	fix: use Literal type for SubmitRequest.action field	2026-04-08 14:33:38 -07:00
pyr0ball	f19cab60f7	feat: sft router — /queue, /submit, /undo endpoints	2026-04-08 14:22:06 -07:00
pyr0ball	b330e84111	fix: sft router — yaml error handling, none filter, shared jsonl utils, fixture restore	2026-04-08 14:07:09 -07:00
pyr0ball	597ffc7324	feat: sft router skeleton — /api/sft/runs and /api/sft/import	2026-04-08 13:54:58 -07:00
pyr0ball	25880e377d	refactor: consolidate HTML extraction into app/utils.py Rename _strip_html/_extract_body to strip_html/extract_body (public API). Remove duplicate _TextExtractor, strip_html, and _extract_body from imap_fetch.py; import from app.utils instead. Update test_label_tool.py to use the new public names.	2026-04-08 06:52:15 -07:00
pyr0ball	ae0ac19505	chore: retire Streamlit app, scaffold sft branch - Delete app/label_tool.py (Streamlit UI retired; Vue SPA is sole UI) - Extract _strip_html and _extract_body into app/utils.py (stdlib-only, reusable) - Update tests/test_label_tool.py import to app.utils - Rename start-api/stop-api/restart-api/open-api → start/stop/restart/open in manage.sh - Remove STREAMLIT variable and all Streamlit-specific case blocks from manage.sh - Update manage.sh usage section to reflect Vue+FastAPI-only commands - Add data/sft_candidates.jsonl and data/sft_approved.jsonl to .gitignore - Add sft.bench_results_dir key to config/label_tool.yaml.example	2026-04-08 06:18:12 -07:00
pyr0ball	e38a28dcc3	fix(avocet): narrow cancel except clause, clear stale cancel flags on new run - except clause in cancel_benchmark/cancel_finetune narrowed from Exception to _subprocess.TimeoutExpired (C1) - _cancelled_jobs.discard() called after registering new proc to prevent a stale flag from a prior run masking errors (I2) - local `import subprocess` removed from run_benchmark and run_finetune_endpoint; all Popen calls updated to _subprocess.Popen (I1) - test patch targets updated from subprocess.Popen to app.api._subprocess.Popen; cancelled-event tests updated to set flag in proc.wait() side-effect so the discard-on-new-run logic is exercised correctly	2026-03-15 18:13:01 -07:00
pyr0ball	0ab49609c0	feat(avocet): add cancel endpoints for benchmark and finetune jobs Adds POST /api/benchmark/cancel and POST /api/finetune/cancel endpoints that terminate the running subprocess (kill on 3s timeout), and updates the run generators to emit a cancelled SSE event instead of error when the job was intentionally stopped.	2026-03-15 18:09:20 -07:00
pyr0ball	cbc382cc88	fix(avocet): reduce deberta-small VRAM + auto-select freest GPU for training - deberta-small: batch_size 16→8 + grad_accum 1→2 (same effective batch), gradient_checkpointing=True (fp16 stays off: DeBERTa v3 disentangled attention overflows fp16 at the gather step) - api: _best_cuda_device() picks highest free-VRAM GPU via nvidia-smi; sets CUDA_VISIBLE_DEVICES in subprocess env to prevent DataParallel replication across both GPUs; adds PYTORCH_ALLOC_CONF=expandable_segments - SSE log now reports which GPU was selected	2026-03-15 17:09:06 -07:00
pyr0ball	dd352f07cd	fix(avocet): _MODELS_DIR overridable in tests; sanitize score paths against path traversal	2026-03-15 16:07:27 -07:00
pyr0ball	903624a4b8	feat(avocet): add /api/finetune/status and /api/finetune/run endpoints	2026-03-15 16:04:34 -07:00
pyr0ball	a53f3a7341	feat(avocet): benchmark UI, label fixes, BenchmarkView with charts and SSE run	2026-03-15 09:39:37 -07:00
pyr0ball	07407117a5	feat: add GET /api/fetch/stream SSE endpoint for real-time IMAP progress	2026-03-04 12:05:23 -08:00
pyr0ball	e5e66b09cc	feat: add POST /api/accounts/test endpoint	2026-03-04 12:04:42 -08:00
pyr0ball	47a2178ee4	feat: add GET /api/stats and GET /api/stats/download endpoints	2026-03-04 12:04:11 -08:00
pyr0ball	3f0cd7e837	feat: add GET/POST /api/config endpoints for IMAP account management	2026-03-04 12:03:40 -08:00
pyr0ball	8a0545a6e7	feat: extract IMAP logic to app/imap_fetch.py for reuse by API	2026-03-04 11:42:22 -08:00
pyr0ball	3788254abd	fix: prevent blank page on rebuild and queue drain on skip/discard Two bugs fixed: 1. Blank white page after vue SPA rebuild: browsers cached old index.html referencing old asset hashes. Assets are deleted on rebuild, causing 404s for JS/CSS -> blank page. Fix: serve index.html with Cache-Control: no-cache so browsers always fetch fresh HTML. Hashed assets (/assets/chunk-abc123.js) remain cacheable forever. 2. Queue draining to empty on skip/discard: handleSkip and handleDiscard never refilled the local queue buffer. After enough skips, store.current went null and the empty state showed (blank-looking). Fix: both handlers now call fetchBatch() when queue drops below 3, matching handleLabel. Also: sync classifier_adapters LABELS to match current 10-label schema (new_lead + hired, remove unrelated). 48 Python tests pass, 48 frontend tests pass.	2026-03-03 19:26:34 -08:00
pyr0ball	cb9ebb805c	fix(avocet): normalize queue schema + bind to 0.0.0.0 for LAN access - Add _item_id() (content hash) + _normalize() to map legacy JSONL fields (from_addr/account/no-id) to Vue schema (from/source/id) - All mutating endpoints now look up by _normalize(x)[id] — handles both stored-id (test fixtures) and content-hash (real data) transparently - Change uvicorn bind from 127.0.0.1 to 0.0.0.0 so LAN clients can connect	2026-03-03 18:43:00 -08:00
pyr0ball	2fdafc1d10	fix(avocet): strip HTML from email bodies — stdlib HTMLParser, no deps	2026-03-03 16:28:18 -08:00
pyr0ball	01cc908eab	fix(avocet): undo — commit-then-clear order, empty-records guard, skip dedup, stronger test	2026-03-03 15:41:58 -08:00
pyr0ball	f4facc6484	feat(avocet): discard, undo, labels config, static serving — backend complete	2026-03-03 15:35:01 -08:00
pyr0ball	5912b73705	feat(avocet): POST /api/skip endpoint	2026-03-03 15:21:32 -08:00
pyr0ball	efd9d69692	fix(avocet): store original item in _last_action; add requirements.txt	2026-03-03 15:16:54 -08:00
pyr0ball	ce202d97ea	feat(avocet): POST /api/label endpoint	2026-03-03 15:14:04 -08:00
pyr0ball	8898258055	feat(avocet): GET /api/queue endpoint	2026-03-03 15:00:59 -08:00
pyr0ball	d36d0be166	fix(avocet): _write_jsonl empty-list writes empty file; add reset_last_action helper	2026-03-03 14:36:18 -08:00
pyr0ball	f06114e648	feat(avocet): FastAPI skeleton + JSONL helpers	2026-03-03 13:30:28 -08:00
pyr0ball	f48590f859	feat: discard button — removes email from queue without writing to score file	2026-02-27 15:48:40 -08:00
pyr0ball	ab764cb8f6	feat: targeted fetch — date range + sender/subject filter for historical email pulls	2026-02-27 15:15:49 -08:00
pyr0ball	4c346aa328	feat: 9 labels (add event_rescheduled/unrelated/digest), wildcard Other label, InvalidCharacterError fix	2026-02-27 14:34:15 -08:00
pyr0ball	4c659033c9	fix: fetch log — overwrite per-email progress instead of appending status.write() per email grows the log unboundedly on big pulls. Now uses status.empty() to create one updatable slot; per-email progress overwrites it, cleared after each account completes. Per-account summaries still use status.write() (one line each).	2026-02-27 14:20:57 -08:00
pyr0ball	260c7c0f96	feat: add Settings tab with IMAP account GUI + connection test - ⚙️ Settings tab: add/edit/remove accounts without touching YAML - Per-account: name, host, port, SSL, username, password (masked), folder, days back - Test connection button: connect → login → select folder → report message count - Save writes config/label_tool.yaml; Reload discards unsaved changes - _sync_settings_to_state() prevents index-key drift on add/remove - _test_imap_connection() helper shared with fetch tab indirectly - CLAUDE.md: document new tab, Settings UI design notes	2026-02-27 14:18:51 -08:00
pyr0ball	d68754d432	feat: initial avocet repo — email classifier training tool Scrape → Store → Process pipeline for building email classifier benchmark data across the CircuitForge menagerie. - app/label_tool.py — Streamlit card-stack UI, multi-account IMAP fetch, 6-bucket labeling, undo/skip, keyboard shortcuts (1-6/S/U) - scripts/classifier_adapters.py — ZeroShotAdapter (+ two_pass), GLiClassAdapter, RerankerAdapter; ABC with lazy model loading - scripts/benchmark_classifier.py — 13-model registry, --score, --compare, --list-models, --export-db; uses label_tool.yaml for IMAP - tests/ — 20 tests, all passing, zero model downloads required - config/label_tool.yaml.example — multi-account IMAP template - data/email_score.jsonl.example — sample labeled data for CI Labels: interview_scheduled, offer_received, rejected, positive_response, survey_received, neutral	2026-02-27 14:07:38 -08:00

36 commits