feat: Corrections tab — SFT candidate import, review, and JSONL export #15
Loading…
Reference in a new issue
No description provided.
Delete branch "feat/sft-corrections"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Summary
Implements avocet#14 "Import benchmark SFT candidates for labeling". Adds a full Corrections workflow to the Avocet SPA that pulls SFT candidates from cf-orch benchmark runs, surfaces them as reviewable cards, collects human corrections, and exports approved records as SFT-ready JSONL. Also retires the old Streamlit app.
Backend (
app/sft.py+scripts/sft_import.py)GET /api/sft/runs— discover importable benchmark result runs from cf-orchPOST /api/sft/import— import a run with JSONL deduplication onidfield (streaming_read_existing_idsfor memory efficiency)GET /api/sft/queue— paginated queue ofneeds_reviewcandidatesPOST /api/sft/submit— approve (correct), discard, or flag a candidate; validated withLiteral["correct","discard","flag"]POST /api/sft/undo— restore the last submitted item toneeds_reviewGET /api/sft/export— NDJSON streaming export of approved recordsGET /api/sft/stats— counts by status using shared_is_exportable()predicateGET /api/sft/config+POST /api/sft/config— read/writebench_results_dirwith atomic file write (tmp + rename); null-safe YAML section handlingset_sft_data_dir/set_sft_config_dir)Frontend
stores/sft.ts— Pinia store: queue, current (computed), lastAction, removeCurrentFromQueue, restoreItemSftCorrectionArea.vue— inline correction textarea with expose/reset, accessible aria-describedby guardSftCard.vue— quality chip (low/mid/ok), collapsible prompt, action buttons, failure_reason null guarduseSftKeyboard.ts— keyboard shortcuts (c/d/f/Escape) with input/textarea focus guardsCorrectionsView.vue— main review page with pessimistic submit/undo (rollback-safe), undo toast, stats sidebarSettingsView.vue— SFT Integration section with bench_results_dir input, run picker table, import UI; loads saved config on mountRetired
streamlit_app.py,streamlit_requirements.txt,run_streamlit.sh) removedTests
test_sft.py+ 7 newtest_sft_import.py)Test plan
- Add /corrections route to Vue router (lazy-loaded CorrectionsView) - Add Corrections nav item (✍️) to AppSidebar after Benchmark - Add cf-orch Integration section to SettingsView with bench_results_dir field, run scanner, and per-run import table - Add GET /api/sft/config and POST /api/sft/config endpoints to app/sft.py- sft.py GET /config: use `or {}` guard so `sft: ~` (null YAML) doesn't return None instead of the default empty config - CorrectionsView: convert handleCorrect/Discard/Flag and handleUndo from optimistic to pessimistic — queue mutation only happens after server confirms; failures leave item in queue so user can retry cleanly - SettingsView: call loadSftConfig() on mount so saved bench_results_dir is populated instead of always starting empty