Commit graph

28 commits

Author SHA1 Message Date
6ef6f06023 feat: restructure AppSidebar into two-domain nav with section headers and flywheel signal badges 2026-05-02 13:52:45 -07:00
3299c0e23a feat: Imitate tab — pull CF product samples, compare LLM responses
Backend (app/imitate.py):
- GET /api/imitate/products — reads imitate: config, checks online status
- GET /api/imitate/products/{id}/sample — fetches real item from product API
- GET /api/imitate/run (SSE) — streams ollama responses for selected models
- POST /api/imitate/push-corrections — queues results in SFT corrections JSONL

Frontend (ImitateView.vue):
- Step 1: product picker grid (online/offline status, icon from config)
- Step 2: raw sample preview + editable prompt textarea
- Step 3: ollama model multi-select, temperature slider, SSE run with live log
- Step 4: response cards side by side, push to Corrections button

Wiring:
- app/api.py: include imitate_router at /api/imitate
- web/src/router: /imitate route + lazy import
- AppSidebar: Imitate nav entry (mirror icon)
- config/label_tool.yaml.example: imitate: section with peregrine example
- 16 unit tests (100% passing)

Also: BenchmarkView.vue Compare panel — side-by-side run diff for bench results
2026-04-09 20:12:57 -07:00
b6b3d2c390 feat: HuggingFace model management tab
- New /api/models router: HF lookup, approval queue (JSONL persistence),
  SSE download progress via snapshot_download(), installed model listing,
  path-traversal-safe DELETE
- pipeline_tag → adapter type mapping (zero-shot-classification,
  sentence-similarity, text-generation)
- 27 tests covering all endpoints, duplicate detection, path traversal
- ModelsView.vue: HF lookup + add, approval queue, live download progress
  bars via SSE, installed model table with delete
- Sidebar entry (🤗 Models) between Benchmark and Corrections
2026-04-08 22:32:35 -07:00
9633d9a535 feat: add failure_category field to SFT corrections (#16)
Adds optional failure_category to SubmitRequest and candidate records so
reviewers can classify why a model response was wrong, not just what to do
with it. Enables the fine-tune harness to filter training data by failure
type (e.g. exclude scoring artifacts, train only on genuine wrong answers).

Taxonomy: scoring_artifact | style_violation | partial_answer |
          wrong_answer | format_error | hallucination

- app/sft.py: FailureCategory Literal type; SubmitRequest.failure_category;
  stored on candidate record in POST /submit correct branch
- tests/test_sft.py: 3 new tests (stores value, null round-trip, 422 on invalid)
- stores/sft.ts: SftFailureCategory type exported; SftQueueItem + SftLastAction
  updated; setLastAction accepts optional category param
- SftCard.vue: chip-group selector shown during correct/discard/flag flow;
  two-step confirm for discard/flag reveals chips before emitting; category
  forwarded in all emit payloads
- CorrectionsView.vue: handleCorrect/Discard/Flag accept and forward category
  to POST /api/sft/submit body and store.setLastAction
- SftCard.test.ts: 11 new tests covering chip visibility, selection,
  single-active enforcement, pending-action flow, emit payloads, cancel
2026-04-08 22:10:26 -07:00
353d0a47a0 feat: Corrections tab — router, sidebar, settings, SFT config endpoints
- Add /corrections route to Vue router (lazy-loaded CorrectionsView)
- Add Corrections nav item (✍️) to AppSidebar after Benchmark
- Add cf-orch Integration section to SettingsView with bench_results_dir
  field, run scanner, and per-run import table
- Add GET /api/sft/config and POST /api/sft/config endpoints to app/sft.py
2026-04-08 18:29:22 -07:00
03e5f9f9b4 fix: guard null failure_reason render, fix mid-quality test description
- Add v-if guard on failure-reason <p> so null renders no element (not literal "null")
- Clarify mid-quality test description: score is 0.4 to <0.7 (exclusive upper bound)
- Add test: renders nothing for failure_reason when null (+1 → 14 SftCard tests)
2026-04-08 15:23:19 -07:00
e16ea95dcc fix: guard aria-describedby from rendering undefined string 2026-04-08 15:22:12 -07:00
8873920b83 feat: SftCard — quality chip, prompt collapsible, action buttons, correction area slot 2026-04-08 15:19:37 -07:00
2d939b77f9 feat: SftCorrectionArea — inline correction text area component 2026-04-08 15:16:45 -07:00
a53f3a7341 feat(avocet): benchmark UI, label fixes, BenchmarkView with charts and SSE run 2026-03-15 09:39:37 -07:00
ce1b8c2215 fix(avocet): reset card element state when new item loads to clear previous animation inline styles 2026-03-08 07:44:02 -07:00
5c6aa02998 fix(avocet): restore drag aura color feedback via updateAura in useCardAnimation 2026-03-08 07:14:24 -07:00
9302644259 feat(avocet): wire Anime.js card animation into EmailCardStack
Replace CSS keyframe dismiss classes and inline cardStyle/deltaX/deltaY
with useCardAnimation composable — pickup/setDragPosition/snapBack/animateDismiss
are now called from pointer event handlers and a dismissType watcher.
2026-03-08 07:07:58 -07:00
1a95d4d580 fix(avocet): ball escapes overflow clip, floats above header/footer with z-index + transparency 2026-03-05 15:14:24 -08:00
351703d9db fix(avocet): grid pinned to viewport with height 100dvh + card ball floats above finger at scale 0.55 2026-03-05 15:07:58 -08:00
d7cd01a8da feat(avocet): add velocity-based fling detection to toss gesture (option B: speed + alignment) 2026-03-05 14:55:10 -08:00
fc8cb9a8bd feat(avocet): replace swipe+HTML5-drag with unified pointer-events toss gesture 2026-03-05 10:38:52 -08:00
cac02b2c5f feat(avocet): replace HTML5 drag events on LabelBucketGrid with hoveredBucket prop 2026-03-05 10:10:48 -08:00
8a2df0e2f8 feat: card crumples to small ball on drag pickup so buckets expand fully 2026-03-04 12:38:46 -08:00
2a48ab0f03 feat: add Vue Router + stow-able AppSidebar; stub Fetch/Stats/Settings views 2026-03-04 12:12:26 -08:00
dc92ecff5f fix: bucket grid now renders 3x3+1 numpad layout on all screen sizes 2026-03-04 11:31:36 -08:00
8d2fdf6299 fix: UndoToast now emits expire after 5s so toast self-dismisses 2026-03-04 11:29:03 -08:00
05d12a1417 feat(avocet): LabelBucketGrid bucket-mode CSS — spring expansion, glow on drop 2026-03-03 16:19:29 -08:00
97437f39c9 feat(avocet): EmailCardStack — swipe gestures, depth shadows, dismissal classes 2026-03-03 16:16:09 -08:00
5f03db20dc feat(avocet): UndoToast — 5-second countdown, undo button, accessible 2026-03-03 16:13:02 -08:00
452730bb98 feat(avocet): LabelBucketGrid — numpad layout, bucket-mode expansion, drag drop 2026-03-03 16:04:31 -08:00
f5aca77ff6 feat(avocet): EmailCard component — subject, from/date, body preview, expand/collapse 2026-03-03 16:03:01 -08:00
02efd5fb1d feat(avocet): Vite + Vue 3 + UnoCSS + Vitest scaffold 2026-03-03 15:46:58 -08:00