feat: Imitate tab — pull CF product samples, compare LLM responses #23

Merged

pyr0ball merged 2 commits from feat/imitate into main

2026-04-09 20:13:21 -07:00

pyr0ball commented

2026-04-09 20:05:02 -07:00

Owner

Summary

Backend (app/imitate.py): 4 endpoints — product list with online check, real sample fetch from CF product APIs, SSE-streamed ollama run, push-to-corrections
Frontend (ImitateView.vue): 4-step wizard — product picker → sample/prompt editor → model multi-select + SSE run log → side-by-side result cards with corrections push
Wiring: app/api.py, router, sidebar nav, label_tool.yaml.example imitate section
BenchmarkView: Compare panel for side-by-side bench run diffs
Tests: 16 unit tests (100% passing); also fixes pre-existing test_tasks_parses_yaml schema mismatch

Test plan

conda run -n cf python -m pytest tests/test_imitate.py -v — 16 passed
Start avocet, navigate to /imitate, confirm product grid loads from config
Select a product (requires imitate: block in config/label_tool.yaml)
Fetch sample, edit prompt, run against one ollama model
Push results to Corrections, verify JSONL appended
Offline product shows greyed-out card

## Summary - **Backend** (`app/imitate.py`): 4 endpoints — product list with online check, real sample fetch from CF product APIs, SSE-streamed ollama run, push-to-corrections - **Frontend** (`ImitateView.vue`): 4-step wizard — product picker → sample/prompt editor → model multi-select + SSE run log → side-by-side result cards with corrections push - **Wiring**: `app/api.py`, router, sidebar nav, `label_tool.yaml.example` imitate section - **BenchmarkView**: Compare panel for side-by-side bench run diffs - **Tests**: 16 unit tests (100% passing); also fixes pre-existing `test_tasks_parses_yaml` schema mismatch ## Test plan - [ ] `conda run -n cf python -m pytest tests/test_imitate.py -v` — 16 passed - [ ] Start avocet, navigate to /imitate, confirm product grid loads from config - [ ] Select a product (requires `imitate:` block in `config/label_tool.yaml`) - [ ] Fetch sample, edit prompt, run against one ollama model - [ ] Push results to Corrections, verify JSONL appended - [ ] Offline product shows greyed-out card

pyr0ball added 2 commits 2026-04-09 20:05:03 -07:00

test: fix test_tasks_parses_yaml for TaskEntry schema dc2dc70ef9

TaskEntry now includes prompt/system fields (default ""). Switch from
exact dict comparison to field-by-field assertions so the test is
forward-compatible with optional schema additions.

feat: Imitate tab — pull CF product samples, compare LLM responses 118ae2660a

Backend (app/imitate.py):
- GET /api/imitate/products — reads imitate: config, checks online status
- GET /api/imitate/products/{id}/sample — fetches real item from product API
- GET /api/imitate/run (SSE) — streams ollama responses for selected models
- POST /api/imitate/push-corrections — queues results in SFT corrections JSONL

Frontend (ImitateView.vue):
- Step 1: product picker grid (online/offline status, icon from config)
- Step 2: raw sample preview + editable prompt textarea
- Step 3: ollama model multi-select, temperature slider, SSE run with live log
- Step 4: response cards side by side, push to Corrections button

Wiring:
- app/api.py: include imitate_router at /api/imitate
- web/src/router: /imitate route + lazy import
- AppSidebar: Imitate nav entry (mirror icon)
- config/label_tool.yaml.example: imitate: section with peregrine example
- 16 unit tests (100% passing)

Also: BenchmarkView.vue Compare panel — side-by-side run diff for bench results

pyr0ball force-pushed feat/imitate from 118ae2660a to 3299c0e23a

2026-04-09 20:13:10 -07:00

Compare