app/cloud_session.py:
- Thin wrapper around cf_core.cloud_session.CloudSessionFactory
- BYOK detection reads ~/.config/circuitforge/llm.yaml (same path as other products)
- get_session: FastAPI dependency, returns CloudUser (user_id, tier, has_byok)
- require_tier: dependency factory for tier-gated routes
app/imitate.py:
- _run_cftext gains user_id: str | None param; non-None values included in
the cf-orch ServiceAllocateRequest so premium users get their custom models
- run_imitate injects session via Depends(_get_imitate_session); extracts user_id,
filters out local/anon sessions (they get the shared catalog), passes real
cloud user_id to the ThreadPoolExecutor fanout
- _get_imitate_session wraps get_session with a try/except so imitate keeps
working in envs where cloud_session deps aren't installed
Backend:
- Run all cf-text model allocations concurrently via ThreadPoolExecutor + as_completed
- Announce model_start events upfront so the UI can show loading states immediately
- Replace timer-based startup polling with coordinator state signals: waits for
state=="running" (success) or state=="stopped" (fail-fast) on the matching
node/gpu instance; falls back to health poll after 6 consecutive probe misses
- Add /api/cforch/catalog endpoint: fetches live cf-text model list from cf-orch,
filtering out proxy entries (ollama://, vllm://, http://) so only loadable models
are returned
Frontend (ImitateView.vue):
- Show per-model loading spinners as results arrive via SSE stream
- Display cold-start badge when coordinator signals the model was freshly loaded
Backend (app/imitate.py):
- GET /api/imitate/products — reads imitate: config, checks online status
- GET /api/imitate/products/{id}/sample — fetches real item from product API
- GET /api/imitate/run (SSE) — streams ollama responses for selected models
- POST /api/imitate/push-corrections — queues results in SFT corrections JSONL
Frontend (ImitateView.vue):
- Step 1: product picker grid (online/offline status, icon from config)
- Step 2: raw sample preview + editable prompt textarea
- Step 3: ollama model multi-select, temperature slider, SSE run with live log
- Step 4: response cards side by side, push to Corrections button
Wiring:
- app/api.py: include imitate_router at /api/imitate
- web/src/router: /imitate route + lazy import
- AppSidebar: Imitate nav entry (mirror icon)
- config/label_tool.yaml.example: imitate: section with peregrine example
- 16 unit tests (100% passing)
Also: BenchmarkView.vue Compare panel — side-by-side run diff for bench results