Commit graph

4 commits

Author SHA1 Message Date
c72b4415db feat(recipe_scan): use Qwen2-VL GGUF via cf-text OpenAI-compat API
Replace two-step docuvision OCR + LLM structuring pipeline with a
single multimodal VLM call. The bartowski Qwen2-VL-7B-Instruct Q5_K_M
GGUF is served by cf-text (llama.cpp) and accepts image_url content
blocks identical to the OpenAI vision API format.

Removes docuvision dependency for recipe scanning; the addict-missing /
DeepseekVLV2Processor-missing cf-docuvision error no longer blocks scans.
Receipt OCR (kiwi.ocr task) still routes to cf-docuvision separately.
2026-05-16 18:38:21 -07:00
2df17ec719 feat(recipe-scan): add SSE streaming endpoint for cold-start progress feedback
Some checks failed
CI / Frontend (Vue) (push) Waiting to run
CI / Backend (Python) (push) Waiting to run
Mirror / mirror (push) Has been cancelled
Release / release (push) Has been cancelled
POST /recipes/scan/stream emits live status events while cf-docuvision
allocates and processes, replacing the static spinner with phase-aware labels:
  allocating -> scanning -> structuring -> done|error

Uses asyncio.Queue bridge to route progress callbacks from the sync scanner
thread to the async SSE generator. Frontend updated to consume the stream via
fetch + ReadableStream (EventSource does not support POST multipart).

Closes kiwi#136 (companion to the docuvision routing fix).
2026-05-16 16:24:32 -07:00
4ac24e7920 fix(recipe-scan): wire cf-docuvision OCR + LLMRouter for cloud recipe scanning (kiwi#136)
Some checks are pending
CI / Backend (Python) (push) Waiting to run
CI / Frontend (Vue) (push) Waiting to run
Mirror / mirror (push) Waiting to run
Two-step pipeline: task_allocate("kiwi", "recipe_scan", service_hint="cf-docuvision")
acquires a docuvision allocation, calls /extract per image to get OCR text, then
LLMRouter structures the combined OCR output into recipe JSON via the text
extraction prompt.

Also fixes DocuvisionClient bugs:
- POST field was "image" (ignored by Pydantic) — should be "image_b64"
- Response read "text" key — docuvision returns "raw_text"
- Add hint parameter (use "text" for recipe cards, dense prose)
- Configurable timeout (default 120s; docuvision lazy-loads model on first request)
2026-05-16 14:21:15 -07:00
896b4e048c feat: recipe scanner — photo to structured recipe (kiwi#9)
New feature: photograph a recipe card, cookbook page, or handwritten
note and have it extracted into a structured, editable recipe.

Backend:
- POST /recipes/scan: accept 1-4 photos, run VLM extraction, return
  structured JSON for review (not auto-saved)
- POST /recipes/scan/save: persist a reviewed/edited recipe
- GET/DELETE /recipes/user: user-created recipe CRUD
- Vision backend priority: cf-orch -> local Qwen2.5-VL -> Anthropic BYOK
- 503 with clear config hint when no vision backend available
- Multi-photo support: facing pages (ingredients/directions) sent together
- Pantry cross-reference: marks which ingredients are already on hand
- migration 041: user_recipes table (title, servings, cook_time, steps,
  ingredients JSON, source, pantry_match_pct)
- Tier gate: recipe_scan -> paid, BYOK-unlockable

Frontend:
- "Scan" button in the Recipes tab bar (camera icon)
- RecipeScanModal: upload step (drag-drop + file picker, up to 4 photos,
  live previews), processing step (spinner), review/edit step (all
  fields inline-editable before save), pantry match badge, warning banner
  for low-confidence or incomplete scans

Tests: 35 new tests (23 unit + 12 API), 404 total passing
2026-04-27 08:23:01 -07:00