kiwi/app/services/label_capture.py
pyr0ball 17e62c451f feat: visual label capture for unenriched barcodes (kiwi#79)
When a barcode scan finds no product in FDC/OFF, paid-tier users now see a
"Capture label" offer instead of a dead-end "add manually" prompt.

Backend:
- Migration 036: captured_products local cache table (keyed by barcode,
  UPSERT on conflict so re-capture refreshes rather than errors)
- store.get_captured_product / save_captured_product (with JSON decode for
  ingredient_names and allergens)
- app/services/label_capture.py: wraps cf-core VisionRouter (caption API);
  graceful fallback to zero-confidence mock when stub/error; JSON fence
  stripping; confidence clamped to [0,1]; KIWI_LABEL_CAPTURE_MOCK=1 for tests
- New schemas: LabelCaptureResponse, LabelConfirmRequest, LabelConfirmResponse
- POST /inventory/scan/label-capture — image to extraction (paid+ gate, 403)
- POST /inventory/scan/label-confirm — save confirmed product + optional
  inventory add
- Both scan endpoints now: check captured_products cache before FDC/OFF;
  set needs_visual_capture=True for gap products on paid tier; BarcodeScanResult
  gains needs_visual_capture field
- visual_label_capture feature gate added to tiers.py (paid)

Tests: 42 new tests (service, store/migration, API endpoints) — 367 total passing

Frontend:
- InventoryList.vue: capturePhase state machine (offer => uploading => reviewing)
- Offer card appears after scan gap (calm UX: no urgency, Discard always visible)
- Review form: pre-populated from extraction; amber label highlights for
  unread fields (confidence < 0.7); comma-separated ingredients/allergens
- api.ts: LabelCaptureResult + LabelConfirmRequest types; captureLabelPhoto()
  and confirmLabelCapture() API methods
2026-04-24 17:57:25 -07:00

140 lines
4.7 KiB
Python
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

"""Visual label capture service for unenriched products (kiwi#79).
Wraps the cf-core VisionRouter to extract structured nutrition data from a
photographed nutrition facts panel. When the VisionRouter is not yet wired
(NotImplementedError) the service falls back to a mock extraction so the
barcode scan flow can be exercised end-to-end in development.
JSON contract returned by the vision model (and mock):
{
"product_name": str | null,
"brand": str | null,
"serving_size_g": number | null,
"calories": number | null,
"fat_g": number | null,
"saturated_fat_g": number | null,
"carbs_g": number | null,
"sugar_g": number | null,
"fiber_g": number | null,
"protein_g": number | null,
"sodium_mg": number | null,
"ingredient_names": [str],
"allergens": [str],
"confidence": number (0.01.0)
}
"""
from __future__ import annotations
import json
import logging
import os
from typing import Any
log = logging.getLogger(__name__)
# Confidence below this threshold surfaces amber highlights in the UI.
REVIEW_THRESHOLD = 0.7
_MOCK_EXTRACTION: dict[str, Any] = {
"product_name": "Unknown Product",
"brand": None,
"serving_size_g": None,
"calories": None,
"fat_g": None,
"saturated_fat_g": None,
"carbs_g": None,
"sugar_g": None,
"fiber_g": None,
"protein_g": None,
"sodium_mg": None,
"ingredient_names": [],
"allergens": [],
"confidence": 0.0,
}
_EXTRACTION_PROMPT = """You are reading a nutrition facts label photograph.
Extract the following fields as a JSON object with no extra text:
{
"product_name": <product name or null>,
"brand": <brand name or null>,
"serving_size_g": <serving size in grams as a number or null>,
"calories": <calories per serving as a number or null>,
"fat_g": <total fat grams or null>,
"saturated_fat_g": <saturated fat grams or null>,
"carbs_g": <total carbohydrates grams or null>,
"sugar_g": <sugars grams or null>,
"fiber_g": <dietary fiber grams or null>,
"protein_g": <protein grams or null>,
"sodium_mg": <sodium milligrams or null>,
"ingredient_names": [list of individual ingredients as strings],
"allergens": [list of allergens explicitly stated on label],
"confidence": <your confidence this extraction is correct, 0.0 to 1.0>
}
Use null for any field you cannot read clearly. Do not guess values.
Respond with JSON only."""
def extract_label(image_bytes: bytes) -> dict[str, Any]:
"""Run vision model extraction on raw label image bytes.
Returns a dict matching the nutrition JSON contract above.
Falls back to a zero-confidence mock if the VisionRouter is not yet
implemented (stub) or if the model returns unparseable output.
"""
# Allow unit tests to bypass the vision model entirely.
if os.environ.get("KIWI_LABEL_CAPTURE_MOCK") == "1":
log.debug("label_capture: mock mode active")
return dict(_MOCK_EXTRACTION)
try:
from circuitforge_core.vision import caption as vision_caption
result = vision_caption(image_bytes, prompt=_EXTRACTION_PROMPT)
raw = result.caption or ""
return _parse_extraction(raw)
except Exception as exc:
log.warning("label_capture: extraction failed (%s) — returning mock extraction", exc)
return dict(_MOCK_EXTRACTION)
def _parse_extraction(raw: str) -> dict[str, Any]:
"""Parse the JSON string returned by the vision model.
Strips markdown code fences if present. Validates required shape.
Returns the mock on any parse error.
"""
text = raw.strip()
if text.startswith("```"):
# Strip ```json ... ``` fences
lines = text.splitlines()
text = "\n".join(lines[1:-1] if lines[-1].strip() == "```" else lines[1:])
try:
data = json.loads(text)
except json.JSONDecodeError as exc:
log.warning("label_capture: could not parse vision response: %s", exc)
return dict(_MOCK_EXTRACTION)
if not isinstance(data, dict):
log.warning("label_capture: vision response is not a dict")
return dict(_MOCK_EXTRACTION)
# Normalise list fields — model may return None instead of []
for list_key in ("ingredient_names", "allergens"):
if not isinstance(data.get(list_key), list):
data[list_key] = []
# Clamp confidence to [0, 1]
confidence = data.get("confidence")
if not isinstance(confidence, (int, float)):
confidence = 0.0
data["confidence"] = max(0.0, min(1.0, float(confidence)))
return data
def needs_review(extraction: dict[str, Any]) -> bool:
"""Return True when the extraction confidence is below REVIEW_THRESHOLD."""
return float(extraction.get("confidence", 0.0)) < REVIEW_THRESHOLD