Switches recipe generation service type from 'cf-text' to 'vllm' so the
coordinator can route to quantized small models (Qwen2.5-3B, Phi-4-mini)
rather than the full text backend. Passes CF_APP_NAME for per-product
VRAM/request analytics in the coordinator dashboard.
- llm_recipe.py: _SERVICE_TYPE = 'vllm'; _MODEL_CANDIDATES list; passes
model_candidates and pipeline= to CFOrchClient.allocate()
- compose.cloud.yml: CF_APP_NAME=kiwi env var for coordinator attribution
Users often have ingredients they want to avoid today (out of stock, not feeling it)
that aren't true allergies. The new 'Not today' filter lets them exclude specific
ingredients per session without permanently modifying their allergy list.
- recipe.py schema: exclude_ingredients field (list[str], default [])
- recipe_engine.py: filters corpus results when any ingredient is in exclude_set
- llm_recipe.py: injects exclusions into both prompt templates so LLM-generated
recipes respect the constraint at generation time
- RecipesView.vue: tag-chip UI with Enter/comma input, removes on × click
- stores/recipes.ts: excludeIngredients reactive list (not persisted to localStorage)
When the coordinator returns 429 (all nodes at max_concurrent limit), the previous
code fell back to LLMRouter which is also overloaded at high concurrency. This
caused the request to hang for ~60s before nginx returned a 504.
Now: detect 429/max_concurrent in the RuntimeError message and return "" immediately
so the caller gets an empty RecipeResult (graceful degradation) rather than a timeout.
Aligns llm_recipe.py with the pattern already used by the meal plan
service. cf-text routes through a lighter GGUF/llama.cpp path and
shares VRAM budget with other products via cf-orch, rather than
requiring a dedicated vLLM process. Also drops model_candidates
(not applicable to cf-text allocation).
Closes#70
- Settings: add unit_system key (metric | imperial, default metric)
- Recipe LLM prompts: inject unit instruction into L3 and L4 prompts
so generated recipes use the user's preferred units throughout
- Frontend: new utils/units.ts converter (mirrors Python units.py)
- Inventory list: display quantities converted to preferred units
- Settings view: metric/imperial toggle with save button
- Settings store: load/save unit_system alongside cooking_equipment
Closes#81
Introduces a thin HTTP client for the cf-docuvision service and wires it
as a fast path in VisionLanguageOCR.extract_receipt_data(). When CF_ORCH_URL
is set, the pipeline attempts docuvision allocation via CFOrchClient before
loading the heavy local VLM; falls back gracefully if unavailable.