_parse_json_from_text always returns a dict (never None), so the
previous `if parsed is not None` guard was permanently true — garbled
docuvision output would return an empty skeleton instead of falling
through to the local VLM. Replace the check with a meaningful-content
test (items or merchant present). Add two tests: one that asserts the
fallthrough behavior on an empty parse, one that confirms the fast path
is taken when parsing succeeds.
Introduces a thin HTTP client for the cf-docuvision service and wires it
as a fast path in VisionLanguageOCR.extract_receipt_data(). When CF_ORCH_URL
is set, the pipeline attempts docuvision allocation via CFOrchClient before
loading the heavy local VLM; falls back gracefully if unavailable.
- build_recipe_index.py: add _parse_r_vector() for food.com R format, add
_parse_allrecipes_text() for corbt/all-recipes text format, _row_to_fields()
dispatcher handles both columnar (food.com) and single-text (all-recipes)
- build_flavorgraph_index.py: switch from graph.json to nodes/edges CSVs
matching actual FlavorGraph repo structure
- download_datasets.py: switch recipe source to corbt/all-recipes (2.1M
recipes, 807MB) replacing near-empty AkashPS11/recipes_data_food.com
- 007_recipe_corpus.sql: add UNIQUE constraint on external_id to prevent
duplicate inserts on pipeline reruns
Add GroceryLink schema model and grocery_links field to RecipeResult.
Introduce GroceryLinkBuilder service (Amazon Fresh, Walmart, Instacart)
using env-var affiliate tags; no links emitted when tags are absent.
Wire link builder into RecipeEngine.suggest() for levels 1-2.
Add test_grocery_links_free_tier to verify structure contract.
35 tests passing.
Uses circuitforge_core.tasks.scheduler. VRAM detection via cf-orch when
available, falling back to unlimited. Adds expiry_llm_fallback task type
to background-predict expiry dates for items the LUT doesn't cover.