33a5cdec37
feat: cloud auth bypass, VRAM leasing, barcode EXIF fix, pipeline improvements
...
- cloud_session.py: CLOUD_AUTH_BYPASS_IPS with CIDR support; X-Real-IP for
Docker bridge NAT-aware client IP resolution; local-dev DB path under
CLOUD_DATA_ROOT for bypass sessions
- compose.cloud.yml: thread CLOUD_AUTH_BYPASS_IPS from shell env; document
Docker bridge CIDR requirement in .env.example
- nginx.cloud.conf + nginx.conf: client_max_body_size 20m for barcode uploads
- barcode_scanner.py: EXIF orientation correction (PIL ImageOps.exif_transpose)
before cv2 decode; rotation coverage extended to [90, 180, 270, 45, 135]
to catch sideways barcodes the 270° case was missing
- llm_recipe.py: CF-core VRAM lease acquire/release wrapping LLMRouter calls
- tasks/runner.py + config.py: COORDINATOR_URL + recipe_llm VRAM budget (4GB)
- recipes.py: per-request Store creation inside asyncio.to_thread worker to
avoid SQLite check_same_thread violations
- download_datasets.py: HF_PARQUET_FILES strategy for repos without dataset
builders (lishuyang/recipepairs direct parquet download)
- derive_substitutions.py: use recipepairs_recipes.parquet for ingredient
lookup; numpy array detection; JSON category parsing
- test_build_flavorgraph_index.py: rewritten for CSV-based index format
- pyproject.toml: add Pillow>=10.0 for EXIF rotation support
2026-04-01 16:06:23 -07:00
77627cec23
fix: data pipeline — R-vector parser, allrecipes dataset, unique recipe index
...
- build_recipe_index.py: add _parse_r_vector() for food.com R format, add
_parse_allrecipes_text() for corbt/all-recipes text format, _row_to_fields()
dispatcher handles both columnar (food.com) and single-text (all-recipes)
- build_flavorgraph_index.py: switch from graph.json to nodes/edges CSVs
matching actual FlavorGraph repo structure
- download_datasets.py: switch recipe source to corbt/all-recipes (2.1M
recipes, 807MB) replacing near-empty AkashPS11/recipes_data_food.com
- 007_recipe_corpus.sql: add UNIQUE constraint on external_id to prevent
duplicate inserts on pipeline reruns
2026-03-31 21:36:13 -07:00
e44d36e32f
fix: pipeline scripts — connection safety, remove unused recipes_path arg, fix inserted counter, pre-load profile index
2026-03-30 23:10:52 -07:00
bad6dd175c
feat: data pipeline -- recipe corpus + substitution pair derivation
2026-03-30 22:55:41 -07:00
59b6a8265f
feat: data pipeline -- FlavorGraph molecule index builder
2026-03-30 22:46:53 -07:00
97203313c1
feat: data pipeline -- USDA FDC ingredient index builder
2026-03-30 22:44:25 -07:00