discover_wayback.py — enumerates recipe slugs from archived menu API
(/api/v2/menus/<id>) and product API (/api/v1/products/*) plus
recipe-category HTML pages. Writes incremental JSONL manifest to
/Library/Assets/kiwi/pipeline/pc_slugs.jsonl.
scrape_recipes.py — fetches full recipe data per slug using three-tier
fallback: product API JSON (oldest captures first), HTML inline state
(__NEXT_DATA__ / __INITIAL_STATE__), and JSON-LD structured data.
Outputs recipes_purplecarrot.parquet in food.com columnar format so
build_recipe_index.py imports it unchanged. Includes SourceURL column
for recipe attribution UI (kiwi#139). Checkpoints every 50 recipes.
Initial discovery: 158 slugs from menu 1536 + product_api pass.
Re-run discover_wayback.py after archive.org stabilizes to pick up
older slugs from recipe-category pages.
Backlog: live Playwright scraper for post-Wayback recipes (kiwi#137).