Recipe browser: subcategory coverage sparse — category-level fully populated #108
Labels
No labels
accessibility
backlog
beta-feedback
bug
duplicate
enhancement
feature-request
help wanted
invalid
needs-design
needs-triage
question
wontfix
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: Circuit-Forge/kiwi#108
Loading…
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Current State (updated 2026-04-21)
What is now working
recipe_browser_ftsis fully populated — 3,195,798 entries covering the entire corpus. The backfill resolved the original data gap.Category-level browse results are healthy:
The browse API matches recipe titles + ingredient names against keyword lists in
browser_domains.py— no dependency onrecipes.categorycolumn data (still only 1,228 rows populated, but irrelevant for browse).Remaining gap: subcategory counts all 0
Drill-down subcategories (Sicilian, Neapolitan, Tuscan, etc.) return 0 results. The subcategory keywords are specific dish names and regional terms that rarely appear in recipe titles in the food.com corpus. The FTS match works — the terms just aren't in the data.
Root cause: The food.com corpus tags recipes at a high level ("Italian") but not regionally. Subcategory classification requires either:
infer_recipe_tags.pyagainst the full corpus to derive regional subcategory membership from ingredient + title signalsrecipe_count > 0in the UINew categories — pending cloud restart
browser_domains.pywas updated 2026-04-21 to add:These will not appear in the browse UI until
kiwi-cloud-api-1is restarted with the updated code.Next steps
docker restart kiwi-cloud-api-1to expose new categoriesscripts/pipeline/infer_recipe_tags.pyagainst full 3.2M corpus to populate subcategory coveragerecipe_browser_fts: only 1.2K of 3.2M corpus recipes have category/keywords — browser returns sparse resultsto Recipe browser: subcategory coverage sparse — category-level fully populated