Browse: meal_type categories near-empty (Lunch, Dinner, Snack, Beverage, Side Dish) #122

Closed
opened 2026-04-26 19:02:38 -07:00 by pyr0ball · 0 comments
Owner

Problem

The recipe browser meal_type domain has very uneven coverage. Breakfast has substantial results, but Lunch, Dinner, Snack, Beverage, and Side Dish return very few or no recipes.

Root cause: the keyword lists in browser_domains.py do not align with the actual category and keywords values in the food.com corpus.

Investigation needed

Run the following against the corpus DB to understand the actual distribution:

SELECT category, count(*) FROM recipes
GROUP BY category ORDER BY count(*) DESC LIMIT 100;

Also check which keywords actually appear in the keywords JSON column:

SELECT keywords, count(*) FROM recipes
WHERE keywords IS NOT NULL
GROUP BY keywords ORDER BY count(*) DESC LIMIT 50;

Fix

Update app/services/recipe/browser_domains.py meal_type keyword lists to include the corpus-specific category values that map to each meal type. For example:

  • Dinner may need to include corpus categories like "chicken dishes", "beef dishes", "pork", "seafood", "vegetables" etc.
  • Lunch may need to include "salads", "sandwiches", "soups" as corpus category values
  • Snack may need to include "appetizers", "snacks", "dips" etc.
  • Beverage and Side Dish similarly

Acceptance criteria

  • Each meal_type category returns at least 100 recipes from the corpus
  • Category counts are visible in the browser domain list endpoint (/recipes/browse/meal_type)
  • Verified with SELECT category, count(*) FROM recipes GROUP BY category cross-referenced against updated keyword lists

See also: domain keyword enrichment audit ticket (sister issue).

## Problem The recipe browser `meal_type` domain has very uneven coverage. Breakfast has substantial results, but Lunch, Dinner, Snack, Beverage, and Side Dish return very few or no recipes. Root cause: the keyword lists in `browser_domains.py` do not align with the actual `category` and `keywords` values in the food.com corpus. ## Investigation needed Run the following against the corpus DB to understand the actual distribution: ```sql SELECT category, count(*) FROM recipes GROUP BY category ORDER BY count(*) DESC LIMIT 100; ``` Also check which keywords actually appear in the `keywords` JSON column: ```sql SELECT keywords, count(*) FROM recipes WHERE keywords IS NOT NULL GROUP BY keywords ORDER BY count(*) DESC LIMIT 50; ``` ## Fix Update `app/services/recipe/browser_domains.py` `meal_type` keyword lists to include the corpus-specific category values that map to each meal type. For example: - **Dinner** may need to include corpus categories like `"chicken dishes"`, `"beef dishes"`, `"pork"`, `"seafood"`, `"vegetables"` etc. - **Lunch** may need to include `"salads"`, `"sandwiches"`, `"soups"` as corpus category values - **Snack** may need to include `"appetizers"`, `"snacks"`, `"dips"` etc. - **Beverage** and **Side Dish** similarly ## Acceptance criteria - Each meal_type category returns at least 100 recipes from the corpus - Category counts are visible in the browser domain list endpoint (`/recipes/browse/meal_type`) - Verified with `SELECT category, count(*) FROM recipes GROUP BY category` cross-referenced against updated keyword lists ## Related See also: domain keyword enrichment audit ticket (sister issue).
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: Circuit-Forge/kiwi#122
No description provided.