kiwi/docs/getting-started/llm-setup.md
pyr0ball 01aae2eec8
Some checks failed
CI / Backend (Python) (push) Has been cancelled
CI / Frontend (Vue) (push) Has been cancelled
CI / Backend (Python) (pull_request) Has been cancelled
CI / Frontend (Vue) (pull_request) Has been cancelled
fix: recipe enrichment backfill, main_ingredient browser domain, bug batch
Recipe corpus (#108):
- Add _MAIN_INGREDIENT_SIGNALS to tag_inferrer.py (Chicken/Beef/Pork/Fish/Pasta/
  Vegetables/Eggs/Legumes/Grains/Cheese) — infers main:* tags from ingredient names
- Update browser_domains.py main_ingredient categories to use main:* tag queries
  instead of raw food terms; recipe_browser_fts now has full 3.19M row coverage
  (was ~1.2K before backfill)

Bug fixes:
- Fix community posts response shape (#96): add total/page/page_size fields
- Fix export endpoint arg types (#92)
- Fix household invite store leak (#93)
- Fix receipts endpoint issues
- Fix saved_recipes endpoint
- Add session endpoint (app/api/endpoints/session.py)

Shopping list:
- Add migration 033_shopping_list.sql
- Add shopping schemas (app/models/schemas/shopping.py)
- Add ShoppingView.vue, ShoppingItemRow.vue, shopping.ts store

Frontend:
- InventoryList, RecipesView, RecipeDetailPanel polish
- App.vue routing updates for shopping view

Docs:
- Add user-facing docs under docs/ (getting-started, user-guide, reference)
- Add screenshots
2026-04-18 15:38:56 -07:00

74 lines
2.4 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# LLM Backend Setup (Optional)
An LLM backend unlocks **receipt OCR**, **recipe suggestions (L3L4)**, and **style auto-classification**. Everything else works without one.
You can use any OpenAI-compatible inference server: Ollama, vLLM, LM Studio, a local llama.cpp server, or a commercial API.
## BYOK — Bring Your Own Key
BYOK means you provide your own LLM backend. Paid AI features are unlocked at **any tier** when a valid backend is configured. You pay for your own inference; Kiwi just uses it.
## Choosing a backend
| Backend | Best for | Notes |
|---------|----------|-------|
| **Ollama** | Local, easy setup | Recommended for getting started |
| **vLLM** | Local, high throughput | Better for faster hardware |
| **OpenAI API** | No local GPU | Requires paid API key |
| **Anthropic API** | No local GPU | Requires paid API key |
## Ollama setup (recommended)
```bash
# Install Ollama
curl -fsSL https://ollama.ai/install.sh | sh
# Pull a model — llama3.1 8B works well for recipe tasks
ollama pull llama3.1
# Verify it's running
ollama list
```
In your Kiwi `.env`:
```bash
LLM_BACKEND=ollama
LLM_BASE_URL=http://host.docker.internal:11434
LLM_MODEL=llama3.1
```
!!! note "Docker networking"
Use `host.docker.internal` instead of `localhost` when Ollama is running on your host and Kiwi is in Docker.
## OpenAI-compatible API
```bash
LLM_BACKEND=openai
LLM_BASE_URL=https://api.openai.com/v1
LLM_API_KEY=sk-your-key-here
LLM_MODEL=gpt-4o-mini
```
## Verify the connection
In the Kiwi **Settings** page, the LLM status indicator shows whether the backend is reachable. A green checkmark means OCR and L3L4 recipe suggestions are active.
## What LLM is used for
| Feature | LLM required |
|---------|-------------|
| Receipt OCR (line-item extraction) | Yes |
| Recipe suggestions L1 (pantry match) | No |
| Recipe suggestions L2 (substitution) | No |
| Recipe suggestions L3 (style templates) | Yes |
| Recipe suggestions L4 (full generation) | Yes |
| Style auto-classifier | Yes |
L1 and L2 suggestions use deterministic matching — they work without any LLM configured. See [Recipe Engine](../reference/recipe-engine.md) for the full algorithm breakdown.
## Model recommendations
- **Receipt OCR**: any model with vision capability (LLaVA, GPT-4o, etc.)
- **Recipe suggestions**: 7B13B instruction-tuned models work well; larger models produce more creative L4 output
- **Style classification**: small models handle this fine (3B+)