pagepiper/docs/getting-started/ollama-setup.md

# Ollama Setup

Hybrid vector search and RAG chat are gated behind a local Ollama instance. This is the BYOK (bring your own key) unlock for the Free tier — no paid subscription required.

## Install Ollama

```bash
curl -fsSL https://ollama.ai/install.sh | sh
```

## Pull the required models

```bash
# Embedding model — converts pages into vectors
ollama pull nomic-embed-text

# Chat model — answers questions using retrieved page excerpts
ollama pull mistral:7b
```

`nomic-embed-text` produces 1024-dimensional vectors and runs comfortably on 8 GB of VRAM.
`mistral:7b` requires roughly 5 GB of VRAM. Substitute any compatible model.

## Configure Pagepiper

In your `.env`:

```bash
PAGEPIPER_OLLAMA_URL=http://localhost:11434
PAGEPIPER_EMBED_MODEL=nomic-embed-text
PAGEPIPER_CHAT_MODEL=mistral:7b
```

Restart Pagepiper:

```bash
./manage.sh restart
```

## Verify

Upload or re-index a document. The document card should show **Embedding N / M pages** during ingest. Once complete, the Chat tab becomes active.

## Changing embedding models

If you switch `PAGEPIPER_EMBED_MODEL`, Pagepiper detects the dimension mismatch at startup, deletes the old vector database, and automatically re-embeds all indexed documents in the background. BM25 search remains available throughout.

!!! note
    Re-embedding a large library can take 30-60 minutes depending on hardware.