pagepiper/docs/getting-started/ollama-setup.md

1.3 KiB

Ollama Setup

Hybrid vector search and RAG chat are gated behind a local Ollama instance. This is the BYOK (bring your own key) unlock for the Free tier — no paid subscription required.

Install Ollama

curl -fsSL https://ollama.ai/install.sh | sh

Pull the required models

# Embedding model — converts pages into vectors
ollama pull nomic-embed-text

# Chat model — answers questions using retrieved page excerpts
ollama pull mistral:7b

nomic-embed-text produces 1024-dimensional vectors and runs comfortably on 8 GB of VRAM. mistral:7b requires roughly 5 GB of VRAM. Substitute any compatible model.

Configure Pagepiper

In your .env:

PAGEPIPER_OLLAMA_URL=http://localhost:11434
PAGEPIPER_EMBED_MODEL=nomic-embed-text
PAGEPIPER_CHAT_MODEL=mistral:7b

Restart Pagepiper:

./manage.sh restart

Verify

Upload or re-index a document. The document card should show Embedding N / M pages during ingest. Once complete, the Chat tab becomes active.

Changing embedding models

If you switch PAGEPIPER_EMBED_MODEL, Pagepiper detects the dimension mismatch at startup, deletes the old vector database, and automatically re-embeds all indexed documents in the background. BM25 search remains available throughout.

!!! note Re-embedding a large library can take 30-60 minutes depending on hardware.