pagepiper/docs/getting-started/ollama-setup.md

49 lines
1.3 KiB
Markdown

# Ollama Setup
Hybrid vector search and RAG chat are gated behind a local Ollama instance. This is the BYOK (bring your own key) unlock for the Free tier — no paid subscription required.
## Install Ollama
```bash
curl -fsSL https://ollama.ai/install.sh | sh
```
## Pull the required models
```bash
# Embedding model — converts pages into vectors
ollama pull nomic-embed-text
# Chat model — answers questions using retrieved page excerpts
ollama pull mistral:7b
```
`nomic-embed-text` produces 1024-dimensional vectors and runs comfortably on 8 GB of VRAM.
`mistral:7b` requires roughly 5 GB of VRAM. Substitute any compatible model.
## Configure Pagepiper
In your `.env`:
```bash
PAGEPIPER_OLLAMA_URL=http://localhost:11434
PAGEPIPER_EMBED_MODEL=nomic-embed-text
PAGEPIPER_CHAT_MODEL=mistral:7b
```
Restart Pagepiper:
```bash
./manage.sh restart
```
## Verify
Upload or re-index a document. The document card should show **Embedding N / M pages** during ingest. Once complete, the Chat tab becomes active.
## Changing embedding models
If you switch `PAGEPIPER_EMBED_MODEL`, Pagepiper detects the dimension mismatch at startup, deletes the old vector database, and automatically re-embeds all indexed documents in the background. BM25 search remains available throughout.
!!! note
Re-embedding a large library can take 30-60 minutes depending on hardware.