1.8 KiB
Library
The library is the home screen. It shows all indexed documents and lets you add new ones.
Adding documents
Upload — click Upload PDF / EPUB and select a file. Files up to 200 MB are accepted. The document is saved to data/uploads/ and queued for indexing immediately.
Scan — set PAGEPIPER_WATCH_DIR to a directory in your .env, then click Scan for PDFs. Any PDF or EPUB not already in the library is queued. Re-scanning is safe; already-indexed documents are skipped.
Document states
| Badge | Meaning |
|---|---|
| PROCESSING | Text extraction or embedding in progress |
| READY | Fully indexed and searchable |
| ERROR | Indexing failed — see the error message on the card |
Ingestion progress
While a document is processing, its card shows a live progress bar:
- Animated sliding bar while text is being extracted (before page count is known)
- "Embedding N / M pages (X%)" once vectors are being written
The card refreshes automatically and emits a library reload when indexing completes.
Re-indexing
Click Re-index on any document card to re-run the full ingest pipeline. This is useful after:
- Changing the
PAGEPIPER_EMBED_MODEL(dimension mismatch auto-detected at startup, but you can also trigger manually) - A failed ingest you want to retry
- Updating to a new version of Pagepiper with an improved extractor
Removing a document
Click Remove to delete the document's metadata, page chunks, and vectors. The source file on disk is not deleted.
Storage
All data lives in the directory set by PAGEPIPER_DATA_DIR (default: data/):
| File | Contents |
|---|---|
pagepiper.db |
Document metadata, page chunks, chat feedback |
pagepiper_vecs.db |
sqlite-vec vector store |
uploads/ |
Files added via browser upload |