Library

The library is the home screen. It shows all indexed documents and lets you add new ones.

Adding documents

Upload — click Upload PDF / EPUB and select a file. Files up to 200 MB are accepted. The document is saved to data/uploads/ and queued for indexing immediately.

Scan — set PAGEPIPER_WATCH_DIR to a directory in your .env, then click Scan for PDFs. Any PDF or EPUB not already in the library is queued. Re-scanning is safe; already-indexed documents are skipped.

Document states

Badge	Meaning
PROCESSING	Text extraction or embedding in progress
READY	Fully indexed and searchable
ERROR	Indexing failed — see the error message on the card

Ingestion progress

While a document is processing, its card shows a live progress bar:

Animated sliding bar while text is being extracted (before page count is known)
"Embedding N / M pages (X%)" once vectors are being written

The card refreshes automatically and emits a library reload when indexing completes.

Re-indexing

Click Re-index on any document card to re-run the full ingest pipeline. This is useful after:

Changing the PAGEPIPER_EMBED_MODEL (dimension mismatch auto-detected at startup, but you can also trigger manually)
A failed ingest you want to retry
Updating to a new version of Pagepiper with an improved extractor

Removing a document

Click Remove to delete the document's metadata, page chunks, and vectors. The source file on disk is not deleted.

Storage

All data lives in the directory set by PAGEPIPER_DATA_DIR (default: data/):

File	Contents
`pagepiper.db`	Document metadata, page chunks, chat feedback
`pagepiper_vecs.db`	sqlite-vec vector store
`uploads/`	Files added via browser upload

1.8 KiB Raw Blame History