48 lines
1.8 KiB
Markdown
48 lines
1.8 KiB
Markdown
# Library
|
|
|
|
The library is the home screen. It shows all indexed documents and lets you add new ones.
|
|
|
|
## Adding documents
|
|
|
|
**Upload** — click **Upload PDF / EPUB** and select a file. Files up to 200 MB are accepted. The document is saved to `data/uploads/` and queued for indexing immediately.
|
|
|
|
**Scan** — set `PAGEPIPER_WATCH_DIR` to a directory in your `.env`, then click **Scan for PDFs**. Any PDF or EPUB not already in the library is queued. Re-scanning is safe; already-indexed documents are skipped.
|
|
|
|
## Document states
|
|
|
|
| Badge | Meaning |
|
|
|-------|---------|
|
|
| PROCESSING | Text extraction or embedding in progress |
|
|
| READY | Fully indexed and searchable |
|
|
| ERROR | Indexing failed — see the error message on the card |
|
|
|
|
## Ingestion progress
|
|
|
|
While a document is processing, its card shows a live progress bar:
|
|
|
|
- Animated sliding bar while text is being extracted (before page count is known)
|
|
- "Embedding N / M pages (X%)" once vectors are being written
|
|
|
|
The card refreshes automatically and emits a library reload when indexing completes.
|
|
|
|
## Re-indexing
|
|
|
|
Click **Re-index** on any document card to re-run the full ingest pipeline. This is useful after:
|
|
|
|
- Changing the `PAGEPIPER_EMBED_MODEL` (dimension mismatch auto-detected at startup, but you can also trigger manually)
|
|
- A failed ingest you want to retry
|
|
- Updating to a new version of Pagepiper with an improved extractor
|
|
|
|
## Removing a document
|
|
|
|
Click **Remove** to delete the document's metadata, page chunks, and vectors. The source file on disk is not deleted.
|
|
|
|
## Storage
|
|
|
|
All data lives in the directory set by `PAGEPIPER_DATA_DIR` (default: `data/`):
|
|
|
|
| File | Contents |
|
|
|------|---------|
|
|
| `pagepiper.db` | Document metadata, page chunks, chat feedback |
|
|
| `pagepiper_vecs.db` | sqlite-vec vector store |
|
|
| `uploads/` | Files added via browser upload |
|