docs: add README and MIT LICENSE

Covers hardware requirements, Docker Compose quickstart, /extract API
reference, CF_DOCUVISION_URL wiring, and kiwi#150 callout for the
self-hosted CF_ORCH_URL code gap.
This commit is contained in:
pyr0ball 2026-06-05 11:59:25 -07:00
parent 47d4dfc786
commit cf0e2fa649
2 changed files with 207 additions and 0 deletions

21
LICENSE Normal file
View file

@ -0,0 +1,21 @@
MIT License
Copyright (c) 2024 CircuitForge LLC
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.

186
README.md Normal file
View file

@ -0,0 +1,186 @@
# cf-docuvision
Document parsing service for CircuitForge products. Parses scanned documents, PDFs, forms, and receipts into structured elements (headings, paragraphs, tables, figures) using [Dolphin-v2](https://huggingface.co/ByteDance/Dolphin-v2) (ByteDance, Apache 2.0).
**Status:** v0.1.0 — production-ready for single-page documents.
---
## Prerequisites
### Hardware
| GPU VRAM | Result |
|----------|--------|
| 16GB+ | Recommended — fast single-page parsing (13 seconds) |
| 8GB | Minimum — works for most documents |
| Under 8GB | Likely CUDA out-of-memory on model load |
| CPU only | Works — expect 60120 seconds per page |
If you are on CPU or have limited VRAM, set `CF_DOCUVISION_DEVICE=cpu` before starting. The service logs a warning and continues — CPU fallback is slow but functional.
### Model download
First startup downloads approximately **58 GB** from HuggingFace. Subsequent runs use the local cache. No HuggingFace account required (model is Apache 2.0, not gated).
To speed up large downloads:
```bash
pip install hf-transfer
export HF_HUB_ENABLE_HF_TRANSFER=1
```
---
## Quick start (Docker Compose)
```bash
git clone https://git.opensourcesolarpunk.com/Circuit-Forge/cf-docuvision.git
cd cf-docuvision
cp .env.example .env # edit if needed
docker compose up -d
```
Watch model load progress:
```bash
docker compose logs -f cf-docuvision
```
The service is ready when logs show `cf-docuvision: ready`. Confirm:
```bash
curl http://localhost:8003/health
# {"status": "ok", "model": "ByteDance/Dolphin-v2"}
```
---
## Direct Python run
```bash
pip install -r requirements.txt
CF_DOCUVISION_DEVICE=cuda uvicorn app.main:app --host 0.0.0.0 --port 8003
```
CPU fallback:
```bash
CF_DOCUVISION_DEVICE=cpu uvicorn app.main:app --host 0.0.0.0 --port 8003
```
---
## Configuration
| Variable | Default | Description |
|---|---|---|
| `CF_DOCUVISION_MODEL` | `ByteDance/Dolphin-v2` | HuggingFace model ID or local path |
| `CF_DOCUVISION_DEVICE` | `auto` | `cuda`, `cpu`, or `auto` (GPU if available) |
| `CF_DOCUVISION_PORT` | `8003` | Service port (Docker Compose only) |
To skip HuggingFace download, set `CF_DOCUVISION_MODEL` to a local directory:
```bash
# Optional: uncomment the volume mount in compose.yml
# - /Library/Assets/LLM/dolphin-v2:/models/dolphin-v2:ro
CF_DOCUVISION_MODEL=/models/dolphin-v2
```
---
## Connecting from a product
Set `CF_DOCUVISION_URL` in the product's `.env`:
```bash
CF_DOCUVISION_URL=http://localhost:8003
```
Products using cf-core's `DocuvisionClient` pick this up automatically.
> **Kiwi note:** Kiwi v0.10.x gates the docuvision call on `CF_ORCH_URL``CF_DOCUVISION_URL` is not yet read directly. Fix is tracked at [kiwi#150](https://git.opensourcesolarpunk.com/Circuit-Forge/kiwi/issues/150). Once that ships, set `CF_DOCUVISION_URL` in Kiwi's `.env` and leave `CF_ORCH_URL` unset.
---
## API reference
### `GET /health`
Returns 200 when the model is loaded and ready.
```json
{"status": "ok", "model": "ByteDance/Dolphin-v2"}
```
Returns 503 while the model is still loading at startup.
### `POST /extract`
Parse a document image into structured elements.
**Request:**
```json
{
"image_b64": "<base64-encoded image bytes (JPEG, PNG, TIFF)>",
"hint": "auto"
}
```
`hint` controls extraction focus:
| Value | Behaviour |
|---|---|
| `auto` | General parsing — balanced detection of all element types (default) |
| `table` | Prioritise HTML table rendering |
| `text` | Prioritise text content and heading hierarchy |
| `form` | Prioritise form fields and key-value pairs |
**Response:**
```json
{
"elements": [
{"type": "heading", "text": "Invoice", "bbox": [0.05, 0.02, 0.9, 0.08]},
{"type": "paragraph", "text": "Due date: 2026-07-01", "bbox": [0.05, 0.10, 0.6, 0.14]}
],
"tables": [
{"html": "<table>...</table>", "bbox": [0.05, 0.20, 0.95, 0.60]}
],
"raw_text": "Invoice\nDue date: 2026-07-01\n...",
"metadata": {
"source": "cf-docuvision",
"model": "ByteDance/Dolphin-v2",
"hint": "auto",
"elapsed_ms": 1240
}
}
```
Element types: `heading`, `paragraph`, `list`, `table`, `figure`, `formula`, `code`.
`bbox` values are normalised to [0, 1] relative to the image dimensions.
---
## Troubleshooting
**`CUDA out of memory` at startup**
Dolphin-v2 requires ~8GB VRAM. Set `CF_DOCUVISION_DEVICE=cpu` to use CPU mode instead.
**`503 Model not loaded` on first request**
The model is still loading. Watch logs for `cf-docuvision: ready` before sending requests. The Docker healthcheck waits up to 120 seconds.
**Very slow processing**
CPU mode is expected to take 60120 seconds per page. This is normal. If you need speed, a GPU is required.
**`trust_remote_code=True` warning**
Dolphin-v2 requires `trust_remote_code=True` for its custom architecture. The model is Apache 2.0 and auditable at [huggingface.co/ByteDance/Dolphin-v2](https://huggingface.co/ByteDance/Dolphin-v2).
---
## License
- cf-docuvision service: [MIT](LICENSE) — CircuitForge LLC
- Dolphin-v2 model: [Apache 2.0](https://huggingface.co/ByteDance/Dolphin-v2) — ByteDance