Add CF_DOCUVISION_URL direct fallback in _try_docuvision() #150

Open
opened 2026-06-05 11:20:27 -07:00 by pyr0ball · 0 comments
Owner

Problem

_try_docuvision() in app/services/ocr/vl_model.py currently gates on CF_ORCH_URL:

if not CF_ORCH_URL:
    return None

If CF_ORCH_URL is not set, the function returns None and never attempts the docuvision call — even if the user has set CF_DOCUVISION_URL pointing to a standalone cf-docuvision container.

CF_DOCUVISION_URL is not read anywhere in Kiwi. Self-hosters running their own cf-docuvision instance cannot connect it without this fix.

Fix

In _try_docuvision(), read CF_DOCUVISION_URL as a direct fallback:

import os
CF_DOCUVISION_URL = os.environ.get("CF_DOCUVISION_URL", "").rstrip("/")

async def _try_docuvision(image_bytes: bytes) -> str | None:
    url = CF_ORCH_URL or CF_DOCUVISION_URL
    if not url:
        return None
    # ... rest of function, call /parse or equivalent endpoint

When CF_ORCH_URL is set, cf-orch still takes precedence (existing behaviour). When only CF_DOCUVISION_URL is set, Kiwi connects directly to the standalone service.

Acceptance criteria

  • CF_DOCUVISION_URL is read from env
  • _try_docuvision() uses it when CF_ORCH_URL is absent
  • .env.example documents CF_DOCUVISION_URL with a comment explaining it is for self-hosters without cf-orch
  • Unit test covering the CF_DOCUVISION_URL-only path (mock HTTP call)

Labels

enhancement, self-hosting

## Problem `_try_docuvision()` in `app/services/ocr/vl_model.py` currently gates on `CF_ORCH_URL`: ```python if not CF_ORCH_URL: return None ``` If `CF_ORCH_URL` is not set, the function returns `None` and never attempts the docuvision call — even if the user has set `CF_DOCUVISION_URL` pointing to a standalone `cf-docuvision` container. `CF_DOCUVISION_URL` is **not read anywhere** in Kiwi. Self-hosters running their own `cf-docuvision` instance cannot connect it without this fix. ## Fix In `_try_docuvision()`, read `CF_DOCUVISION_URL` as a direct fallback: ```python import os CF_DOCUVISION_URL = os.environ.get("CF_DOCUVISION_URL", "").rstrip("/") async def _try_docuvision(image_bytes: bytes) -> str | None: url = CF_ORCH_URL or CF_DOCUVISION_URL if not url: return None # ... rest of function, call /parse or equivalent endpoint ``` When `CF_ORCH_URL` is set, cf-orch still takes precedence (existing behaviour). When only `CF_DOCUVISION_URL` is set, Kiwi connects directly to the standalone service. ## Acceptance criteria - `CF_DOCUVISION_URL` is read from env - `_try_docuvision()` uses it when `CF_ORCH_URL` is absent - `.env.example` documents `CF_DOCUVISION_URL` with a comment explaining it is for self-hosters without cf-orch - Unit test covering the `CF_DOCUVISION_URL`-only path (mock HTTP call) ## Labels enhancement, self-hosting
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: Circuit-Forge/kiwi#150
No description provided.