Without CF_ORCH_URL set, _call_vision_backend() skips cf-orch entirely
and falls through to local VLM (no GPU in container) then fails.
.env gets CF_ORCH_URL=http://10.1.10.71:7700 for the local rack.
.env.example updated with documentation for self-hosters.
Local scan confirmed: cf-docuvision (Sif, GGUF) → ollama llama3.1:8b → 200 OK.
Paid+ local users with circuitforge_orch installed now get the coordinator-
aware scheduler automatically — no env var needed. The coordinator's
allocation queue already prefers the local GPU first, so latency stays low.
Priority: USE_ORCH_SCHEDULER env override > CLOUD_MODE > cf-orch importable.
Free-tier local users without cf-orch installed get LocalScheduler as before.
USE_ORCH_SCHEDULER=false can force LocalScheduler even when cf-orch is present.
Switches to OrchestratedScheduler in cloud mode so concurrent recipe_llm
jobs fan out across all registered cf-orch GPU nodes instead of serializing
on one. Under load this eliminates poll timeouts from queue backup.
USE_ORCH_SCHEDULER env var gives explicit control independent of CLOUD_MODE:
unset follow CLOUD_MODE (cloud=orch, local=local)
true OrchestratedScheduler always (e.g. multi-GPU local rig)
false LocalScheduler always (e.g. cloud single-GPU dev instance)
ImportError fallback: if circuitforge_orch is not installed and orch is
requested, logs a warning and falls back to LocalScheduler gracefully.
- .env.example: document ANTHROPIC_API_KEY, OPENAI_API_KEY, OLLAMA_HOST,
OLLAMA_MODEL, CF_ORCH_URL, CF_LICENSE_KEY with usage comments
- config.py: expose CF_LICENSE_KEY in Settings for startup visibility
- pyproject.toml: pin circuitforge-core >= 0.6.0 (env-var auto-config +
CFOrchClient bearer auth land in 0.6.0)
Bare-metal self-hosters can now run Kiwi with only OLLAMA_HOST set and
zero yaml config. Paid+ users set CF_ORCH_URL + CF_LICENSE_KEY for
managed cloud GPU inference.