Switches to OrchestratedScheduler in cloud mode so concurrent recipe_llm
jobs fan out across all registered cf-orch GPU nodes instead of serializing
on one. Under load this eliminates poll timeouts from queue backup.
USE_ORCH_SCHEDULER env var gives explicit control independent of CLOUD_MODE:
unset follow CLOUD_MODE (cloud=orch, local=local)
true OrchestratedScheduler always (e.g. multi-GPU local rig)
false LocalScheduler always (e.g. cloud single-GPU dev instance)
ImportError fallback: if circuitforge_orch is not installed and orch is
requested, logs a warning and falls back to LocalScheduler gracefully.
Add E2E_TEST_USER_ID setting (opt-in via env); session bootstrap logs
at DEBUG instead of INFO for the known test user so test runs don't
inflate session counts. Still visible with DEBUG=true.
- .env.example: document ANTHROPIC_API_KEY, OPENAI_API_KEY, OLLAMA_HOST,
OLLAMA_MODEL, CF_ORCH_URL, CF_LICENSE_KEY with usage comments
- config.py: expose CF_LICENSE_KEY in Settings for startup visibility
- pyproject.toml: pin circuitforge-core >= 0.6.0 (env-var auto-config +
CFOrchClient bearer auth land in 0.6.0)
Bare-metal self-hosters can now run Kiwi with only OLLAMA_HOST set and
zero yaml config. Paid+ users set CF_ORCH_URL + CF_LICENSE_KEY for
managed cloud GPU inference.