- platforms/: eBay platform adapter (snipe integration layer) - docs/: developer guide, module reference, getting-started docs - scripts/: utility scripts for development and deployment
3 KiB
llm
LLM router with a configurable fallback chain. Abstracts over Ollama, vLLM, Anthropic, and any OpenAI-compatible backend. Products never talk to a specific LLM backend directly.
from circuitforge_core.llm import LLMRouter
Design principle
The router implements "local inference first." Cloud backends sit at the end of the fallback chain. A product configured with only Ollama will never silently fall through to a paid API.
Configuration
The router reads config/llm.yaml from the product's working directory (or the path passed to the constructor). Each product maintains its own llm.yaml; cf-core provides the router, not the config.
# config/llm.yaml example
fallback_order:
- ollama
- vllm
- anthropic
ollama:
enabled: true
base_url: http://localhost:11434
model: llama3.2:3b
vllm:
enabled: false
base_url: http://localhost:8000
anthropic:
enabled: false
api_key_env: ANTHROPIC_API_KEY
API
LLMRouter(config_path=None, cloud_mode=False)
Instantiate the router. In most products, instantiation happens inside a shim that injects product-specific config resolution.
router.complete(prompt, system=None, images=None, fallback_order=None) -> str
Send a completion request. Tries backends in order; falls through on error or unavailability.
router = LLMRouter()
response = router.complete(
prompt="Summarize this recipe in one sentence.",
system="You are a cooking assistant.",
)
Pass images: list[str] (base64-encoded) for vision requests — non-vision backends are automatically skipped when images are present.
Pass fallback_order=["vllm", "anthropic"] to override the config chain for a specific call (useful for task-specific routing).
router.stream(prompt, system=None) -> Iterator[str]
Streaming variant. Yields token chunks as they arrive. Not all backends support streaming; the router logs a warning and falls back to a non-streaming backend if needed.
Shim requirement
!!! warning "Always use the product shim"
Scripts and endpoints must import LLMRouter from the product shim (scripts/llm_router.py or app/llm_router.py), never directly from circuitforge_core.llm.router. The shim handles tri-level config resolution (env vars override config file overrides defaults) and cloud mode wiring. Bypassing it breaks cloud deployments silently.
Backends
| Backend | Type | Notes |
|---|---|---|
ollama |
Local | Preferred default; model names from config/llm.yaml |
vllm |
Local GPU | For high-throughput or large models |
anthropic |
Cloud | Requires ANTHROPIC_API_KEY env var |
openai |
Cloud | Any OpenAI-compatible endpoint |
claude_code |
Local wrapper | claude-bridge OpenAI-compatible wrapper on :3009 |
Vision routing
When images are included in a complete() call, the router checks each backend's vision capability before trying it. Configure vision priority separately:
vision_fallback_order:
- vision_service # local moondream2 via FastAPI on :8002
- claude_code
- anthropic