Circuit-Forge/circuitforge-core

Fork 0

pyr0ball 383897f990

CI / test (push) Waiting to run

Details

Mirror / mirror (push) Waiting to run

Details

Release — PyPI / release (push) Waiting to run

Details

feat: platforms module + docs + scripts

- platforms/: eBay platform adapter (snipe integration layer)
- docs/: developer guide, module reference, getting-started docs
- scripts/: utility scripts for development and deployment

2026-04-24 15:23:16 -07:00

3 KiB

Raw Blame History

llm

LLM router with a configurable fallback chain. Abstracts over Ollama, vLLM, Anthropic, and any OpenAI-compatible backend. Products never talk to a specific LLM backend directly.

from circuitforge_core.llm import LLMRouter

Design principle

The router implements "local inference first." Cloud backends sit at the end of the fallback chain. A product configured with only Ollama will never silently fall through to a paid API.

Configuration

The router reads config/llm.yaml from the product's working directory (or the path passed to the constructor). Each product maintains its own llm.yaml; cf-core provides the router, not the config.

# config/llm.yaml example
fallback_order:
  - ollama
  - vllm
  - anthropic

ollama:
  enabled: true
  base_url: http://localhost:11434
  model: llama3.2:3b

vllm:
  enabled: false
  base_url: http://localhost:8000

anthropic:
  enabled: false
  api_key_env: ANTHROPIC_API_KEY

API

`LLMRouter(config_path=None, cloud_mode=False)`

Instantiate the router. In most products, instantiation happens inside a shim that injects product-specific config resolution.

`router.complete(prompt, system=None, images=None, fallback_order=None) -> str`

Send a completion request. Tries backends in order; falls through on error or unavailability.

router = LLMRouter()
response = router.complete(
    prompt="Summarize this recipe in one sentence.",
    system="You are a cooking assistant.",
)

Pass images: list[str] (base64-encoded) for vision requests — non-vision backends are automatically skipped when images are present.

Pass fallback_order=["vllm", "anthropic"] to override the config chain for a specific call (useful for task-specific routing).

`router.stream(prompt, system=None) -> Iterator[str]`

Streaming variant. Yields token chunks as they arrive. Not all backends support streaming; the router logs a warning and falls back to a non-streaming backend if needed.

Shim requirement

!!! warning "Always use the product shim" Scripts and endpoints must import LLMRouter from the product shim (scripts/llm_router.py or app/llm_router.py), never directly from circuitforge_core.llm.router. The shim handles tri-level config resolution (env vars override config file overrides defaults) and cloud mode wiring. Bypassing it breaks cloud deployments silently.

Backends

Backend	Type	Notes
`ollama`	Local	Preferred default; model names from `config/llm.yaml`
`vllm`	Local GPU	For high-throughput or large models
`anthropic`	Cloud	Requires `ANTHROPIC_API_KEY` env var
`openai`	Cloud	Any OpenAI-compatible endpoint
`claude_code`	Local wrapper	claude-bridge OpenAI-compatible wrapper on :3009

Vision routing

When images are included in a complete() call, the router checks each backend's vision capability before trying it. Configure vision priority separately:

vision_fallback_order:
  - vision_service   # local moondream2 via FastAPI on :8002
  - claude_code
  - anthropic

3 KiB Raw Blame History

llm

Design principle

Configuration

API

LLMRouter(config_path=None, cloud_mode=False)

router.complete(prompt, system=None, images=None, fallback_order=None) -> str

router.stream(prompt, system=None) -> Iterator[str]

Shim requirement

Backends

Vision routing

3 KiB

Raw Blame History

`LLMRouter(config_path=None, cloud_mode=False)`

`router.complete(prompt, system=None, images=None, fallback_order=None) -> str`

`router.stream(prompt, system=None) -> Iterator[str]`