circuitforge-core/docs/modules/hardware.md
pyr0ball 383897f990
Some checks are pending
CI / test (push) Waiting to run
Mirror / mirror (push) Waiting to run
Release — PyPI / release (push) Waiting to run
feat: platforms module + docs + scripts
- platforms/: eBay platform adapter (snipe integration layer)
- docs/: developer guide, module reference, getting-started docs
- scripts/: utility scripts for development and deployment
2026-04-24 15:23:16 -07:00

2 KiB
Raw Blame History

hardware

GPU enumeration and VRAM-tier profile generation. Used by manage.sh at startup to recommend a Docker Compose profile and by the cf-orch coordinator for resource allocation.

from circuitforge_core.hardware import get_gpus, recommend_profile, HardwareProfile

GPU detection

get_gpus() returns a list of detected GPUs with their VRAM capacity. Detection strategy:

  1. Try nvidia-smi (Linux/Windows NVIDIA)
  2. Fall back to system_profiler SPDisplaysDataType on Darwin when hw.optional.arm64=1 (Apple Silicon)
  3. Return CPU-only profile if neither succeeds
gpus = get_gpus()
# [{"name": "RTX 4090", "vram_gb": 24.0, "type": "nvidia"},
#  {"name": "Apple M2 Max", "vram_gb": 32.0, "type": "apple_silicon"}]

Compose profile recommendation

profile = recommend_profile(gpus)
# "single-gpu" | "dual-gpu" | "cpu" | "remote"

Profile selection rules:

  • single-gpu: one NVIDIA GPU with >= 8GB VRAM
  • dual-gpu: two or more NVIDIA GPUs
  • cpu: no NVIDIA GPU (Apple Silicon uses cpu since Docker on Mac has no Metal passthrough)
  • remote: explicitly requested or when local inference would exceed available VRAM

!!! note "Apple Silicon" Apple Silicon Macs should run Ollama natively (outside Docker) for Metal-accelerated inference. Docker on macOS runs in a Linux VM with no Metal passthrough. preflight.py in each product detects native Ollama on :11434 and adopts it automatically.

VRAM tiers

VRAM Models that fit
< 4 GB Quantized 1B3B models (Phi-3 mini, Llama 3.2 3B Q4)
48 GB 7B8B models Q4 (Llama 3.1 8B, Mistral 7B)
816 GB 13B14B models Q4, 7B models in full precision
1624 GB 30B models Q4, 13B full precision
24 GB+ 70B models Q4

HardwareProfile

The HardwareProfile dataclass is written to compose.override.yml by preflight.py at product startup, making GPU capabilities available to Docker Compose without hardcoding.