- platforms/: eBay platform adapter (snipe integration layer) - docs/: developer guide, module reference, getting-started docs - scripts/: utility scripts for development and deployment
2 KiB
2 KiB
hardware
GPU enumeration and VRAM-tier profile generation. Used by manage.sh at startup to recommend a Docker Compose profile and by the cf-orch coordinator for resource allocation.
from circuitforge_core.hardware import get_gpus, recommend_profile, HardwareProfile
GPU detection
get_gpus() returns a list of detected GPUs with their VRAM capacity. Detection strategy:
- Try
nvidia-smi(Linux/Windows NVIDIA) - Fall back to
system_profiler SPDisplaysDataTypeon Darwin whenhw.optional.arm64=1(Apple Silicon) - Return CPU-only profile if neither succeeds
gpus = get_gpus()
# [{"name": "RTX 4090", "vram_gb": 24.0, "type": "nvidia"},
# {"name": "Apple M2 Max", "vram_gb": 32.0, "type": "apple_silicon"}]
Compose profile recommendation
profile = recommend_profile(gpus)
# "single-gpu" | "dual-gpu" | "cpu" | "remote"
Profile selection rules:
single-gpu: one NVIDIA GPU with >= 8GB VRAMdual-gpu: two or more NVIDIA GPUscpu: no NVIDIA GPU (Apple Silicon usescpusince Docker on Mac has no Metal passthrough)remote: explicitly requested or when local inference would exceed available VRAM
!!! note "Apple Silicon"
Apple Silicon Macs should run Ollama natively (outside Docker) for Metal-accelerated inference. Docker on macOS runs in a Linux VM with no Metal passthrough. preflight.py in each product detects native Ollama on :11434 and adopts it automatically.
VRAM tiers
| VRAM | Models that fit |
|---|---|
| < 4 GB | Quantized 1B–3B models (Phi-3 mini, Llama 3.2 3B Q4) |
| 4–8 GB | 7B–8B models Q4 (Llama 3.1 8B, Mistral 7B) |
| 8–16 GB | 13B–14B models Q4, 7B models in full precision |
| 16–24 GB | 30B models Q4, 13B full precision |
| 24 GB+ | 70B models Q4 |
HardwareProfile
The HardwareProfile dataclass is written to compose.override.yml by preflight.py at product startup, making GPU capabilities available to Docker Compose without hardcoding.