circuitforge-core

Author	SHA1	Message	Date
pyr0ball	67493048e2	feat(stt): add cf-stt module — FasterWhisperBackend + managed FastAPI app Some checks are pending CI / test (push) Waiting to run Details Mirror / mirror (push) Waiting to run Details - STTBackend Protocol + STTResult/STTSegment frozen dataclasses (base.py) - MockSTTBackend for CI/tests (no GPU needed, CF_STT_MOCK=1) - FasterWhisperBackend: loads model once, thread-safe, VRAM estimate by model size - app.py: FastAPI service runnable as managed process by cf-orch POST /transcribe (multipart audio) → STTTranscribeResponse-compatible JSON GET /health → {status, model, vram_mb} - __init__.py: process-level singleton + transcribe() convenience fn - pyproject.toml: stt-faster-whisper + stt-service optional dep groups	2026-04-08 22:14:46 -07:00
pyr0ball	5766fa82ab	refactor: replace vision stub with cf-vision shim (cf-core#36) Some checks failed CI / test (push) Has been cancelled Details Mirror / mirror (push) Has been cancelled Details circuitforge_core.vision.router now re-exports VisionRouter from the standalone cf-vision repo. Existing imports unchanged; falls back to a helpful ImportError stub if cf-vision is not installed. Closes cf-core#36	2026-04-06 17:59:05 -07:00
pyr0ball	48d33a78ef	fix: migration runner resilient to partial-failure via retry-with-removal Some checks are pending CI / test (push) Waiting to run Details Mirror / mirror (push) Waiting to run Details Instead of splitting SQL on semicolons (fragile — semicolons appear inside comments and string literals), use executescript() for correct tokenization. On 'duplicate column name' error (caused by a prior partial run that auto-committed some ALTER TABLE statements before crashing), strip the already-applied ADD COLUMN statement from the script and retry. Limit to 20 attempts to prevent infinite loops on genuinely broken SQL. This replaces the earlier per-statement split approach which broke on migration 004 comment text containing a semicolon inside a -- comment, causing the remainder ('one row per...') to be treated as raw SQL.	2026-04-05 22:39:12 -07:00
pyr0ball	c9c4828387	fix: make migration runner resilient to partial-failure recovery Some checks are pending CI / test (push) Waiting to run Details Mirror / mirror (push) Waiting to run Details SQLite's executescript() auto-commits each DDL statement individually. If a migration crashes mid-run, prior ALTER TABLE statements are already committed but the migration is never recorded as applied. On restart, the runner re-runs the same file and hits 'duplicate column name' on already-applied statements, breaking subsequent startups permanently. Replace executescript() with per-statement execute() calls. 'Duplicate column name' OperationalErrors are caught and logged as warnings so the migration can complete and be marked as done. All other errors still propagate normally.	2026-04-05 22:23:29 -07:00
pyr0ball	19a26e02a0	Merge pull request 'feat: re-export make_feedback_router from circuitforge_core.api (closes #30 )' (#32 ) from feature/api-exports into main Some checks are pending CI / test (push) Waiting to run Details Mirror / mirror (push) Waiting to run Details	2026-04-05 21:37:33 -07:00
pyr0ball	3c9c765668	feat: re-export make_feedback_router from circuitforge_core.api (closes #30 ) Some checks failed CI / test (pull_request) Has been cancelled Details	2026-04-05 21:21:44 -07:00
pyr0ball	bb2ed3e992	fix: parameterize bare dict type annotations in license module Some checks failed CI / test (pull_request) Has been cancelled Details	2026-04-05 21:19:10 -07:00
pyr0ball	f3bc4ac605	feat: add CF_LICENSE_KEY validation via Heimdall (closes #26 ) Introduces circuitforge_core.config.license with validate_license() and get_license_tier(). Both functions are safe to call when CF_LICENSE_KEY is absent, returning free tier gracefully. Results are cached 30 min per (key, product) pair. CF_LICENSE_URL env var overrides the default Heimdall endpoint. Re-exports added to config.__init__. Existing test_config.py moved into tests/test_config/ package to co-locate with new test_license.py (10 tests; 204 total passing).	2026-04-05 21:16:57 -07:00
pyr0ball	f0a9ec5c37	fix: raise 502 on label creation failure; narrow subprocess exception scope	2026-04-05 17:36:52 -07:00
pyr0ball	0a15ad9522	feat: add circuitforge_core.api.feedback — shared feedback router factory (closes #23 ) Adds make_feedback_router(repo, product, demo_mode_fn) which returns a FastAPI APIRouter with GET /status and POST / endpoints. Handles Forgejo label creation/reuse, issue body assembly (including repro steps for bugs), demo mode gating, and FORGEJO_API_TOKEN presence checks. 12 tests covering all status/submit paths, mock Forgejo interaction, and body content assertions. Also adds fastapi>=0.110 and httpx>=0.27 to [dev] optional deps.	2026-04-05 17:31:02 -07:00
pyr0ball	c244260d1c	feat!: strip resources/ from MIT core — moves to circuitforge-orch (v0.8.0) BREAKING CHANGE: circuitforge_core.resources is no longer available. Import CFOrchClient from circuitforge_orch.client instead. cf-orch CLI entry point is now in the circuitforge-orch package.	2026-04-04 22:34:27 -07:00
pyr0ball	2259382d0b	refactor: replace coordinator-aware TaskScheduler with Protocol + LocalScheduler (MIT); update LLMRouter import path	2026-04-04 22:26:06 -07:00
pyr0ball	090a86ce1b	refactor: update LLMRouter lazy import — circuitforge_core.resources.client → circuitforge_orch.client	2026-04-04 22:16:17 -07:00
pyr0ball	d16bc569cf	chore: bump version to 0.7.0 — affiliates + preferences modules	2026-04-04 18:28:52 -07:00
pyr0ball	fe19de3d9a	feat: affiliates public API surface (__init__.py)	2026-04-04 18:27:45 -07:00
pyr0ball	7837fbcad2	feat: affiliates router — wrap_url() with opt-out, BYOK, and CF env-var resolution	2026-04-04 18:20:21 -07:00
pyr0ball	73cec07bd2	feat: affiliates disclosure — per-retailer tooltip copy + first-encounter banner constants	2026-04-04 18:14:58 -07:00
pyr0ball	4c3f3a95a5	feat: affiliates programs — AffiliateProgram, registry, eBay EPN + Amazon Associates builders	2026-04-04 18:12:45 -07:00
pyr0ball	d719ea2309	feat: preferences public helpers — get_user_preference / set_user_preference (closes #22 self-hosted)	2026-04-04 18:10:24 -07:00
pyr0ball	0d9d030320	feat: preferences LocalFileStore — YAML-backed single-user preference store	2026-04-04 18:07:35 -07:00
pyr0ball	9ee31a09c1	feat: preferences dot-path utilities (get_path, set_path)	2026-04-04 18:04:44 -07:00
pyr0ball	e6cd3a2e96	chore: sync __version__ to 0.6.0 (matches pyproject.toml)	2026-04-03 16:48:11 -07:00
pyr0ball	3deae056de	feat: local-first LLM config + hosted coordinator auth LLMRouter env-var auto-config: - No llm.yaml required — auto-configures from ANTHROPIC_API_KEY, OPENAI_API_KEY, or OLLAMA_HOST on first use - Bare-metal self-hosters can run any CF product with just env vars - Falls back to FileNotFoundError with actionable message only when no env vars are set either CFOrchClient auth: - Reads CF_LICENSE_KEY env var (or explicit api_key param) - Sends Authorization: Bearer <key> on all allocation/release requests - Required for the hosted public coordinator; no-op for local deployments HeimdallAuthMiddleware (new): - FastAPI middleware for cf-orch coordinator - Enabled by HEIMDALL_URL env var; self-hosted deployments skip it - 5-min TTL cache (matching Kiwi cloud session) keeps Heimdall off the per-allocation hot path - /api/health exempt; free-tier keys rejected with 403 + reason - 13 tests covering cache TTL, tier ranking, and middleware gating	2026-04-03 08:32:15 -07:00
pyr0ball	8d87ed4c9f	feat: manage.py cross-platform product manager (closes #6 ) - circuitforge_core.manage module — replaces bash-only manage.sh - config.py: ManageConfig from manage.toml (TOML via tomllib/tomli) app name, default_url, docker compose_file/project, native services Falls back to directory name when no manage.toml present - docker_mode.py: DockerManager wrapping 'docker compose' (v2 plugin) or 'docker-compose' (v1 fallback); docker_available() probe Commands: start, stop, restart, status, logs, build - native_mode.py: NativeManager with PID file process management platformdirs for platform-appropriate PID/log paths Windows-compatible log tailing (polling, no tail -f) Cross-platform kill: SIGTERM→SIGKILL on Unix, taskkill /F on Windows - cli.py: typer CLI — start/stop/restart/status/logs/build/open/install-shims Mode auto-detection: Docker available + compose file → docker; else native --mode docker\|native\|auto override - templates/manage.sh: bash shim (conda, venv, python3 detection) - templates/manage.ps1: PowerShell shim (same detection, Windows) - templates/manage.toml.example: annotated config template - __main__.py: python -m circuitforge_core.manage entry point - pyproject.toml: manage extras group (platformdirs, typer) cf-manage console script; version bumped to 0.5.0 - 36 tests: config (6), docker_mode (9), native_mode (21)	2026-04-02 23:04:35 -07:00
pyr0ball	7bb6b76bd5	feat: ollama adopt-if-running + health_path in ProcessSpec (#16 ) - ProcessSpec: adopt (bool) and health_path (str, default /health) fields - ServiceManager: adopt=True probes health_path before spawning; is_running() uses health probe for adopt services rather than proc table + socket check - _probe_health() helper: urllib GET on localhost:port+path, returns bool - Agent /services/{service}/start: returns adopted=True when service was already running; coordinator sets state=running immediately (no probe wait) - ServiceInstance: health_path field (default /health) - service_registry.upsert_instance(): health_path kwarg - Probe loop uses inst.health_path instead of hardcoded /health - coordinator allocate_service: looks up health_path from profile spec via _get_health_path() and stores on ServiceInstance - All GPU profiles (2/4/6/8/16/24 GB + cpu-16/32): ollama managed block with adopt=true, health_path=/api/tags, port 11434 - 11 new tests	2026-04-02 22:09:42 -07:00
pyr0ball	a54a530493	feat: agent watchdog — persist known nodes + auto-reconnect after coordinator restart closes #15 - NodeStore: SQLite persistence for known agent nodes (~/.local/share/circuitforge/cf-orch-nodes.db) - upsert on every register(); prune_stale() for 30-day cleanup - survives coordinator restarts — data readable by next process - AgentSupervisor.restore_from_store(): reload known nodes on startup, mark all offline; heartbeat loop brings back any that respond - AgentSupervisor.register(): persists to NodeStore on every call - cli.py coordinator: NodeStore wired in; restore_from_store() called before uvicorn starts - cli.py agent: one-shot registration replaced with persistent reconnect loop (daemon thread, 30 s interval) — coordinator restart → nodes reappear within one cycle with no manual intervention on agent hosts - 16 new tests: NodeStore (8) + AgentSupervisor watchdog (8)	2026-04-02 22:01:55 -07:00
pyr0ball	cd9864b5e8	feat: hardware detection, cf-docuvision service, documents ingestion pipeline Closes #5, #7, #8, #13 ## hardware module (closes #5) - HardwareSpec, LLMBackendConfig, LLMConfig dataclasses - VramTier ladder (CPU / 2 / 4 / 6 / 8 / 16 / 24 GB) with select_tier() - generate_profile() maps HardwareSpec → LLMConfig for llm.yaml generation - detect_hardware() with nvidia-smi / rocm-smi / system_profiler / cpu fallback - 31 tests across tiers, generator, and detect ## cf-docuvision service (closes #8) - FastAPI service wrapping ByteDance/Dolphin-v2 (Qwen2.5-VL backbone) - POST /extract: image_b64 or image_path + hint → ExtractResponse - Lazy model loading; JSON-structured output with plain-text fallback - ProcessSpec managed blocks added to all four GPU profiles (6/8/16/24 GB) - 14 tests ## documents module (closes #7) - StructuredDocument, Element, ParsedTable dataclasses (frozen, composable) - DocuvisionClient: thin HTTP client for cf-docuvision POST /extract - ingest(): primary cf-docuvision path → LLMRouter vision fallback → empty doc - CF_DOCUVISION_URL env var for URL override - 22 tests ## coordinator probe loop (closes #13) - _run_instance_probe_loop: starting → running on 200; starting → stopped on timeout - 4 async tests with CancelledError-based tick control	2026-04-02 18:53:25 -07:00
pyr0ball	a7290c1240	feat(orch): background health probe loop — starting → running transition Coordinator now polls all 'starting' instances every 5 s via GET /health. On 200: state → running. After 300 s without a healthy response: state → stopped. Closes #10.	2026-04-02 17:18:16 -07:00
pyr0ball	bd132851ec	fix(orch): tighten VRAM pre-flight to require full max_mb free (not half) max_mb // 2 was too loose — Qwen2.5-3B needs ~5.9 GB on an 8 GB card but the threshold only required 3.25 GB free, allowing Ollama to hold 4.5 GB while a load attempt was still dispatched (causing OOM crash). - node_selector: can_fit = free_mb >= service_max_mb (was // 2) - coordinator /start: same threshold fix + updated error message - tests: two new node_selector tests pin the full-ceiling semantics; updated stale docstring in coordinator app test	2026-04-02 16:44:36 -07:00
pyr0ball	2d095f0090	fix(llm-server): handle transformers 5.x BatchEncoding; use dtype kwarg - apply_chat_template() returns BatchEncoding in transformers 5.x (not bare tensor); extract .input_ids explicitly with fallback for 4.x compat - Switch from deprecated torch_dtype= to dtype= in from_pretrained()	2026-04-02 16:36:07 -07:00
pyr0ball	c78341fc6f	feat(orch): replace Ouro/vllm-Docker with generic HF inference server; add ProcessSpec - Add circuitforge_core/resources/inference/llm_server.py: generic OpenAI-compatible FastAPI server for any HuggingFace causal LM (Phi-4-mini-instruct, Qwen2.5-3B-Instruct) - Add service_manager.py + service_probe.py: ProcessSpec start/stop/is_running support (Popen-based; socket probe confirms readiness before marking running) - Update all 4 public GPU profiles to use ProcessSpec→llm_server instead of Docker vllm: 6gb (max_mb 5500), 8gb (max_mb 6500), 16gb/24gb (max_mb 9000) - Model candidates: Phi-4-mini-instruct first (7.2GB), Qwen2.5-3B-Instruct fallback (5.8GB) - Remove ouro_server.py (Ouro incompatible with transformers 5.x; vllm Docker also incompatible) - Add 17 tests for ServiceManager ProcessSpec (start/stop/is_running/list/get_url)	2026-04-02 15:33:08 -07:00
pyr0ball	27999925cf	fix(orch): seed ServiceInstance on first allocate start	2026-04-02 14:22:55 -07:00
pyr0ball	e58c3aea23	fix: TTL sweep, immutability, service-scoped release, logger in orch alloc - ServiceRegistry: add sweep_expired_allocations() to remove stale TTL allocations and transition instances to idle; add get_allocation() helper - AgentSupervisor._run_idle_sweep: call sweep_expired_allocations() before idle-timeout check so crashed-caller leaks are cleaned up each sweep tick - schema._parse_managed: copy raw dict before extracting 'type' key instead of mutating caller's dict with pop() - app.release_allocation: validate allocation belongs to the given service path param before releasing; return 404 if mismatch - router._try_cf_orch_alloc: replace print() with logger.warning(); add module-level logger = logging.getLogger(__name__) - tests: add test_sweep_expired_allocations covering TTL expiry and idle state transition	2026-04-02 12:55:38 -07:00
pyr0ball	02806359af	feat: add Services table to coordinator dashboard	2026-04-02 12:47:27 -07:00
pyr0ball	a4ccaaf3e2	fix: address coordinator/idle-sweep quality issues from review - CRITICAL: idle sweep now calls mark_stopped() after successful HTTP stop, preventing repeated stop POSTs on every 3rd tick for the same instance - CRITICAL: active_allocations() now filters by gpu_id to avoid marking wrong instance idle on multi-GPU nodes when an allocation is released - CRITICAL: VRAM pre-flight guard in ensure_service was dead code — added the actual HTTPException(503) before the candidate loop - IMPORTANT: register() now updates agent_url on re-registration if it changed, so relocated agents are tracked correctly - IMPORTANT: updated test_service_registry.py callers of active_allocations() to pass the now-required gpu_id argument	2026-04-02 12:45:31 -07:00
pyr0ball	49ab9e4e88	feat: wire ServiceRegistry into coordinator allocate endpoints	2026-04-02 12:30:58 -07:00
pyr0ball	c299482e0d	feat: add idle sweep to AgentSupervisor	2026-04-02 12:30:28 -07:00
pyr0ball	1e168ac636	feat(profiles): add idle_stop_after_s field; set 600s for vllm slot Add idle_stop_after_s to ServiceProfile (default 0 = never stop). Set 600s (10 min) timeout on vllm slot in all single-GPU profiles. Backward compatible; non-vllm services inherit default 0 (no auto-stop).	2026-04-02 12:24:19 -07:00
pyr0ball	9754f522d9	feat(orch): add ServiceRegistry — allocation tracking + idle state machine	2026-04-02 12:22:46 -07:00
pyr0ball	17a24173f7	feat(llm): add cf_orch allocation support to LLMRouter backends	2026-04-02 12:19:17 -07:00
pyr0ball	f741e6a80b	fix(orch): hoist service-known check; capture resident_keys once in allocate	2026-04-02 11:45:48 -07:00
pyr0ball	defaf39883	feat(core): add CFOrchClient sync+async context manager Implements CFOrchClient with allocate() (sync contextmanager) and allocate_async() (async contextmanager) for cf-orch GPU resource allocation. Releases allocation on exit; ignores 404 on release; raises RuntimeError on non-2xx allocation response. Exports CFOrchClient and Allocation from circuitforge_core.resources. Note: async test uses unittest.mock rather than httpretty — httpretty only patches stdlib sockets and does not intercept httpx async (anyio) transport.	2026-04-02 11:44:35 -07:00
pyr0ball	8201f6b3e9	feat(orch): add /api/services/{service}/allocate with auto node selection	2026-04-02 11:25:38 -07:00
pyr0ball	52d2c5cf38	feat(orch): expose online_agents() and resident_keys() helpers	2026-04-02 11:22:29 -07:00
pyr0ball	d600fb6651	refactor(orch): hoist service_max_mb lookup; clarify warm-fallback comments	2026-04-02 11:21:20 -07:00
pyr0ball	13eb0c85f1	feat(orch): add NodeSelector — warm-first GPU scoring	2026-04-02 11:18:44 -07:00
pyr0ball	aa51794f45	fix(scheduler): join batch worker threads in shutdown() Previously shutdown() only joined the scheduler loop thread. Batch worker threads (which decrement _reserved_vram in their finally block) could still be running when shutdown returned, leaving stale VRAM accounting. Now snapshots active workers under lock and joins them all. Snapshot-then-join pattern avoids holding the lock across blocking join calls (which would deadlock since workers acquire the same lock on exit).	2026-04-01 11:21:30 -07:00
pyr0ball	6b8e421eb2	feat(scheduler): acquire/release cf-orch VRAM lease per batch worker Before running a batch of tasks, the scheduler now requests a VRAM lease from the cf-orch coordinator (POST /api/leases). The lease is held for the full batch and released in the finally block so it's always cleaned up even on error. Falls back gracefully if the coordinator is unreachable. Adds coordinator_url and service_name params to TaskScheduler.__init__ and get_scheduler() so callers can override the default localhost:7700.	2026-04-01 11:06:16 -07:00
pyr0ball	67701f0d29	feat(orch): agent self-registration and coordinator heartbeat loop coordinator/app.py: - Add POST /api/nodes — agents POST {node_id, agent_url} to self-register; coordinator immediately polls the new agent for GPU info - Add lifespan context manager that starts/stops AgentSupervisor heartbeat loop (previously the loop was never started) cli.py start: - Add --node-id flag (default 'local') - Pre-register the local agent URL (http://127.0.0.1:{agent_port}) so the heartbeat loop can poll it immediately on startup - Drop redundant lease_manager.register_gpu() call — supervisor.poll_agent() now does this via the heartbeat after the agent responds cli.py agent: - Add --advertise-host flag for NATted/multi-homed nodes - Fire registration POST to coordinator in a daemon thread (2s delay) so uvicorn.run() can start binding immediately; no double uvicorn.run()	2026-03-31 19:20:35 -07:00
pyr0ball	7aa0ad7a51	feat(dashboard): add self-hosted coordinator dashboard at GET / - dashboard.html: node-centric layout — GPU cards with VRAM bars and sparklines, active leases table with TTL progress bars, service health pill, auto-refreshes every 5s via fetch() against the local JSON API - All dynamic content set via DOM textContent / createElementNS — no innerHTML with user-sourced strings - coordinator/app.py: serves dashboard.html at GET / (HTMLResponse, excluded from OpenAPI schema); HTML read at import time from package dir - test_dashboard_serves_html: verifies 200, content-type text/html, and key route markers present	2026-03-31 18:57:25 -07:00

1 2

77 commits