circuitforge-core

Author	SHA1	Message	Date
pyr0ball	3c9c765668	feat: re-export make_feedback_router from circuitforge_core.api (closes #30 ) Some checks failed CI / test (pull_request) Has been cancelled Details	2026-04-05 21:21:44 -07:00
pyr0ball	d98d27be3d	chore: remove misplaced cf-orch docker workflow (belongs in circuitforge-orch) Some checks are pending CI / test (push) Waiting to run Details Mirror / mirror (push) Waiting to run Details	2026-04-05 20:53:13 -07:00
pyr0ball	4d858af4d1	Merge pull request 'ci: Forgejo Actions — CI, PyPI release, mirrors (closes #27 )' (#29 ) from feature/ci-cd into main Some checks are pending CI / test (push) Waiting to run Details Mirror / mirror (push) Waiting to run Details	2026-04-05 20:51:31 -07:00
pyr0ball	874354f235	fix: continue-on-error for mirror steps; guard duplicate Forgejo release creation Some checks failed CI / test (pull_request) Has been cancelled Details	2026-04-05 20:51:18 -07:00
pyr0ball	3050179b2f	fix: use jq for safe JSON in release step; remove redundant dev deps in ci	2026-04-05 20:51:18 -07:00
pyr0ball	378d125ba6	ci: add Forgejo Actions workflows — CI, PyPI release, mirrors, cliff.toml (closes #27 )	2026-04-05 20:51:18 -07:00
pyr0ball	1cbea29817	Merge pull request 'feat: shared feedback router factory (closes #23 )' (#28 ) from feature/api-feedback into main	2026-04-05 20:50:24 -07:00
pyr0ball	f0a9ec5c37	fix: raise 502 on label creation failure; narrow subprocess exception scope	2026-04-05 17:36:52 -07:00
pyr0ball	0a15ad9522	feat: add circuitforge_core.api.feedback — shared feedback router factory (closes #23 ) Adds make_feedback_router(repo, product, demo_mode_fn) which returns a FastAPI APIRouter with GET /status and POST / endpoints. Handles Forgejo label creation/reuse, issue body assembly (including repro steps for bugs), demo mode gating, and FORGEJO_API_TOKEN presence checks. 12 tests covering all status/submit paths, mock Forgejo interaction, and body content assertions. Also adds fastapi>=0.110 and httpx>=0.27 to [dev] optional deps.	2026-04-05 17:31:02 -07:00
pyr0ball	c244260d1c	feat!: strip resources/ from MIT core — moves to circuitforge-orch (v0.8.0) BREAKING CHANGE: circuitforge_core.resources is no longer available. Import CFOrchClient from circuitforge_orch.client instead. cf-orch CLI entry point is now in the circuitforge-orch package.	2026-04-04 22:34:27 -07:00
pyr0ball	2259382d0b	refactor: replace coordinator-aware TaskScheduler with Protocol + LocalScheduler (MIT); update LLMRouter import path	2026-04-04 22:26:06 -07:00
pyr0ball	090a86ce1b	refactor: update LLMRouter lazy import — circuitforge_core.resources.client → circuitforge_orch.client	2026-04-04 22:16:17 -07:00
pyr0ball	c1e825c06a	Merge pull request 'feat: affiliates + preferences modules v0.7.0 (closes #21 , #22 )' (#25 ) from feature/affiliates-module into main	2026-04-04 19:14:24 -07:00
pyr0ball	d16bc569cf	chore: bump version to 0.7.0 — affiliates + preferences modules	2026-04-04 18:28:52 -07:00
pyr0ball	ccd2a35deb	test: affiliates integration tests — full wrap_url round-trip	2026-04-04 18:28:27 -07:00
pyr0ball	fe19de3d9a	feat: affiliates public API surface (__init__.py)	2026-04-04 18:27:45 -07:00
pyr0ball	7837fbcad2	feat: affiliates router — wrap_url() with opt-out, BYOK, and CF env-var resolution	2026-04-04 18:20:21 -07:00
pyr0ball	73cec07bd2	feat: affiliates disclosure — per-retailer tooltip copy + first-encounter banner constants	2026-04-04 18:14:58 -07:00
pyr0ball	4c3f3a95a5	feat: affiliates programs — AffiliateProgram, registry, eBay EPN + Amazon Associates builders	2026-04-04 18:12:45 -07:00
pyr0ball	d719ea2309	feat: preferences public helpers — get_user_preference / set_user_preference (closes #22 self-hosted)	2026-04-04 18:10:24 -07:00
pyr0ball	0d9d030320	feat: preferences LocalFileStore — YAML-backed single-user preference store	2026-04-04 18:07:35 -07:00
pyr0ball	9ee31a09c1	feat: preferences dot-path utilities (get_path, set_path)	2026-04-04 18:04:44 -07:00
pyr0ball	e6cd3a2e96	chore: sync __version__ to 0.6.0 (matches pyproject.toml)	2026-04-03 16:48:11 -07:00
pyr0ball	cb51ba72bc	feat: cf-orch Docker image + Forgejo CI pipeline Dockerfile.orch — multi-mode image (coordinator \| agent): - coordinator: runs cf-orch coordinator on $CF_ORCH_PORT (default 7700) - agent: connects to $CF_COORDINATOR_URL, serves $CF_AGENT_GPU_IDS .forgejo/workflows/docker.yml — publishes on every vN.N.N tag: - ghcr.io/circuit-forge/cf-orch:latest - ghcr.io/circuit-forge/cf-orch:vX.Y.Z - Layer cache via GHA cache backend Closes #19. Bumps to v0.6.0.	2026-04-03 09:10:29 -07:00
pyr0ball	3deae056de	feat: local-first LLM config + hosted coordinator auth LLMRouter env-var auto-config: - No llm.yaml required — auto-configures from ANTHROPIC_API_KEY, OPENAI_API_KEY, or OLLAMA_HOST on first use - Bare-metal self-hosters can run any CF product with just env vars - Falls back to FileNotFoundError with actionable message only when no env vars are set either CFOrchClient auth: - Reads CF_LICENSE_KEY env var (or explicit api_key param) - Sends Authorization: Bearer <key> on all allocation/release requests - Required for the hosted public coordinator; no-op for local deployments HeimdallAuthMiddleware (new): - FastAPI middleware for cf-orch coordinator - Enabled by HEIMDALL_URL env var; self-hosted deployments skip it - 5-min TTL cache (matching Kiwi cloud session) keeps Heimdall off the per-allocation hot path - /api/health exempt; free-tier keys rejected with 403 + reason - 13 tests covering cache TTL, tier ranking, and middleware gating	2026-04-03 08:32:15 -07:00
pyr0ball	9544f695e6	chore: CHANGELOG for v0.5.0	2026-04-02 23:05:22 -07:00
pyr0ball	7397e227e2	Merge pull request 'feat: manage.py cross-platform product manager' (#18 ) from feature/manage-py into main	2026-04-02 23:04:58 -07:00
pyr0ball	8d87ed4c9f	feat: manage.py cross-platform product manager (closes #6 ) - circuitforge_core.manage module — replaces bash-only manage.sh - config.py: ManageConfig from manage.toml (TOML via tomllib/tomli) app name, default_url, docker compose_file/project, native services Falls back to directory name when no manage.toml present - docker_mode.py: DockerManager wrapping 'docker compose' (v2 plugin) or 'docker-compose' (v1 fallback); docker_available() probe Commands: start, stop, restart, status, logs, build - native_mode.py: NativeManager with PID file process management platformdirs for platform-appropriate PID/log paths Windows-compatible log tailing (polling, no tail -f) Cross-platform kill: SIGTERM→SIGKILL on Unix, taskkill /F on Windows - cli.py: typer CLI — start/stop/restart/status/logs/build/open/install-shims Mode auto-detection: Docker available + compose file → docker; else native --mode docker\|native\|auto override - templates/manage.sh: bash shim (conda, venv, python3 detection) - templates/manage.ps1: PowerShell shim (same detection, Windows) - templates/manage.toml.example: annotated config template - __main__.py: python -m circuitforge_core.manage entry point - pyproject.toml: manage extras group (platformdirs, typer) cf-manage console script; version bumped to 0.5.0 - 36 tests: config (6), docker_mode (9), native_mode (21)	2026-04-02 23:04:35 -07:00
pyr0ball	6e3474b97b	chore: CHANGELOG for v0.4.0	2026-04-02 22:13:01 -07:00
pyr0ball	d45d4e1de6	Merge pull request 'feat: agent watchdog + Ollama adopt-if-running' (#17 ) from feature/agent-watchdog into main	2026-04-02 22:12:32 -07:00
pyr0ball	7bb6b76bd5	feat: ollama adopt-if-running + health_path in ProcessSpec (#16 ) - ProcessSpec: adopt (bool) and health_path (str, default /health) fields - ServiceManager: adopt=True probes health_path before spawning; is_running() uses health probe for adopt services rather than proc table + socket check - _probe_health() helper: urllib GET on localhost:port+path, returns bool - Agent /services/{service}/start: returns adopted=True when service was already running; coordinator sets state=running immediately (no probe wait) - ServiceInstance: health_path field (default /health) - service_registry.upsert_instance(): health_path kwarg - Probe loop uses inst.health_path instead of hardcoded /health - coordinator allocate_service: looks up health_path from profile spec via _get_health_path() and stores on ServiceInstance - All GPU profiles (2/4/6/8/16/24 GB + cpu-16/32): ollama managed block with adopt=true, health_path=/api/tags, port 11434 - 11 new tests	2026-04-02 22:09:42 -07:00
pyr0ball	a54a530493	feat: agent watchdog — persist known nodes + auto-reconnect after coordinator restart closes #15 - NodeStore: SQLite persistence for known agent nodes (~/.local/share/circuitforge/cf-orch-nodes.db) - upsert on every register(); prune_stale() for 30-day cleanup - survives coordinator restarts — data readable by next process - AgentSupervisor.restore_from_store(): reload known nodes on startup, mark all offline; heartbeat loop brings back any that respond - AgentSupervisor.register(): persists to NodeStore on every call - cli.py coordinator: NodeStore wired in; restore_from_store() called before uvicorn starts - cli.py agent: one-shot registration replaced with persistent reconnect loop (daemon thread, 30 s interval) — coordinator restart → nodes reappear within one cycle with no manual intervention on agent hosts - 16 new tests: NodeStore (8) + AgentSupervisor watchdog (8)	2026-04-02 22:01:55 -07:00
pyr0ball	a36f469d60	chore: CHANGELOG for v0.3.0	2026-04-02 18:56:49 -07:00
pyr0ball	1de5ec767c	Merge pull request 'feat: hardware detection, cf-docuvision service, documents ingestion pipeline' (#14 ) from feature/hardware-docuvision into main	2026-04-02 18:55:50 -07:00
pyr0ball	cd9864b5e8	feat: hardware detection, cf-docuvision service, documents ingestion pipeline Closes #5, #7, #8, #13 ## hardware module (closes #5) - HardwareSpec, LLMBackendConfig, LLMConfig dataclasses - VramTier ladder (CPU / 2 / 4 / 6 / 8 / 16 / 24 GB) with select_tier() - generate_profile() maps HardwareSpec → LLMConfig for llm.yaml generation - detect_hardware() with nvidia-smi / rocm-smi / system_profiler / cpu fallback - 31 tests across tiers, generator, and detect ## cf-docuvision service (closes #8) - FastAPI service wrapping ByteDance/Dolphin-v2 (Qwen2.5-VL backbone) - POST /extract: image_b64 or image_path + hint → ExtractResponse - Lazy model loading; JSON-structured output with plain-text fallback - ProcessSpec managed blocks added to all four GPU profiles (6/8/16/24 GB) - 14 tests ## documents module (closes #7) - StructuredDocument, Element, ParsedTable dataclasses (frozen, composable) - DocuvisionClient: thin HTTP client for cf-docuvision POST /extract - ingest(): primary cf-docuvision path → LLMRouter vision fallback → empty doc - CF_DOCUVISION_URL env var for URL override - 22 tests ## coordinator probe loop (closes #13) - _run_instance_probe_loop: starting → running on 200; starting → stopped on timeout - 4 async tests with CancelledError-based tick control	2026-04-02 18:53:25 -07:00
pyr0ball	482c430cdb	docs: add CHANGELOG for v0.1.0 and v0.2.0	2026-04-02 17:25:06 -07:00
pyr0ball	749e51ccca	Merge pull request 'feat(orch): health probe loop + VRAM pre-flight fix' (#12 ) from feature/orch-llm-server into main	2026-04-02 17:24:09 -07:00
pyr0ball	a7290c1240	feat(orch): background health probe loop — starting → running transition Coordinator now polls all 'starting' instances every 5 s via GET /health. On 200: state → running. After 300 s without a healthy response: state → stopped. Closes #10.	2026-04-02 17:18:16 -07:00
pyr0ball	bd132851ec	fix(orch): tighten VRAM pre-flight to require full max_mb free (not half) max_mb // 2 was too loose — Qwen2.5-3B needs ~5.9 GB on an 8 GB card but the threshold only required 3.25 GB free, allowing Ollama to hold 4.5 GB while a load attempt was still dispatched (causing OOM crash). - node_selector: can_fit = free_mb >= service_max_mb (was // 2) - coordinator /start: same threshold fix + updated error message - tests: two new node_selector tests pin the full-ceiling semantics; updated stale docstring in coordinator app test	2026-04-02 16:44:36 -07:00
pyr0ball	2d095f0090	fix(llm-server): handle transformers 5.x BatchEncoding; use dtype kwarg - apply_chat_template() returns BatchEncoding in transformers 5.x (not bare tensor); extract .input_ids explicitly with fallback for 4.x compat - Switch from deprecated torch_dtype= to dtype= in from_pretrained()	2026-04-02 16:36:07 -07:00
pyr0ball	c78341fc6f	feat(orch): replace Ouro/vllm-Docker with generic HF inference server; add ProcessSpec - Add circuitforge_core/resources/inference/llm_server.py: generic OpenAI-compatible FastAPI server for any HuggingFace causal LM (Phi-4-mini-instruct, Qwen2.5-3B-Instruct) - Add service_manager.py + service_probe.py: ProcessSpec start/stop/is_running support (Popen-based; socket probe confirms readiness before marking running) - Update all 4 public GPU profiles to use ProcessSpec→llm_server instead of Docker vllm: 6gb (max_mb 5500), 8gb (max_mb 6500), 16gb/24gb (max_mb 9000) - Model candidates: Phi-4-mini-instruct first (7.2GB), Qwen2.5-3B-Instruct fallback (5.8GB) - Remove ouro_server.py (Ouro incompatible with transformers 5.x; vllm Docker also incompatible) - Add 17 tests for ServiceManager ProcessSpec (start/stop/is_running/list/get_url)	2026-04-02 15:33:08 -07:00
pyr0ball	27999925cf	fix(orch): seed ServiceInstance on first allocate start	2026-04-02 14:22:55 -07:00
pyr0ball	c5e12b74f2	Merge pull request 'feat: auto service lifecycle — /allocate, NodeSelector, idle sweep, CFOrchClient' (#9 ) from feature/orch-auto-lifecycle into main	2026-04-02 14:11:36 -07:00
pyr0ball	e58c3aea23	fix: TTL sweep, immutability, service-scoped release, logger in orch alloc - ServiceRegistry: add sweep_expired_allocations() to remove stale TTL allocations and transition instances to idle; add get_allocation() helper - AgentSupervisor._run_idle_sweep: call sweep_expired_allocations() before idle-timeout check so crashed-caller leaks are cleaned up each sweep tick - schema._parse_managed: copy raw dict before extracting 'type' key instead of mutating caller's dict with pop() - app.release_allocation: validate allocation belongs to the given service path param before releasing; return 404 if mismatch - router._try_cf_orch_alloc: replace print() with logger.warning(); add module-level logger = logging.getLogger(__name__) - tests: add test_sweep_expired_allocations covering TTL expiry and idle state transition	2026-04-02 12:55:38 -07:00
pyr0ball	1a20b80a50	test: add VRAM pre-flight 503 test for ensure_service	2026-04-02 12:49:50 -07:00
pyr0ball	02806359af	feat: add Services table to coordinator dashboard	2026-04-02 12:47:27 -07:00
pyr0ball	a4ccaaf3e2	fix: address coordinator/idle-sweep quality issues from review - CRITICAL: idle sweep now calls mark_stopped() after successful HTTP stop, preventing repeated stop POSTs on every 3rd tick for the same instance - CRITICAL: active_allocations() now filters by gpu_id to avoid marking wrong instance idle on multi-GPU nodes when an allocation is released - CRITICAL: VRAM pre-flight guard in ensure_service was dead code — added the actual HTTPException(503) before the candidate loop - IMPORTANT: register() now updates agent_url on re-registration if it changed, so relocated agents are tracked correctly - IMPORTANT: updated test_service_registry.py callers of active_allocations() to pass the now-required gpu_id argument	2026-04-02 12:45:31 -07:00
pyr0ball	49ab9e4e88	feat: wire ServiceRegistry into coordinator allocate endpoints	2026-04-02 12:30:58 -07:00
pyr0ball	c299482e0d	feat: add idle sweep to AgentSupervisor	2026-04-02 12:30:28 -07:00
pyr0ball	1e168ac636	feat(profiles): add idle_stop_after_s field; set 600s for vllm slot Add idle_stop_after_s to ServiceProfile (default 0 = never stop). Set 600s (10 min) timeout on vllm slot in all single-GPU profiles. Backward compatible; non-vllm services inherit default 0 (no auto-stop).	2026-04-02 12:24:19 -07:00

1 2

97 commits