Commit graph

62 commits

Author SHA1 Message Date
4ac99403bd feat(community): add CommunityDB connection pool and migration runner 2026-04-12 17:21:39 -07:00
4c27cf4bd0 feat(community): add CommunityPost frozen dataclass with element snapshot schema 2026-04-12 17:19:24 -07:00
69a338bd98 feat(text): add OpenAI-compat /v1/chat/completions endpoint
Some checks are pending
CI / test (push) Waiting to run
Mirror / mirror (push) Waiting to run
Adds POST /v1/chat/completions to the cf-text FastAPI service so it can
be used as an openai_compat backend in LLMRouter without any router changes.
The endpoint accepts the standard OpenAI chat request format and returns
a standard chat.completion response.

4 tests added; all 36 text tests pass.
2026-04-12 17:04:58 -07:00
8c1daf3b6c feat: cf-vision managed service (#43)
Some checks are pending
CI / test (push) Waiting to run
Mirror / mirror (push) Waiting to run
SigLIP so400m-patch14-384 as default backend (classify + embed, ~1.4 GB VRAM).
VLM backend (moondream2, LLaVA, Qwen-VL, etc.) as callable alternative for
caption generation and VQA.  Follows the same factory/Protocol/mock pattern
as cf-stt and cf-tts.

New module: circuitforge_core.vision
- backends/base.py  — VisionBackend Protocol, VisionResult, make_vision_backend()
- backends/mock.py  — MockVisionBackend (no GPU, deterministic)
- backends/siglip.py — SigLIPBackend: sigmoid zero-shot classify + L2 embed
- backends/vlm.py   — VLMBackend: AutoModelForVision2Seq caption + prompt classify
- __init__.py       — process singleton; classify(), embed(), caption(), make_backend()
- app.py            — FastAPI service (port 8006): /health /classify /embed /caption

Backend selection: CF_VISION_BACKEND=siglip|vlm, auto-detected from model path.
VLM backend: supports_embed=False, caption()/classify() only.
SigLIP backend: supports_caption=False, classify()/embed() only.

52 new tests, 385 total passing. Closes #43.
2026-04-09 06:53:43 -07:00
80b0d5fd34 feat: v0.9.0 — cf-text, pipeline crystallization engine, multimodal pipeline, a11y preferences
Some checks failed
CI / test (push) Waiting to run
Mirror / mirror (push) Has been cancelled
Release — PyPI / release (push) Has been cancelled
Closes #33, #37, #38, #41, #42.

## cf-text (closes #41)
- New module: `circuitforge_core.text` — direct local inference bypassing ollama/vllm
- Backends: llama.cpp (GGUF), transformers (HF), mock
- Auto-detects backend from file extension; CF_TEXT_BACKEND env override
- Optional 4-bit/8-bit quantisation via bitsandbytes (CF_TEXT_4BIT / CF_TEXT_8BIT)
- process-level singleton + per-request `make_backend()` path

## Pipeline crystallization engine (closes #33, #37)
- FPGA→ASIC model: LLM-discovered paths → deterministic workflows after N approvals
- `models.py`: PipelineRun (incl. review_duration_ms + output_modified per #37),
  CrystallizedWorkflow, Step, hash_input()
- `recorder.py`: append-only JSON run log under ~/.config/circuitforge/pipeline/
- `crystallizer.py`: threshold check, majority/most-recent step strategy,
  rubber-stamp warning (review_duration_ms < 5s triggers warnings.warn)
- `registry.py`: exact + fuzzy match, deactivate-without-delete, colon-safe filenames
- `executor.py`: deterministic steps with transparent LLM fallback

## Multimodal chunked pipeline (closes #42)
- `pipeline/multimodal.py`: cf-docuvision pages → cf-text streaming
- `run()` yields PageResult per page (progressive, no full-doc buffer)
- `stream()` yields (page_idx, token) tuples for token-level UI rendering
- `vram_serialise` flag + `swap_fn` hook for 8GB GPU VRAM management
- `prompt_fn` callback for product-specific prompt construction

## Accessibility preferences (closes #38)
- `preferences/accessibility.py`: PREF_REDUCED_MOTION, PREF_HIGH_CONTRAST,
  PREF_FONT_SIZE, PREF_SCREEN_READER with get/set helpers
- Exported from preferences package __init__

## LLM router fix
- cf-orch backends: skip reachability pre-check; allocation starts the service
- Static backends: reachability check remains in place
2026-04-08 23:17:18 -07:00
f3bc4ac605 feat: add CF_LICENSE_KEY validation via Heimdall (closes #26)
Introduces circuitforge_core.config.license with validate_license() and
get_license_tier(). Both functions are safe to call when CF_LICENSE_KEY
is absent, returning free tier gracefully. Results are cached 30 min per
(key, product) pair. CF_LICENSE_URL env var overrides the default
Heimdall endpoint. Re-exports added to config.__init__. Existing
test_config.py moved into tests/test_config/ package to co-locate with
new test_license.py (10 tests; 204 total passing).
2026-04-05 21:16:57 -07:00
f0a9ec5c37 fix: raise 502 on label creation failure; narrow subprocess exception scope 2026-04-05 17:36:52 -07:00
0a15ad9522 feat: add circuitforge_core.api.feedback — shared feedback router factory (closes #23)
Adds make_feedback_router(repo, product, demo_mode_fn) which returns a
FastAPI APIRouter with GET /status and POST / endpoints. Handles Forgejo
label creation/reuse, issue body assembly (including repro steps for bugs),
demo mode gating, and FORGEJO_API_TOKEN presence checks. 12 tests covering
all status/submit paths, mock Forgejo interaction, and body content assertions.
Also adds fastapi>=0.110 and httpx>=0.27 to [dev] optional deps.
2026-04-05 17:31:02 -07:00
c244260d1c feat!: strip resources/ from MIT core — moves to circuitforge-orch (v0.8.0)
BREAKING CHANGE: circuitforge_core.resources is no longer available.
Import CFOrchClient from circuitforge_orch.client instead.
cf-orch CLI entry point is now in the circuitforge-orch package.
2026-04-04 22:34:27 -07:00
2259382d0b refactor: replace coordinator-aware TaskScheduler with Protocol + LocalScheduler (MIT); update LLMRouter import path 2026-04-04 22:26:06 -07:00
090a86ce1b refactor: update LLMRouter lazy import — circuitforge_core.resources.client → circuitforge_orch.client 2026-04-04 22:16:17 -07:00
ccd2a35deb test: affiliates integration tests — full wrap_url round-trip 2026-04-04 18:28:27 -07:00
7837fbcad2 feat: affiliates router — wrap_url() with opt-out, BYOK, and CF env-var resolution 2026-04-04 18:20:21 -07:00
73cec07bd2 feat: affiliates disclosure — per-retailer tooltip copy + first-encounter banner constants 2026-04-04 18:14:58 -07:00
4c3f3a95a5 feat: affiliates programs — AffiliateProgram, registry, eBay EPN + Amazon Associates builders 2026-04-04 18:12:45 -07:00
d719ea2309 feat: preferences public helpers — get_user_preference / set_user_preference (closes #22 self-hosted) 2026-04-04 18:10:24 -07:00
0d9d030320 feat: preferences LocalFileStore — YAML-backed single-user preference store 2026-04-04 18:07:35 -07:00
9ee31a09c1 feat: preferences dot-path utilities (get_path, set_path) 2026-04-04 18:04:44 -07:00
3deae056de feat: local-first LLM config + hosted coordinator auth
LLMRouter env-var auto-config:
- No llm.yaml required — auto-configures from ANTHROPIC_API_KEY,
  OPENAI_API_KEY, or OLLAMA_HOST on first use
- Bare-metal self-hosters can run any CF product with just env vars
- Falls back to FileNotFoundError with actionable message only when
  no env vars are set either

CFOrchClient auth:
- Reads CF_LICENSE_KEY env var (or explicit api_key param)
- Sends Authorization: Bearer <key> on all allocation/release requests
- Required for the hosted public coordinator; no-op for local deployments

HeimdallAuthMiddleware (new):
- FastAPI middleware for cf-orch coordinator
- Enabled by HEIMDALL_URL env var; self-hosted deployments skip it
- 5-min TTL cache (matching Kiwi cloud session) keeps Heimdall off the
  per-allocation hot path
- /api/health exempt; free-tier keys rejected with 403 + reason
- 13 tests covering cache TTL, tier ranking, and middleware gating
2026-04-03 08:32:15 -07:00
8d87ed4c9f feat: manage.py cross-platform product manager (closes #6)
- circuitforge_core.manage module — replaces bash-only manage.sh
  - config.py: ManageConfig from manage.toml (TOML via tomllib/tomli)
    app name, default_url, docker compose_file/project, native services
    Falls back to directory name when no manage.toml present
  - docker_mode.py: DockerManager wrapping 'docker compose' (v2 plugin)
    or 'docker-compose' (v1 fallback); docker_available() probe
    Commands: start, stop, restart, status, logs, build
  - native_mode.py: NativeManager with PID file process management
    platformdirs for platform-appropriate PID/log paths
    Windows-compatible log tailing (polling, no tail -f)
    Cross-platform kill: SIGTERM→SIGKILL on Unix, taskkill /F on Windows
  - cli.py: typer CLI — start/stop/restart/status/logs/build/open/install-shims
    Mode auto-detection: Docker available + compose file → docker; else native
    --mode docker|native|auto override
  - templates/manage.sh: bash shim (conda, venv, python3 detection)
  - templates/manage.ps1: PowerShell shim (same detection, Windows)
  - templates/manage.toml.example: annotated config template
  - __main__.py: python -m circuitforge_core.manage entry point

- pyproject.toml: manage extras group (platformdirs, typer)
  cf-manage console script; version bumped to 0.5.0

- 36 tests: config (6), docker_mode (9), native_mode (21)
2026-04-02 23:04:35 -07:00
7bb6b76bd5 feat: ollama adopt-if-running + health_path in ProcessSpec (#16)
- ProcessSpec: adopt (bool) and health_path (str, default /health) fields
- ServiceManager: adopt=True probes health_path before spawning; is_running()
  uses health probe for adopt services rather than proc table + socket check
- _probe_health() helper: urllib GET on localhost:port+path, returns bool
- Agent /services/{service}/start: returns adopted=True when service was
  already running; coordinator sets state=running immediately (no probe wait)
- ServiceInstance: health_path field (default /health)
- service_registry.upsert_instance(): health_path kwarg
- Probe loop uses inst.health_path instead of hardcoded /health
- coordinator allocate_service: looks up health_path from profile spec via
  _get_health_path() and stores on ServiceInstance
- All GPU profiles (2/4/6/8/16/24 GB + cpu-16/32): ollama managed block
  with adopt=true, health_path=/api/tags, port 11434
- 11 new tests
2026-04-02 22:09:42 -07:00
a54a530493 feat: agent watchdog — persist known nodes + auto-reconnect after coordinator restart
closes #15

- NodeStore: SQLite persistence for known agent nodes
  (~/.local/share/circuitforge/cf-orch-nodes.db)
  - upsert on every register(); prune_stale() for 30-day cleanup
  - survives coordinator restarts — data readable by next process

- AgentSupervisor.restore_from_store(): reload known nodes on startup,
  mark all offline; heartbeat loop brings back any that respond

- AgentSupervisor.register(): persists to NodeStore on every call

- cli.py coordinator: NodeStore wired in; restore_from_store() called
  before uvicorn starts

- cli.py agent: one-shot registration replaced with persistent reconnect
  loop (daemon thread, 30 s interval) — coordinator restart → nodes
  reappear within one cycle with no manual intervention on agent hosts

- 16 new tests: NodeStore (8) + AgentSupervisor watchdog (8)
2026-04-02 22:01:55 -07:00
cd9864b5e8 feat: hardware detection, cf-docuvision service, documents ingestion pipeline
Closes #5, #7, #8, #13

## hardware module (closes #5)
- HardwareSpec, LLMBackendConfig, LLMConfig dataclasses
- VramTier ladder (CPU / 2 / 4 / 6 / 8 / 16 / 24 GB) with select_tier()
- generate_profile() maps HardwareSpec → LLMConfig for llm.yaml generation
- detect_hardware() with nvidia-smi / rocm-smi / system_profiler / cpu fallback
- 31 tests across tiers, generator, and detect

## cf-docuvision service (closes #8)
- FastAPI service wrapping ByteDance/Dolphin-v2 (Qwen2.5-VL backbone)
- POST /extract: image_b64 or image_path + hint → ExtractResponse
- Lazy model loading; JSON-structured output with plain-text fallback
- ProcessSpec managed blocks added to all four GPU profiles (6/8/16/24 GB)
- 14 tests

## documents module (closes #7)
- StructuredDocument, Element, ParsedTable dataclasses (frozen, composable)
- DocuvisionClient: thin HTTP client for cf-docuvision POST /extract
- ingest(): primary cf-docuvision path → LLMRouter vision fallback → empty doc
- CF_DOCUVISION_URL env var for URL override
- 22 tests

## coordinator probe loop (closes #13)
- _run_instance_probe_loop: starting → running on 200; starting → stopped on timeout
- 4 async tests with CancelledError-based tick control
2026-04-02 18:53:25 -07:00
bd132851ec fix(orch): tighten VRAM pre-flight to require full max_mb free (not half)
max_mb // 2 was too loose — Qwen2.5-3B needs ~5.9 GB on an 8 GB card
but the threshold only required 3.25 GB free, allowing Ollama to hold
4.5 GB while a load attempt was still dispatched (causing OOM crash).

- node_selector: can_fit = free_mb >= service_max_mb (was // 2)
- coordinator /start: same threshold fix + updated error message
- tests: two new node_selector tests pin the full-ceiling semantics;
  updated stale docstring in coordinator app test
2026-04-02 16:44:36 -07:00
c78341fc6f feat(orch): replace Ouro/vllm-Docker with generic HF inference server; add ProcessSpec
- Add circuitforge_core/resources/inference/llm_server.py: generic OpenAI-compatible
  FastAPI server for any HuggingFace causal LM (Phi-4-mini-instruct, Qwen2.5-3B-Instruct)
- Add service_manager.py + service_probe.py: ProcessSpec start/stop/is_running support
  (Popen-based; socket probe confirms readiness before marking running)
- Update all 4 public GPU profiles to use ProcessSpec→llm_server instead of Docker vllm:
  6gb (max_mb 5500), 8gb (max_mb 6500), 16gb/24gb (max_mb 9000)
- Model candidates: Phi-4-mini-instruct first (7.2GB), Qwen2.5-3B-Instruct fallback (5.8GB)
- Remove ouro_server.py (Ouro incompatible with transformers 5.x; vllm Docker also incompatible)
- Add 17 tests for ServiceManager ProcessSpec (start/stop/is_running/list/get_url)
2026-04-02 15:33:08 -07:00
e58c3aea23 fix: TTL sweep, immutability, service-scoped release, logger in orch alloc
- ServiceRegistry: add sweep_expired_allocations() to remove stale TTL
  allocations and transition instances to idle; add get_allocation() helper
- AgentSupervisor._run_idle_sweep: call sweep_expired_allocations() before
  idle-timeout check so crashed-caller leaks are cleaned up each sweep tick
- schema._parse_managed: copy raw dict before extracting 'type' key instead
  of mutating caller's dict with pop()
- app.release_allocation: validate allocation belongs to the given service
  path param before releasing; return 404 if mismatch
- router._try_cf_orch_alloc: replace print() with logger.warning(); add
  module-level logger = logging.getLogger(__name__)
- tests: add test_sweep_expired_allocations covering TTL expiry and idle
  state transition
2026-04-02 12:55:38 -07:00
1a20b80a50 test: add VRAM pre-flight 503 test for ensure_service 2026-04-02 12:49:50 -07:00
a4ccaaf3e2 fix: address coordinator/idle-sweep quality issues from review
- CRITICAL: idle sweep now calls mark_stopped() after successful HTTP stop,
  preventing repeated stop POSTs on every 3rd tick for the same instance
- CRITICAL: active_allocations() now filters by gpu_id to avoid marking wrong
  instance idle on multi-GPU nodes when an allocation is released
- CRITICAL: VRAM pre-flight guard in ensure_service was dead code — added the
  actual HTTPException(503) before the candidate loop
- IMPORTANT: register() now updates agent_url on re-registration if it changed,
  so relocated agents are tracked correctly
- IMPORTANT: updated test_service_registry.py callers of active_allocations()
  to pass the now-required gpu_id argument
2026-04-02 12:45:31 -07:00
49ab9e4e88 feat: wire ServiceRegistry into coordinator allocate endpoints 2026-04-02 12:30:58 -07:00
c299482e0d feat: add idle sweep to AgentSupervisor 2026-04-02 12:30:28 -07:00
1e168ac636 feat(profiles): add idle_stop_after_s field; set 600s for vllm slot
Add idle_stop_after_s to ServiceProfile (default 0 = never stop).
Set 600s (10 min) timeout on vllm slot in all single-GPU profiles.
Backward compatible; non-vllm services inherit default 0 (no auto-stop).
2026-04-02 12:24:19 -07:00
9754f522d9 feat(orch): add ServiceRegistry — allocation tracking + idle state machine 2026-04-02 12:22:46 -07:00
defaf39883 feat(core): add CFOrchClient sync+async context manager
Implements CFOrchClient with allocate() (sync contextmanager) and
allocate_async() (async contextmanager) for cf-orch GPU resource
allocation. Releases allocation on exit; ignores 404 on release;
raises RuntimeError on non-2xx allocation response. Exports
CFOrchClient and Allocation from circuitforge_core.resources.

Note: async test uses unittest.mock rather than httpretty — httpretty
only patches stdlib sockets and does not intercept httpx async (anyio)
transport.
2026-04-02 11:44:35 -07:00
8201f6b3e9 feat(orch): add /api/services/{service}/allocate with auto node selection 2026-04-02 11:25:38 -07:00
52d2c5cf38 feat(orch): expose online_agents() and resident_keys() helpers 2026-04-02 11:22:29 -07:00
13eb0c85f1 feat(orch): add NodeSelector — warm-first GPU scoring 2026-04-02 11:18:44 -07:00
7aa0ad7a51 feat(dashboard): add self-hosted coordinator dashboard at GET /
- dashboard.html: node-centric layout — GPU cards with VRAM bars and
  sparklines, active leases table with TTL progress bars, service health
  pill, auto-refreshes every 5s via fetch() against the local JSON API
- All dynamic content set via DOM textContent / createElementNS — no
  innerHTML with user-sourced strings
- coordinator/app.py: serves dashboard.html at GET / (HTMLResponse,
  excluded from OpenAPI schema); HTML read at import time from package dir
- test_dashboard_serves_html: verifies 200, content-type text/html,
  and key route markers present
2026-03-31 18:57:25 -07:00
22bad8590a fix(tasks): fix VRAM accounting race, lock scope, type annotations
- C1: Remove _reserved_vram decrement from _scheduler_loop reaper; sole
  responsibility now belongs to _batch_worker's finally block, eliminating
  the double-decrement race that could drive _reserved_vram negative.
- C2: Move TaskScheduler construction (including VRAM detection httpx call)
  outside _scheduler_lock in get_scheduler(); lock is now only held for the
  final singleton assignment, preventing 2s lock contention on first call.
- I1: Add RunTaskFn type alias (Callable[...]) and use it in __init__ and
  get_scheduler() instead of bare Callable.
- I2: Replace namedtuple TaskSpec with typed NamedTuple class.
- I3: Parameterize _queues annotation as dict[str, deque[TaskSpec]].
- I4: Wrap _queues read in start() with self._lock.
- I5: Replace time.sleep() ordering assertion in test_vram_budget_blocks_second_type
  with event-based synchronization using type_a_started/type_b_started events.
- M2: Use sqlite3.connect() as context manager in _load_queued_tasks.
- M3: Strengthen weak assertion in test_enqueue_returns_false_when_queue_full.
- M4: Add test_reserved_vram_zero_after_task_completes to catch C1 regression.
2026-03-31 09:15:09 -07:00
09a5087c72 test(tasks): add preflight fallback coverage to scheduler tests
Adds test_detect_vram_preflight_fallback to cover the spec path where
cf-orch is unreachable but scripts.preflight.get_gpus() succeeds,
verifying detect_available_vram_gb() returns the summed total VRAM.
Uses sys.modules injection to simulate the preflight module being present.
2026-03-30 23:15:19 -07:00
5801928f8e feat(tasks): add shared VRAM-aware LLM task scheduler
Extract generic batch scheduler into circuitforge_core.tasks.scheduler
so any CircuitForge product can use it. Includes VRAM detection via
cf-orch coordinator (cooperative free-VRAM), preflight fallback, and
unlimited fallback; singleton API; full test coverage (12 tests).
2026-03-30 23:12:23 -07:00
d755e9ea2c test(resources): add integration tests for full lease/eviction cycle 2026-03-30 22:37:06 -07:00
70017abd35 feat(resources): add cf-orch CLI with start, agent, status, install-service commands 2026-03-30 22:27:11 -07:00
4bcd297b18 feat(resources): add cforch-coordinator FastAPI app with lease/node/profile endpoints 2026-03-30 22:01:46 -07:00
cede761d82 feat(resources): add AgentSupervisor and EvictionEngine 2026-03-30 21:44:42 -07:00
7718911652 feat(resources): add cforch-agent FastAPI app with /health /gpu-info /evict 2026-03-30 20:51:08 -07:00
4a857d5339 feat(resources): add EvictionExecutor with SIGTERM/grace/SIGKILL sequence 2026-03-30 20:46:45 -07:00
a79fd10f45 fix(resources): patch subprocess at import site in gpu_monitor tests 2026-03-30 20:45:01 -07:00
3dcbe801f1 feat(resources): add GpuMonitor for nvidia-smi polling 2026-03-30 20:42:57 -07:00
6b239b76e3 fix(resources): rename lambda var; convert asyncio.run test to async 2026-03-30 20:41:03 -07:00
d60503f059 feat(resources): add LeaseManager with VRAM tracking and eviction candidate selection 2026-03-30 20:38:51 -07:00