circuitforge-core

Author	SHA1	Message	Date
pyr0ball	9f7fb45071	feat(video): add cf-video module — Marlin-2B FastAPI service + mock backend + tests Some checks are pending CI / test (push) Waiting to run Details Mirror / mirror (push) Waiting to run Details Add the circuitforge_core.video package implementing the cf-video inference service managed by cf-orch. Service endpoints: GET /health — liveness check; model name + VRAM POST /caption — dense scene description + timestamped event list POST /find — temporal grounding of a natural-language event query Backend hierarchy: VideoBackend (Protocol) MarlinBackend — NemoStation/Marlin-2B via transformers>=5.7.0 MockVideoBackend — deterministic stub; no GPU required Pydantic request/response models enforce parameter bounds at the API boundary (max_new_tokens ge/le, event min_length=1). Span is serialized as list[float] \| None for JSON compatibility. MarlinBackend loads eagerly in __init__ so cf-orch's 2-second liveness poll catches load failures immediately. FORCE_QWENVL_VIDEO_READER env var defaults to torchcodec (faster than av path) before transformers import. pyproject.toml extras: video-marlin — torch, transformers, torchcodec, qwen-vl-utils, av, Pillow video-service — video-marlin + fastapi + uvicorn Test coverage: 46 tests across test_mock_backend.py and test_app.py. All passing without GPU or real video file. Closes: #71	2026-05-25 20:00:37 -07:00
pyr0ball	fb3a4c697d	feat(llm): v0.20.0 — LLMRouter dict init + Ollama embed preflight (closes #59 , #60 ) Some checks failed CI / test (push) Waiting to run Details Mirror / mirror (push) Has been cancelled Details Release — PyPI / release (push) Has been cancelled Details - LLMRouter.__init__ now accepts a Path \| dict; pagepiper ingest scripts pass a runtime-constructed config dict instead of a temp file - _check_ollama_model_pulled() preflight on embed(): checks /api/tags once per backend URL and raises RuntimeError("...Fix: ollama pull <model>") when the configured embedding model is not pulled; silently skips for non-Ollama backends (vLLM, etc.) that don't expose /api/tags - 6 new tests: dict init paths (x2) + preflight scenarios (x4) - Existing embed tests updated to mock requests.get to avoid live Ollama calls	2026-05-05 14:59:49 -07:00
pyr0ball	7526092481	fix(llm): strengthen embed skip-verification test; add DEMO_MODE check to embed()	2026-05-04 16:02:26 -07:00
pyr0ball	8e2d15bcd4	feat(llm): add LLMRouter.embed() for batch embedding generation Adds embed(texts, model_override, fallback_order) to LLMRouter. Only openai_compat backends are tried (Ollama/vLLM expose /v1/embeddings; anthropic and vision_service do not). Uses embedding_model from backend config when present, falls back to the chat model otherwise. Supports cf-orch allocation and raises RuntimeError when all backends are exhausted. 4 tests added (TDD: RED → GREEN), 763 total passing, no regressions.	2026-05-04 15:58:44 -07:00
pyr0ball	a6d906bcbb	fix(vector): explicit rollback, table identifier guard, query scope fix	2026-05-04 15:55:05 -07:00
pyr0ball	0489f1111c	feat(vector): add LocalSQLiteVecStore backed by sqlite-vec Implements the VectorStore ABC using sqlite-vec virtual tables. Two-table design (vec0 virtual + companion meta) supports upsert, top-k ANN query with optional metadata post-filter, delete by ID, and bulk delete_where. Also renames VectorMatch.id → entry_id to avoid shadowing the Python builtin, updating base.py and all tests. Installed: sqlite-vec 0.1.9 Tests: 16 passed (7 base + 9 integration)	2026-05-04 15:41:39 -07:00
pyr0ball	e6c69f25ae	fix(vector): rename VectorMatch.entry_id to id per downstream contract VectorMatch.entry_id renamed to VectorMatch.id to match the API contract expected by downstream consumers (pagepiper T7). The dataclass remains frozen to prevent field reassignment; metadata is kept as plain dict for JSON deserialization compatibility. - Renamed VectorMatch.entry_id field to id - Updated all test references to use .id accessor - Simplified metadata to plain dict (removed MappingProxyType wrapping) - All 7 tests passing	2026-05-04 14:19:14 -07:00
pyr0ball	9492942623	fix(vector): make VectorMatch.metadata immutable; rename id to entry_id	2026-05-04 11:46:24 -07:00
pyr0ball	fe51914902	feat(vector): add VectorStore ABC and VectorMatch dataclass	2026-05-04 11:42:03 -07:00
pyr0ball	ac45067ae7	test(documents): add OCR fallback and edge case tests for PDFExtractor	2026-05-04 08:45:53 -07:00
pyr0ball	408ab64c55	test(documents): add OCR and ImportError coverage for PDFExtractor - Add module-level guards for pytesseract and PIL.Image (enables patching in tests) - Move `import io` from inside _ocr_page to module-level stdlib imports - Extract _ensure_pil_image() helper with TypeError guard so isinstance check does not blow up when Image is patched to a MagicMock in tests - Add 3 new tests: pdfplumber=None ImportError, sparse-page OCR fallback, OCR render failure returns empty chunk - Coverage: 96% (up from 64%)	2026-05-04 08:39:31 -07:00
pyr0ball	bbb146b361	feat(documents): add PDFExtractor text-layer extraction and PageChunk Adds circuitforge_core/documents/pdf.py with: - PageChunk frozen dataclass (page_number, text, source, word_count) - PDFExtractor.chunk_pages() — pdfplumber text-layer per page, OCR fallback via pytesseract for sparse pages - Module-level graceful ImportError guard on pdfplumber (patchable, follows cf-core optional-extra pattern) - pdf and pdf-ocr optional extras declared in pyproject.toml 3 tests, all passing.	2026-05-04 08:33:10 -07:00
pyr0ball	0f5ea86ab0	fix(input/gestures): enforce numpy array immutability in HandLandmarks; add CameraCapture tests - Set points.flags.writeable = False in HandsDetector.detect() so in-place mutation of HandLandmarks.points raises ValueError (frozen=True alone does not protect numpy array contents) - Extend test_handlandmarks_is_immutable to assert ValueError on array mutation - Add test_camera.py with 3 tests covering is_open, frames() yield/break behaviour, and context manager release (was at 0% coverage) - Remove unused `import numpy as np` from camera.py; fix frames() return annotation to Iterator (np.ndarray ref removed with the import)	2026-04-26 20:48:02 -07:00
pyr0ball	a62bff5f1e	test(input/gestures): add full pipeline smoke test	2026-04-26 20:18:40 -07:00
pyr0ball	a31e6099c6	feat(input/gestures): implement HandsDetector wrapping mediapipe Hands	2026-04-26 20:08:05 -07:00
pyr0ball	5a4917d455	style: black format normalizer.py and test_normalizer.py	2026-04-26 20:05:54 -07:00
pyr0ball	460530bb03	feat(input/gestures): implement normalize_hand() with scale/translation invariance	2026-04-26 19:58:00 -07:00
pyr0ball	b2b58913c7	feat: scaffold cf_input.gestures module + gestures-mediapipe dep group	2026-04-26 18:51:45 -07:00
pyr0ball	185057d8ca	feat(reranker): full adapter suite + cf-orch auto-routing (closes #54 ) Some checks failed CI / test (push) Has been cancelled Details Mirror / mirror (push) Has been cancelled Details Release — PyPI / release (push) Has been cancelled Details Five backends: BGE (FlagEmbedding), Qwen3 (generative yes/no logit scorer, batched forward pass), CrossEncoder (sentence-transformers, covers mxbai-rerank / ms-marco / jina), Cohere (BYOK cloud), Remote (HTTP delegate to cf-reranker service). Mock adapter for tests. 54 tests. cf-reranker FastAPI service app (port 8011) — cf-orch manages as a process, defaults to Qwen3-Reranker-0.6B. make_reranker() auto-detects CF_ORCH_URL and routes to cf-orch cf-reranker when set — cloud apps (Kiwi, Peregrine, Snipe) get remote Qwen3 reranking with zero code changes. Local dev falls back to local BGE. pyproject extras: reranker-bge, reranker-qwen3, reranker-cross-encoder, reranker-cohere, reranker-service.	2026-04-26 09:04:39 -07:00
pyr0ball	82f0b4c3d0	feat: cf_core.reranker — shared reranker module Phase 1 (#54 ) Some checks failed CI / test (push) Has been cancelled Details Mirror / mirror (push) Has been cancelled Details Trunk + text branch + BGE adapter: - base.py: Reranker Protocol, RerankResult (frozen dataclass), TextReranker base class with rerank() / rerank_batch() built on _score_pairs() - adapters/mock.py: MockTextReranker — Jaccard scoring, no deps, deterministic - adapters/bge.py: BGETextReranker — FlagEmbedding cross-encoder, thread-safe, batched forward pass via rerank_batch(); graceful ImportError if dep missing - __init__.py: rerank() singleton, make_reranker(), reset_reranker(); CF_RERANKER_MODEL / CF_RERANKER_BACKEND / CF_RERANKER_MOCK env vars - pyproject.toml: reranker-bge and reranker-qwen3 optional dep groups - 20 tests, all passing Architecture ready for Phase 2 (Qwen3TextReranker) and Phase 3 (cf-orch remote backend). ImageReranker/AudioReranker branches stubbed in base.py docstring.	2026-04-21 12:25:01 -07:00
pyr0ball	1553ff1630	feat: add activitypub module — actor, objects, signing, delivery, Lemmy, inbox (closes #51 ) Some checks are pending CI / test (push) Waiting to run Details Mirror / mirror (push) Waiting to run Details CFActor (frozen dataclass, RSA keygen), AS2 object constructors (Note, Offer, Request, Create), HTTP Signatures (draft-cavage-http-signatures-08, rsa-sha256), signed delivery via requests, Lemmy REST client (JWT auth), FastAPI inbox router with optional signature verification. Digest header re-verified against actual body bytes on verify_signature() to prevent body-swap attacks. inbox.py omits __future__ annotations to avoid FastAPI's annotation-resolution-against-module-globals constraint. 105 tests. Bumps to v0.14.0.	2026-04-20 13:18:03 -07:00
pyr0ball	f9b9fa5283	feat: add currency_code preference + format_currency utility (closes #52 ) Some checks are pending CI / test (push) Waiting to run Details Mirror / mirror (push) Waiting to run Details Adds circuitforge_core.preferences.currency with get/set_currency_code() and format_currency(). Priority chain: store → CURRENCY_DEFAULT env → USD. Formatting uses babel when available; falls back to a 30-currency symbol table with correct ISO 4217 minor-unit decimal places (0 for JPY, KRW, etc.). Consumed by Snipe, Kiwi, Peregrine, Crossbill. Bumps to v0.13.0.	2026-04-20 13:06:04 -07:00
pyr0ball	aa057b20e2	feat: add job_quality deterministic trust scorer (closes #48 ) Some checks are pending CI / test (push) Waiting to run Details Mirror / mirror (push) Waiting to run Details 12 signal functions covering staleness, repost patterns, salary transparency, ATS blackhole detection, and enrichment signals. All pure functions — no LLM, no network, no I/O. trust_score = 1 - sum(triggered weights), clamped to [0,1]. confidence reflects fraction of signals with available evidence. Salary transparency enforced for CO/CA/NY/WA/IL/MA. ATS blackhole patterns: Lever, Greenhouse, Workday, iCIMS, Taleo. 83 tests (models, all 12 signals individually, scorer). Bumps to v0.12.0.	2026-04-20 13:02:57 -07:00
pyr0ball	80eeae5460	feat: audio module, musicgen tests, SQLCipher PRAGMA hardening #45 — db/base.py: PRAGMA key=? parameterized form instead of f-string interpolation. Regression tests added (skip when pysqlcipher3 absent). #50 — circuitforge_core.audio: shared PCM/signal utilities (MIT, numpy-only) - convert.py: pcm_to_float32, float32_to_pcm, bytes_to_float32 - gate.py: is_silent, rms (RMS energy gate) - resample.py: resample (scipy.signal.resample_poly; numpy linear fallback) - buffer.py: ChunkAccumulator (window-based chunk collector + flush) Replaces hand-rolled equivalents in cf-voice stt.py + context.py. 34 tests, all passing. #49 — tests/test_musicgen/: 21 tests covering mock backend, factory, and FastAPI app endpoints. musicgen module was already implemented; tests were the missing piece to close the issue.	2026-04-20 11:10:49 -07:00
pyr0ball	ffb95a5a30	feat(community): add SharedStore base class with typed pg read/write methods Implements SharedStore with get_post_by_slug, list_posts (with JSONB filter support), insert_post, and delete_post. _cursor_to_dict handles both real psycopg2 tuple rows and mock dict rows for clean unit tests. Also promotes community __init__.py imports from try/except guards to unconditional now that db.py and store.py both exist.	2026-04-12 22:03:27 -07:00
pyr0ball	f74457d11f	feat(community): add CommunityDB connection pool and migration runner	2026-04-12 21:38:14 -07:00
pyr0ball	2e9e3fdc4b	feat(community): add CommunityPost frozen dataclass with element snapshot schema	2026-04-12 20:51:29 -07:00
pyr0ball	69a338bd98	feat(text): add OpenAI-compat /v1/chat/completions endpoint Some checks are pending CI / test (push) Waiting to run Details Mirror / mirror (push) Waiting to run Details Adds POST /v1/chat/completions to the cf-text FastAPI service so it can be used as an openai_compat backend in LLMRouter without any router changes. The endpoint accepts the standard OpenAI chat request format and returns a standard chat.completion response. 4 tests added; all 36 text tests pass.	2026-04-12 17:04:58 -07:00
pyr0ball	8c1daf3b6c	feat: cf-vision managed service (#43 ) Some checks are pending CI / test (push) Waiting to run Details Mirror / mirror (push) Waiting to run Details SigLIP so400m-patch14-384 as default backend (classify + embed, ~1.4 GB VRAM). VLM backend (moondream2, LLaVA, Qwen-VL, etc.) as callable alternative for caption generation and VQA. Follows the same factory/Protocol/mock pattern as cf-stt and cf-tts. New module: circuitforge_core.vision - backends/base.py — VisionBackend Protocol, VisionResult, make_vision_backend() - backends/mock.py — MockVisionBackend (no GPU, deterministic) - backends/siglip.py — SigLIPBackend: sigmoid zero-shot classify + L2 embed - backends/vlm.py — VLMBackend: AutoModelForVision2Seq caption + prompt classify - __init__.py — process singleton; classify(), embed(), caption(), make_backend() - app.py — FastAPI service (port 8006): /health /classify /embed /caption Backend selection: CF_VISION_BACKEND=siglip\|vlm, auto-detected from model path. VLM backend: supports_embed=False, caption()/classify() only. SigLIP backend: supports_caption=False, classify()/embed() only. 52 new tests, 385 total passing. Closes #43.	2026-04-09 06:53:43 -07:00
pyr0ball	80b0d5fd34	feat: v0.9.0 — cf-text, pipeline crystallization engine, multimodal pipeline, a11y preferences Some checks failed CI / test (push) Waiting to run Details Mirror / mirror (push) Has been cancelled Details Release — PyPI / release (push) Has been cancelled Details Closes #33, #37, #38, #41, #42. ## cf-text (closes #41) - New module: `circuitforge_core.text` — direct local inference bypassing ollama/vllm - Backends: llama.cpp (GGUF), transformers (HF), mock - Auto-detects backend from file extension; CF_TEXT_BACKEND env override - Optional 4-bit/8-bit quantisation via bitsandbytes (CF_TEXT_4BIT / CF_TEXT_8BIT) - process-level singleton + per-request `make_backend()` path ## Pipeline crystallization engine (closes #33, #37) - FPGA→ASIC model: LLM-discovered paths → deterministic workflows after N approvals - `models.py`: PipelineRun (incl. review_duration_ms + output_modified per #37), CrystallizedWorkflow, Step, hash_input() - `recorder.py`: append-only JSON run log under ~/.config/circuitforge/pipeline/ - `crystallizer.py`: threshold check, majority/most-recent step strategy, rubber-stamp warning (review_duration_ms < 5s triggers warnings.warn) - `registry.py`: exact + fuzzy match, deactivate-without-delete, colon-safe filenames - `executor.py`: deterministic steps with transparent LLM fallback ## Multimodal chunked pipeline (closes #42) - `pipeline/multimodal.py`: cf-docuvision pages → cf-text streaming - `run()` yields PageResult per page (progressive, no full-doc buffer) - `stream()` yields (page_idx, token) tuples for token-level UI rendering - `vram_serialise` flag + `swap_fn` hook for 8GB GPU VRAM management - `prompt_fn` callback for product-specific prompt construction ## Accessibility preferences (closes #38) - `preferences/accessibility.py`: PREF_REDUCED_MOTION, PREF_HIGH_CONTRAST, PREF_FONT_SIZE, PREF_SCREEN_READER with get/set helpers - Exported from preferences package __init__ ## LLM router fix - cf-orch backends: skip reachability pre-check; allocation starts the service - Static backends: reachability check remains in place	2026-04-08 23:17:18 -07:00
pyr0ball	f3bc4ac605	feat: add CF_LICENSE_KEY validation via Heimdall (closes #26 ) Introduces circuitforge_core.config.license with validate_license() and get_license_tier(). Both functions are safe to call when CF_LICENSE_KEY is absent, returning free tier gracefully. Results are cached 30 min per (key, product) pair. CF_LICENSE_URL env var overrides the default Heimdall endpoint. Re-exports added to config.__init__. Existing test_config.py moved into tests/test_config/ package to co-locate with new test_license.py (10 tests; 204 total passing).	2026-04-05 21:16:57 -07:00
pyr0ball	f0a9ec5c37	fix: raise 502 on label creation failure; narrow subprocess exception scope	2026-04-05 17:36:52 -07:00
pyr0ball	0a15ad9522	feat: add circuitforge_core.api.feedback — shared feedback router factory (closes #23 ) Adds make_feedback_router(repo, product, demo_mode_fn) which returns a FastAPI APIRouter with GET /status and POST / endpoints. Handles Forgejo label creation/reuse, issue body assembly (including repro steps for bugs), demo mode gating, and FORGEJO_API_TOKEN presence checks. 12 tests covering all status/submit paths, mock Forgejo interaction, and body content assertions. Also adds fastapi>=0.110 and httpx>=0.27 to [dev] optional deps.	2026-04-05 17:31:02 -07:00
pyr0ball	c244260d1c	feat!: strip resources/ from MIT core — moves to circuitforge-orch (v0.8.0) BREAKING CHANGE: circuitforge_core.resources is no longer available. Import CFOrchClient from circuitforge_orch.client instead. cf-orch CLI entry point is now in the circuitforge-orch package.	2026-04-04 22:34:27 -07:00
pyr0ball	2259382d0b	refactor: replace coordinator-aware TaskScheduler with Protocol + LocalScheduler (MIT); update LLMRouter import path	2026-04-04 22:26:06 -07:00
pyr0ball	090a86ce1b	refactor: update LLMRouter lazy import — circuitforge_core.resources.client → circuitforge_orch.client	2026-04-04 22:16:17 -07:00
pyr0ball	ccd2a35deb	test: affiliates integration tests — full wrap_url round-trip	2026-04-04 18:28:27 -07:00
pyr0ball	7837fbcad2	feat: affiliates router — wrap_url() with opt-out, BYOK, and CF env-var resolution	2026-04-04 18:20:21 -07:00
pyr0ball	73cec07bd2	feat: affiliates disclosure — per-retailer tooltip copy + first-encounter banner constants	2026-04-04 18:14:58 -07:00
pyr0ball	4c3f3a95a5	feat: affiliates programs — AffiliateProgram, registry, eBay EPN + Amazon Associates builders	2026-04-04 18:12:45 -07:00
pyr0ball	d719ea2309	feat: preferences public helpers — get_user_preference / set_user_preference (closes #22 self-hosted)	2026-04-04 18:10:24 -07:00
pyr0ball	0d9d030320	feat: preferences LocalFileStore — YAML-backed single-user preference store	2026-04-04 18:07:35 -07:00
pyr0ball	9ee31a09c1	feat: preferences dot-path utilities (get_path, set_path)	2026-04-04 18:04:44 -07:00
pyr0ball	3deae056de	feat: local-first LLM config + hosted coordinator auth LLMRouter env-var auto-config: - No llm.yaml required — auto-configures from ANTHROPIC_API_KEY, OPENAI_API_KEY, or OLLAMA_HOST on first use - Bare-metal self-hosters can run any CF product with just env vars - Falls back to FileNotFoundError with actionable message only when no env vars are set either CFOrchClient auth: - Reads CF_LICENSE_KEY env var (or explicit api_key param) - Sends Authorization: Bearer <key> on all allocation/release requests - Required for the hosted public coordinator; no-op for local deployments HeimdallAuthMiddleware (new): - FastAPI middleware for cf-orch coordinator - Enabled by HEIMDALL_URL env var; self-hosted deployments skip it - 5-min TTL cache (matching Kiwi cloud session) keeps Heimdall off the per-allocation hot path - /api/health exempt; free-tier keys rejected with 403 + reason - 13 tests covering cache TTL, tier ranking, and middleware gating	2026-04-03 08:32:15 -07:00
pyr0ball	8d87ed4c9f	feat: manage.py cross-platform product manager (closes #6 ) - circuitforge_core.manage module — replaces bash-only manage.sh - config.py: ManageConfig from manage.toml (TOML via tomllib/tomli) app name, default_url, docker compose_file/project, native services Falls back to directory name when no manage.toml present - docker_mode.py: DockerManager wrapping 'docker compose' (v2 plugin) or 'docker-compose' (v1 fallback); docker_available() probe Commands: start, stop, restart, status, logs, build - native_mode.py: NativeManager with PID file process management platformdirs for platform-appropriate PID/log paths Windows-compatible log tailing (polling, no tail -f) Cross-platform kill: SIGTERM→SIGKILL on Unix, taskkill /F on Windows - cli.py: typer CLI — start/stop/restart/status/logs/build/open/install-shims Mode auto-detection: Docker available + compose file → docker; else native --mode docker\|native\|auto override - templates/manage.sh: bash shim (conda, venv, python3 detection) - templates/manage.ps1: PowerShell shim (same detection, Windows) - templates/manage.toml.example: annotated config template - __main__.py: python -m circuitforge_core.manage entry point - pyproject.toml: manage extras group (platformdirs, typer) cf-manage console script; version bumped to 0.5.0 - 36 tests: config (6), docker_mode (9), native_mode (21)	2026-04-02 23:04:35 -07:00
pyr0ball	7bb6b76bd5	feat: ollama adopt-if-running + health_path in ProcessSpec (#16 ) - ProcessSpec: adopt (bool) and health_path (str, default /health) fields - ServiceManager: adopt=True probes health_path before spawning; is_running() uses health probe for adopt services rather than proc table + socket check - _probe_health() helper: urllib GET on localhost:port+path, returns bool - Agent /services/{service}/start: returns adopted=True when service was already running; coordinator sets state=running immediately (no probe wait) - ServiceInstance: health_path field (default /health) - service_registry.upsert_instance(): health_path kwarg - Probe loop uses inst.health_path instead of hardcoded /health - coordinator allocate_service: looks up health_path from profile spec via _get_health_path() and stores on ServiceInstance - All GPU profiles (2/4/6/8/16/24 GB + cpu-16/32): ollama managed block with adopt=true, health_path=/api/tags, port 11434 - 11 new tests	2026-04-02 22:09:42 -07:00
pyr0ball	a54a530493	feat: agent watchdog — persist known nodes + auto-reconnect after coordinator restart closes #15 - NodeStore: SQLite persistence for known agent nodes (~/.local/share/circuitforge/cf-orch-nodes.db) - upsert on every register(); prune_stale() for 30-day cleanup - survives coordinator restarts — data readable by next process - AgentSupervisor.restore_from_store(): reload known nodes on startup, mark all offline; heartbeat loop brings back any that respond - AgentSupervisor.register(): persists to NodeStore on every call - cli.py coordinator: NodeStore wired in; restore_from_store() called before uvicorn starts - cli.py agent: one-shot registration replaced with persistent reconnect loop (daemon thread, 30 s interval) — coordinator restart → nodes reappear within one cycle with no manual intervention on agent hosts - 16 new tests: NodeStore (8) + AgentSupervisor watchdog (8)	2026-04-02 22:01:55 -07:00
pyr0ball	cd9864b5e8	feat: hardware detection, cf-docuvision service, documents ingestion pipeline Closes #5, #7, #8, #13 ## hardware module (closes #5) - HardwareSpec, LLMBackendConfig, LLMConfig dataclasses - VramTier ladder (CPU / 2 / 4 / 6 / 8 / 16 / 24 GB) with select_tier() - generate_profile() maps HardwareSpec → LLMConfig for llm.yaml generation - detect_hardware() with nvidia-smi / rocm-smi / system_profiler / cpu fallback - 31 tests across tiers, generator, and detect ## cf-docuvision service (closes #8) - FastAPI service wrapping ByteDance/Dolphin-v2 (Qwen2.5-VL backbone) - POST /extract: image_b64 or image_path + hint → ExtractResponse - Lazy model loading; JSON-structured output with plain-text fallback - ProcessSpec managed blocks added to all four GPU profiles (6/8/16/24 GB) - 14 tests ## documents module (closes #7) - StructuredDocument, Element, ParsedTable dataclasses (frozen, composable) - DocuvisionClient: thin HTTP client for cf-docuvision POST /extract - ingest(): primary cf-docuvision path → LLMRouter vision fallback → empty doc - CF_DOCUVISION_URL env var for URL override - 22 tests ## coordinator probe loop (closes #13) - _run_instance_probe_loop: starting → running on 200; starting → stopped on timeout - 4 async tests with CancelledError-based tick control	2026-04-02 18:53:25 -07:00
pyr0ball	bd132851ec	fix(orch): tighten VRAM pre-flight to require full max_mb free (not half) max_mb // 2 was too loose — Qwen2.5-3B needs ~5.9 GB on an 8 GB card but the threshold only required 3.25 GB free, allowing Ollama to hold 4.5 GB while a load attempt was still dispatched (causing OOM crash). - node_selector: can_fit = free_mb >= service_max_mb (was // 2) - coordinator /start: same threshold fix + updated error message - tests: two new node_selector tests pin the full-ceiling semantics; updated stale docstring in coordinator app test	2026-04-02 16:44:36 -07:00
pyr0ball	c78341fc6f	feat(orch): replace Ouro/vllm-Docker with generic HF inference server; add ProcessSpec - Add circuitforge_core/resources/inference/llm_server.py: generic OpenAI-compatible FastAPI server for any HuggingFace causal LM (Phi-4-mini-instruct, Qwen2.5-3B-Instruct) - Add service_manager.py + service_probe.py: ProcessSpec start/stop/is_running support (Popen-based; socket probe confirms readiness before marking running) - Update all 4 public GPU profiles to use ProcessSpec→llm_server instead of Docker vllm: 6gb (max_mb 5500), 8gb (max_mb 6500), 16gb/24gb (max_mb 9000) - Model candidates: Phi-4-mini-instruct first (7.2GB), Qwen2.5-3B-Instruct fallback (5.8GB) - Remove ouro_server.py (Ouro incompatible with transformers 5.x; vllm Docker also incompatible) - Add 17 tests for ServiceManager ProcessSpec (start/stop/is_running/list/get_url)	2026-04-02 15:33:08 -07:00

1 2

87 commits