Missing from initial extras list — required by QwenVLVideoProcessor
at inference time. On CUDA 13 nodes must be installed from the PyTorch
nightly cu130 index to avoid a torch version downgrade:
pip install --index-url https://download.pytorch.org/whl/nightly/cu130 torch torchvision
Discovered during Muninn deployment (2026-05-26).
Humans own design, architecture, code review, testing, and
verification. LLMs are part of our development workflow.
Links to circuitforge.tech/positions for our full position.
CUDA defaults to FASTEST_FIRST device ordering, which does not match
nvidia-smi's PCI bus order on multi-GPU nodes. On Muninn, the RTX 3090
is cuda:0 and the Quadro RTX 4000 is cuda:1 — the opposite of nvidia-smi.
Two fixes:
1. Set CUDA_DEVICE_ORDER=PCI_BUS_ID so --gpu-id always matches nvidia-smi
and the muninn.yaml profile GPU index assignments.
2. Use direct assignment (os.environ[...] = ...) instead of setdefault —
setdefault silently no-ops if CUDA_VISIBLE_DEVICES is already present
in the environment (conda activation, prior run, system default).
Add the circuitforge_core.video package implementing the cf-video inference
service managed by cf-orch.
Service endpoints:
GET /health — liveness check; model name + VRAM
POST /caption — dense scene description + timestamped event list
POST /find — temporal grounding of a natural-language event query
Backend hierarchy:
VideoBackend (Protocol)
MarlinBackend — NemoStation/Marlin-2B via transformers>=5.7.0
MockVideoBackend — deterministic stub; no GPU required
Pydantic request/response models enforce parameter bounds at the API
boundary (max_new_tokens ge/le, event min_length=1). Span is serialized
as list[float] | None for JSON compatibility.
MarlinBackend loads eagerly in __init__ so cf-orch's 2-second liveness
poll catches load failures immediately. FORCE_QWENVL_VIDEO_READER env var
defaults to torchcodec (faster than av path) before transformers import.
pyproject.toml extras:
video-marlin — torch, transformers, torchcodec, qwen-vl-utils, av, Pillow
video-service — video-marlin + fastapi + uvicorn
Test coverage: 46 tests across test_mock_backend.py and test_app.py.
All passing without GPU or real video file.
Closes: #71
Adds three-layer dedup infrastructure for community recipe posts:
- Migration 006: similar_to_ref self-FK, title lower() index, recipe_id index
- CommunityPost.similar_to_ref optional field (frozen dataclass, defaults None)
- SharedStore.search_similar_posts(): title ILIKE + recipe_id match, ordered by relevance
- insert_post() wires similar_to_ref into the INSERT
- LLMRouter.__init__ now accepts a Path | dict; pagepiper ingest scripts
pass a runtime-constructed config dict instead of a temp file
- _check_ollama_model_pulled() preflight on embed(): checks /api/tags once
per backend URL and raises RuntimeError("...Fix: ollama pull <model>")
when the configured embedding model is not pulled; silently skips for
non-Ollama backends (vLLM, etc.) that don't expose /api/tags
- 6 new tests: dict init paths (x2) + preflight scenarios (x4)
- Existing embed tests updated to mock requests.get to avoid live Ollama calls
Adds embed(texts, model_override, fallback_order) to LLMRouter. Only
openai_compat backends are tried (Ollama/vLLM expose /v1/embeddings;
anthropic and vision_service do not). Uses embedding_model from backend
config when present, falls back to the chat model otherwise. Supports
cf-orch allocation and raises RuntimeError when all backends are exhausted.
4 tests added (TDD: RED → GREEN), 763 total passing, no regressions.
Implements the VectorStore ABC using sqlite-vec virtual tables.
Two-table design (vec0 virtual + companion meta) supports upsert,
top-k ANN query with optional metadata post-filter, delete by ID,
and bulk delete_where. Also renames VectorMatch.id → entry_id to
avoid shadowing the Python builtin, updating base.py and all tests.
Installed: sqlite-vec 0.1.9
Tests: 16 passed (7 base + 9 integration)
VectorMatch.entry_id renamed to VectorMatch.id to match the API contract
expected by downstream consumers (pagepiper T7). The dataclass remains frozen
to prevent field reassignment; metadata is kept as plain dict for JSON
deserialization compatibility.
- Renamed VectorMatch.entry_id field to id
- Updated all test references to use .id accessor
- Simplified metadata to plain dict (removed MappingProxyType wrapping)
- All 7 tests passing
- Add module-level guards for pytesseract and PIL.Image (enables patching in tests)
- Move `import io` from inside _ocr_page to module-level stdlib imports
- Extract _ensure_pil_image() helper with TypeError guard so isinstance check
does not blow up when Image is patched to a MagicMock in tests
- Add 3 new tests: pdfplumber=None ImportError, sparse-page OCR fallback,
OCR render failure returns empty chunk
- Coverage: 96% (up from 64%)
- Set points.flags.writeable = False in HandsDetector.detect() so in-place
mutation of HandLandmarks.points raises ValueError (frozen=True alone does not
protect numpy array contents)
- Extend test_handlandmarks_is_immutable to assert ValueError on array mutation
- Add test_camera.py with 3 tests covering is_open, frames() yield/break
behaviour, and context manager release (was at 0% coverage)
- Remove unused `import numpy as np` from camera.py; fix frames() return
annotation to Iterator (np.ndarray ref removed with the import)
Five backends: BGE (FlagEmbedding), Qwen3 (generative yes/no logit scorer,
batched forward pass), CrossEncoder (sentence-transformers, covers mxbai-rerank
/ ms-marco / jina), Cohere (BYOK cloud), Remote (HTTP delegate to cf-reranker
service). Mock adapter for tests. 54 tests.
cf-reranker FastAPI service app (port 8011) — cf-orch manages as a process,
defaults to Qwen3-Reranker-0.6B.
make_reranker() auto-detects CF_ORCH_URL and routes to cf-orch cf-reranker
when set — cloud apps (Kiwi, Peregrine, Snipe) get remote Qwen3 reranking
with zero code changes. Local dev falls back to local BGE.
pyproject extras: reranker-bge, reranker-qwen3, reranker-cross-encoder,
reranker-cohere, reranker-service.
Extracted from kiwi/avocet where it was duplicated. Reads llm.yaml via
the same path LLMRouter uses — products can now import detect_byok from
cf-core instead of maintaining their own copy.
Extracts the JWT validation + Heimdall tier resolution + guest session pattern
that was duplicated across kiwi and peregrine into a single reusable module.
CloudSessionFactory is parameterized by product name. Products instantiate it
once at module level and call .dependency() to get a FastAPI-compatible Depends()
function. .require_tier(min_tier) returns a dependency factory for gated routes.
CloudUser carries:
user_id — Directus UUID, "local" (self-hosted), "local-dev" (bypass), "anon-<uuid>"
tier — free | paid | premium | ultra | local
product — which CF product this session is for
has_byok — whether user has a configured LLM backend
meta — dict for product-specific extras (household_id, license_key, etc.)
Products can pass extra_meta= to attach product-specific fields without
subclassing. The module is FastAPI-only (fastapi is a lazy import so local-mode
products that never hit cloud paths don't pay the import cost).
- backends/ollama.py: routes requests to a running Ollama instance via HTTP API
- backends/vllm.py: routes requests to vllm's OpenAI-compatible API
(/v1/chat/completions); cf-text holds no GPU memory in proxy mode
- hardware/tiers.py: register cf-musicgen in 8GB, 16GB, and 32GB VRAM tiers
- tts/app.py: use inline type comment for _backend to avoid runtime global warning
- tts/backends/base.py: minor style cleanup
- create_app: add gpu_ids param; when set, exports CUDA_VISIBLE_DEVICES=<ids>
so HuggingFace Accelerate auto-shards across all listed devices
- CLI: add --gpu-ids arg (e.g. "0,1"); overrides --gpu-id when provided
- backends/base.py: propagate gpu_ids through TextBackend.generate
so backends can be aware of the visible device set
Single-GPU deployments are unaffected — --gpu-id=0 remains the default.
Adds community subcategory tagging for corpus recipes (kiwi#118).
Any product with a recipe corpus can use this to let users tag recipes
into browse taxonomy locations that FTS missed.
- 005_recipe_tags.sql: recipe_tags (per-recipe taxonomy tag with upvote
counter) + recipe_tag_votes (dedup table; submitter self-vote at insert)
- store.py: submit_recipe_tag(), upvote_recipe_tag(), get_recipe_tag_by_id(),
list_tags_for_recipe(), get_accepted_recipe_ids_for_subcategory()
Acceptance threshold: upvotes >= 2 (submitter counts as 1, one more needed).
Tags keyed as recipe_source='corpus' for future community-recipe extension.
Names cf-text, cf-voice, cf-vision as trunk services with the cf_orch
allocation block pattern. Documents all backend types (openai_compat,
anthropic, vision_service) and the env-var auto-detection path.
Adds circuitforge_core.preferences.currency with get/set_currency_code()
and format_currency(). Priority chain: store → CURRENCY_DEFAULT env → USD.
Formatting uses babel when available; falls back to a 30-currency symbol
table with correct ISO 4217 minor-unit decimal places (0 for JPY, KRW, etc.).
Consumed by Snipe, Kiwi, Peregrine, Crossbill. Bumps to v0.13.0.
12 signal functions covering staleness, repost patterns, salary transparency,
ATS blackhole detection, and enrichment signals. All pure functions — no LLM,
no network, no I/O. trust_score = 1 - sum(triggered weights), clamped to [0,1].
confidence reflects fraction of signals with available evidence.
Salary transparency enforced for CO/CA/NY/WA/IL/MA. ATS blackhole patterns:
Lever, Greenhouse, Workday, iCIMS, Taleo.
83 tests (models, all 12 signals individually, scorer). Bumps to v0.12.0.
Add early validation in create_app(): raise ValueError with a clear
message when model_path is empty and mock=False. Prevents the cryptic
HFValidationError that surfaced when cf-orch passed an empty {model}
arg (cf-orch #46). Surfaces the real problem at the service layer
rather than deep inside the HuggingFace loader.