Add the circuitforge_core.video package implementing the cf-video inference
service managed by cf-orch.
Service endpoints:
GET /health — liveness check; model name + VRAM
POST /caption — dense scene description + timestamped event list
POST /find — temporal grounding of a natural-language event query
Backend hierarchy:
VideoBackend (Protocol)
MarlinBackend — NemoStation/Marlin-2B via transformers>=5.7.0
MockVideoBackend — deterministic stub; no GPU required
Pydantic request/response models enforce parameter bounds at the API
boundary (max_new_tokens ge/le, event min_length=1). Span is serialized
as list[float] | None for JSON compatibility.
MarlinBackend loads eagerly in __init__ so cf-orch's 2-second liveness
poll catches load failures immediately. FORCE_QWENVL_VIDEO_READER env var
defaults to torchcodec (faster than av path) before transformers import.
pyproject.toml extras:
video-marlin — torch, transformers, torchcodec, qwen-vl-utils, av, Pillow
video-service — video-marlin + fastapi + uvicorn
Test coverage: 46 tests across test_mock_backend.py and test_app.py.
All passing without GPU or real video file.
Closes: #71