Monocular depth source — Depth Anything v2 + MiDaS fallback (PyTorch) #13

Open
opened 2026-04-26 21:46:23 -07:00 by pyr0ball · 0 comments
Owner

Implement MonocularDepthSource — single webcam + PyTorch for depth estimation.

Model tiers (auto-selected by available VRAM):

  • Depth Anything v2 (default, ~2GB VRAM or CPU): current state-of-the-art monocular depth. Load via HuggingFace transformers. Relative depth map (not metric absolute).
  • MiDaS v3 Small (fallback, CPU-friendly, ~400MB): faster, smaller, 10–15fps on CPU. Lower resolution depth map.
  • ZoeDepth (optional, ~3GB VRAM): metric depth in meters. Useful for proximity-to-camera slider control.

Conda env: runs in merlin-vision env (separate from cf env — PyTorch is large). manage.sh creates/manages this env automatically.

Output: relative Z-values sufficient for gesture discrimination. get_3d_landmarks() maps normalized depth to approximate Z in the same coordinate space as Phase A.

Optional dep group: merlin[depth-mono]torch>=2.0, transformers>=4.38.

Implement `MonocularDepthSource` — single webcam + PyTorch for depth estimation. **Model tiers (auto-selected by available VRAM):** - **Depth Anything v2** (default, ~2GB VRAM or CPU): current state-of-the-art monocular depth. Load via HuggingFace `transformers`. Relative depth map (not metric absolute). - **MiDaS v3 Small** (fallback, CPU-friendly, ~400MB): faster, smaller, 10–15fps on CPU. Lower resolution depth map. - **ZoeDepth** (optional, ~3GB VRAM): metric depth in meters. Useful for proximity-to-camera slider control. **Conda env:** runs in `merlin-vision` env (separate from `cf` env — PyTorch is large). `manage.sh` creates/manages this env automatically. **Output:** relative Z-values sufficient for gesture discrimination. `get_3d_landmarks()` maps normalized depth to approximate Z in the same coordinate space as Phase A. Optional dep group: `merlin[depth-mono]` → `torch>=2.0`, `transformers>=4.38`.
pyr0ball added this to the Phase B — Depth and High-Fidelity Spatial Input milestone 2026-04-26 21:46:23 -07:00
Sign in to join this conversation.
No labels
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: Circuit-Forge/raven#13
No description provided.