feat: cf_core.musicgen — MusicGen HTTP service (Sparrow blocker) #49

New issue

Closed

opened 2026-04-17 14:54:50 -07:00 by pyr0ball · 0 comments

pyr0ball commented

2026-04-17 14:54:50 -07:00

Owner

Context

Sparrow (music continuation studio, SPRW) needs circuitforge_core.musicgen.app to exist as a standalone FastAPI HTTP service. It is already referenced in the cf-orch node profiles:

# heimdall.yaml / navi.yaml
cf-musicgen:
  max_mb: 8000
  args_template: "-m circuitforge_core.musicgen.app --model facebook/musicgen-melody --port {port} --gpu-id {gpu_id}"

cf-orch spawns it via python -m circuitforge_core.musicgen.app. Sparrow's MusicGenClient allocates it from cf-orch and calls it directly.

Required API

POST /continue
  {
    audio_path: str,          # path to source audio file (wav/mp3)
    prompt: str,              # text conditioning
    duration_s: float,        # output length (5-30s)
    cfg_coef: float,          # classifier-free guidance (default 3.0)
    prompt_duration_s: float  # how much of source to use as melody prompt (default 10.0)
  }
  -> { audio_path: str, duration_s: float }

GET /health
  -> { status: "ready" | "loading" | "error", model: str, gpu_id: int }

POST /continue is synchronous (blocks until generation complete). Sparrow wraps it in a background asyncio task and pushes SSE status events when done.

Implementation notes

Model: facebook/musicgen-melody (melody-conditioned continuation)
Input audio is on disk; MusicGen uses the melody from prompt_duration_s seconds as conditioning
Output: writes a new wav file, returns the path
--gpu-id CLI arg selects CUDA device; --port selects listen port (both passed by cf-orch agent)
--model overridable (allow facebook/musicgen-small for lower VRAM nodes)
Mock mode (CF_MUSICGEN_MOCK=1): return a copy of the source file so Sparrow can develop without GPU

Module structure

circuitforge_core/
└── musicgen/
    ├── __init__.py
    ├── app.py          FastAPI app + CLI entrypoint
    ├── generator.py    MusicGenGenerator class (lazy model load, generate())
    └── mock.py         MockMusicGenGenerator (copies source, adds silence)

Tier / licensing

BSL 1.1 (real inference)
Mock + HTTP interface: MIT
Follows same pattern as cf_core.stt, cf_core.tts

Blocks

Circuit-Forge/sparrow backend: MusicGenClient cannot be completed without this

Cross-product note

Filed as part of Sparrow architecture review. A follow-up issue should extract shared audio utilities (PCM conversion, resampling, energy gate) into cf_core.audio — needed by both cf-voice and Sparrow.

## Context Sparrow (music continuation studio, SPRW) needs `circuitforge_core.musicgen.app` to exist as a standalone FastAPI HTTP service. It is already referenced in the cf-orch node profiles: ```yaml # heimdall.yaml / navi.yaml cf-musicgen: max_mb: 8000 args_template: "-m circuitforge_core.musicgen.app --model facebook/musicgen-melody --port {port} --gpu-id {gpu_id}" ``` cf-orch spawns it via `python -m circuitforge_core.musicgen.app`. Sparrow's `MusicGenClient` allocates it from cf-orch and calls it directly. ## Required API ``` POST /continue { audio_path: str, # path to source audio file (wav/mp3) prompt: str, # text conditioning duration_s: float, # output length (5-30s) cfg_coef: float, # classifier-free guidance (default 3.0) prompt_duration_s: float # how much of source to use as melody prompt (default 10.0) } -> { audio_path: str, duration_s: float } GET /health -> { status: "ready" | "loading" | "error", model: str, gpu_id: int } ``` `POST /continue` is synchronous (blocks until generation complete). Sparrow wraps it in a background asyncio task and pushes SSE status events when done. ## Implementation notes - Model: `facebook/musicgen-melody` (melody-conditioned continuation) - Input audio is on disk; MusicGen uses the melody from `prompt_duration_s` seconds as conditioning - Output: writes a new wav file, returns the path - `--gpu-id` CLI arg selects CUDA device; `--port` selects listen port (both passed by cf-orch agent) - `--model` overridable (allow `facebook/musicgen-small` for lower VRAM nodes) - Mock mode (CF_MUSICGEN_MOCK=1): return a copy of the source file so Sparrow can develop without GPU ## Module structure ``` circuitforge_core/ └── musicgen/ ├── __init__.py ├── app.py FastAPI app + CLI entrypoint ├── generator.py MusicGenGenerator class (lazy model load, generate()) └── mock.py MockMusicGenGenerator (copies source, adds silence) ``` ## Tier / licensing - BSL 1.1 (real inference) - Mock + HTTP interface: MIT - Follows same pattern as cf_core.stt, cf_core.tts ## Blocks - Circuit-Forge/sparrow backend: MusicGenClient cannot be completed without this ## Cross-product note Filed as part of Sparrow architecture review. A follow-up issue should extract shared audio utilities (PCM conversion, resampling, energy gate) into cf_core.audio — needed by both cf-voice and Sparrow.