FastAPI backend (port 8522):
- Session lifecycle: POST /session/start, DELETE /session/{id}/end, GET /session/{id}
- SSE stream: GET /session/{id}/stream — per-subscriber asyncio.Queue fan-out, 15s heartbeat
- History: GET /session/{id}/history with min_confidence + limit filters
- Audio: WS /session/{id}/audio — binary PCM ingestion stub (real inference in v0.2.x)
- Export: GET /session/{id}/export — downloadable JSON session log
- ContextClassifier background task per session (CF_VOICE_MOCK=1 in dev)
- ToneEvent SSE wire format per cf-core#40 (locked field names)
- Tier gate: CFG-LNNT- prefix check, 402 for paid features
Vue 3 frontend (port 8521, Vite + UnoCSS + Pinia):
- NowPanel: affect-aware border tint, subtext, prosody flags, shift indicator
- HistoryStrip: horizontal scroll, last 8 events with affect color
- ComposeBar: start/stop session, SSE connection lifecycle
- useToneStream: EventSource composable
- useAudioCapture: AudioWorklet → Int16 PCM → WebSocket (v0.1.x stub)
- audio-processor.js: 100ms chunk accumulator in AudioWorklet thread
- Respects prefers-reduced-motion globally
26 tests passing, manage.sh, Dockerfile, compose.yml included.
49 lines
1.9 KiB
Python
49 lines
1.9 KiB
Python
# app/api/audio.py — WebSocket audio ingestion endpoint
|
|
#
|
|
# Receives raw PCM Int16 audio chunks from the browser's AudioWorkletProcessor.
|
|
# Each message is a binary frame: 16kHz mono Int16 PCM.
|
|
# The backend accumulates chunks until cf-voice processes them.
|
|
#
|
|
# Notation v0.1.x: audio is accepted and acknowledged but inference runs
|
|
# through the background ContextClassifier (started at session creation),
|
|
# not inline here. This endpoint is wired for the real audio path
|
|
# (Navigation v0.2.x) where chunks feed the STT + diarizer directly.
|
|
from __future__ import annotations
|
|
|
|
import logging
|
|
|
|
from fastapi import APIRouter, WebSocket, WebSocketDisconnect
|
|
|
|
from app.services import session_store
|
|
|
|
logger = logging.getLogger(__name__)
|
|
router = APIRouter(prefix="/session", tags=["audio"])
|
|
|
|
|
|
@router.websocket("/{session_id}/audio")
|
|
async def audio_ws(websocket: WebSocket, session_id: str) -> None:
|
|
"""
|
|
WebSocket endpoint for binary PCM audio upload.
|
|
|
|
Clients (browser AudioWorkletProcessor) send binary frames.
|
|
Server acknowledges each frame with {"ok": true}.
|
|
|
|
In mock mode (CF_VOICE_MOCK=1) the session's ContextClassifier generates
|
|
synthetic frames independently -- audio sent here is accepted but not
|
|
processed. Real inference wiring happens in Navigation v0.2.x.
|
|
"""
|
|
session = session_store.get_session(session_id)
|
|
if session is None:
|
|
await websocket.close(code=4004, reason=f"Session {session_id} not found")
|
|
return
|
|
|
|
await websocket.accept()
|
|
logger.info("Audio WS connected for session %s", session_id)
|
|
|
|
try:
|
|
while True:
|
|
data = await websocket.receive_bytes()
|
|
# Notation v0.1.x: acknowledge receipt; real inference in v0.2.x
|
|
await websocket.send_json({"ok": True, "bytes": len(data)})
|
|
except WebSocketDisconnect:
|
|
logger.info("Audio WS disconnected for session %s", session_id)
|