cf-voice/cf_voice
pyr0ball fed6388b99 feat: real inference pipeline — STT, tone classifier, diarization, mic capture
- cf_voice/stt.py: WhisperSTT async wrapper (faster-whisper, thread-pool executor,
  rolling 50-word session prompt for cross-chunk context continuity)
- cf_voice/classify.py: ToneClassifier — wav2vec2 SER + librosa prosody flags
  (energy, ZCR speech rate, YIN pitch contour) mapped to AFFECT_LABELS
- cf_voice/diarize.py: Diarizer async wrapper around pyannote/speaker-diarization-3.1;
  speaker_at() helper for Navigation v0.2.x wiring
- cf_voice/capture.py: MicVoiceIO — sounddevice 16kHz mono capture, 2s window
  accumulation, parallel STT+classify tasks, shift_magnitude from confidence delta
- cf_voice/io.py: make_io() now returns MicVoiceIO when CF_VOICE_MOCK is unset
- cf_voice/context.py: classify_chunk() split into mock/real paths; real path
  decodes base64 PCM and runs ToneClassifier synchronously (cf-orch endpoint)
- pyproject.toml: inference extras expanded (faster-whisper, sounddevice,
  librosa, python-dotenv)
- .env.example: HF_TOKEN, CF_VOICE_WHISPER_MODEL, CF_VOICE_DEVICE, CF_VOICE_MOCK,
  CF_VOICE_CONFIDENCE_THRESHOLD

Prior art ported from: Plex-Scripts/transcription/diarization.py (pyannote
setup), devl/ogma/backend/speech/transcription_engine.py (faster-whisper
preprocessing and session prompt pattern).
2026-04-06 17:33:51 -07:00
..
__init__.py feat: initial cf-voice stub — VoiceFrame API, mock IO, context classifier 2026-04-06 16:03:07 -07:00
capture.py feat: real inference pipeline — STT, tone classifier, diarization, mic capture 2026-04-06 17:33:51 -07:00
classify.py feat: real inference pipeline — STT, tone classifier, diarization, mic capture 2026-04-06 17:33:51 -07:00
cli.py feat: initial cf-voice stub — VoiceFrame API, mock IO, context classifier 2026-04-06 16:03:07 -07:00
context.py feat: real inference pipeline — STT, tone classifier, diarization, mic capture 2026-04-06 17:33:51 -07:00
diarize.py feat: real inference pipeline — STT, tone classifier, diarization, mic capture 2026-04-06 17:33:51 -07:00
events.py feat: AudioEvent models, classify_chunk() for per-chunk request-response path 2026-04-06 16:53:10 -07:00
io.py feat: real inference pipeline — STT, tone classifier, diarization, mic capture 2026-04-06 17:33:51 -07:00
models.py feat: initial cf-voice stub — VoiceFrame API, mock IO, context classifier 2026-04-06 16:03:07 -07:00
stt.py feat: real inference pipeline — STT, tone classifier, diarization, mic capture 2026-04-06 17:33:51 -07:00