cf-voice

Circuit-Forge/cf-voice

Fork 0

Commit graph

Author	SHA1	Message	Date
pyr0ball	24f04b67db	feat: full voice pipeline — AST acoustic, accent, privacy, prosody, dimensional, trajectory, telephony, FastAPI app New modules shipped (from Linnet integration): - acoustic.py: AST (MIT/ast-finetuned-audioset-10-10-0.4593) replaces YAMNet stub; 527 AudioSet classes mapped to queue/speaker/environ/scene labels; _LABEL_MAP includes hold_music, ringback, DTMF, background_shift, AMD signal chain - accent.py: facebook/mms-lid-126 language ID → regional accent labels (en_gb, en_us, en_au, fr, es, de, zh, …); lazy-loaded, gated by CF_VOICE_ACCENT - privacy.py: compound privacy risk scorer — public_env, background_voices, nature scene, accent signals; returns 0–3 score without storing any audio - prosody.py: openSMILE-backed prosody extractor (sarcasm_risk, flat_f0_score, speech_rate, pitch_range); mock mode returns neutral values - dimensional.py: audeering/wav2vec2-large-robust-12-ft-emotion-msp-dim valence/arousal/dominance scorer; gated by CF_VOICE_DIMENSIONAL - trajectory.py: rolling buffer for arousal/valence deltas, trend detection (escalating/suppressed/stable), coherence scoring, suppression/reframe flags - telephony.py: TelephonyBackend Protocol + MockTelephonyBackend + SignalWireBackend + FreeSWITCHBackend; CallSession dataclass; make_telephony() factory - app.py: FastAPI service (port 8007) — /health + /classify; accepts base64 PCM chunks, returns full AudioEventOut including dimensional/prosody/accent fields - prefs.py: voice preference helpers (elcor_mode, confidence_threshold, whisper_model, elcor_prior_frames); cf-core and env-var fallback Tests: fix stale tests (YAMNetAcousticBackend → ASTAcousticBackend, scene field added to AcousticResult, speaker_at gap now resolves dominant speaker not UNKNOWN, make_io real path returns MicVoiceIO when sounddevice installed). 78 tests passing. Closes #2, #3.	2026-04-18 22:36:58 -07:00
pyr0ball	335d51f02f	feat: lock ToneEvent SSE wire format (cf-core#40) - AudioEvent: add speaker_id field (was on VoiceFrame only; needed on all events) - ToneEvent: add session_id field for session correlation across embedded consumers - README: full wire format documentation — JSON shape, field reference table, SSE envelope, Elcor mode subtext table, module license map - ToneEvent docstring references cf-core#40 as the wire format spec Closes cf-core#40	2026-04-06 17:51:09 -07:00
pyr0ball	35fc0a088c	feat: initial cf-voice stub — VoiceFrame API, mock IO, context classifier - VoiceFrame dataclass: label, confidence, speaker_id, shift_magnitude, timestamp - MockVoiceIO: async generator of synthetic frames on a timer (CF_VOICE_MOCK=1) - ContextClassifier: passthrough stub wrapping VoiceIO; _enrich() hook for real classifiers - make_io() factory: mock mode auto-detected from env, raises NotImplementedError for real audio - cf-voice-demo CLI entry point for quick smoke-testing - 12 tests passing; editable install via pip install -e ../cf-voice	2026-04-06 16:03:07 -07:00

Author

SHA1

Message

Date

pyr0ball

24f04b67db

feat: full voice pipeline — AST acoustic, accent, privacy, prosody, dimensional, trajectory, telephony, FastAPI app

New modules shipped (from Linnet integration):
- acoustic.py: AST (MIT/ast-finetuned-audioset-10-10-0.4593) replaces YAMNet stub;
  527 AudioSet classes mapped to queue/speaker/environ/scene labels; _LABEL_MAP
  includes hold_music, ringback, DTMF, background_shift, AMD signal chain
- accent.py: facebook/mms-lid-126 language ID → regional accent labels
  (en_gb, en_us, en_au, fr, es, de, zh, …); lazy-loaded, gated by CF_VOICE_ACCENT
- privacy.py: compound privacy risk scorer — public_env, background_voices,
  nature scene, accent signals; returns 0–3 score without storing any audio
- prosody.py: openSMILE-backed prosody extractor (sarcasm_risk, flat_f0_score,
  speech_rate, pitch_range); mock mode returns neutral values
- dimensional.py: audeering/wav2vec2-large-robust-12-ft-emotion-msp-dim
  valence/arousal/dominance scorer; gated by CF_VOICE_DIMENSIONAL
- trajectory.py: rolling buffer for arousal/valence deltas, trend detection
  (escalating/suppressed/stable), coherence scoring, suppression/reframe flags
- telephony.py: TelephonyBackend Protocol + MockTelephonyBackend + SignalWireBackend
  + FreeSWITCHBackend; CallSession dataclass; make_telephony() factory
- app.py: FastAPI service (port 8007) — /health + /classify; accepts base64 PCM
  chunks, returns full AudioEventOut including dimensional/prosody/accent fields
- prefs.py: voice preference helpers (elcor_mode, confidence_threshold,
  whisper_model, elcor_prior_frames); cf-core and env-var fallback

Tests: fix stale tests (YAMNetAcousticBackend → ASTAcousticBackend, scene field
added to AcousticResult, speaker_at gap now resolves dominant speaker not UNKNOWN,
make_io real path returns MicVoiceIO when sounddevice installed). 78 tests passing.

Closes #2, #3.

2026-04-18 22:36:58 -07:00

pyr0ball

335d51f02f

feat: lock ToneEvent SSE wire format (cf-core#40)

- AudioEvent: add speaker_id field (was on VoiceFrame only; needed on all events)
- ToneEvent: add session_id field for session correlation across embedded consumers
- README: full wire format documentation — JSON shape, field reference table,
  SSE envelope, Elcor mode subtext table, module license map
- ToneEvent docstring references cf-core#40 as the wire format spec

Closes cf-core#40

2026-04-06 17:51:09 -07:00

pyr0ball

35fc0a088c

feat: initial cf-voice stub — VoiceFrame API, mock IO, context classifier

- VoiceFrame dataclass: label, confidence, speaker_id, shift_magnitude, timestamp
- MockVoiceIO: async generator of synthetic frames on a timer (CF_VOICE_MOCK=1)
- ContextClassifier: passthrough stub wrapping VoiceIO; _enrich() hook for real classifiers
- make_io() factory: mock mode auto-detected from env, raises NotImplementedError for real audio
- cf-voice-demo CLI entry point for quick smoke-testing
- 12 tests passing; editable install via pip install -e ../cf-voice

2026-04-06 16:03:07 -07:00

3 commits