feat: YAMNet acoustic event classifier — queue/environ/speaker type #2

Closed
opened 2026-04-12 10:56:21 -07:00 by pyr0ball · 0 comments
Owner

Context

Linnet#5. The YAMNetAcousticBackend is currently a NotImplementedError stub in acoustic.py. This issue covers the real implementation.

Requirements

  • Load google/yamnet via TensorFlow Hub (or PyTorch port)
  • Map YAMNet class outputs to cf-voice event buckets:
    • queue — hold music, elevator music, phone beep, keypad tone
    • environ — indoor room, outdoor, vehicle interior, crowd
    • speaker — speech (single), speech (crowd), silence
  • classify_window(audio_bytes, timestamp) returns AcousticResult(queue, environ, speaker)
  • Graceful degradation: if YAMNet unavailable (no TF), fall back to MockAcousticBackend
  • CF_VOICE_ACOUSTIC=1 env var opt-in (default off until model is confirmed on Heimdall)

Label expansion for linnet#20

Once base YAMNet works, extend _YAMNET_MAP with:

  • Birdsong, traffic, street crossing, rain, background voices
    These feed the acoustic fingerprinting / privacy risk scorer.

Tracking

Linnet#5 (base), Linnet#20 (privacy extension)

## Context Linnet#5. The `YAMNetAcousticBackend` is currently a `NotImplementedError` stub in `acoustic.py`. This issue covers the real implementation. ## Requirements - Load `google/yamnet` via TensorFlow Hub (or PyTorch port) - Map YAMNet class outputs to cf-voice event buckets: - `queue` — hold music, elevator music, phone beep, keypad tone - `environ` — indoor room, outdoor, vehicle interior, crowd - `speaker` — speech (single), speech (crowd), silence - `classify_window(audio_bytes, timestamp)` returns `AcousticResult(queue, environ, speaker)` - Graceful degradation: if YAMNet unavailable (no TF), fall back to `MockAcousticBackend` - `CF_VOICE_ACOUSTIC=1` env var opt-in (default off until model is confirmed on Heimdall) ## Label expansion for linnet#20 Once base YAMNet works, extend `_YAMNET_MAP` with: - Birdsong, traffic, street crossing, rain, background voices These feed the acoustic fingerprinting / privacy risk scorer. ## Tracking Linnet#5 (base), Linnet#20 (privacy extension)
pyr0ball added this to the v0.2.0 — Navigation milestone 2026-04-12 10:56:21 -07:00
pyr0ball added the
backlog
inference
acoustic
labels 2026-04-12 10:56:21 -07:00
Sign in to join this conversation.
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: Circuit-Forge/cf-voice#2
No description provided.