feat: audio domain tagging for benchmark datasets #25

Open
opened 2026-04-10 21:35:23 -07:00 by pyr0ball · 0 comments
Owner

Context: Audio benchmark datasets mix wildly different recording conditions — acted studio speech, naturalistic conversation, broadcast panel shows, call centre audio. Lumping them into a single pool hides per-domain failure modes; testing against British comedy panel show audio ("As Yet Untitled") showed SER models reading "neutral" across the board on naturalistic non-NA-accent speech.

Scope:

  • Extend dataset schema with optional audio_domain string field
  • Implement suggested taxonomy: acted_na, acted_eu, naturalistic_en_gb, naturalistic_en_us, broadcast, call_centre, phone_degraded
  • Labeling UI shows domain badge alongside sample and allows editing
  • Export format (JSON + CSV) includes audio_domain field
  • Schema change is backward-compatible (field is optional, existing datasets unaffected)

Out of scope: Automatic domain prediction (see separate issue for lightweight domain classifier).

Acceptance criteria:

  • Dataset schema accepts optional audio_domain string field
  • Labeling UI shows domain badge and allows editing
  • Export includes domain tag
  • Existing datasets with no domain tag load and export without errors

Related: circuitforge-plans/avocet/ — audio model evaluation extension; see also cf-voice/Linnet SER evaluation work

**Context:** Audio benchmark datasets mix wildly different recording conditions — acted studio speech, naturalistic conversation, broadcast panel shows, call centre audio. Lumping them into a single pool hides per-domain failure modes; testing against British comedy panel show audio ("As Yet Untitled") showed SER models reading "neutral" across the board on naturalistic non-NA-accent speech. **Scope:** - [ ] Extend dataset schema with optional `audio_domain` string field - [ ] Implement suggested taxonomy: `acted_na`, `acted_eu`, `naturalistic_en_gb`, `naturalistic_en_us`, `broadcast`, `call_centre`, `phone_degraded` - [ ] Labeling UI shows domain badge alongside sample and allows editing - [ ] Export format (JSON + CSV) includes `audio_domain` field - [ ] Schema change is backward-compatible (field is optional, existing datasets unaffected) **Out of scope:** Automatic domain prediction (see separate issue for lightweight domain classifier). **Acceptance criteria:** - [ ] Dataset schema accepts optional `audio_domain` string field - [ ] Labeling UI shows domain badge and allows editing - [ ] Export includes domain tag - [ ] Existing datasets with no domain tag load and export without errors **Related:** `circuitforge-plans/avocet/` — audio model evaluation extension; see also cf-voice/Linnet SER evaluation work
pyr0ball added the
enhancement
label 2026-04-10 21:35:23 -07:00
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: Circuit-Forge/avocet#25
No description provided.