feat: lightweight audio domain classifier as a labeling assist task #27

New issue

Open

opened 2026-04-10 21:35:40 -07:00 by pyr0ball · 0 comments

pyr0ball commented

2026-04-10 21:35:40 -07:00

Owner

Context: Manually tagging hundreds of audio samples by domain is tedious and inconsistent. A lightweight pre-classifier could suggest the domain tag automatically, leaving humans to confirm or correct. This also has a second use: the same classifier would drive inference-time routing in cf-voice (routing to accent-aware or domain-specific SER models at runtime).

Scope:

Add an audio domain classification task to the Avocet labeling pipeline (optional pre-label step, not required)
Use a lightweight model (e.g. ECAPA-TDNN or AST fine-tuned on domain categories) to predict audio_domain before a sample enters the card-stack UI
Predicted domain shown in labeling UI as a pre-filled suggestion — labeler confirms or overrides, never forced
Track labeler overrides separately from model predictions for calibration reporting
Ties into cf-orch#34 (accent/dialect-aware model routing) — domain classifier output should be exportable in a format suitable for inference-time routing config

Out of scope: Training the domain classifier itself from scratch (start with a fine-tuned off-the-shelf model). Integration with cf-voice runtime routing (tracked in cf-orch#34).

Acceptance criteria:

Domain classifier runs as an optional pre-label step (can be disabled per dataset)
Predicted domain shown in labeling UI as a suggestion, not a forced value
Labeler override is tracked separately from model prediction for calibration
Calibration report shows model prediction vs. labeler agreement rate per domain

Related: Depends on audio domain tagging issue. Cross-references cf-orch#34 (accent/dialect-aware model routing). circuitforge-plans/avocet/ — audio model evaluation extension.

**Context:** Manually tagging hundreds of audio samples by domain is tedious and inconsistent. A lightweight pre-classifier could suggest the domain tag automatically, leaving humans to confirm or correct. This also has a second use: the same classifier would drive inference-time routing in cf-voice (routing to accent-aware or domain-specific SER models at runtime). **Scope:** - [ ] Add an audio domain classification task to the Avocet labeling pipeline (optional pre-label step, not required) - [ ] Use a lightweight model (e.g. ECAPA-TDNN or AST fine-tuned on domain categories) to predict `audio_domain` before a sample enters the card-stack UI - [ ] Predicted domain shown in labeling UI as a pre-filled suggestion — labeler confirms or overrides, never forced - [ ] Track labeler overrides separately from model predictions for calibration reporting - [ ] Ties into cf-orch#34 (accent/dialect-aware model routing) — domain classifier output should be exportable in a format suitable for inference-time routing config **Out of scope:** Training the domain classifier itself from scratch (start with a fine-tuned off-the-shelf model). Integration with cf-voice runtime routing (tracked in cf-orch#34). **Acceptance criteria:** - [ ] Domain classifier runs as an optional pre-label step (can be disabled per dataset) - [ ] Predicted domain shown in labeling UI as a suggestion, not a forced value - [ ] Labeler override is tracked separately from model prediction for calibration - [ ] Calibration report shows model prediction vs. labeler agreement rate per domain **Related:** Depends on audio domain tagging issue. Cross-references cf-orch#34 (accent/dialect-aware model routing). `circuitforge-plans/avocet/` — audio model evaluation extension.