watch: SoulX-Transcriber — Chinese diarization leader, not practical yet for EN/low-resource #7
Labels
No labels
a11y
acoustic
backlog
bug
cf-core-dep
diarization
enhancement
inference
privacy
stt
testing
tier:paid
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: Circuit-Forge/cf-voice#7
Loading…
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Source: https://huggingface.co/Soul-AILab/SoulX-Transcriber
What it is
SoulX-Transcriber is a unified end-to-end diarization + transcription model based on Qwen3-Omni-30B-A3B (MoE, Apache 2.0). It handles speaker attribution and timestamped segmentation in a single pass.
Why not yet
What's genuinely good
DER 2.89% on AISHELL-4 (Chinese meeting transcription) is state-of-the-art. If CF ever targets Chinese-language institutional or enterprise users — education, corporate meetings — this is worth revisiting on high-VRAM hardware (A100/H100 class).
cf-voice backend comparison
Watch trigger
Re-evaluate if: (a) a distilled/quantized version releases under 8B active params, or (b) CF adds a Chinese-language product tier.