feat: acoustic environment fingerprinting + privacy risk scoring #20
Labels
No labels
a11y
backlog
blocked
bug
cf-core-dep
design
enhancement
infrastructure
internal
privacy
tier:free
tier:paid
ux
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: Circuit-Forge/linnet#20
Loading…
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
feat: acoustic environment fingerprinting + privacy risk scoring
Extends
cf_voice.acousticandcf_voice.eventsto classify the acoustic scene around the primary speaker, enabling privacy-aware session behaviour in Linnet and downstream products (Osprey AMD, Egret DSAR recording detection).Why this matters
The acoustic environment is identifiable. Specific birdsong species + regional accent + traffic pattern can narrow a location to a neighbourhood. The fingerprinting system must score privacy risk locally before deciding what to log, transmit, or surface — including to cloud inference. Local-first is load-bearing here, not aspirational.
Scope: four new signal types
All flow through the existing
AcousticResult/AudioEventpipeline incf_voice.acoustic.1. Scene classification —
event_type: "scene"Broad acoustic scene category. Primary input to privacy risk scoring.
Proposed
SCENE_LABELS:indoor_quiet,indoor_crowd,outdoor_urban,outdoor_nature,vehicle,public_transitBackend: AST/YAMNet acoustic scene model (AudioSet "acoustic scene" subset). New
SceneBackendprotocol inacoustic.pyalongsideAcousticBackend.2. Extended environ labels —
event_type: "environ"expansionExpand the current telephony-only set to cover general-purpose acoustic events:
birdsong,wind,rain,watertraffic,crowd_chatter,street_crossing_signal,constructionhvac,keyboard_typing,restaurant_ambienceBackend: expand
_YAMNET_MAP/_AST_MAPinacoustic.py. Builds on #5 (YAMNet).3. Accent / language identification —
event_type: "accent"Regional accent of primary speaker. Accent alone is not high-risk, but combined with specific birdsong or quiet rural background it becomes location-identifying — the privacy scorer accounts for this compound signal.
Fields:
language: str,region: str,confidence: floatBackend:
facebook/mms-lid-126for language, wav2vec2 accent fine-tune for region. Newcf_voice/accent.py. Lazy-loaded, gated byCF_VOICE_ACCENT=1(off by default — GPU cost + privacy sensitivity).4. Background speaker presence —
SPEAKER_LABELSexpansionAdd
background_voiceslabel: detectable via VAD + speaker count from pyannote or silero-vad. Distinct from primary speaker classification (#8).Privacy risk scoring (
cf_voice/privacy.py)A
privacy_riskvalue (low/moderate/high) derived locally per audio window from the combined fingerprint. Never sent to cloud. Never logged server-side.outdoor_urban+crowd_chatter+trafficindoor_quiet+background_voicesoutdoor_nature+birdsong+ regional accentindoor_quiet+ no background voicesRisk gates (Linnet)
high: warn user before sending audio chunk to cloud STT/inference; offer local-only fallbackmoderate: attachprivacy_flagsto session state, no blocking action by defaultlow: proceed normally; no annotationImplementation order
ENVIRON_LABELS+ label maps (builds on #5)SceneBackendprotocol +MockSceneBackend+ASTSceneBackendstub inacoustic.pycf_voice/accent.py:AccentClassifierwith lazy-load +CF_VOICE_ACCENTgatecf_voice/privacy.py:score_privacy_risk(scene, environ, speaker, accent) -> PrivacyRiskprivacy_riskinGET /session/{id}, addscene-event+accent-eventSSE typescf-voice changes required
cf_voice/events.py:SCENE_LABELS,ACCENT_LABELS, expandedENVIRON_LABELScf_voice/acoustic.py:SceneBackendprotocol, mock, AST stubcf_voice/accent.py:AccentClassifier(new module)cf_voice/privacy.py:PrivacyRiskdataclass + scoring functioncf_voice/context.py: wire scene + accent classifiers into_classify_real_asyncLinnet changes
app/api/sessions.py:privacy_riskfield inGET /session/{id}responseapp/api/events.py:scene-event,accent-eventSSE event typesapp/models/:SceneEvent,AccentEventmodelsuseToneStreamlistener for new event types; NowPanel subtle indicatorcompose.cloud.yml:CF_VOICE_ACCENT=0default (opt-in, expensive)Non-goals
privacy_risk=highaudio chunks to cloud STT without explicit user consent