diff --git a/LICENSE b/LICENSE new file mode 100644 index 0000000..15622c4 --- /dev/null +++ b/LICENSE @@ -0,0 +1,21 @@ +MIT License + +Copyright (c) 2024 CircuitForge LLC + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. diff --git a/README.md b/README.md index cc4595f..940ca38 100644 --- a/README.md +++ b/README.md @@ -4,6 +4,42 @@ CircuitForge voice annotation pipeline. Produces `VoiceFrame` objects from a liv **Status:** Notation v0.1.x — real inference pipeline live (faster-whisper STT, wav2vec2 SER, librosa prosody, pyannote diarization). Mock mode available for dev/CI without GPU or mic. +--- + +## Prerequisites + +### Start here: mock mode (no GPU, no HuggingFace account) + +If you are integrating against the cf-voice API or running CI, start with mock mode. No hardware, no model download, no accounts required: + +```bash +pip install -e ../cf-voice +CF_VOICE_MOCK=1 python -m cf_voice.app --port 8007 +``` + +Mock mode emits synthetic `VoiceFrame` objects on a timer. All API surface is identical to real inference. + +### Moving to real inference: HuggingFace gated models + +Speaker diarization uses two gated HuggingFace models. Before your `HF_TOKEN` will authorise a download, you must individually accept the licence terms for each: + +1. [pyannote/speaker-diarization-3.1](https://huggingface.co/pyannote/speaker-diarization-3.1) — click "Agree and access repository" +2. [pyannote/segmentation-3.0](https://huggingface.co/pyannote/segmentation-3.0) — click "Agree and access repository" + +This is a one-time step per HuggingFace account. If you skip it, the service will fail at startup with a `401 Unauthorized` error from HuggingFace. + +> **Licence note:** The pyannote models are CC BY 4.0. Attribution is required in any distributed product. Set `HF_TOKEN` to a token belonging to an account that has accepted the above terms — using a shared or third-party token that has not individually accepted the terms violates the licence. + +### Hardware + +| Component | Minimum | +|---|---| +| GPU | Any CUDA-capable GPU | +| VRAM | 4GB+ recommended | +| Microphone | Required for live capture (not needed for file processing or mock mode) | + +--- + ## Install ```bash