diff --git a/.gitignore b/.gitignore index d27ccb5..c30b7d3 100644 --- a/.gitignore +++ b/.gitignore @@ -1,3 +1,6 @@ +# Developer context (BSL 1.1 + docs-location policy) +CLAUDE.md + # Credentials secrets.py config/.env diff --git a/CLAUDE.md b/CLAUDE.md deleted file mode 100644 index 6a34638..0000000 --- a/CLAUDE.md +++ /dev/null @@ -1,165 +0,0 @@ -# Minerva — Developer Context - -**Product code:** `MNRV` -**Status:** Concept / early prototype -**Domain:** Privacy-first, local-only voice assistant hardware platform - ---- - -## What Minerva Is - -A 100% local, FOSS voice assistant hardware platform. No cloud. No subscriptions. No data leaving the local network. - -The goal is a reference hardware + software stack for a privacy-first voice assistant that anyone can build, extend, or self-host — including people without technical backgrounds if the assembly docs are good enough. - -Core design principles (same as all CF products): -- **Local-first inference** — Whisper STT, Piper TTS, Mycroft Precise wake word all run on the host server -- **Edge where possible** — wake word detection moves to edge hardware over time (K210 → ESP32-S3 → custom) -- **No cloud dependency** — Home Assistant optional, not required -- **100% FOSS stack** - ---- - -## Hardware Targets - -### Phase 1 (current): Maix Duino (K210) -- K210 dual-core RISC-V @ 400MHz with KPU neural accelerator -- Audio: I2S microphone + speaker output -- Connectivity: ESP32 WiFi/BLE co-processor -- Programming: MaixPy (MicroPython) -- Status: server-side wake word working; edge inference in progress - -### Phase 2: ESP32-S3 -- More accessible, cheaper, better WiFi -- On-device wake word with Espressif ESP-SR -- See `docs/ESP32_S3_VOICE_ASSISTANT_SPEC.md` - -### Phase 3: Custom hardware -- Dedicated PCB for CF reference platform -- Hardware-accelerated wake word + VAD -- Designed for accessibility: large buttons, LED feedback, easy mounting - ---- - -## Software Stack - -### Edge device (Maix Duino / ESP32-S3) -- Firmware: MaixPy or ESP-IDF -- Client: `hardware/maixduino/maix_voice_client.py` -- Audio: I2S capture and playback -- Network: WiFi → Minerva server - -### Server (runs on Heimdall or any Linux box) -- Voice server: `scripts/voice_server.py` (Flask + Whisper + Precise) -- Enhanced version: `scripts/voice_server_enhanced.py` (adds speaker ID via pyannote) -- STT: Whisper (local) -- Wake word: Mycroft Precise -- TTS: Piper -- Home Assistant: REST API integration (optional) -- Conda env: `whisper_cli` (existing on Heimdall) - ---- - -## Directory Structure - -``` -minerva/ -├── docs/ # Architecture, guides, reference docs -│ ├── maix-voice-assistant-architecture.md -│ ├── MYCROFT_PRECISE_GUIDE.md -│ ├── PRECISE_DEPLOYMENT.md -│ ├── ESP32_S3_VOICE_ASSISTANT_SPEC.md -│ ├── HARDWARE_BUYING_GUIDE.md -│ ├── LCD_CAMERA_FEATURES.md -│ ├── K210_PERFORMANCE_VERIFICATION.md -│ ├── WAKE_WORD_ADVANCED.md -│ ├── ADVANCED_WAKE_WORD_TOPICS.md -│ └── QUESTIONS_ANSWERED.md -├── scripts/ # Server-side scripts -│ ├── voice_server.py # Core Flask + Whisper + Precise server -│ ├── voice_server_enhanced.py # + speaker identification (pyannote) -│ ├── setup_voice_assistant.sh # Server setup -│ ├── setup_precise.sh # Mycroft Precise training environment -│ └── download_pretrained_models.sh -├── hardware/ -│ └── maixduino/ # K210 edge device scripts -│ ├── maix_voice_client.py # Production client -│ ├── maix_simple_record_test.py # Audio capture test -│ ├── maix_test_simple.py # Hardware/network test -│ ├── maix_debug_wifi.py # WiFi diagnostics -│ ├── maix_discover_modules.py # Module discovery -│ ├── secrets.py.example # WiFi/server credential template -│ ├── MICROPYTHON_QUIRKS.md -│ └── README.md -├── config/ -│ └── .env.example # Server config template -├── models/ # Wake word models (gitignored, large) -└── CLAUDE.md # This file -``` - ---- - -## Credentials / Secrets - -**Never commit real credentials.** Pattern: - -- Server: copy `config/.env.example` → `config/.env`, fill in real values -- Edge device: copy `hardware/maixduino/secrets.py.example` → `secrets.py`, fill in WiFi + server URL - -Both files are gitignored. `.example` files are committed as templates. - ---- - -## Running the Server - -```bash -# Activate environment -conda activate whisper_cli - -# Basic server (Whisper + Precise wake word) -python scripts/voice_server.py \ - --enable-precise \ - --precise-model models/hey-minerva.net \ - --precise-sensitivity 0.5 - -# Enhanced server (+ speaker identification) -python scripts/voice_server_enhanced.py \ - --enable-speaker-id \ - --hf-token $HF_TOKEN - -# Test health -curl http://localhost:5000/health -curl http://localhost:5000/wake-word/status -``` - ---- - -## Connection to CF Voice Infrastructure - -Minerva is the **hardware platform** for cf-voice. As `circuitforge_core.voice` matures: - -- `cf_voice.io` (STT/TTS) → replaces the ad hoc Whisper/Piper calls in `voice_server.py` -- `cf_voice.context` (parallel classifier) → augments Mycroft Precise with tone/environment detection -- `cf_voice.telephony` → future: Minerva as an always-on household linnet node - -Minerva hardware + cf-voice software = the CF reference voice assistant stack. - ---- - -## Roadmap - -See Forgejo milestones on this repo. High-level: - -1. **Alpha — Server-side pipeline** — Whisper + Precise + Piper working end-to-end on Heimdall -2. **Beta — Edge wake word** — wake word on K210 or ESP32-S3; audio only streams post-wake -3. **Hardware v1** — documented reference build; buying guide; assembly instructions -4. **cf-voice integration** — Minerva uses cf_voice modules from circuitforge-core -5. **Platform** — multiple hardware targets; custom PCB design - ---- - -## Related - -- `cf-voice` module design: `circuitforge-plans/circuitforge-core/2026-04-06-cf-voice-design.md` -- `linnet` product: real-time tone annotation, will eventually embed Minerva as a hardware node -- Heimdall server: primary dev/deployment target (10.1.10.71 on LAN) diff --git a/LICENSE b/LICENSE new file mode 100644 index 0000000..f182a44 --- /dev/null +++ b/LICENSE @@ -0,0 +1,28 @@ +MIT License + +Copyright (c) 2026 CircuitForge LLC + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + +--- + +Note: As Minerva integrates with BSL 1.1 components from circuitforge-core +(cf_voice modules, tier enforcement), those specific files will carry BSL 1.1 +headers and this LICENSE file will be updated to include the BSL 1.1 text. +The pipeline infrastructure (wake word, STT, TTS, audio I/O) remains MIT. diff --git a/THIRD_PARTY_LICENSES.md b/THIRD_PARTY_LICENSES.md new file mode 100644 index 0000000..4458d05 --- /dev/null +++ b/THIRD_PARTY_LICENSES.md @@ -0,0 +1,36 @@ +# Third-Party Licenses + +## Mycroft Precise + +- **License:** Apache 2.0 +- **Source:** https://github.com/MycroftAI/mycroft-precise +- **Usage:** Wake word detection engine. Downloaded at setup time; not bundled in this repo. +- **Note:** If you distribute a packaged Minerva build that includes the `precise-engine` binary, you must include the Apache 2.0 NOTICE file from the Mycroft Precise repository. + +## OpenAI Whisper + +- **License:** MIT +- **Source:** https://github.com/openai/whisper +- **Usage:** Speech-to-text inference. + +## Piper TTS + +- **License:** MIT +- **Source:** https://github.com/rhasspy/piper +- **Usage:** Text-to-speech synthesis. + +## pyannote.audio + +- **License:** MIT (model weights require separate HuggingFace user agreement) +- **Source:** https://github.com/pyannote/pyannote-audio +- **Usage:** Speaker identification and diarization (optional, in `voice_server_enhanced.py`). +- **Note:** To use speaker identification you must accept pyannote's terms on HuggingFace + and provide your own `HF_TOKEN`. Minerva does not distribute pyannote model weights. + +## MaixPy / K210 Firmware + +- **License:** Apache 2.0 (MaixPy); some Kendryte/Sipeed components are proprietary. +- **Source:** https://github.com/sipeed/MaixPy +- **Usage:** MicroPython firmware for Maix Duino (K210) edge devices. +- **Note:** Firmware is NOT included in this repository. Obtain directly from Sipeed: + https://dl.sipeed.com/MAIX/MaixPy/release/ diff --git a/hardware/maixduino/maix_voice_client.py b/hardware/maixduino/maix_voice_client.py index 9d9f056..a269757 100755 --- a/hardware/maixduino/maix_voice_client.py +++ b/hardware/maixduino/maix_voice_client.py @@ -34,12 +34,17 @@ import gc # ----- Configuration ----- -# WiFi Settings -WIFI_SSID = "YourSSID" -WIFI_PASSWORD = "YourPassword" +# Load credentials from secrets.py (copy secrets.py.example → secrets.py, gitignored) +try: + from secrets import SECRETS +except ImportError: + SECRETS = {} + +WIFI_SSID = SECRETS.get("wifi_ssid", "YourSSID") +WIFI_PASSWORD = SECRETS.get("wifi_password", "") # Server Settings -VOICE_SERVER_URL = "http://10.1.10.71:5000" +VOICE_SERVER_URL = SECRETS.get("voice_server_url", "http://10.1.10.71:5000") PROCESS_ENDPOINT = "/process" # Audio Settings