minerva/docs/PRECISE_DEPLOYMENT.md
pyr0ball 173f7f37d4 feat: import mycroft-precise work as Minerva foundation
Ports prior voice assistant research and prototypes from devl/Devops
into the Minerva repo. Includes:

- docs/: architecture, wake word guides, ESP32-S3 spec, hardware buying guide
- scripts/: voice_server.py, voice_server_enhanced.py, setup scripts
- hardware/maixduino/: edge device scripts with WiFi credentials scrubbed
  (replaced hardcoded password with secrets.py pattern)
- config/.env.example: server config template
- .gitignore: excludes .env, secrets.py, model blobs, ELF firmware
- CLAUDE.md: Minerva product context and connection to cf-voice roadmap
2026-04-06 22:21:12 -07:00

13 KiB
Executable file

Mycroft Precise Deployment Guide

Quick Reference: Server vs Edge Detection

Setup:

# 1. On Heimdall: Setup Precise
./setup_precise.sh --wake-word "hey computer"

# 2. Train your model (follow scripts in ~/precise-models/hey-computer/)
cd ~/precise-models/hey-computer
./1-record-wake-word.sh
./2-record-not-wake-word.sh
# Organize samples, then:
./3-train-model.sh
./4-test-model.sh

# 3. Start voice server with Precise
cd ~/voice-assistant
conda activate precise
python voice_server.py \
    --enable-precise \
    --precise-model ~/precise-models/hey-computer/hey-computer.net \
    --precise-sensitivity 0.5

Architecture:

  • Maix Duino → Continuous audio stream → Heimdall
  • Heimdall runs Precise on audio stream
  • On wake word: Process command with Whisper
  • Response → TTS → Stream back to Maix Duino

Pros: Easier setup, better accuracy, simple updates Cons: More network traffic, requires stable connection

Edge Detection (Advanced - Future Phase)

Setup:

# 1. Train model on Heimdall (same as above)
# 2. Convert to KMODEL for K210
# 3. Deploy to Maix Duino
# (See MYCROFT_PRECISE_GUIDE.md for detailed conversion steps)

Architecture:

  • Maix Duino runs Precise locally on K210
  • Only sends audio after wake word detected
  • Lower latency, less network traffic

Pros: Lower latency, less bandwidth, works offline Cons: Complex conversion, lower accuracy, harder updates

Phase-by-Phase Deployment

Phase 1: Server Setup (Day 1)

# On Heimdall
ssh alan@10.1.10.71

# 1. Setup voice assistant base
./setup_voice_assistant.sh

# 2. Setup Mycroft Precise
./setup_precise.sh --wake-word "hey computer"

# 3. Configure environment
vim ~/voice-assistant/config/.env

Update .env:

HA_URL=http://your-home-assistant:8123
HA_TOKEN=your_token_here
PRECISE_MODEL=/home/alan/precise-models/hey-computer/hey-computer.net
PRECISE_SENSITIVITY=0.5

Phase 2: Wake Word Training (Day 1-2)

# Navigate to training directory
cd ~/precise-models/hey-computer
conda activate precise

# Record samples (30-60 minutes)
./1-record-wake-word.sh    # Record 50-100 wake word samples
./2-record-not-wake-word.sh # Record 200-500 negative samples

# Organize samples
# Move 80% of wake-word recordings to wake-word/
# Move 20% of wake-word recordings to test/wake-word/
# Move 80% of not-wake-word to not-wake-word/
# Move 20% of not-wake-word to test/not-wake-word/

# Train model (30-60 minutes)
./3-train-model.sh

# Test model
./4-test-model.sh

# Evaluate on test set
./5-evaluate-model.sh

# Tune threshold
./6-tune-threshold.sh

Phase 3: Server Integration (Day 2)

Option A: Manual Testing

cd ~/voice-assistant
conda activate precise

# Start server with Precise enabled
python voice_server.py \
    --enable-precise \
    --precise-model ~/precise-models/hey-computer/hey-computer.net \
    --precise-sensitivity 0.5 \
    --ha-url http://your-ha:8123 \
    --ha-token your_token

Option B: Systemd Service

Update systemd service to use Precise environment:

sudo vim /etc/systemd/system/voice-assistant.service
[Unit]
Description=Voice Assistant with Wake Word Detection
After=network.target

[Service]
Type=simple
User=alan
WorkingDirectory=/home/alan/voice-assistant
Environment="PATH=/home/alan/miniconda3/envs/precise/bin:/usr/local/bin:/usr/bin:/bin"
EnvironmentFile=/home/alan/voice-assistant/config/.env
ExecStart=/home/alan/miniconda3/envs/precise/bin/python voice_server.py \
    --enable-precise \
    --precise-model /home/alan/precise-models/hey-computer/hey-computer.net \
    --precise-sensitivity 0.5
Restart=on-failure
RestartSec=10
StandardOutput=append:/home/alan/voice-assistant/logs/voice_assistant.log
StandardError=append:/home/alan/voice-assistant/logs/voice_assistant_error.log

[Install]
WantedBy=multi-user.target

Enable and start:

sudo systemctl daemon-reload
sudo systemctl enable voice-assistant
sudo systemctl start voice-assistant
sudo systemctl status voice-assistant

Phase 4: Maix Duino Setup (Day 2-3)

For server-side wake word detection, Maix Duino streams audio:

Update maix_voice_client.py:

# Use simplified mode - just stream audio
# Server handles wake word detection
CONTINUOUS_STREAM = True  # Enable continuous streaming
WAKE_WORD_CHECK_INTERVAL = 0  # Server-side detection

Flash and test:

  1. Copy updated script to SD card
  2. Boot Maix Duino
  3. Check serial console for connection
  4. Speak wake word
  5. Verify server logs show detection

Phase 5: Testing & Tuning (Day 3-7)

Test Wake Word Detection

# Monitor server logs
journalctl -u voice-assistant -f

# Or check detections via API
curl http://10.1.10.71:5000/wake-word/detections

Test End-to-End Flow

  1. Say wake word: "Hey Computer"
  2. Wait for LED/beep on Maix Duino
  3. Say command: "Turn on the living room lights"
  4. Verify HA command executes
  5. Hear TTS response

Monitor Performance

# Check wake word log
tail -f ~/voice-assistant/logs/wake_words.log

# Check false positive rate
grep "wake_word" ~/voice-assistant/logs/wake_words.log | wc -l

# Check accuracy
# Should see detections when you say wake word
# Should NOT see detections during normal conversation

Tune Sensitivity

If too many false positives:

# Increase threshold (more conservative)
# Edit systemd service or restart with:
python voice_server.py --precise-sensitivity 0.7

If missing wake words:

# Decrease threshold (more aggressive)
python voice_server.py --precise-sensitivity 0.3

Collect Hard Examples

# When you notice false positives, record them
cd ~/precise-models/hey-computer
precise-collect -f not-wake-word/false-positive-$(date +%s).wav

# When wake word is missed, record it
precise-collect -f wake-word/missed-$(date +%s).wav

# After collecting 10-20 examples, retrain
./3-train-model.sh

Monitoring Commands

Check System Status

# Service status
sudo systemctl status voice-assistant

# Server health
curl http://10.1.10.71:5000/health

# Wake word status
curl http://10.1.10.71:5000/wake-word/status

# Recent detections
curl http://10.1.10.71:5000/wake-word/detections

View Logs

# Real-time server logs
journalctl -u voice-assistant -f

# Last 50 lines
journalctl -u voice-assistant -n 50

# Specific log file
tail -f ~/voice-assistant/logs/voice_assistant.log

# Wake word detections
tail -f ~/voice-assistant/logs/wake_words.log

# Maix Duino serial console
screen /dev/ttyUSB0 115200

Performance Metrics

# CPU usage (should be ~5-10% idle, spikes during processing)
top -p $(pgrep -f voice_server.py)

# Memory usage
ps aux | grep voice_server.py

# Network traffic (if streaming audio)
iftop -i eth0  # or your network interface

Troubleshooting

Wake Word Not Detecting

Check model is loaded:

curl http://10.1.10.71:5000/wake-word/status
# Should show: "enabled": true

Test model directly:

conda activate precise
precise-listen ~/precise-models/hey-computer/hey-computer.net
# Speak wake word - should see "!"

Check sensitivity:

# Try lower threshold
precise-listen ~/precise-models/hey-computer/hey-computer.net -t 0.3

Verify audio input:

# Test microphone
arecord -d 5 test.wav
aplay test.wav

Too Many False Positives

Increase threshold:

# Edit service or restart with higher sensitivity
python voice_server.py --precise-sensitivity 0.7

Retrain with false positives:

cd ~/precise-models/hey-computer
# Record false triggers in not-wake-word/
precise-collect -f not-wake-word/false-triggers.wav
# Add to not-wake-word training set
./3-train-model.sh

Server Won't Start with Precise

Check Precise installation:

conda activate precise
python -c "from precise_runner import PreciseRunner; print('OK')"

Check engine:

precise-engine --version
# Should show: Precise v0.3.0

Check model file:

ls -lh ~/precise-models/hey-computer/hey-computer.net
file ~/precise-models/hey-computer/hey-computer.net

Check permissions:

chmod +x /usr/local/bin/precise-engine
chmod 644 ~/precise-models/hey-computer/hey-computer.net

Audio Quality Issues

Test audio path:

# Record test on server
arecord -f S16_LE -r 16000 -c 1 -d 5 test.wav

# Transcribe with Whisper
conda activate voice-assistant
python -c "
import whisper
model = whisper.load_model('base')
result = model.transcribe('test.wav')
print(result['text'])
"

If poor quality:

  • Check microphone connection
  • Verify sample rate (16kHz)
  • Test with USB microphone
  • Check for interference/noise

Maix Duino Connection Issues

Check WiFi:

# In Maix Duino serial console
import network
wlan = network.WLAN(network.STA_IF)
print(wlan.isconnected())
print(wlan.ifconfig())

Check server reachability:

# From Maix Duino
import urequests
response = urequests.get('http://10.1.10.71:5000/health')
print(response.json())

Check audio streaming:

# On Heimdall, monitor network
sudo tcpdump -i any -n host <maix-duino-ip>
# Should see continuous packets when streaming

Optimization Tips

Reduce Latency

  1. Use smaller Whisper model:

    # Edit .env
    WHISPER_MODEL=base  # or tiny
    
  2. Optimize Precise sensitivity:

    # Find sweet spot between false positives and latency
    # Lower threshold = faster trigger but more false positives
    
  3. Pre-load models:

    # Models load on startup, not first request
    # Adds ~30s startup time but eliminates first-request delay
    

Improve Accuracy

  1. Use larger Whisper model:

    WHISPER_MODEL=large
    
  2. Train more wake word samples:

    # Aim for 100+ high-quality samples
    # Diverse speakers, conditions, distances
    
  3. Increase training epochs:

    # In 3-train-model.sh
    precise-train -e 120 hey-computer.net .  # vs default 60
    

Reduce False Positives

  1. Collect hard negatives:

    # Record TV, music, similar phrases
    # Add to not-wake-word training set
    
  2. Increase threshold:

    --precise-sensitivity 0.7  # vs default 0.5
    
  3. Use ensemble model:

    # Run multiple models, require agreement
    # Advanced - requires code modification
    

Production Checklist

  • Wake word model trained with 50+ samples
  • Model tested with <5% false positive rate
  • Server service enabled and auto-starting
  • Home Assistant token configured
  • Maix Duino WiFi configured
  • End-to-end test successful
  • Logs rotating properly
  • Monitoring in place
  • Backup of trained model
  • Documentation updated

Backup and Recovery

Backup Trained Model

# Backup model
cp ~/precise-models/hey-computer/hey-computer.net \
   ~/precise-models/hey-computer/hey-computer.net.backup

# Backup to another host
scp ~/precise-models/hey-computer/hey-computer.net \
    user@backup-host:/path/to/backups/

Restore from Backup

# Restore model
cp ~/precise-models/hey-computer/hey-computer.net.backup \
   ~/precise-models/hey-computer/hey-computer.net

# Restart service
sudo systemctl restart voice-assistant

Next Steps

Once basic server-side detection is working:

  1. Add more intents - Expand Home Assistant control
  2. Implement TTS playback - Complete the audio response loop
  3. Multi-room support - Deploy multiple Maix Duino units
  4. Voice profiles - Train model on family members
  5. Edge deployment - Convert model for K210 (advanced)

Resources

Support

Log an Issue

# Collect debug info
echo "=== System Info ===" > debug.log
uname -a >> debug.log
conda list >> debug.log
echo "=== Service Status ===" >> debug.log
systemctl status voice-assistant >> debug.log
echo "=== Recent Logs ===" >> debug.log
journalctl -u voice-assistant -n 100 >> debug.log
echo "=== Wake Word Status ===" >> debug.log
curl http://10.1.10.71:5000/wake-word/status >> debug.log

Then share debug.log when asking for help.

Common Issues Database

Symptom Likely Cause Solution
No wake detection Model not loaded Check /wake-word/status
Service won't start Missing dependencies Reinstall Precise
High false positives Low threshold Increase to 0.7+
Missing wake words High threshold Decrease to 0.3-0.4
Poor transcription Bad audio quality Check microphone
HA commands fail Wrong token Update .env
High CPU usage Large Whisper model Use smaller model

Conclusion

With Mycroft Precise, you have complete control over your wake word detection. Start with server-side detection for easier debugging, collect good training data, and tune the threshold for your environment. Once it's working well, you can optionally optimize to edge detection for lower latency.

The key to success: Quality training data > Quantity

Happy voice assisting! 🎙️