minerva/docs/QUESTIONS_ANSWERED.md

# Your Questions Answered - Quick Reference

## TL;DR: Yes, Yes, and Multiple Options!

### Q1: Pre-trained "Hey Mycroft" Model?

**Answer: YES! ✅**

Download and use immediately:
```bash
./quick_start_hey_mycroft.sh
# Done in 5 minutes - no training!
```

The pre-trained model works great and saves you 1-2 hours of training time.

### Q2: Multiple Wake Words?

**Answer: YES! ✅ (with considerations)**

**Server-side (Heimdall):** Easy, run 3-5 wake words
```bash
python voice_server_enhanced.py \
    --enable-precise \
    --multi-wake-word
```

**Edge (K210):** Feasible for 1-2, challenging for 3+

### Q3: Adopting New Users' Voices?

**Answer: Multiple approaches ✅**

**Best option:** Train one model with everyone's voices upfront
**Alternative:** Incremental retraining as new users join
**Advanced:** Speaker identification with personalization

---

## Detailed Answers

### 1. Pre-trained "Hey Mycroft" Model

#### Where to Get It

```bash
# Quick start script does this for you
wget https://github.com/MycroftAI/precise-data/raw/models-dev/hey-mycroft.tar.gz
tar xzf hey-mycroft.tar.gz
```

#### How to Use

**Instant deployment:**
```bash
python voice_server.py \
    --enable-precise \
    --precise-model ~/precise-models/pretrained/hey-mycroft.net
```

**Fine-tune with your voice:**
```bash
# Record 20-30 samples of your voice saying "Hey Mycroft"
precise-collect

# Fine-tune from pre-trained
precise-train -e 30 my-hey-mycroft.net . \
    --from-checkpoint ~/precise-models/pretrained/hey-mycroft.net
```

#### Advantages

✅ **Zero training time** - Works immediately
✅ **Proven accuracy** - Tested by thousands
✅ **Good baseline** - Already includes diverse voices
✅ **Easy fine-tuning** - Add your voice in 30 mins vs 60+ mins from scratch

#### When to Use Pre-trained vs Custom

**Use Pre-trained "Hey Mycroft" when:**
- You want to test quickly
- "Hey Mycroft" is an acceptable wake word
- You want proven accuracy out-of-box

**Train Custom when:**
- You want a different wake word ("Hey Computer", "Jarvis", etc.)
- Maximum accuracy for your specific environment
- Family-specific wake word

**Hybrid (Recommended):**
- Start with pre-trained "Hey Mycroft"
- Test and learn the system
- Fine-tune with your samples
- Or add custom wake word later

---

### 2. Multiple Wake Words

#### Can You Have Multiple?

**Yes!** Options:

#### Option A: Server-Side (Recommended)

**Easy implementation:**
```bash
# Use the enhanced server
python voice_server_enhanced.py \
    --enable-precise \
    --multi-wake-word
```

**Configured wake words:**
- "Hey Mycroft" (pre-trained)
- "Hey Computer" (custom)
- "Jarvis" (custom)

**Resource impact:**
- 3 models = ~15-30% CPU (Heimdall handles easily)
- ~300-600MB RAM
- Each model runs independently

**Example use cases:**
```python
"Hey Mycroft, what's the time?" → General assistant
"Jarvis, run diagnostics"        → Personal assistant mode
"Emergency, call help"           → Priority/emergency mode
```

#### Option B: Edge (K210)

**Feasible for 1-2 wake words:**
```python
# Sequential checking
for model in ['hey-mycroft.kmodel', 'emergency.kmodel']:
    if detect_wake_word(model):
        return model
```

**Limitations:**
- +50-100ms latency per additional model
- Memory constraints (6MB total for all models)
- More models = more power consumption

**Recommendation:**
- K210: 1 wake word (optimal)
- K210: 2 wake words (acceptable)
- K210: 3+ wake words (not recommended)

#### Option C: Contextual Wake Words

Different wake words for different purposes:
```python
wake_word_contexts = {
    'hey_mycroft': 'general_assistant',
    'emergency': 'priority_emergency',
    'goodnight': 'bedtime_routine',
}
```

#### Should You Use Multiple?

**One wake word is usually enough!**

Commercial products (Alexa, Google) use one wake word and they work fine.

**Use multiple when:**
- Different family members want different wake words
- You want context-specific behaviors (emergency vs. general)
- You enjoy the flexibility

**Start with one, add more later if needed.**

---

### 3. Adopting New Users' Voices

#### Challenge

Same wake word, different voices:
- Mom says "Hey Mycroft" (soprano)
- Dad says "Hey Mycroft" (bass)
- Kids say "Hey Mycroft" (high-pitched)

All need to work!

#### Solution 1: Diverse Training (Recommended)

**During initial training, have everyone record samples:**

```bash
cd ~/precise-models/family-hey-mycroft

# Session 1: Mom records 30 samples
precise-collect  # Mom speaks "Hey Mycroft" 30 times

# Session 2: Dad records 30 samples
precise-collect  # Dad speaks "Hey Mycroft" 30 times

# Session 3: Kids record 20 samples each
precise-collect  # Kids speak "Hey Mycroft" 40 times total

# Train one model with all voices
precise-train -e 60 family-hey-mycroft.net .

# Deploy
python voice_server.py \
    --enable-precise \
    --precise-model family-hey-mycroft.net
```

**Pros:**
✅ One model works for everyone
✅ Simple deployment
✅ No switching needed
✅ Works from day one

**Cons:**
❌ Need everyone's time upfront
❌ Slightly lower per-person accuracy than individual models

#### Solution 2: Incremental Training

**Start with one person, add others over time:**

```bash
# Week 1: Train with Dad's voice
precise-train -e 60 hey-mycroft.net .

# Week 2: Mom wants to use it
# Collect Mom's samples
precise-collect  # Mom records 20-30 samples

# Add to training set
cp mom-samples/* wake-word/

# Retrain from checkpoint (faster!)
precise-train -e 30 hey-mycroft.net . \
    --from-checkpoint hey-mycroft.net

# Now works for both Dad and Mom!

# Week 3: Kids want in
# Repeat process...
```

**Pros:**
✅ Don't need everyone upfront
✅ Easy to add new users
✅ Model improves gradually

**Cons:**
❌ New users may have issues initially
❌ Requires periodic retraining

#### Solution 3: Speaker Identification (Advanced)

**Identify who's speaking, use personalized model/settings:**

```bash
# Install speaker ID
pip install pyannote.audio scipy --break-system-packages

# Use enhanced server
python voice_server_enhanced.py \
    --enable-precise \
    --enable-speaker-id \
    --hf-token YOUR_HF_TOKEN
```

**Enroll users:**
```bash
# Record 30-second voice sample from each person
# POST to /speakers/enroll with audio + name

curl -F "name=alan" \
     -F "audio=@alan_voice.wav" \
     http://localhost:5000/speakers/enroll

curl -F "name=sarah" \
     -F "audio=@sarah_voice.wav" \
     http://localhost:5000/speakers/enroll
```

**Benefits:**
```python
# Different responses per user
if speaker == 'alan':
    turn_on('light.alan_office')
elif speaker == 'sarah':
    turn_on('light.sarah_office')

# Different permissions
if speaker == 'kids' and command.startswith('buy'):
    return "Sorry, kids can't make purchases"
```

**Pros:**
✅ Personalized responses
✅ User-specific settings
✅ Better accuracy (optimized per voice)
✅ Can track who said what

**Cons:**
❌ More complex
❌ Privacy considerations
❌ Additional CPU/RAM (~10% + 200MB)
❌ Requires voice enrollment

#### Solution 4: Pre-trained Model (Easiest)

**"Hey Mycroft" already includes diverse voices!**

```bash
# Just use it - already trained on many voices
./quick_start_hey_mycroft.sh
```

The community model was trained with:
- Male and female voices
- Different accents
- Different ages
- Various environments

**It should work for most family members out-of-box!**

Then fine-tune if needed.

---

## Recommended Path for Your Situation

### Scenario: Family of 3-4 People

**Week 1: Quick Start**
```bash
# Use pre-trained "Hey Mycroft"
./quick_start_hey_mycroft.sh

# Test with all family members
# Likely works for everyone already!
```

**Week 2: Fine-tune if Needed**
```bash
# If someone has issues:
# Have them record 20 samples
# Fine-tune the model

precise-train -e 30 family-hey-mycroft.net . \
    --from-checkpoint ~/precise-models/pretrained/hey-mycroft.net
```

**Week 3: Add Features**
```bash
# If you want personalization:
python voice_server_enhanced.py \
    --enable-speaker-id

# Enroll each family member
```

### Scenario: Just You (or 1-2 People)

**Option 1: Pre-trained**
```bash
./quick_start_hey_mycroft.sh
# Done!
```

**Option 2: Custom Wake Word**
```bash
# Train custom "Hey Computer"
cd ~/precise-models/hey-computer
./1-record-wake-word.sh  # 50 samples
./2-record-not-wake-word.sh  # 200 samples
./3-train-model.sh
```

### Scenario: Multiple People + Multiple Wake Words

**Full setup:**
```bash
# Pre-trained for family
./quick_start_hey_mycroft.sh

# Personal wake word for Dad
cd ~/precise-models/jarvis
# Train custom wake word

# Emergency wake word
cd ~/precise-models/emergency
# Train emergency wake word

# Run multi-wake-word server
python voice_server_enhanced.py \
    --enable-precise \
    --multi-wake-word \
    --enable-speaker-id
```

---

## Quick Decision Matrix

| Your Situation | Recommendation |
|----------------|----------------|
| **Just getting started** | Pre-trained "Hey Mycroft" |
| **Want different wake word** | Train custom model |
| **Family of 3-4** | Pre-trained + fine-tune if needed |
| **Want personalization** | Add speaker ID |
| **Multiple purposes** | Multiple wake words (server-side) |
| **Deploying to K210** | 1 wake word, no speaker ID |

---

## Files to Use

**Quick start with pre-trained:**
- `quick_start_hey_mycroft.sh` - Zero training, 5 minutes!

**Multiple wake words:**
- `voice_server_enhanced.py` - Multi-wake-word + speaker ID support

**Training custom:**
- `setup_precise.sh` - Setup training environment
- Scripts in `~/precise-models/your-wake-word/`

**Documentation:**
- `WAKE_WORD_ADVANCED.md` - Detailed guide (this is comprehensive!)
- `PRECISE_DEPLOYMENT.md` - Production deployment

---

## Summary

✅ **Yes**, pre-trained "Hey Mycroft" exists and works great
✅ **Yes**, you can have multiple wake words (server-side is easy)
✅ **Yes**, multiple approaches for multi-user support

**Recommended approach:**
1. Start with `./quick_start_hey_mycroft.sh` (5 mins)
2. Test with all family members
3. Fine-tune if anyone has issues
4. Add speaker ID later if you want personalization
5. Consider multiple wake words only if you have specific use cases

**Keep it simple!** One pre-trained wake word works for most people.

---

## Next Actions

**Ready to start?**

```bash
# 5-minute quick start
./quick_start_hey_mycroft.sh

# Or read more first
cat WAKE_WORD_ADVANCED.md
```

**Questions?**
- Pre-trained models: See WAKE_WORD_ADVANCED.md § Pre-trained
- Multiple wake words: See WAKE_WORD_ADVANCED.md § Multiple Wake Words
- Voice adaptation: See WAKE_WORD_ADVANCED.md § Voice Adaptation

**Happy voice assisting! 🎙️**