# Your Questions Answered - Quick Reference ## TL;DR: Yes, Yes, and Multiple Options! ### Q1: Pre-trained "Hey Mycroft" Model? **Answer: YES! ✅** Download and use immediately: ```bash ./quick_start_hey_mycroft.sh # Done in 5 minutes - no training! ``` The pre-trained model works great and saves you 1-2 hours of training time. ### Q2: Multiple Wake Words? **Answer: YES! ✅ (with considerations)** **Server-side (Heimdall):** Easy, run 3-5 wake words ```bash python voice_server_enhanced.py \ --enable-precise \ --multi-wake-word ``` **Edge (K210):** Feasible for 1-2, challenging for 3+ ### Q3: Adopting New Users' Voices? **Answer: Multiple approaches ✅** **Best option:** Train one model with everyone's voices upfront **Alternative:** Incremental retraining as new users join **Advanced:** Speaker identification with personalization --- ## Detailed Answers ### 1. Pre-trained "Hey Mycroft" Model #### Where to Get It ```bash # Quick start script does this for you wget https://github.com/MycroftAI/precise-data/raw/models-dev/hey-mycroft.tar.gz tar xzf hey-mycroft.tar.gz ``` #### How to Use **Instant deployment:** ```bash python voice_server.py \ --enable-precise \ --precise-model ~/precise-models/pretrained/hey-mycroft.net ``` **Fine-tune with your voice:** ```bash # Record 20-30 samples of your voice saying "Hey Mycroft" precise-collect # Fine-tune from pre-trained precise-train -e 30 my-hey-mycroft.net . \ --from-checkpoint ~/precise-models/pretrained/hey-mycroft.net ``` #### Advantages ✅ **Zero training time** - Works immediately ✅ **Proven accuracy** - Tested by thousands ✅ **Good baseline** - Already includes diverse voices ✅ **Easy fine-tuning** - Add your voice in 30 mins vs 60+ mins from scratch #### When to Use Pre-trained vs Custom **Use Pre-trained "Hey Mycroft" when:** - You want to test quickly - "Hey Mycroft" is an acceptable wake word - You want proven accuracy out-of-box **Train Custom when:** - You want a different wake word ("Hey Computer", "Jarvis", etc.) - Maximum accuracy for your specific environment - Family-specific wake word **Hybrid (Recommended):** - Start with pre-trained "Hey Mycroft" - Test and learn the system - Fine-tune with your samples - Or add custom wake word later --- ### 2. Multiple Wake Words #### Can You Have Multiple? **Yes!** Options: #### Option A: Server-Side (Recommended) **Easy implementation:** ```bash # Use the enhanced server python voice_server_enhanced.py \ --enable-precise \ --multi-wake-word ``` **Configured wake words:** - "Hey Mycroft" (pre-trained) - "Hey Computer" (custom) - "Jarvis" (custom) **Resource impact:** - 3 models = ~15-30% CPU (Heimdall handles easily) - ~300-600MB RAM - Each model runs independently **Example use cases:** ```python "Hey Mycroft, what's the time?" → General assistant "Jarvis, run diagnostics" → Personal assistant mode "Emergency, call help" → Priority/emergency mode ``` #### Option B: Edge (K210) **Feasible for 1-2 wake words:** ```python # Sequential checking for model in ['hey-mycroft.kmodel', 'emergency.kmodel']: if detect_wake_word(model): return model ``` **Limitations:** - +50-100ms latency per additional model - Memory constraints (6MB total for all models) - More models = more power consumption **Recommendation:** - K210: 1 wake word (optimal) - K210: 2 wake words (acceptable) - K210: 3+ wake words (not recommended) #### Option C: Contextual Wake Words Different wake words for different purposes: ```python wake_word_contexts = { 'hey_mycroft': 'general_assistant', 'emergency': 'priority_emergency', 'goodnight': 'bedtime_routine', } ``` #### Should You Use Multiple? **One wake word is usually enough!** Commercial products (Alexa, Google) use one wake word and they work fine. **Use multiple when:** - Different family members want different wake words - You want context-specific behaviors (emergency vs. general) - You enjoy the flexibility **Start with one, add more later if needed.** --- ### 3. Adopting New Users' Voices #### Challenge Same wake word, different voices: - Mom says "Hey Mycroft" (soprano) - Dad says "Hey Mycroft" (bass) - Kids say "Hey Mycroft" (high-pitched) All need to work! #### Solution 1: Diverse Training (Recommended) **During initial training, have everyone record samples:** ```bash cd ~/precise-models/family-hey-mycroft # Session 1: Mom records 30 samples precise-collect # Mom speaks "Hey Mycroft" 30 times # Session 2: Dad records 30 samples precise-collect # Dad speaks "Hey Mycroft" 30 times # Session 3: Kids record 20 samples each precise-collect # Kids speak "Hey Mycroft" 40 times total # Train one model with all voices precise-train -e 60 family-hey-mycroft.net . # Deploy python voice_server.py \ --enable-precise \ --precise-model family-hey-mycroft.net ``` **Pros:** ✅ One model works for everyone ✅ Simple deployment ✅ No switching needed ✅ Works from day one **Cons:** ❌ Need everyone's time upfront ❌ Slightly lower per-person accuracy than individual models #### Solution 2: Incremental Training **Start with one person, add others over time:** ```bash # Week 1: Train with Dad's voice precise-train -e 60 hey-mycroft.net . # Week 2: Mom wants to use it # Collect Mom's samples precise-collect # Mom records 20-30 samples # Add to training set cp mom-samples/* wake-word/ # Retrain from checkpoint (faster!) precise-train -e 30 hey-mycroft.net . \ --from-checkpoint hey-mycroft.net # Now works for both Dad and Mom! # Week 3: Kids want in # Repeat process... ``` **Pros:** ✅ Don't need everyone upfront ✅ Easy to add new users ✅ Model improves gradually **Cons:** ❌ New users may have issues initially ❌ Requires periodic retraining #### Solution 3: Speaker Identification (Advanced) **Identify who's speaking, use personalized model/settings:** ```bash # Install speaker ID pip install pyannote.audio scipy --break-system-packages # Use enhanced server python voice_server_enhanced.py \ --enable-precise \ --enable-speaker-id \ --hf-token YOUR_HF_TOKEN ``` **Enroll users:** ```bash # Record 30-second voice sample from each person # POST to /speakers/enroll with audio + name curl -F "name=alan" \ -F "audio=@alan_voice.wav" \ http://localhost:5000/speakers/enroll curl -F "name=sarah" \ -F "audio=@sarah_voice.wav" \ http://localhost:5000/speakers/enroll ``` **Benefits:** ```python # Different responses per user if speaker == 'alan': turn_on('light.alan_office') elif speaker == 'sarah': turn_on('light.sarah_office') # Different permissions if speaker == 'kids' and command.startswith('buy'): return "Sorry, kids can't make purchases" ``` **Pros:** ✅ Personalized responses ✅ User-specific settings ✅ Better accuracy (optimized per voice) ✅ Can track who said what **Cons:** ❌ More complex ❌ Privacy considerations ❌ Additional CPU/RAM (~10% + 200MB) ❌ Requires voice enrollment #### Solution 4: Pre-trained Model (Easiest) **"Hey Mycroft" already includes diverse voices!** ```bash # Just use it - already trained on many voices ./quick_start_hey_mycroft.sh ``` The community model was trained with: - Male and female voices - Different accents - Different ages - Various environments **It should work for most family members out-of-box!** Then fine-tune if needed. --- ## Recommended Path for Your Situation ### Scenario: Family of 3-4 People **Week 1: Quick Start** ```bash # Use pre-trained "Hey Mycroft" ./quick_start_hey_mycroft.sh # Test with all family members # Likely works for everyone already! ``` **Week 2: Fine-tune if Needed** ```bash # If someone has issues: # Have them record 20 samples # Fine-tune the model precise-train -e 30 family-hey-mycroft.net . \ --from-checkpoint ~/precise-models/pretrained/hey-mycroft.net ``` **Week 3: Add Features** ```bash # If you want personalization: python voice_server_enhanced.py \ --enable-speaker-id # Enroll each family member ``` ### Scenario: Just You (or 1-2 People) **Option 1: Pre-trained** ```bash ./quick_start_hey_mycroft.sh # Done! ``` **Option 2: Custom Wake Word** ```bash # Train custom "Hey Computer" cd ~/precise-models/hey-computer ./1-record-wake-word.sh # 50 samples ./2-record-not-wake-word.sh # 200 samples ./3-train-model.sh ``` ### Scenario: Multiple People + Multiple Wake Words **Full setup:** ```bash # Pre-trained for family ./quick_start_hey_mycroft.sh # Personal wake word for Dad cd ~/precise-models/jarvis # Train custom wake word # Emergency wake word cd ~/precise-models/emergency # Train emergency wake word # Run multi-wake-word server python voice_server_enhanced.py \ --enable-precise \ --multi-wake-word \ --enable-speaker-id ``` --- ## Quick Decision Matrix | Your Situation | Recommendation | |----------------|----------------| | **Just getting started** | Pre-trained "Hey Mycroft" | | **Want different wake word** | Train custom model | | **Family of 3-4** | Pre-trained + fine-tune if needed | | **Want personalization** | Add speaker ID | | **Multiple purposes** | Multiple wake words (server-side) | | **Deploying to K210** | 1 wake word, no speaker ID | --- ## Files to Use **Quick start with pre-trained:** - `quick_start_hey_mycroft.sh` - Zero training, 5 minutes! **Multiple wake words:** - `voice_server_enhanced.py` - Multi-wake-word + speaker ID support **Training custom:** - `setup_precise.sh` - Setup training environment - Scripts in `~/precise-models/your-wake-word/` **Documentation:** - `WAKE_WORD_ADVANCED.md` - Detailed guide (this is comprehensive!) - `PRECISE_DEPLOYMENT.md` - Production deployment --- ## Summary ✅ **Yes**, pre-trained "Hey Mycroft" exists and works great ✅ **Yes**, you can have multiple wake words (server-side is easy) ✅ **Yes**, multiple approaches for multi-user support **Recommended approach:** 1. Start with `./quick_start_hey_mycroft.sh` (5 mins) 2. Test with all family members 3. Fine-tune if anyone has issues 4. Add speaker ID later if you want personalization 5. Consider multiple wake words only if you have specific use cases **Keep it simple!** One pre-trained wake word works for most people. --- ## Next Actions **Ready to start?** ```bash # 5-minute quick start ./quick_start_hey_mycroft.sh # Or read more first cat WAKE_WORD_ADVANCED.md ``` **Questions?** - Pre-trained models: See WAKE_WORD_ADVANCED.md § Pre-trained - Multiple wake words: See WAKE_WORD_ADVANCED.md § Multiple Wake Words - Voice adaptation: See WAKE_WORD_ADVANCED.md § Voice Adaptation **Happy voice assisting! 🎙️**