# Wake Word Quick Reference Card ## 🎯 TL;DR: What Should I Do? ### Recommendation for Your Setup **Week 1:** Use pre-trained "Hey Mycroft" ```bash ./download_pretrained_models.sh --model hey-mycroft precise-listen ~/precise-models/pretrained/hey-mycroft.net ``` **Week 2-3:** Fine-tune with all family members' voices ```bash cd ~/precise-models/hey-mycroft-family precise-train -e 30 custom.net . --from-checkpoint ../pretrained/hey-mycroft.net ``` **Week 4+:** Add speaker identification ```bash pip install resemblyzer python enroll_speaker.py --name Alan --duration 20 python enroll_speaker.py --name [Family] --duration 20 ``` **Month 2+:** Add second wake word (Hey Jarvis for Plex?) ```bash ./download_pretrained_models.sh --model hey-jarvis # Run both in parallel on server ``` --- ## 📋 Pre-trained Models ### Available Models (Ready to Use!) | Wake Word | Download | Best For | |-----------|----------|----------| | **Hey Mycroft** ⭐ | `--model hey-mycroft` | Default choice, most data | | **Hey Jarvis** | `--model hey-jarvis` | Pop culture, media control | | **Christopher** | `--model christopher` | Unique, less common | | **Hey Ezra** | `--model hey-ezra` | Alternative option | ### Quick Download ```bash # Download one ./download_pretrained_models.sh --model hey-mycroft # Download all ./download_pretrained_models.sh --test-all # Test immediately precise-listen ~/precise-models/pretrained/hey-mycroft.net ``` --- ## 🔢 Multiple Wake Words ### Option 1: Multiple Models (Server-Side) ⭐ RECOMMENDED **What:** Run 2-3 different wake word models simultaneously **Where:** Heimdall (server) **Performance:** ~15-30% CPU for 3 models ```bash # Start with multiple wake words python voice_server.py \ --enable-precise \ --precise-models "\ hey-mycroft:~/models/hey-mycroft.net:0.5,\ hey-jarvis:~/models/hey-jarvis.net:0.5" ``` **Pros:** - ✅ Can identify which wake word was used - ✅ Different contexts (Mycroft=commands, Jarvis=media) - ✅ Easy to add/remove wake words - ✅ Each can have different sensitivity **Cons:** - ❌ Only works server-side (not on Maix Duino) - ❌ Higher CPU usage (but still reasonable) **Use When:** - You want different wake words for different purposes - Server has CPU to spare (yours does!) - Want flexibility to add wake words later ### Option 2: Single Multi-Phrase Model (Edge-Compatible) **What:** One model responds to multiple phrases **Where:** Server OR Maix Duino **Performance:** Same as single model ```bash # Train on multiple phrases cd ~/precise-models/multi-wake # Record "Hey Mycroft" samples → wake-word/ # Record "Hey Computer" samples → wake-word/ # Record negatives → not-wake-word/ precise-train -e 60 multi-wake.net . ``` **Pros:** - ✅ Single model = less compute - ✅ Works on edge (K210) - ✅ Simple deployment **Cons:** - ❌ Can't tell which wake word was used - ❌ May reduce accuracy - ❌ Higher false positive risk **Use When:** - Deploying to Maix Duino (edge) - Want backup wake words - Don't care which was used --- ## 👥 Multi-User Support ### Option 1: Inclusive Training ⭐ START HERE **What:** One model, all voices **How:** All family members record samples ```bash cd ~/precise-models/family-wake # Alice records 30 samples # Bob records 30 samples # You record 30 samples precise-train -e 60 family-wake.net . ``` **Pros:** - ✅ Everyone can use it - ✅ Simple deployment - ✅ Single model **Cons:** - ❌ Can't identify who spoke - ❌ No personalization **Use When:** - Just getting started - Don't need to know who spoke - Want simplicity ### Option 2: Speaker Identification (Week 4+) **What:** Detect wake word, then identify speaker **How:** Voice embeddings (resemblyzer or pyannote) ```bash # Install pip install resemblyzer # Enroll users python enroll_speaker.py --name Alan --duration 20 python enroll_speaker.py --name Alice --duration 20 python enroll_speaker.py --name Bob --duration 20 # Server identifies speaker automatically ``` **Pros:** - ✅ Personalized responses - ✅ User-specific permissions - ✅ Better privacy - ✅ Track preferences **Cons:** - ❌ More complex - ❌ Requires enrollment - ❌ +100-200ms latency - ❌ May fail with similar voices **Use When:** - Want personalization - Need user-specific commands - Ready for advanced features ### Option 3: Per-User Wake Words (Advanced) **What:** Each person has their own wake word **How:** Multiple models, one per person ```bash # Alice: "Hey Mycroft" # Bob: "Hey Jarvis" # You: "Hey Computer" # Run all 3 models in parallel ``` **Pros:** - ✅ Automatic user ID - ✅ Highest accuracy per user - ✅ Clear separation **Cons:** - ❌ 3x models = 3x CPU - ❌ Users must remember their word - ❌ Server-only (not edge) **Use When:** - Need automatic user ID - Have CPU to spare - Users want their own wake word --- ## 🎯 Decision Tree ``` START: Want to use voice assistant │ ├─ Single user or don't care who spoke? │ └─ Use: Inclusive Training (Option 1) │ └─ Download: Hey Mycroft (pre-trained) │ ├─ Multiple users AND need to know who spoke? │ └─ Use: Speaker Identification (Option 2) │ └─ Start with: Hey Mycroft + resemblyzer │ ├─ Want different wake words for different purposes? │ └─ Use: Multiple Models (Option 1) │ └─ Download: Hey Mycroft + Hey Jarvis │ └─ Deploying to Maix Duino (edge)? └─ Use: Single Multi-Phrase Model (Option 2) └─ Train: Custom model with 2-3 phrases ``` --- ## 📊 Comparison Table | Feature | Inclusive | Speaker ID | Per-User Wake | Multiple Wake | |---------|-----------|------------|---------------|---------------| | **Setup Time** | 2 hours | 4 hours | 6 hours | 3 hours | | **Complexity** | ⭐ Easy | ⭐⭐⭐ Medium | ⭐⭐⭐⭐ Hard | ⭐⭐ Easy | | **CPU Usage** | 5-10% | 10-15% | 15-30% | 15-30% | | **Latency** | 100ms | 300ms | 100ms | 100ms | | **User ID** | ❌ No | ✅ Yes | ✅ Yes | ❌ No | | **Edge Deploy** | ✅ Yes | ⚠️ Maybe | ❌ No | ⚠️ Partial | | **Personalize** | ❌ No | ✅ Yes | ✅ Yes | ⚠️ Partial | --- ## 🚀 Recommended Timeline ### Week 1: Get It Working ```bash # Use pre-trained Hey Mycroft ./download_pretrained_models.sh --model hey-mycroft # Test it precise-listen ~/precise-models/pretrained/hey-mycroft.net # Deploy to server python voice_server.py --enable-precise \ --precise-model ~/precise-models/pretrained/hey-mycroft.net ``` ### Week 2-3: Make It Yours ```bash # Fine-tune with your family's voices cd ~/precise-models/hey-mycroft-family # Have everyone record 20-30 samples precise-collect # Alice precise-collect # Bob precise-collect # You # Train precise-train -e 30 custom.net . \ --from-checkpoint ../pretrained/hey-mycroft.net ``` ### Week 4+: Add Intelligence ```bash # Speaker identification pip install resemblyzer python enroll_speaker.py --name Alan --duration 20 python enroll_speaker.py --name Alice --duration 20 # Now server knows who's speaking! ``` ### Month 2+: Expand Features ```bash # Add second wake word for media control ./download_pretrained_models.sh --model hey-jarvis # Run both: Mycroft for commands, Jarvis for Plex python voice_server.py --enable-precise \ --precise-models "mycroft:hey-mycroft.net:0.5,jarvis:hey-jarvis.net:0.5" ``` --- ## 💡 Pro Tips ### Wake Word Selection - ✅ **DO:** Choose clear, distinct wake words - ✅ **DO:** Test in your environment - ❌ **DON'T:** Use similar-sounding words - ❌ **DON'T:** Use common phrases ### Training - ✅ **DO:** Include all intended users - ✅ **DO:** Record in various conditions - ✅ **DO:** Add false positives to training - ❌ **DON'T:** Rush the training process ### Deployment - ✅ **DO:** Start simple (one wake word) - ✅ **DO:** Test thoroughly before adding features - ✅ **DO:** Monitor false positive rate - ❌ **DON'T:** Deploy too many wake words at once ### Speaker ID - ✅ **DO:** Use 20+ seconds for enrollment - ✅ **DO:** Re-enroll if accuracy drops - ✅ **DO:** Test threshold values - ❌ **DON'T:** Expect 100% accuracy --- ## 🔧 Quick Commands ```bash # Download pre-trained model ./download_pretrained_models.sh --model hey-mycroft # Test model precise-listen ~/precise-models/pretrained/hey-mycroft.net # Fine-tune from pre-trained precise-train -e 30 custom.net . \ --from-checkpoint ~/precise-models/pretrained/hey-mycroft.net # Enroll speaker python enroll_speaker.py --name Alan --duration 20 # Start with single wake word python voice_server.py --enable-precise \ --precise-model hey-mycroft.net # Start with multiple wake words python voice_server.py --enable-precise \ --precise-models "mycroft:hey-mycroft.net:0.5,jarvis:hey-jarvis.net:0.5" # Check status curl http://10.1.10.71:5000/wake-word/status # Monitor detections curl http://10.1.10.71:5000/wake-word/detections ``` --- ## 📚 See Also - **Full guide:** [ADVANCED_WAKE_WORD_TOPICS.md](ADVANCED_WAKE_WORD_TOPICS.md) - **Training:** [MYCROFT_PRECISE_GUIDE.md](MYCROFT_PRECISE_GUIDE.md) - **Deployment:** [PRECISE_DEPLOYMENT.md](PRECISE_DEPLOYMENT.md) - **Getting started:** [QUICKSTART.md](QUICKSTART.md) --- ## ❓ FAQ **Q: Can I use "Hey Mycroft" right away?** A: Yes! Download with `./download_pretrained_models.sh --model hey-mycroft` **Q: How many wake words can I run at once?** A: 2-3 comfortably on server. Maix Duino can handle 1. **Q: Can I train my own custom wake word?** A: Yes! See MYCROFT_PRECISE_GUIDE.md Phase 2. **Q: Does speaker ID work with multiple wake words?** A: Yes! Wake word detected → Speaker identified → Personalized response. **Q: Can I use this on Maix Duino?** A: Server-side (start here), then convert to KMODEL (advanced). **Q: How accurate is speaker identification?** A: 85-95% with good enrollment. Re-enroll if accuracy drops. **Q: What if someone has a cold?** A: May reduce accuracy temporarily. System should recover when voice returns to normal. **Q: Can kids use it?** A: Yes! Include their voices in training or enroll them separately. --- **Quick Decision:** Start with pre-trained Hey Mycroft. Add features later! ```bash ./download_pretrained_models.sh --model hey-mycroft precise-listen ~/precise-models/pretrained/hey-mycroft.net # It just works! ✨ ```