Ports prior voice assistant research and prototypes from devl/Devops into the Minerva repo. Includes: - docs/: architecture, wake word guides, ESP32-S3 spec, hardware buying guide - scripts/: voice_server.py, voice_server_enhanced.py, setup scripts - hardware/maixduino/: edge device scripts with WiFi credentials scrubbed (replaced hardcoded password with secrets.py pattern) - config/.env.example: server config template - .gitignore: excludes .env, secrets.py, model blobs, ELF firmware - CLAUDE.md: Minerva product context and connection to cf-voice roadmap
348 lines
8.1 KiB
Markdown
Executable file
348 lines
8.1 KiB
Markdown
Executable file
# MicroPython/MaixPy Quirks and Compatibility Notes
|
|
|
|
**Date:** 2025-12-03
|
|
**MicroPython Version:** v0.6.2-89-gd8901fd22 on 2024-06-17
|
|
**Hardware:** Sipeed Maixduino (K210)
|
|
|
|
This document captures all the compatibility issues and workarounds discovered while developing the voice assistant client for Maixduino.
|
|
|
|
---
|
|
|
|
## String Formatting
|
|
|
|
### ❌ F-strings NOT supported
|
|
```python
|
|
# WRONG - SyntaxError
|
|
message = f"IP: {ip}"
|
|
temperature = f"Temp: {temp}°C"
|
|
```
|
|
|
|
### ✅ Use string concatenation
|
|
```python
|
|
# CORRECT
|
|
message = "IP: " + str(ip)
|
|
temperature = "Temp: " + str(temp) + "°C"
|
|
```
|
|
|
|
---
|
|
|
|
## Conditional Expressions (Ternary Operator)
|
|
|
|
### ❌ Inline ternary expressions NOT supported
|
|
```python
|
|
# WRONG - SyntaxError
|
|
plural = "s" if count > 1 else ""
|
|
message = "Found " + str(count) + " item" + ("s" if count > 1 else "")
|
|
```
|
|
|
|
### ✅ Use explicit if/else blocks
|
|
```python
|
|
# CORRECT
|
|
if count > 1:
|
|
plural = "s"
|
|
else:
|
|
plural = ""
|
|
message = "Found " + str(count) + " item" + plural
|
|
```
|
|
|
|
---
|
|
|
|
## String Methods
|
|
|
|
### ❌ decode() doesn't accept keyword arguments
|
|
```python
|
|
# WRONG - TypeError: function doesn't take keyword arguments
|
|
text = response.decode('utf-8', errors='ignore')
|
|
```
|
|
|
|
### ✅ Use positional arguments only (or catch exceptions)
|
|
```python
|
|
# CORRECT
|
|
try:
|
|
text = response.decode('utf-8')
|
|
except:
|
|
text = str(response)
|
|
```
|
|
|
|
---
|
|
|
|
## Display/LCD Color Format
|
|
|
|
### ❌ RGB tuples NOT accepted
|
|
```python
|
|
# WRONG - TypeError: can't convert tuple to int
|
|
COLOR_RED = (255, 0, 0)
|
|
lcd.draw_string(10, 50, "Hello", COLOR_RED, 0)
|
|
```
|
|
|
|
### ✅ Use bit-packed integers
|
|
```python
|
|
# CORRECT - Pack RGB into 16-bit or 24-bit integer
|
|
def rgb_to_int(r, g, b):
|
|
return (r << 16) | (g << 8) | b
|
|
|
|
COLOR_RED = rgb_to_int(255, 0, 0)
|
|
lcd.draw_string(10, 50, "Hello", COLOR_RED, 0)
|
|
```
|
|
|
|
---
|
|
|
|
## Network - WiFi Module
|
|
|
|
### ❌ Standard network.WLAN NOT available
|
|
```python
|
|
# WRONG - AttributeError: 'module' object has no attribute 'WLAN'
|
|
import network
|
|
nic = network.WLAN(network.STA_IF)
|
|
```
|
|
|
|
### ✅ Use network.ESP32_SPI for Maixduino
|
|
```python
|
|
# CORRECT - Requires full pin configuration
|
|
from network import ESP32_SPI
|
|
from fpioa_manager import fm
|
|
|
|
# Register all 6 SPI pins
|
|
fm.register(25, fm.fpioa.GPIOHS10, force=True) # CS
|
|
fm.register(8, fm.fpioa.GPIOHS11, force=True) # RST
|
|
fm.register(9, fm.fpioa.GPIOHS12, force=True) # RDY
|
|
fm.register(28, fm.fpioa.GPIOHS13, force=True) # MOSI
|
|
fm.register(26, fm.fpioa.GPIOHS14, force=True) # MISO
|
|
fm.register(27, fm.fpioa.GPIOHS15, force=True) # SCLK
|
|
|
|
nic = ESP32_SPI(
|
|
cs=fm.fpioa.GPIOHS10,
|
|
rst=fm.fpioa.GPIOHS11,
|
|
rdy=fm.fpioa.GPIOHS12,
|
|
mosi=fm.fpioa.GPIOHS13,
|
|
miso=fm.fpioa.GPIOHS14,
|
|
sclk=fm.fpioa.GPIOHS15
|
|
)
|
|
|
|
nic.connect(SSID, PASSWORD)
|
|
```
|
|
|
|
### ❌ active() method NOT available
|
|
```python
|
|
# WRONG - AttributeError: 'ESP32_SPI' object has no attribute 'active'
|
|
nic.active(True)
|
|
```
|
|
|
|
### ✅ Just use connect() directly
|
|
```python
|
|
# CORRECT
|
|
nic.connect(SSID, PASSWORD)
|
|
```
|
|
|
|
---
|
|
|
|
## I2S Audio
|
|
|
|
### ❌ record() doesn't accept size parameter only
|
|
```python
|
|
# WRONG - TypeError: object with buffer protocol required
|
|
chunk = i2s_dev.record(1024)
|
|
```
|
|
|
|
### ✅ Returns Audio object, use to_bytes()
|
|
```python
|
|
# CORRECT
|
|
audio_obj = i2s_dev.record(total_bytes)
|
|
audio_data = audio_obj.to_bytes()
|
|
```
|
|
|
|
**Note:** Audio data often comes in unexpected formats:
|
|
- Expected: 16-bit mono PCM
|
|
- Reality: Often 32-bit or stereo (4x expected size)
|
|
- Solution: Implement format detection and conversion
|
|
|
|
---
|
|
|
|
## Memory Management
|
|
|
|
### Memory is VERY limited (~6MB total, much less available)
|
|
|
|
**Problems encountered:**
|
|
- Creating large bytearrays fails (>100KB can fail)
|
|
- Multiple allocations cause fragmentation
|
|
- In-place operations preferred over creating new buffers
|
|
|
|
### ❌ Creating new buffers
|
|
```python
|
|
# WRONG - MemoryError on large data
|
|
compressed = bytearray()
|
|
for i in range(0, len(data), 4):
|
|
compressed.extend(data[i:i+2]) # Allocates new memory
|
|
```
|
|
|
|
### ✅ Work with smaller chunks or compress during transmission
|
|
```python
|
|
# CORRECT - Process in smaller pieces
|
|
chunk_size = 512
|
|
for i in range(0, len(data), chunk_size):
|
|
chunk = data[i:i+chunk_size]
|
|
process_chunk(chunk) # Handle incrementally
|
|
```
|
|
|
|
**Solutions implemented:**
|
|
1. Reduce recording duration (3s → 1s)
|
|
2. Compress audio (μ-law: 50% size reduction)
|
|
3. Stream transmission in small chunks (512 bytes)
|
|
4. Add delays between sends to prevent buffer overflow
|
|
|
|
---
|
|
|
|
## String Operations
|
|
|
|
### ❌ Arithmetic in string concatenation
|
|
```python
|
|
# WRONG - SyntaxError (sometimes)
|
|
message = "Count: #" + str(count + 1)
|
|
```
|
|
|
|
### ✅ Separate arithmetic from concatenation
|
|
```python
|
|
# CORRECT
|
|
next_count = count + 1
|
|
message = "Count: #" + str(next_count)
|
|
```
|
|
|
|
---
|
|
|
|
## Bytearray Operations
|
|
|
|
### ❌ Item deletion NOT supported
|
|
```python
|
|
# WRONG - TypeError: 'bytearray' object doesn't support item deletion
|
|
del audio_data[expected_size:]
|
|
```
|
|
|
|
### ✅ Create new bytearray with slice
|
|
```python
|
|
# CORRECT
|
|
audio_data = audio_data[:expected_size]
|
|
# Or create new buffer
|
|
trimmed = bytearray(expected_size)
|
|
trimmed[:] = audio_data[:expected_size]
|
|
```
|
|
|
|
---
|
|
|
|
## HTTP Requests
|
|
|
|
### ❌ urequests module NOT available
|
|
```python
|
|
# WRONG - ImportError: no module named 'urequests'
|
|
import urequests
|
|
response = urequests.post(url, data=data)
|
|
```
|
|
|
|
### ✅ Use raw socket HTTP
|
|
```python
|
|
# CORRECT
|
|
import socket
|
|
|
|
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
|
|
s.connect((host, port))
|
|
|
|
# Manual HTTP headers
|
|
headers = "POST /path HTTP/1.1\r\n"
|
|
headers += "Host: " + host + "\r\n"
|
|
headers += "Content-Type: audio/wav\r\n"
|
|
headers += "Content-Length: " + str(len(data)) + "\r\n"
|
|
headers += "Connection: close\r\n\r\n"
|
|
|
|
s.send(headers.encode())
|
|
s.send(data)
|
|
|
|
response = s.recv(1024)
|
|
s.close()
|
|
```
|
|
|
|
**Socket I/O errors common:**
|
|
- `[Errno 5] EIO` - Buffer overflow or disconnect
|
|
- Solutions:
|
|
- Send smaller chunks (512-1024 bytes)
|
|
- Add delays between sends (`time.sleep_ms(10)`)
|
|
- Enable keepalive if supported
|
|
|
|
---
|
|
|
|
## Best Practices for MaixPy
|
|
|
|
1. **Avoid complex expressions** - Break into simple steps
|
|
2. **Pre-allocate when possible** - Reduce fragmentation
|
|
3. **Use small buffers** - 512-1024 byte chunks work well
|
|
4. **Add delays in loops** - Prevent watchdog/buffer issues
|
|
5. **Explicit type conversions** - Always use `str()`, `int()`, etc.
|
|
6. **Test incrementally** - Memory errors appear suddenly
|
|
7. **Monitor serial output** - Errors often give hints
|
|
8. **Simplify, simplify** - Complexity = bugs in MicroPython
|
|
|
|
---
|
|
|
|
## Testing Methodology
|
|
|
|
When porting Python code to MaixPy:
|
|
|
|
1. Start with simplest version (hardcoded values)
|
|
2. Test each function individually via REPL
|
|
3. Add features incrementally
|
|
4. Watch for memory errors (usually allocation failures)
|
|
5. If error occurs, simplify the last change
|
|
6. Use print statements liberally (no debugger available)
|
|
|
|
---
|
|
|
|
## Hardware-Specific Notes
|
|
|
|
### Maixduino ESP32 WiFi
|
|
- Requires manual pin registration
|
|
- 6 pins must be configured (CS, RST, RDY, MOSI, MISO, SCLK)
|
|
- Connection can be slow (20+ seconds)
|
|
- Stability improves with smaller packet sizes
|
|
|
|
### I2S Microphone
|
|
- Returns Audio objects, not raw bytes
|
|
- Format is often different than configured
|
|
- May return stereo when mono requested
|
|
- May return 32-bit when 16-bit requested
|
|
- Always implement format detection/conversion
|
|
|
|
### BOOT Button (GPIO 16)
|
|
- Active low (0 = pressed, 1 = released)
|
|
- Requires pull-up configuration
|
|
- Debounce by waiting for release
|
|
- Can be used without interrupts (polling is fine)
|
|
|
|
---
|
|
|
|
## Resources
|
|
|
|
- **MaixPy Documentation:** https://maixpy.sipeed.com/
|
|
- **K210 Datasheet:** https://canaan.io/product/kendryteai
|
|
- **ESP32 SPI Firmware:** https://github.com/sipeed/MaixPy_scripts/tree/master/network
|
|
|
|
---
|
|
|
|
## Summary of Successful Patterns
|
|
|
|
```python
|
|
# Audio recording and transmission pipeline
|
|
1. Record audio → Audio object (128KB for 1 second)
|
|
2. Convert to bytes → to_bytes() (still 128KB)
|
|
3. Detect format → Check size vs expected
|
|
4. Convert to mono 16-bit → In-place copy (32KB)
|
|
5. Compress with μ-law → 50% reduction (16KB)
|
|
6. Send in chunks → 512 bytes at a time with delays
|
|
7. Parse response → Simple string operations
|
|
|
|
# Total: ~85% size reduction, fits in memory!
|
|
```
|
|
|
|
This approach works reliably on K210 with ~6MB RAM.
|
|
|
|
---
|
|
|
|
**Last Updated:** 2025-12-03
|
|
**Status:** Fully tested and working
|