feat: fine-tune pipeline for writing voice model #37

Open
opened 2026-04-22 07:06:45 -07:00 by pyr0ball · 0 comments
Owner

Summary

Fine-tune the top-ranked model from the voice benchmark (#prev) on Alan's writing corpus to produce a voice-matched local model for Magpie draft generation.

Depends on

  • Voice benchmark ticket (run first, identify base model)

Approach

Dataset prep

  • Format voice corpus as instruction-tuning pairs:
    • Input: thread title + thread body + signal reason
    • Output: reply in Alan's voice
  • Augment with rephrasing variants to avoid overfitting
  • Target: 200-500 training pairs

Fine-tuning

  • Reuse Avocet's existing fine-tune harness (scripts/finetune.py)
  • Method: QLoRA (4-bit quantized LoRA) -- fits on single RTX4000 8GB
  • Base: winner from benchmark (likely Mistral-7B or similar)
  • Epochs: 3, eval every 50 steps
  • Output: merged GGUF at Q4_K_M for cf-orch serving

Eval

  • Blind comparison: fine-tuned vs base model on held-out thread samples
  • Pass criteria: human eval prefers fine-tuned output 70%+ of the time

Output

  • models/voice-v1.gguf (gitignored, stored in /devl/models/)
  • Model card: docs/voice-model-v1.md (training params, eval results, known quirks)
  • cf-orch registration: add to models inventory so Magpie can route to it
## Summary Fine-tune the top-ranked model from the voice benchmark (#prev) on Alan's writing corpus to produce a voice-matched local model for Magpie draft generation. ## Depends on - Voice benchmark ticket (run first, identify base model) ## Approach ### Dataset prep - Format voice corpus as instruction-tuning pairs: - Input: thread title + thread body + signal reason - Output: reply in Alan's voice - Augment with rephrasing variants to avoid overfitting - Target: 200-500 training pairs ### Fine-tuning - Reuse Avocet's existing fine-tune harness (`scripts/finetune.py`) - Method: QLoRA (4-bit quantized LoRA) -- fits on single RTX4000 8GB - Base: winner from benchmark (likely Mistral-7B or similar) - Epochs: 3, eval every 50 steps - Output: merged GGUF at Q4_K_M for cf-orch serving ### Eval - Blind comparison: fine-tuned vs base model on held-out thread samples - Pass criteria: human eval prefers fine-tuned output 70%+ of the time ## Output - `models/voice-v1.gguf` (gitignored, stored in /devl/models/) - Model card: `docs/voice-model-v1.md` (training params, eval results, known quirks) - cf-orch registration: add to models inventory so Magpie can route to it
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: Circuit-Forge/avocet#37
No description provided.