feat: writing style benchmark harness for local text-gen models #36
Loading…
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Summary
Build a benchmark harness that runs all available local text-gen models against a writing style evaluation corpus and scores them for voice match. Output is a ranked model table to inform fine-tune base selection.
Steps
1. Corpus collection
data/voice_corpus/-- plain text files, one per sample2. Prompt design
3. Models to benchmark
/api/v1/modelsfor current inventory)4. Scoring
benchmark_results/voice_YYYY-MM-DD.mdranked tableAcceptance criteria
python scripts/benchmark_voice.py --models all --samples data/voice_corpus/