Imitation pipeline: cover letter generation #28
Loading…
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Context: Peregrine generates personalized cover letters via LLM as its primary AI feature. A fine-tuned model already exists (
alex-cover-writer:latest) but was trained on one user's 62 letters. Avocet is positioned to collect multi-user training data at scale for a generalizable replacement.What Peregrine uses this for:
Given a job title, company name, and JD excerpt, the model writes a full 3-4 paragraph cover letter in the candidate's established voice. The system prompt injects the candidate's name, career summary, and personality/voice descriptor. Three past cover letters retrieved by TF-IDF cosine similarity are included as few-shot style examples. An optional mission-alignment hint is injected for Para 3 when the company matches a preferred industry (music, animal welfare, education, social impact, healthcare). Recruiter-context framing is injected for Jobgether listings. A refinement path accepts a prior draft plus free-text feedback and asks the model to revise.
Input/output schema:
max_tokens=1200cap; post-processed to trim at first sign-offCurrent model/fallback chain:
claude_code → ollama (alex-cover-writer:latest) → vllm → copilot → anthropicalex-cover-writer:latestis the live fine-tuned model (Llama-3.2-3B-Instruct, QLoRA rank 16, 10 epochs, 62-letter corpus).Recommended model domain:
Instruction-following 3B-7B; generation task with strong style-imitation requirement. Voice-consistency is the primary quality axis, not factual accuracy. A 3B model fine-tuned on diverse user corpora should outperform a zero-shot 7B.
Can Avocet produce training data for it?
Yes — directly. Avocet's label tool can be extended with a cover letter review card type. Existing Peregrine outputs (user-approved letters that were actually submitted) are high-quality silver labels. The input prompt structure is deterministic and already logged.
Suggested data collection approach:
staging.db(cover_lettercolumn onappliedjobs) as silver-label JSONL using the existingscripts/prepare_training_data.pypatternRelated:
circuitforge-plans/peregrine/superpowers/; Peregrinescripts/generate_cover_letter.py