Imitation pipeline: culture-fit survey analysis #31
Loading…
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Context: Peregrine's Survey Assistant page (7_Survey.py) helps candidates answer employer culture-fit surveys by analyzing question text or a screenshot and recommending answers. This is a two-mode pipeline: vision (screenshot input) and text (pasted survey). Both are candidate for fine-tuned replacement.
What Peregrine uses this for:
The user pastes survey question text or uploads a screenshot. In Quick mode the model outputs a lettered answer recommendation with a one-sentence rationale per question (format:
1. B — reason). In Analysis mode it evaluates each answer option individually before making a recommendation. A system prompt frames the model as an advisor who knows the candidate values collaboration, communication, growth, and impact.Input/output schema (text path):
"You are a job application advisor helping a candidate answer a culture-fit survey. The candidate values collaborative teamwork, clear communication, growth, and impact. Choose answers that present them in the best professional light.""1. B — brief reason\n2. A — brief reason\n..."— one line per questionresearch_fallback_orderfromconfig/llm.yamlInput/output schema (vision/screenshot path):
"1. B — brief reason"format extracted from the screenshotvision_fallback_orderfromconfig/llm.yaml(vision_service → claude_code → anthropic)Current model/fallback chain:
research_fallback_order(typicallyclaude_code → vllm → ollama_research → ...)vision_fallback_order(vision_service [moondream2] → claude_code → anthropic); non-vision backends skipped automatically when images are presentRecommended model domain:
Can Avocet produce training data for it?
Yes for the text path. Avocet's label tool is well-suited: present the survey text + the model's recommendation and ask the labeler whether each answer choice is correct, with the option to override. The
survey_responsestable in Peregrine'sstaging.dbalready storesraw_input,llm_output,mode, andsourceper response — existing accepted outputs are silver labels.Suggested data collection approach:
survey_responsesrows wheresource=text_pasteand the user did not override the output (implicit acceptance signal)Related: Peregrine
app/pages/7_Survey.py;survey_responsestable instaging.db