SFT corrections: add failure_category field for richer candidate classification #16
Labels
No labels
backlog
bug
duplicate
enhancement
help wanted
invalid
question
wontfix
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: Circuit-Forge/avocet#16
Loading…
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Context: The current Corrections tab only captures what to do with a candidate (correct/discard/flag), not why it failed. During the first live benchmark run we identified at least two distinct failure modes that need to be distinguished for useful SFT training data: scoring artifacts (model was right, benchmark pattern was too strict) and style violations (correct answer but violated instruction constraints like verbosity or unsolicited advice). Without categorization, all corrections are mixed together and the fine-tune harness can't filter by failure type.
Scope:
failure_categoryfield toSubmitRequestPydantic model (app/sft.py)failure_categoryon the candidate record when submittingfailure_categoryin the SFT export JSONLSftCard.vue(dropdown or chip group, shown before submitting)SftQueueItemTypeScript interface to include the fielduseSftStoresubmit action to pass categoryFailure category taxonomy:
scoring_artifact— model was right, benchmark pattern too strictstyle_violation— correct answer but violated instruction constraints (verbosity, format, unsolicited advice)partial_answer— right direction, missing key elementswrong_answer— genuinely incorrect reasoningformat_error— correct content but wrong output structurehallucination— invented factsOut of scope: Changing how the benchmark harness scores candidates; adding category-based filtering to the export endpoint (follow-on).
Acceptance criteria:
Related: Builds on
feat/sft-corrections(PR #15); discovered during first ollama benchmark run 2026-04-08