SFT corrections: add failure_category field for richer candidate classification #16

Closed
opened 2026-04-08 21:43:15 -07:00 by pyr0ball · 0 comments
Owner

Context: The current Corrections tab only captures what to do with a candidate (correct/discard/flag), not why it failed. During the first live benchmark run we identified at least two distinct failure modes that need to be distinguished for useful SFT training data: scoring artifacts (model was right, benchmark pattern was too strict) and style violations (correct answer but violated instruction constraints like verbosity or unsolicited advice). Without categorization, all corrections are mixed together and the fine-tune harness can't filter by failure type.

Scope:

  • Add optional failure_category field to SubmitRequest Pydantic model (app/sft.py)
  • Store failure_category on the candidate record when submitting
  • Include failure_category in the SFT export JSONL
  • Add category selector to SftCard.vue (dropdown or chip group, shown before submitting)
  • Update SftQueueItem TypeScript interface to include the field
  • Update useSftStore submit action to pass category
  • Add tests for new field (backend + frontend)

Failure category taxonomy:

  • scoring_artifact — model was right, benchmark pattern too strict
  • style_violation — correct answer but violated instruction constraints (verbosity, format, unsolicited advice)
  • partial_answer — right direction, missing key elements
  • wrong_answer — genuinely incorrect reasoning
  • format_error — correct content but wrong output structure
  • hallucination — invented facts

Out of scope: Changing how the benchmark harness scores candidates; adding category-based filtering to the export endpoint (follow-on).

Acceptance criteria:

  • Reviewer can optionally select a failure category before submitting a correction
  • Category is stored on the candidate record and appears in exported JSONL
  • Null/omitted category is valid (field is optional, existing records unaffected)
  • Backend and frontend tests cover the new field

Related: Builds on feat/sft-corrections (PR #15); discovered during first ollama benchmark run 2026-04-08

**Context:** The current Corrections tab only captures *what to do* with a candidate (correct/discard/flag), not *why it failed*. During the first live benchmark run we identified at least two distinct failure modes that need to be distinguished for useful SFT training data: scoring artifacts (model was right, benchmark pattern was too strict) and style violations (correct answer but violated instruction constraints like verbosity or unsolicited advice). Without categorization, all corrections are mixed together and the fine-tune harness can't filter by failure type. **Scope:** - [ ] Add optional `failure_category` field to `SubmitRequest` Pydantic model (`app/sft.py`) - [ ] Store `failure_category` on the candidate record when submitting - [ ] Include `failure_category` in the SFT export JSONL - [ ] Add category selector to `SftCard.vue` (dropdown or chip group, shown before submitting) - [ ] Update `SftQueueItem` TypeScript interface to include the field - [ ] Update `useSftStore` submit action to pass category - [ ] Add tests for new field (backend + frontend) **Failure category taxonomy:** - `scoring_artifact` — model was right, benchmark pattern too strict - `style_violation` — correct answer but violated instruction constraints (verbosity, format, unsolicited advice) - `partial_answer` — right direction, missing key elements - `wrong_answer` — genuinely incorrect reasoning - `format_error` — correct content but wrong output structure - `hallucination` — invented facts **Out of scope:** Changing how the benchmark harness scores candidates; adding category-based filtering to the export endpoint (follow-on). **Acceptance criteria:** - Reviewer can optionally select a failure category before submitting a correction - Category is stored on the candidate record and appears in exported JSONL - Null/omitted category is valid (field is optional, existing records unaffected) - Backend and frontend tests cover the new field **Related:** Builds on `feat/sft-corrections` (PR #15); discovered during first ollama benchmark run 2026-04-08
pyr0ball added this to the Beta — Benchmark Harness milestone 2026-04-08 21:43:15 -07:00
pyr0ball added the
enhancement
label 2026-04-08 21:43:15 -07:00
Sign in to join this conversation.
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: Circuit-Forge/avocet#16
No description provided.