feat: SFT failure_category — classify why a model response was wrong #17

Merged

pyr0ball merged 1 commit from feat/sft-failure-category into main

2026-04-08 22:19:20 -07:00

pyr0ball commented

2026-04-08 22:10:43 -07:00

Owner

Closes #16

Summary

Adds optional failure_category field (6-value Literal enum) to SubmitRequest and stored candidate records
Reviewers can now tag why a model failed: scoring_artifact, style_violation, partial_answer, wrong_answer, format_error, hallucination
Category is forwarded in the SFT export JSONL so the fine-tune harness can filter by failure type
UI: chip-group selector appears during correct/discard/flag flow; two-step confirm for discard/flag reveals chips before emitting; field is optional — submission never blocked
14 new tests (3 backend, 11 frontend); 28+111 total passing

Note: This branch is based on feat/sft-corrections (PR #15). Should be merged after #15 lands.

Test plan

Select a failure category chip before correcting — verify it appears in the exported JSONL
Submit a correction without selecting a category — verify failure_category: null in export
POST failure_category: "nonsense" directly to /api/sft/submit — verify 422
Undo a correction — verify the undo works regardless of whether a category was set

Closes #16 ## Summary - Adds optional `failure_category` field (6-value Literal enum) to `SubmitRequest` and stored candidate records - Reviewers can now tag *why* a model failed: `scoring_artifact`, `style_violation`, `partial_answer`, `wrong_answer`, `format_error`, `hallucination` - Category is forwarded in the SFT export JSONL so the fine-tune harness can filter by failure type - UI: chip-group selector appears during correct/discard/flag flow; two-step confirm for discard/flag reveals chips before emitting; field is optional — submission never blocked - 14 new tests (3 backend, 11 frontend); 28+111 total passing > **Note:** This branch is based on `feat/sft-corrections` (PR #15). Should be merged after #15 lands. ## Test plan - [ ] Select a failure category chip before correcting — verify it appears in the exported JSONL - [ ] Submit a correction without selecting a category — verify `failure_category: null` in export - [ ] POST `failure_category: "nonsense"` directly to `/api/sft/submit` — verify 422 - [ ] Undo a correction — verify the undo works regardless of whether a category was set

pyr0ball added 1 commit 2026-04-08 22:10:43 -07:00

feat: add failure_category field to SFT corrections (#16 ) 9633d9a535

Adds optional failure_category to SubmitRequest and candidate records so
reviewers can classify why a model response was wrong, not just what to do
with it. Enables the fine-tune harness to filter training data by failure
type (e.g. exclude scoring artifacts, train only on genuine wrong answers).

Taxonomy: scoring_artifact | style_violation | partial_answer |
          wrong_answer | format_error | hallucination

- app/sft.py: FailureCategory Literal type; SubmitRequest.failure_category;
  stored on candidate record in POST /submit correct branch
- tests/test_sft.py: 3 new tests (stores value, null round-trip, 422 on invalid)
- stores/sft.ts: SftFailureCategory type exported; SftQueueItem + SftLastAction
  updated; setLastAction accepts optional category param
- SftCard.vue: chip-group selector shown during correct/discard/flag flow;
  two-step confirm for discard/flag reveals chips before emitting; category
  forwarded in all emit payloads
- CorrectionsView.vue: handleCorrect/Discard/Flag accept and forward category
  to POST /api/sft/submit body and store.setLastAction
- SftCard.test.ts: 11 new tests covering chip visibility, selection,
  single-active enforcement, pending-action flow, emit payloads, cancel