Avocet by Circuit Forge LLC — email classifier training tool: multi-account IMAP fetch, card-stack labeling UI, benchmark harness
Adds optional failure_category to SubmitRequest and candidate records so
reviewers can classify why a model response was wrong, not just what to do
with it. Enables the fine-tune harness to filter training data by failure
type (e.g. exclude scoring artifacts, train only on genuine wrong answers).
Taxonomy: scoring_artifact | style_violation | partial_answer |
wrong_answer | format_error | hallucination
- app/sft.py: FailureCategory Literal type; SubmitRequest.failure_category;
stored on candidate record in POST /submit correct branch
- tests/test_sft.py: 3 new tests (stores value, null round-trip, 422 on invalid)
- stores/sft.ts: SftFailureCategory type exported; SftQueueItem + SftLastAction
updated; setLastAction accepts optional category param
- SftCard.vue: chip-group selector shown during correct/discard/flag flow;
two-step confirm for discard/flag reveals chips before emitting; category
forwarded in all emit payloads
- CorrectionsView.vue: handleCorrect/Discard/Flag accept and forward category
to POST /api/sft/submit body and store.setLastAction
- SftCard.test.ts: 11 new tests covering chip visibility, selection,
single-active enforcement, pending-action flow, emit payloads, cancel
|
||
|---|---|---|
| app | ||
| config | ||
| data | ||
| scripts | ||
| tests | ||
| web | ||
| .gitignore | ||
| environment.yml | ||
| manage.sh | ||
| PRIVACY.md | ||
| pytest.ini | ||
| requirements.txt | ||