
# Avocet
**Email classifier training tool — label, benchmark, fine-tune.**
[]()
[](https://git.opensourcesolarpunk.com/Circuit-Forge/avocet/releases)
[](LICENSE)
[]()
[](https://circuitforge.tech)
---
## What is Avocet?
Avocet is the internal data pipeline Circuit Forge uses to build, evaluate, and fine-tune email classifiers. It implements a three-stage workflow: human labelers review emails one at a time in a drag-to-bucket UI and produce a ground-truth dataset; the benchmark harness scores any number of HuggingFace zero-shot models against that dataset and produces a ranked comparison; and the fine-tune harness adapts the best-scoring base model to the labeled distribution. The output feeds directly into Peregrine's email classification layer. No LLM API key required for the label tool or benchmark — all inference runs locally via HuggingFace Transformers.
---
## Quick Start
```bash
git clone https://git.opensourcesolarpunk.com/Circuit-Forge/avocet.git
cd avocet
# Copy config template and fill in your IMAP credentials
cp config/label_tool.yaml.example config/label_tool.yaml
# Start the label tool (Vue SPA + FastAPI, port 8503)
./manage.sh start
./manage.sh open
```
---
## Features
- **Drag-to-bucket label UI** — ASMR-style card interface; drag emails into labeled buckets or discard without queuing noise into the training set
- **Targeted IMAP fetch** — pull emails by date range, sender, or subject filter across multiple accounts without flooding the queue
- **Email classifier benchmark** — score any HuggingFace zero-shot model against your labeled JSONL; side-by-side comparison on live IMAP emails
- **Planning benchmark** — evaluate LLMs on structured planning tasks; compare models head-to-head with verbose diff output
- **Writing style benchmark** — compare Ollama models on writing style coherence; scan local disk for existing outputs
- **Fine-tune harness** — HuggingFace Transformers fine-tuning from labeled ground truth; classifier adapter interface for swapping backends at runtime
- **Local inference first** — no API key required; GPU optional; designed to run on developer hardware
- **Hot-reload dev mode** — uvicorn `--reload` + Vite HMR (hot module replacement) for fast iteration on both API and UI
---
## CLI Reference
All operations go through `manage.sh`.
### Label Tool
```bash
./manage.sh start # Build Vue SPA and start FastAPI on port 8503
./manage.sh stop # Stop FastAPI server
./manage.sh restart # Stop, rebuild, and restart
./manage.sh status # Show running state and port
./manage.sh logs # Tail the API log
./manage.sh open # Open http://localhost:8503 in browser
./manage.sh dev # Hot-reload: uvicorn --reload + Vite HMR
./manage.sh test # Run pytest suite
```
### Email Classifier Benchmark
```bash
./manage.sh benchmark [args] # Run benchmark_classifier.py
./manage.sh list-models # List available zero-shot models
./manage.sh score # Score models against labeled JSONL
./manage.sh score --include-slow # Include large/slow models
./manage.sh compare --limit 30 # Side-by-side comparison on live IMAP emails
```
### Planning Benchmark
```bash
./manage.sh plans-bench [args] # Run benchmark_plans.py
./manage.sh plans-list # List available models
./manage.sh plans-run