docs(readme): landing page rewrite — three-stage pipeline explained, full CLI reference, data flow diagram, label table
This commit is contained in:
parent
258bbdc0af
commit
6f9aad126e
1 changed files with 124 additions and 56 deletions
180
README.md
180
README.md
|
|
@ -1,22 +1,119 @@
|
||||||
# Avocet — Email Classifier Training Tool
|
<div align="center">
|
||||||
|
<img src="docs/avocet-logo.svg" alt="Avocet" height="96" />
|
||||||
|
|
||||||
> *Part of the CircuitForge LLC internal infrastructure suite.*
|
# Avocet
|
||||||
|
|
||||||
**Status:** Internal beta — label tool and benchmark harness complete. Used to build training data for Peregrine's email classifier.
|
**Email classifier training tool — label, benchmark, fine-tune.**
|
||||||
|
|
||||||
|
[]()
|
||||||
|
[](LICENSE)
|
||||||
|
[]()
|
||||||
|
[](https://circuitforge.tech)
|
||||||
|
</div>
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## What it does
|
## What is Avocet?
|
||||||
|
|
||||||
Avocet is the data pipeline for building and benchmarking email classifiers. It has two layers:
|
Avocet is the internal data pipeline Circuit Forge uses to build, evaluate, and fine-tune email classifiers. It implements a three-stage workflow: human labelers review emails one at a time in a drag-to-bucket UI and produce a ground-truth dataset; the benchmark harness scores any number of HuggingFace zero-shot models against that dataset and produces a ranked comparison; and the fine-tune harness adapts the best-scoring base model to the labeled distribution. The output feeds directly into Peregrine's email classification layer. No LLM API key required for the label tool or benchmark — all inference runs locally via HuggingFace Transformers.
|
||||||
|
|
||||||
**No LLM required.** Avocet uses zero-shot HuggingFace classification models — no API key, no cloud inference, no GPU required for the label tool. The benchmark harness can optionally export LLM-labeled emails from a Peregrine staging DB, but human labeling via the card-stack UI is the primary workflow.
|
---
|
||||||
|
|
||||||
**Layer 1 — Label tool**
|
## Quick Start
|
||||||
Card-stack UI for building ground-truth classifier benchmark data. Fetch emails from one or more IMAP accounts (with targeted date-range and sender/subject filters), review them card-by-card, and label each with a job-search category. Labeled output feeds the benchmark harness.
|
|
||||||
|
|
||||||
**Layer 2 — Benchmark harness**
|
```bash
|
||||||
Scores HuggingFace zero-shot classification models against the labeled dataset. Supports slow/large model inclusion, visual side-by-side comparison on live emails, and export of LLM-labeled emails from a Peregrine staging DB.
|
git clone https://git.opensourcesolarpunk.com/Circuit-Forge/avocet.git
|
||||||
|
cd avocet
|
||||||
|
|
||||||
|
# Copy config template and fill in your IMAP credentials
|
||||||
|
cp config/label_tool.yaml.example config/label_tool.yaml
|
||||||
|
|
||||||
|
# Start the label tool (Vue SPA + FastAPI, port 8503)
|
||||||
|
./manage.sh start
|
||||||
|
./manage.sh open
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Features
|
||||||
|
|
||||||
|
- **Drag-to-bucket label UI** — ASMR-style card interface; drag emails into labeled buckets or discard without queuing noise into the training set
|
||||||
|
- **Targeted IMAP fetch** — pull emails by date range, sender, or subject filter across multiple accounts without flooding the queue
|
||||||
|
- **Email classifier benchmark** — score any HuggingFace zero-shot model against your labeled JSONL; side-by-side comparison on live IMAP emails
|
||||||
|
- **Planning benchmark** — evaluate LLMs on structured planning tasks; compare models head-to-head with verbose diff output
|
||||||
|
- **Writing style benchmark** — compare Ollama models on writing style coherence; scan local disk for existing outputs
|
||||||
|
- **Fine-tune harness** — HuggingFace Transformers fine-tuning from labeled ground truth; classifier adapter interface for swapping backends at runtime
|
||||||
|
- **Local inference first** — no API key required; GPU optional; designed to run on developer hardware
|
||||||
|
- **Hot-reload dev mode** — uvicorn `--reload` + Vite HMR (hot module replacement) for fast iteration on both API and UI
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## CLI Reference
|
||||||
|
|
||||||
|
All operations go through `manage.sh`.
|
||||||
|
|
||||||
|
### Label Tool
|
||||||
|
|
||||||
|
```bash
|
||||||
|
./manage.sh start # Build Vue SPA and start FastAPI on port 8503
|
||||||
|
./manage.sh stop # Stop FastAPI server
|
||||||
|
./manage.sh restart # Stop, rebuild, and restart
|
||||||
|
./manage.sh status # Show running state and port
|
||||||
|
./manage.sh logs # Tail the API log
|
||||||
|
./manage.sh open # Open http://localhost:8503 in browser
|
||||||
|
./manage.sh dev # Hot-reload: uvicorn --reload + Vite HMR
|
||||||
|
./manage.sh test # Run pytest suite
|
||||||
|
```
|
||||||
|
|
||||||
|
### Email Classifier Benchmark
|
||||||
|
|
||||||
|
```bash
|
||||||
|
./manage.sh benchmark [args] # Run benchmark_classifier.py
|
||||||
|
./manage.sh list-models # List available zero-shot models
|
||||||
|
./manage.sh score # Score models against labeled JSONL
|
||||||
|
./manage.sh score --include-slow # Include large/slow models
|
||||||
|
./manage.sh compare --limit 30 # Side-by-side comparison on live IMAP emails
|
||||||
|
```
|
||||||
|
|
||||||
|
### Planning Benchmark
|
||||||
|
|
||||||
|
```bash
|
||||||
|
./manage.sh plans-bench [args] # Run benchmark_plans.py
|
||||||
|
./manage.sh plans-list # List available models
|
||||||
|
./manage.sh plans-run <model> [args] # Run a single model (verbose)
|
||||||
|
./manage.sh plans-compare <m1> <m2> [...] # Compare models side-by-side
|
||||||
|
```
|
||||||
|
|
||||||
|
### Writing Style Benchmark
|
||||||
|
|
||||||
|
```bash
|
||||||
|
./manage.sh style-bench [args] # Run benchmark_style.py
|
||||||
|
./manage.sh style-list # List available Ollama models
|
||||||
|
./manage.sh style-run [args] # Run writing style benchmark
|
||||||
|
./manage.sh style-last # Print most recent benchmark report
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Data Flow
|
||||||
|
|
||||||
|
```
|
||||||
|
IMAP accounts
|
||||||
|
→ fetch (targeted or wide)
|
||||||
|
→ email_label_queue.jsonl
|
||||||
|
|
||||||
|
email_label_queue.jsonl
|
||||||
|
→ label tool drag-to-bucket UI
|
||||||
|
→ email_score.jsonl (ground truth)
|
||||||
|
|
||||||
|
email_score.jsonl
|
||||||
|
→ benchmark harness
|
||||||
|
→ model rankings
|
||||||
|
|
||||||
|
best model
|
||||||
|
→ fine-tune harness
|
||||||
|
→ Peregrine classifier adapter
|
||||||
|
```
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
|
|
@ -38,69 +135,40 @@ Scores HuggingFace zero-shot classification models against the labeled dataset.
|
||||||
|
|
||||||
## Stack
|
## Stack
|
||||||
|
|
||||||
| Layer | Tech |
|
| Layer | Technology |
|
||||||
|-------|------|
|
|-------|-----------|
|
||||||
| Label UI | Streamlit (port 8503, auto-increments on collision) |
|
| Label UI | Vue 3 SPA (Vite) |
|
||||||
|
| API | FastAPI + uvicorn (port 8503) |
|
||||||
| Benchmark | Python + HuggingFace Transformers |
|
| Benchmark | Python + HuggingFace Transformers |
|
||||||
| Email fetch | IMAP (multi-account, targeted date/sender/subject filter) |
|
| Email fetch | IMAP (multi-account, targeted date/sender/subject filter) |
|
||||||
| Data | JSONL (`data/email_label_queue.jsonl`, `data/email_score.jsonl`) |
|
| Data | JSONL (`data/email_label_queue.jsonl`, `data/email_score.jsonl`) |
|
||||||
| Config | `config/label_tool.yaml` (gitignored — see `.example`) |
|
| Runtime | SQLite |
|
||||||
|
| Config | `config/label_tool.yaml` (gitignored — `.example` committed) |
|
||||||
Conda environments:
|
|
||||||
- `job-seeker` — label tool UI
|
|
||||||
- `job-seeker-classifiers` — benchmark harness (separate env for heavy deps)
|
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## Running
|
## Logo
|
||||||
|
|
||||||
```bash
|
The Avocet logo (`avocet_v1_poly.svg`) lives in the shared graphics repo. Copy it to `docs/avocet-logo.svg` to render correctly in this README.
|
||||||
./manage.sh start # start label tool UI (port collision-safe from 8503)
|
|
||||||
./manage.sh stop # stop
|
|
||||||
./manage.sh restart # restart
|
|
||||||
./manage.sh status # show running state and port
|
|
||||||
./manage.sh logs # tail label tool log
|
|
||||||
./manage.sh open # open in browser
|
|
||||||
```
|
|
||||||
|
|
||||||
Benchmark:
|
|
||||||
```bash
|
|
||||||
./manage.sh benchmark --list-models # list available zero-shot models
|
|
||||||
./manage.sh score # score models against labeled JSONL
|
|
||||||
./manage.sh score --include-slow # include large/slow models
|
|
||||||
./manage.sh compare --limit 30 # visual comparison on live IMAP emails
|
|
||||||
```
|
|
||||||
|
|
||||||
Dev:
|
|
||||||
```bash
|
|
||||||
./manage.sh test # run pytest suite
|
|
||||||
```
|
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## Data flow
|
## About
|
||||||
|
|
||||||
```
|
Avocet is internal CircuitForge infrastructure, open source as a reference implementation. It is not a user-facing product. The primary consumer is [Peregrine](https://git.opensourcesolarpunk.com/Circuit-Forge/peregrine), CircuitForge's job-search pipeline tool.
|
||||||
IMAP accounts → fetch (targeted or wide) → email_label_queue.jsonl
|
|
||||||
→ label tool card UI → email_score.jsonl
|
|
||||||
→ benchmark harness → model rankings
|
|
||||||
→ best model → Peregrine classifier adapter
|
|
||||||
```
|
|
||||||
|
|
||||||
Targeted fetch: date range + sender/subject filter for pulling historical emails on specific senders or topics without flooding the queue.
|
Docs: [docs.circuitforge.tech/avocet](https://docs.circuitforge.tech/avocet)
|
||||||
|
|
||||||
Discard: removes an email from the queue without writing to the score file — for emails that don't belong in the training set.
|
## Forgejo-primary
|
||||||
|
|
||||||
---
|
Avocet is developed and maintained on Forgejo at [git.opensourcesolarpunk.com/Circuit-Forge/avocet](https://git.opensourcesolarpunk.com/Circuit-Forge/avocet). GitHub and Codeberg are read-only mirrors.
|
||||||
|
|
||||||
## Classifier adapters
|
|
||||||
|
|
||||||
`app/classifier_adapters.py` provides a common interface for swapping classifier backends. Falls back to the label name when no `LABEL_DESCRIPTIONS` entry is configured for a label (RerankerAdapter).
|
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## License
|
## License
|
||||||
|
|
||||||
BSL 1.1 — internal tool, not user-facing.
|
[Business Source License 1.1](LICENSE) — classifier training is an AI feature under the CircuitForge licensing model.
|
||||||
|
|
||||||
© 2026 Circuit Forge LLC
|
Free for personal non-commercial self-hosting. Commercial use or SaaS re-hosting requires a paid license. Converts to MIT after 4 years.
|
||||||
|
|
||||||
|
© 2026 Circuit Forge LLC — Privacy · Safety · Accessibility
|
||||||
|
|
|
||||||
Loading…
Reference in a new issue