diff --git a/README.md b/README.md index 77f4798..235ea0e 100644 --- a/README.md +++ b/README.md @@ -1,22 +1,119 @@ -# Avocet — Email Classifier Training Tool +
+ Avocet -> *Part of the CircuitForge LLC internal infrastructure suite.* + # Avocet -**Status:** Internal beta — label tool and benchmark harness complete. Used to build training data for Peregrine's email classifier. + **Email classifier training tool — label, benchmark, fine-tune.** + + [![Status: Internal Beta](https://img.shields.io/badge/status-internal%20beta-blue)]() + [![License: BSL 1.1](https://img.shields.io/badge/license-BSL%201.1-orange)](LICENSE) + [![Stack: Vue 3 + FastAPI](https://img.shields.io/badge/stack-Vue%203%20%2B%20FastAPI-brightgreen)]() + [![CircuitForge](https://img.shields.io/badge/by-CircuitForge-black)](https://circuitforge.tech) +
--- -## What it does +## What is Avocet? -Avocet is the data pipeline for building and benchmarking email classifiers. It has two layers: +Avocet is the internal data pipeline Circuit Forge uses to build, evaluate, and fine-tune email classifiers. It implements a three-stage workflow: human labelers review emails one at a time in a drag-to-bucket UI and produce a ground-truth dataset; the benchmark harness scores any number of HuggingFace zero-shot models against that dataset and produces a ranked comparison; and the fine-tune harness adapts the best-scoring base model to the labeled distribution. The output feeds directly into Peregrine's email classification layer. No LLM API key required for the label tool or benchmark — all inference runs locally via HuggingFace Transformers. -**No LLM required.** Avocet uses zero-shot HuggingFace classification models — no API key, no cloud inference, no GPU required for the label tool. The benchmark harness can optionally export LLM-labeled emails from a Peregrine staging DB, but human labeling via the card-stack UI is the primary workflow. +--- -**Layer 1 — Label tool** -Card-stack UI for building ground-truth classifier benchmark data. Fetch emails from one or more IMAP accounts (with targeted date-range and sender/subject filters), review them card-by-card, and label each with a job-search category. Labeled output feeds the benchmark harness. +## Quick Start -**Layer 2 — Benchmark harness** -Scores HuggingFace zero-shot classification models against the labeled dataset. Supports slow/large model inclusion, visual side-by-side comparison on live emails, and export of LLM-labeled emails from a Peregrine staging DB. +```bash +git clone https://git.opensourcesolarpunk.com/Circuit-Forge/avocet.git +cd avocet + +# Copy config template and fill in your IMAP credentials +cp config/label_tool.yaml.example config/label_tool.yaml + +# Start the label tool (Vue SPA + FastAPI, port 8503) +./manage.sh start +./manage.sh open +``` + +--- + +## Features + +- **Drag-to-bucket label UI** — ASMR-style card interface; drag emails into labeled buckets or discard without queuing noise into the training set +- **Targeted IMAP fetch** — pull emails by date range, sender, or subject filter across multiple accounts without flooding the queue +- **Email classifier benchmark** — score any HuggingFace zero-shot model against your labeled JSONL; side-by-side comparison on live IMAP emails +- **Planning benchmark** — evaluate LLMs on structured planning tasks; compare models head-to-head with verbose diff output +- **Writing style benchmark** — compare Ollama models on writing style coherence; scan local disk for existing outputs +- **Fine-tune harness** — HuggingFace Transformers fine-tuning from labeled ground truth; classifier adapter interface for swapping backends at runtime +- **Local inference first** — no API key required; GPU optional; designed to run on developer hardware +- **Hot-reload dev mode** — uvicorn `--reload` + Vite HMR (hot module replacement) for fast iteration on both API and UI + +--- + +## CLI Reference + +All operations go through `manage.sh`. + +### Label Tool + +```bash +./manage.sh start # Build Vue SPA and start FastAPI on port 8503 +./manage.sh stop # Stop FastAPI server +./manage.sh restart # Stop, rebuild, and restart +./manage.sh status # Show running state and port +./manage.sh logs # Tail the API log +./manage.sh open # Open http://localhost:8503 in browser +./manage.sh dev # Hot-reload: uvicorn --reload + Vite HMR +./manage.sh test # Run pytest suite +``` + +### Email Classifier Benchmark + +```bash +./manage.sh benchmark [args] # Run benchmark_classifier.py +./manage.sh list-models # List available zero-shot models +./manage.sh score # Score models against labeled JSONL +./manage.sh score --include-slow # Include large/slow models +./manage.sh compare --limit 30 # Side-by-side comparison on live IMAP emails +``` + +### Planning Benchmark + +```bash +./manage.sh plans-bench [args] # Run benchmark_plans.py +./manage.sh plans-list # List available models +./manage.sh plans-run [args] # Run a single model (verbose) +./manage.sh plans-compare [...] # Compare models side-by-side +``` + +### Writing Style Benchmark + +```bash +./manage.sh style-bench [args] # Run benchmark_style.py +./manage.sh style-list # List available Ollama models +./manage.sh style-run [args] # Run writing style benchmark +./manage.sh style-last # Print most recent benchmark report +``` + +--- + +## Data Flow + +``` +IMAP accounts + → fetch (targeted or wide) + → email_label_queue.jsonl + +email_label_queue.jsonl + → label tool drag-to-bucket UI + → email_score.jsonl (ground truth) + +email_score.jsonl + → benchmark harness + → model rankings + +best model + → fine-tune harness + → Peregrine classifier adapter +``` --- @@ -38,69 +135,40 @@ Scores HuggingFace zero-shot classification models against the labeled dataset. ## Stack -| Layer | Tech | -|-------|------| -| Label UI | Streamlit (port 8503, auto-increments on collision) | +| Layer | Technology | +|-------|-----------| +| Label UI | Vue 3 SPA (Vite) | +| API | FastAPI + uvicorn (port 8503) | | Benchmark | Python + HuggingFace Transformers | | Email fetch | IMAP (multi-account, targeted date/sender/subject filter) | | Data | JSONL (`data/email_label_queue.jsonl`, `data/email_score.jsonl`) | -| Config | `config/label_tool.yaml` (gitignored — see `.example`) | - -Conda environments: -- `job-seeker` — label tool UI -- `job-seeker-classifiers` — benchmark harness (separate env for heavy deps) +| Runtime | SQLite | +| Config | `config/label_tool.yaml` (gitignored — `.example` committed) | --- -## Running +## Logo -```bash -./manage.sh start # start label tool UI (port collision-safe from 8503) -./manage.sh stop # stop -./manage.sh restart # restart -./manage.sh status # show running state and port -./manage.sh logs # tail label tool log -./manage.sh open # open in browser -``` - -Benchmark: -```bash -./manage.sh benchmark --list-models # list available zero-shot models -./manage.sh score # score models against labeled JSONL -./manage.sh score --include-slow # include large/slow models -./manage.sh compare --limit 30 # visual comparison on live IMAP emails -``` - -Dev: -```bash -./manage.sh test # run pytest suite -``` +The Avocet logo (`avocet_v1_poly.svg`) lives in the shared graphics repo. Copy it to `docs/avocet-logo.svg` to render correctly in this README. --- -## Data flow +## About -``` -IMAP accounts → fetch (targeted or wide) → email_label_queue.jsonl -→ label tool card UI → email_score.jsonl -→ benchmark harness → model rankings -→ best model → Peregrine classifier adapter -``` +Avocet is internal CircuitForge infrastructure, open source as a reference implementation. It is not a user-facing product. The primary consumer is [Peregrine](https://git.opensourcesolarpunk.com/Circuit-Forge/peregrine), CircuitForge's job-search pipeline tool. -Targeted fetch: date range + sender/subject filter for pulling historical emails on specific senders or topics without flooding the queue. +Docs: [docs.circuitforge.tech/avocet](https://docs.circuitforge.tech/avocet) -Discard: removes an email from the queue without writing to the score file — for emails that don't belong in the training set. +## Forgejo-primary ---- - -## Classifier adapters - -`app/classifier_adapters.py` provides a common interface for swapping classifier backends. Falls back to the label name when no `LABEL_DESCRIPTIONS` entry is configured for a label (RerankerAdapter). +Avocet is developed and maintained on Forgejo at [git.opensourcesolarpunk.com/Circuit-Forge/avocet](https://git.opensourcesolarpunk.com/Circuit-Forge/avocet). GitHub and Codeberg are read-only mirrors. --- ## License -BSL 1.1 — internal tool, not user-facing. +[Business Source License 1.1](LICENSE) — classifier training is an AI feature under the CircuitForge licensing model. -© 2026 Circuit Forge LLC +Free for personal non-commercial self-hosting. Commercial use or SaaS re-hosting requires a paid license. Converts to MIT after 4 years. + +© 2026 Circuit Forge LLC — Privacy · Safety · Accessibility