docs(readme): landing page rewrite — three-stage pipeline explained, full CLI reference, data flow diagram, label table

2026-05-06 08:51:46 -07:00 · 2026-05-06 08:51:46 -07:00 · 6f9aad126e
commit 6f9aad126e
parent 258bbdc0af
1 changed files with 124 additions and 56 deletions
--- a/README.md
+++ b/README.md
@ -1,22 +1,119 @@
-# Avocet — Email Classifier Training Tool
+<div align="center">
+  <img src="docs/avocet-logo.svg" alt="Avocet" height="96" />

-> *Part of the CircuitForge LLC internal infrastructure suite.*
+  # Avocet

-**Status:** Internal beta — label tool and benchmark harness complete. Used to build training data for Peregrine's email classifier.
+  **Email classifier training tool — label, benchmark, fine-tune.**
+
+  [![Status: Internal Beta](https://img.shields.io/badge/status-internal%20beta-blue)]()
+  [![License: BSL 1.1](https://img.shields.io/badge/license-BSL%201.1-orange)](LICENSE)
+  [![Stack: Vue 3 + FastAPI](https://img.shields.io/badge/stack-Vue%203%20%2B%20FastAPI-brightgreen)]()
+  [![CircuitForge](https://img.shields.io/badge/by-CircuitForge-black)](https://circuitforge.tech)
+</div>

 ---

-## What it does
+## What is Avocet?

-Avocet is the data pipeline for building and benchmarking email classifiers. It has two layers:
+Avocet is the internal data pipeline Circuit Forge uses to build, evaluate, and fine-tune email classifiers. It implements a three-stage workflow: human labelers review emails one at a time in a drag-to-bucket UI and produce a ground-truth dataset; the benchmark harness scores any number of HuggingFace zero-shot models against that dataset and produces a ranked comparison; and the fine-tune harness adapts the best-scoring base model to the labeled distribution. The output feeds directly into Peregrine's email classification layer. No LLM API key required for the label tool or benchmark — all inference runs locally via HuggingFace Transformers.

-**No LLM required.** Avocet uses zero-shot HuggingFace classification models — no API key, no cloud inference, no GPU required for the label tool. The benchmark harness can optionally export LLM-labeled emails from a Peregrine staging DB, but human labeling via the card-stack UI is the primary workflow.
+---

-**Layer 1 — Label tool**
-Card-stack UI for building ground-truth classifier benchmark data. Fetch emails from one or more IMAP accounts (with targeted date-range and sender/subject filters), review them card-by-card, and label each with a job-search category. Labeled output feeds the benchmark harness.
+## Quick Start

-**Layer 2 — Benchmark harness**
-Scores HuggingFace zero-shot classification models against the labeled dataset. Supports slow/large model inclusion, visual side-by-side comparison on live emails, and export of LLM-labeled emails from a Peregrine staging DB.
+```bash
+git clone https://git.opensourcesolarpunk.com/Circuit-Forge/avocet.git
+cd avocet
+
+# Copy config template and fill in your IMAP credentials
+cp config/label_tool.yaml.example config/label_tool.yaml
+
+# Start the label tool (Vue SPA + FastAPI, port 8503)
+./manage.sh start
+./manage.sh open
+```
+
+---
+
+## Features
+
+- **Drag-to-bucket label UI** — ASMR-style card interface; drag emails into labeled buckets or discard without queuing noise into the training set
+- **Targeted IMAP fetch** — pull emails by date range, sender, or subject filter across multiple accounts without flooding the queue
+- **Email classifier benchmark** — score any HuggingFace zero-shot model against your labeled JSONL; side-by-side comparison on live IMAP emails
+- **Planning benchmark** — evaluate LLMs on structured planning tasks; compare models head-to-head with verbose diff output
+- **Writing style benchmark** — compare Ollama models on writing style coherence; scan local disk for existing outputs
+- **Fine-tune harness** — HuggingFace Transformers fine-tuning from labeled ground truth; classifier adapter interface for swapping backends at runtime
+- **Local inference first** — no API key required; GPU optional; designed to run on developer hardware
+- **Hot-reload dev mode** — uvicorn `--reload` + Vite HMR (hot module replacement) for fast iteration on both API and UI
+
+---
+
+## CLI Reference
+
+All operations go through `manage.sh`.
+
+### Label Tool
+
+```bash
+./manage.sh start          # Build Vue SPA and start FastAPI on port 8503
+./manage.sh stop           # Stop FastAPI server
+./manage.sh restart        # Stop, rebuild, and restart
+./manage.sh status         # Show running state and port
+./manage.sh logs           # Tail the API log
+./manage.sh open           # Open http://localhost:8503 in browser
+./manage.sh dev            # Hot-reload: uvicorn --reload + Vite HMR
+./manage.sh test           # Run pytest suite
+```
+
+### Email Classifier Benchmark
+
+```bash
+./manage.sh benchmark [args]       # Run benchmark_classifier.py
+./manage.sh list-models            # List available zero-shot models
+./manage.sh score                  # Score models against labeled JSONL
+./manage.sh score --include-slow   # Include large/slow models
+./manage.sh compare --limit 30     # Side-by-side comparison on live IMAP emails
+```
+
+### Planning Benchmark
+
+```bash
+./manage.sh plans-bench [args]              # Run benchmark_plans.py
+./manage.sh plans-list                      # List available models
+./manage.sh plans-run <model> [args]        # Run a single model (verbose)
+./manage.sh plans-compare <m1> <m2> [...]   # Compare models side-by-side
+```
+
+### Writing Style Benchmark
+
+```bash
+./manage.sh style-bench [args]     # Run benchmark_style.py
+./manage.sh style-list             # List available Ollama models
+./manage.sh style-run [args]       # Run writing style benchmark
+./manage.sh style-last             # Print most recent benchmark report
+```
+
+---
+
+## Data Flow
+
+```
+IMAP accounts
+  → fetch (targeted or wide)
+  → email_label_queue.jsonl
+
+email_label_queue.jsonl
+  → label tool drag-to-bucket UI
+  → email_score.jsonl (ground truth)
+
+email_score.jsonl
+  → benchmark harness
+  → model rankings
+
+best model
+  → fine-tune harness
+  → Peregrine classifier adapter
+```

 ---

@ -38,69 +135,40 @@ Scores HuggingFace zero-shot classification models against the labeled dataset.

 ## Stack

-| Layer | Tech |
-|-------|------|
-| Label UI | Streamlit (port 8503, auto-increments on collision) |
+| Layer | Technology |
+|-------|-----------|
+| Label UI | Vue 3 SPA (Vite) |
+| API | FastAPI + uvicorn (port 8503) |
 | Benchmark | Python + HuggingFace Transformers |
 | Email fetch | IMAP (multi-account, targeted date/sender/subject filter) |
 | Data | JSONL (`data/email_label_queue.jsonl`, `data/email_score.jsonl`) |
-| Config | `config/label_tool.yaml` (gitignored — see `.example`) |
-
-Conda environments:
- `job-seeker` — label tool UI
- `job-seeker-classifiers` — benchmark harness (separate env for heavy deps)
+| Runtime | SQLite |
+| Config | `config/label_tool.yaml` (gitignored — `.example` committed) |

 ---

-## Running
+## Logo

-```bash
-./manage.sh start              # start label tool UI (port collision-safe from 8503)
-./manage.sh stop               # stop
-./manage.sh restart            # restart
-./manage.sh status             # show running state and port
-./manage.sh logs               # tail label tool log
-./manage.sh open               # open in browser
-```
-
-Benchmark:
-```bash
-./manage.sh benchmark --list-models    # list available zero-shot models
-./manage.sh score                      # score models against labeled JSONL
-./manage.sh score --include-slow       # include large/slow models
-./manage.sh compare --limit 30         # visual comparison on live IMAP emails
-```
-
-Dev:
-```bash
-./manage.sh test               # run pytest suite
-```
+The Avocet logo (`avocet_v1_poly.svg`) lives in the shared graphics repo. Copy it to `docs/avocet-logo.svg` to render correctly in this README.

 ---

-## Data flow
+## About

-```
-IMAP accounts → fetch (targeted or wide) → email_label_queue.jsonl
-→ label tool card UI → email_score.jsonl
-→ benchmark harness → model rankings
-→ best model → Peregrine classifier adapter
-```
+Avocet is internal CircuitForge infrastructure, open source as a reference implementation. It is not a user-facing product. The primary consumer is [Peregrine](https://git.opensourcesolarpunk.com/Circuit-Forge/peregrine), CircuitForge's job-search pipeline tool.

-Targeted fetch: date range + sender/subject filter for pulling historical emails on specific senders or topics without flooding the queue.
+Docs: [docs.circuitforge.tech/avocet](https://docs.circuitforge.tech/avocet)

-Discard: removes an email from the queue without writing to the score file — for emails that don't belong in the training set.
+## Forgejo-primary

---
-
-## Classifier adapters
-
-`app/classifier_adapters.py` provides a common interface for swapping classifier backends. Falls back to the label name when no `LABEL_DESCRIPTIONS` entry is configured for a label (RerankerAdapter).
+Avocet is developed and maintained on Forgejo at [git.opensourcesolarpunk.com/Circuit-Forge/avocet](https://git.opensourcesolarpunk.com/Circuit-Forge/avocet). GitHub and Codeberg are read-only mirrors.

 ---

 ## License

-BSL 1.1 — internal tool, not user-facing.
+[Business Source License 1.1](LICENSE) — classifier training is an AI feature under the CircuitForge licensing model.

-© 2026 Circuit Forge LLC
+Free for personal non-commercial self-hosting. Commercial use or SaaS re-hosting requires a paid license. Converts to MIT after 4 years.
+
+© 2026 Circuit Forge LLC — Privacy · Safety · Accessibility