Scrape → Store → Process pipeline for building email classifier
benchmark data across the CircuitForge menagerie.
- app/label_tool.py — Streamlit card-stack UI, multi-account IMAP fetch,
6-bucket labeling, undo/skip, keyboard shortcuts (1-6/S/U)
- scripts/classifier_adapters.py — ZeroShotAdapter (+ two_pass),
GLiClassAdapter, RerankerAdapter; ABC with lazy model loading
- scripts/benchmark_classifier.py — 13-model registry, --score,
--compare, --list-models, --export-db; uses label_tool.yaml for IMAP
- tests/ — 20 tests, all passing, zero model downloads required
- config/label_tool.yaml.example — multi-account IMAP template
- data/email_score.jsonl.example — sample labeled data for CI
Labels: interview_scheduled, offer_received, rejected,
positive_response, survey_received, neutral
23 lines
655 B
Text
23 lines
655 B
Text
# config/label_tool.yaml — Multi-account IMAP config for the email label tool
|
|
# Copy to config/label_tool.yaml and fill in your credentials.
|
|
# This file is gitignored.
|
|
|
|
accounts:
|
|
- name: "Gmail"
|
|
host: "imap.gmail.com"
|
|
port: 993
|
|
username: "you@gmail.com"
|
|
password: "your-app-password" # Use an App Password, not your login password
|
|
folder: "INBOX"
|
|
days_back: 90
|
|
|
|
- name: "Outlook"
|
|
host: "outlook.office365.com"
|
|
port: 993
|
|
username: "you@outlook.com"
|
|
password: "your-app-password"
|
|
folder: "INBOX"
|
|
days_back: 90
|
|
|
|
# Optional: limit emails fetched per account per run (0 = unlimited)
|
|
max_per_account: 500
|