Scrape → Store → Process pipeline for building email classifier
benchmark data across the CircuitForge menagerie.
- app/label_tool.py — Streamlit card-stack UI, multi-account IMAP fetch,
6-bucket labeling, undo/skip, keyboard shortcuts (1-6/S/U)
- scripts/classifier_adapters.py — ZeroShotAdapter (+ two_pass),
GLiClassAdapter, RerankerAdapter; ABC with lazy model loading
- scripts/benchmark_classifier.py — 13-model registry, --score,
--compare, --list-models, --export-db; uses label_tool.yaml for IMAP
- tests/ — 20 tests, all passing, zero model downloads required
- config/label_tool.yaml.example — multi-account IMAP template
- data/email_score.jsonl.example — sample labeled data for CI
Labels: interview_scheduled, offer_received, rejected,
positive_response, survey_received, neutral
25 lines
428 B
YAML
25 lines
428 B
YAML
name: job-seeker-classifiers
|
|
channels:
|
|
- conda-forge
|
|
- defaults
|
|
dependencies:
|
|
- python=3.11
|
|
- pip
|
|
- pip:
|
|
# UI
|
|
- streamlit>=1.32
|
|
- pyyaml>=6.0
|
|
|
|
# Classifier backends (heavy — install selectively)
|
|
- transformers>=4.40
|
|
- torch>=2.2
|
|
- accelerate>=0.27
|
|
|
|
# Optional: GLiClass adapter
|
|
# - gliclass
|
|
|
|
# Optional: BGE reranker adapter
|
|
# - FlagEmbedding
|
|
|
|
# Dev
|
|
- pytest>=8.0
|