# Architecture This page describes Peregrine's system structure, layer boundaries, and key design decisions. --- ## System Overview ``` ┌─────────────────────────────────────────────────────────────┐ │ Docker Compose │ │ │ │ ┌──────────┐ ┌──────────┐ ┌───────┐ ┌───────────────┐ │ │ │ app │ │ ollama │ │ vllm │ │ vision │ │ │ │ :8501 │ │ :11434 │ │ :8000 │ │ :8002 │ │ │ │Streamlit │ │ Local LLM│ │ vLLM │ │ Moondream2 │ │ │ └────┬─────┘ └──────────┘ └───────┘ └───────────────┘ │ │ │ │ │ ┌────┴───────┐ ┌─────────────┐ │ │ │ searxng │ │ staging.db │ │ │ │ :8888 │ │ (SQLite) │ │ │ └────────────┘ └─────────────┘ │ └─────────────────────────────────────────────────────────────┘ ┌─────────────────────────────────────────────────────────────┐ │ Streamlit App Layer │ │ │ │ app/app.py (entry point, navigation, sidebar task badge) │ │ │ │ app/pages/ │ │ 0_Setup.py First-run wizard (gates everything) │ │ 1_Job_Review.py Approve / reject queue │ │ 2_Settings.py All user configuration │ │ 4_Apply.py Cover letter gen + PDF export │ │ 5_Interviews.py Kanban: phone_screen → hired │ │ 6_Interview_Prep.py Research brief + practice Q&A │ │ 7_Survey.py Culture-fit survey assistant │ │ │ │ app/wizard/ │ │ step_hardware.py ... step_integrations.py │ │ tiers.py Feature gate definitions │ └─────────────────────────────────────────────────────────────┘ ┌─────────────────────────────────────────────────────────────┐ │ Scripts Layer │ │ (framework-independent — could be called by FastAPI) │ │ │ │ discover.py JobSpy + custom board orchestration │ │ match.py Resume keyword scoring │ │ db.py All SQLite helpers (single source) │ │ llm_router.py LLM fallback chain │ │ generate_cover_letter.py Cover letter generation │ │ company_research.py Pre-interview research brief │ │ task_runner.py Background daemon thread executor │ │ imap_sync.py IMAP email fetch + classify │ │ sync.py Push to external integrations │ │ user_profile.py UserProfile wrapper for user.yaml │ │ preflight.py Port + resource check │ │ │ │ custom_boards/ Per-board scrapers │ │ integrations/ Per-service integration drivers │ │ vision_service/ FastAPI Moondream2 inference server │ └─────────────────────────────────────────────────────────────┘ ┌─────────────────────────────────────────────────────────────┐ │ Config Layer │ │ │ │ config/user.yaml Personal data + wizard state │ │ config/llm.yaml LLM backends + fallback chains │ │ config/search_profiles.yaml Job search configuration │ │ config/resume_keywords.yaml Scoring keywords │ │ config/blocklist.yaml Excluded companies/domains │ │ config/email.yaml IMAP credentials │ │ config/integrations/ Per-integration credentials │ └─────────────────────────────────────────────────────────────┘ ┌─────────────────────────────────────────────────────────────┐ │ Database Layer │ │ │ │ staging.db (SQLite, local, gitignored) │ │ │ │ jobs Core pipeline — all job data │ │ job_contacts Email thread log per job │ │ company_research LLM-generated research briefs │ │ background_tasks Async task queue state │ │ survey_responses Culture-fit survey Q&A pairs │ └─────────────────────────────────────────────────────────────┘ ``` --- ## Layer Boundaries ### App layer (app/) The Streamlit UI layer. Its only responsibilities are: - Reading from `scripts/db.py` helpers - Calling `scripts/` functions directly or via `task_runner.submit_task()` - Rendering results to the browser The app layer does not contain business logic. Database queries, LLM calls, and integrations all live in `scripts/`. ### Scripts layer (scripts/) This is the stable public API of Peregrine. Scripts are designed to be framework-independent — they do not import Streamlit and can be called from a CLI, FastAPI endpoint, or background thread without modification. All personal data access goes through `scripts/user_profile.py` (`UserProfile` class). Scripts never read `config/user.yaml` directly. All database access goes through `scripts/db.py`. No script does raw SQLite outside of `db.py`. ### Config layer (config/) Plain YAML files. Gitignored files contain secrets; `.example` files are committed as templates. --- ## Background Tasks `scripts/task_runner.py` provides a simple background thread executor for long-running LLM tasks. ```python from scripts.task_runner import submit_task # Queue a cover letter generation task submit_task(db_path, task_type="cover_letter", job_id=42) # Queue a company research task submit_task(db_path, task_type="company_research", job_id=42) ``` Tasks are recorded in the `background_tasks` table with statuses: `queued → running → completed / failed`. **Dedup rule:** Only one `queued` or `running` task per `(task_type, job_id)` pair is allowed at a time. Submitting a duplicate is a silent no-op. **On startup:** `app/app.py` resets any `running` or `queued` rows to `failed` to clear tasks that were interrupted by a server restart. **Sidebar indicator:** `app/app.py` polls the `background_tasks` table every 3 seconds via a Streamlit fragment and displays a badge in the sidebar. --- ## LLM Router `scripts/llm_router.py` provides a single `complete()` call that tries backends in priority order and falls back transparently. See [LLM Router](../reference/llm-router.md) for full documentation. --- ## Key Design Decisions ### scripts/ is framework-independent The scripts layer was deliberately kept free of Streamlit imports. This means the full pipeline can be migrated to a FastAPI or Celery backend without rewriting business logic. ### All personal data via UserProfile `scripts/user_profile.py` is the single source of truth for all user data. This makes it easy to swap the storage backend (e.g. from YAML to a database) without touching every script. ### SQLite as staging layer `staging.db` acts as the staging layer between discovery and external integrations. This lets discovery, matching, and the UI all run independently without network dependencies. External integrations (Notion, Airtable, etc.) are push-only and optional. ### Tier system in app/wizard/tiers.py `FEATURES` is a single dict that maps feature key → minimum tier. `can_use(tier, feature)` is the single gating function. New features are added to `FEATURES` in one place. ### Vision service is a separate process Moondream2 requires `torch` and `transformers`, which are incompatible with the lightweight main conda environment. The vision service runs as a separate FastAPI process in a separate conda environment (`job-seeker-vision`), keeping the main env free of GPU dependencies.