feat: add pre-commit sensitive file blocker and support request issue template

Completes issue #7 (public mirror setup): - .githooks/pre-commit: blocks sensitive filenames (.env, *.key, *.pem, id_rsa, credentials.json, etc.) and credential content patterns (private key headers, AWS keys, GitHub tokens, Stripe secret keys, generic API key assignments) from being committed - .github/ISSUE_TEMPLATE/support_request.md: third issue template for usage questions alongside existing bug report and feature request
fix: get_config_dir had one extra .parent, resolved to /config not /app/config
2026-03-16 11:30:11 -07:00 · 2026-03-15 17:14:48 -07:00 · 2026-03-15 16:48:37 -07:00 · 2026-03-15 16:43:27 -07:00 · 2026-03-15 16:37:46 -07:00 · 2026-03-15 16:36:50 -07:00
217 changed files with 24581 additions and 13480 deletions
--- a/.dockerignore
+++ b/.dockerignore
@ -0,0 +1,20 @@
 .git
 __pycache__
 *.pyc
 *.pyo
 staging.db
 config/user.yaml
 config/notion.yaml
 config/email.yaml
 config/tokens.yaml
 config/craigslist.yaml
 .streamlit.pid
 .streamlit.log
 aihawk/
 docs/
 tests/
 .env
 data/
 log/
 unsloth_compiled_cache/
 resume_matcher/
--- a/.env.example
+++ b/.env.example
@ -0,0 +1,38 @@
 # .env.example — copy to .env
 # Auto-generated by the setup wizard, or fill in manually.
 # NEVER commit .env to git.
 STREAMLIT_PORT=8501
 OLLAMA_PORT=11434
 VLLM_PORT=8000
 SEARXNG_PORT=8888
 VISION_PORT=8002
 VISION_MODEL=vikhyatk/moondream2
 VISION_REVISION=2025-01-09
 DOCS_DIR=~/Documents/JobSearch
 OLLAMA_MODELS_DIR=~/models/ollama
 VLLM_MODELS_DIR=~/models/vllm
 VLLM_MODEL=Ouro-1.4B
 OLLAMA_DEFAULT_MODEL=llama3.2:3b
 # API keys (required for remote profile)
 ANTHROPIC_API_KEY=
 OPENAI_COMPAT_URL=
 OPENAI_COMPAT_KEY=
 # Feedback button — Forgejo issue filing
 FORGEJO_API_TOKEN=
 FORGEJO_REPO=pyr0ball/peregrine
 FORGEJO_API_URL=https://git.opensourcesolarpunk.com/api/v1
 # GITHUB_TOKEN=          # future — enable when public mirror is active
 # GITHUB_REPO=           # future
 # Cloud multi-tenancy (compose.cloud.yml only — do not set for local installs)
 CLOUD_MODE=false
 CLOUD_DATA_ROOT=/devl/menagerie-data
 DIRECTUS_JWT_SECRET=           # must match website/.env DIRECTUS_SECRET value
 CF_SERVER_SECRET=              # random 64-char hex — generate: openssl rand -hex 32
 PLATFORM_DB_URL=postgresql://cf_platform:<password>@host.docker.internal:5433/circuitforge_platform
 HEIMDALL_URL=http://cf-license:8000   # internal Docker URL; override for external access
 HEIMDALL_ADMIN_TOKEN=                 # must match ADMIN_TOKEN in circuitforge-license .env
--- a/.gitea/ISSUE_TEMPLATE/bug_report.md
+++ b/.gitea/ISSUE_TEMPLATE/bug_report.md
@ -0,0 +1,30 @@
 ---
 name: Bug report
 about: Something isn't working correctly
 labels: bug
 ---
 ## Describe the bug
 <!-- A clear description of what went wrong. -->
 ## Steps to reproduce
 1.
 2.
 3.
 ## Expected behaviour
 ## Actual behaviour
 <!-- Paste relevant log output below (redact any API keys or personal info): -->
 ```
 ## Environment
 - Peregrine version: <!-- output of `./manage.sh status` or git tag -->
 - OS:
 - Runtime: Docker / conda-direct
 - GPU profile: remote / cpu / single-gpu / dual-gpu
--- a/.gitea/ISSUE_TEMPLATE/feature_request.md
+++ b/.gitea/ISSUE_TEMPLATE/feature_request.md
@ -0,0 +1,26 @@
 ---
 name: Feature request
 about: Suggest an improvement or new capability
 labels: enhancement
 ---
 ## Problem statement
 <!-- What are you trying to do that's currently hard or impossible? -->
 ## Proposed solution
 ## Alternatives considered
 ## Which tier would this belong to?
 - [ ] Free
 - [ ] Paid
 - [ ] Premium
 - [ ] Ultra (human-in-the-loop)
 - [ ] Not sure
 ## Would you be willing to contribute a PR?
 - [ ] Yes
 - [ ] No
--- a/.githooks/commit-msg
+++ b/.githooks/commit-msg
@ -0,0 +1,32 @@
 #!/usr/bin/env bash
 # .githooks/commit-msg — enforces conventional commit format
 # Format: type: description  OR  type(scope): description
 set -euo pipefail
 RED='\033[0;31m'; YELLOW='\033[1;33m'; NC='\033[0m'
 VALID_TYPES="feat|fix|docs|chore|test|refactor|perf|ci|build"
 MSG_FILE="$1"
 MSG=$(head -1 "$MSG_FILE")
 if [[ -z "${MSG// }" ]]; then
    echo -e "${RED}Commit rejected:${NC} Commit message is empty."
    exit 1
 fi
 if ! echo "$MSG" | grep -qE "^($VALID_TYPES)(\(.+\))?: .+"; then
    echo -e "${RED}Commit rejected:${NC} Message does not follow conventional commit format."
    echo ""
    echo -e "  Required: ${YELLOW}type: description${NC}  or  ${YELLOW}type(scope): description${NC}"
    echo -e "  Valid types: ${YELLOW}$VALID_TYPES${NC}"
    echo ""
    echo -e "  Your message: ${YELLOW}$MSG${NC}"
    echo ""
    echo -e "  Examples:"
    echo -e "    ${YELLOW}feat: add cover letter refinement${NC}"
    echo -e "    ${YELLOW}fix(wizard): handle missing user.yaml gracefully${NC}"
    echo -e "    ${YELLOW}docs: update tier system reference${NC}"
    exit 1
 fi
 exit 0
--- a/.githooks/pre-commit
+++ b/.githooks/pre-commit
@ -0,0 +1,84 @@
 #!/usr/bin/env bash
 # .githooks/pre-commit — blocks sensitive files and credential patterns from being committed
 set -euo pipefail
 RED='\033[0;31m'; YELLOW='\033[1;33m'; BOLD='\033[1m'; NC='\033[0m'
 BLOCKED=0
 STAGED=$(git diff --cached --name-only --diff-filter=ACM 2>/dev/null)
 if [[ -z "$STAGED" ]]; then
    exit 0
 fi
 # ── Blocked filenames ──────────────────────────────────────────────────────────
 BLOCKED_FILES=(
    ".env"
    ".env.local"
    ".env.production"
    ".env.staging"
    "*.pem"
    "*.key"
    "*.p12"
    "*.pfx"
    "id_rsa"
    "id_ecdsa"
    "id_ed25519"
    "id_dsa"
    "*.ppk"
    "secrets.yml"
    "secrets.yaml"
    "credentials.json"
    "service-account*.json"
    "*.keystore"
    "htpasswd"
    ".htpasswd"
 )
 while IFS= read -r file; do
    filename="$(basename "$file")"
    for pattern in "${BLOCKED_FILES[@]}"; do
        # shellcheck disable=SC2254
        case "$filename" in
            $pattern)
                echo -e "${RED}BLOCKED:${NC} ${BOLD}$file${NC} matches blocked filename pattern '${YELLOW}$pattern${NC}'"
                BLOCKED=1
                ;;
        esac
    done
 done <<< "$STAGED"
 # ── Blocked content patterns ───────────────────────────────────────────────────
 declare -A CONTENT_PATTERNS=(
    ["RSA/EC private key header"]="-----BEGIN (RSA|EC|DSA|OPENSSH) PRIVATE KEY"
    ["AWS access key"]="AKIA[0-9A-Z]{16}"
    ["GitHub token"]="ghp_[A-Za-z0-9]{36}"
    ["Generic API key assignment"]="(api_key|API_KEY|secret_key|SECRET_KEY)\s*=\s*['\"][A-Za-z0-9_\-]{16,}"
    ["Stripe secret key"]="sk_(live|test)_[A-Za-z0-9]{24,}"
    ["Forgejo/Gitea token (40 hex chars)"]="[a-f0-9]{40}"
 )
 while IFS= read -r file; do
    # Skip binary files
    if git diff --cached -- "$file" | grep -qP "^\+.*\x00"; then
        continue
    fi
    for label in "${!CONTENT_PATTERNS[@]}"; do
        pattern="${CONTENT_PATTERNS[$label]}"
        matches=$(git diff --cached -- "$file" | grep "^+" | grep -cP "$pattern" 2>/dev/null || true)
        if [[ "$matches" -gt 0 ]]; then
            echo -e "${RED}BLOCKED:${NC} ${BOLD}$file${NC} contains pattern matching '${YELLOW}$label${NC}'"
            BLOCKED=1
        fi
    done
 done <<< "$STAGED"
 # ── Result ─────────────────────────────────────────────────────────────────────
 if [[ "$BLOCKED" -eq 1 ]]; then
    echo ""
    echo -e "${RED}Commit rejected.${NC} Remove sensitive files/content before committing."
    echo -e "To bypass in an emergency: ${YELLOW}git commit --no-verify${NC} (use with extreme caution)"
    exit 1
 fi
 exit 0
--- a/.github/ISSUE_TEMPLATE/bug_report.md
+++ b/.github/ISSUE_TEMPLATE/bug_report.md
@ -0,0 +1,30 @@
 ---
 name: Bug report
 about: Something isn't working correctly
 labels: bug
 ---
 ## Describe the bug
 <!-- A clear description of what went wrong. -->
 ## Steps to reproduce
 1.
 2.
 3.
 ## Expected behaviour
 ## Actual behaviour
 <!-- Paste relevant log output below (redact any API keys or personal info): -->
 ```
 ## Environment
 - Peregrine version: <!-- output of `./manage.sh status` or git tag -->
 - OS:
 - Runtime: Docker / conda-direct
 - GPU profile: remote / cpu / single-gpu / dual-gpu
--- a/.github/ISSUE_TEMPLATE/config.yml
+++ b/.github/ISSUE_TEMPLATE/config.yml
@ -0,0 +1,5 @@
 blank_issues_enabled: false
 contact_links:
  - name: Security vulnerability
    url: mailto:security@circuitforge.tech
    about: Do not open a public issue for security vulnerabilities. Email us instead.
--- a/.github/ISSUE_TEMPLATE/feature_request.md
+++ b/.github/ISSUE_TEMPLATE/feature_request.md
@ -0,0 +1,26 @@
 ---
 name: Feature request
 about: Suggest an improvement or new capability
 labels: enhancement
 ---
 ## Problem statement
 <!-- What are you trying to do that's currently hard or impossible? -->
 ## Proposed solution
 ## Alternatives considered
 ## Which tier would this belong to?
 - [ ] Free
 - [ ] Paid
 - [ ] Premium
 - [ ] Ultra (human-in-the-loop)
 - [ ] Not sure
 ## Would you be willing to contribute a PR?
 - [ ] Yes
 - [ ] No
--- a/.github/ISSUE_TEMPLATE/support_request.md
+++ b/.github/ISSUE_TEMPLATE/support_request.md
@ -0,0 +1,26 @@
 ---
 name: Support Request
 about: Ask a question or get help using Peregrine
 title: '[Support] '
 labels: question
 assignees: ''
 ---
 ## What are you trying to do?
 <!-- Describe what you're trying to accomplish -->
 ## What have you tried?
 <!-- Steps you've already taken, docs you've read, etc. -->
 ## Environment
 - OS: <!-- e.g. Ubuntu 22.04, macOS 14 -->
 - Install method: <!-- Docker / Podman / source -->
 - Peregrine version: <!-- run `./manage.sh status` or check the UI footer -->
 - LLM backend: <!-- Ollama / vLLM / OpenAI / other -->
 ## Logs or screenshots
 <!-- Paste relevant output from `./manage.sh logs` or attach a screenshot -->
--- a/.github/pull_request_template.md
+++ b/.github/pull_request_template.md
@ -0,0 +1,27 @@
 ## Summary
 <!-- What does this PR do? -->
 ## Related issue(s)
 Closes #
 ## Type of change
 - [ ] feat — new feature
 - [ ] fix — bug fix
 - [ ] docs — documentation only
 - [ ] chore — tooling, deps, refactor
 - [ ] test — test coverage
 ## Testing
 <!-- What did you run to verify this works? -->
 ```bash
 pytest tests/ -v
 ```
 ## CLA
 - [ ] I agree that my contribution is licensed under the project's [BSL 1.1](./LICENSE-BSL) terms.
--- a/.github/workflows/ci.yml
+++ b/.github/workflows/ci.yml
@ -0,0 +1,29 @@
 name: CI
 on:
  push:
    branches: [main]
  pull_request:
    branches: [main]
 jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - name: Install system dependencies
        run: sudo apt-get update -q && sudo apt-get install -y libsqlcipher-dev
      - name: Set up Python
        uses: actions/setup-python@v5
        with:
          python-version: "3.11"
          cache: pip
      - name: Install dependencies
        run: pip install -r requirements.txt
      - name: Run tests
        run: pytest tests/ -v --tb=short
--- a/.gitignore
+++ b/.gitignore
@ -18,3 +18,33 @@ log/
 unsloth_compiled_cache/
 data/survey_screenshots/*
 !data/survey_screenshots/.gitkeep
 config/user.yaml
 config/plain_text_resume.yaml
 config/.backup-*
 config/integrations/*.yaml
 !config/integrations/*.yaml.example
 # companyScraper runtime artifacts
 scrapers/.cache/
 scrapers/.debug/
 scrapers/raw_scrapes/
 compose.override.yml
 config/license.json
 config/user.yaml.working
 # Claude context files — kept out of version control
 CLAUDE.md
 data/email_score.jsonl
 data/email_label_queue.jsonl
 data/email_compare_sample.jsonl
 config/label_tool.yaml
 config/server.yaml
 demo/data/*.db
 demo/seed_demo.py
 # Git worktrees
 .worktrees/
--- a/.gitleaks.toml
+++ b/.gitleaks.toml
@ -0,0 +1,32 @@
 # peregrine/.gitleaks.toml — per-repo allowlists extending the shared base config
 [extend]
 path = "/Library/Development/CircuitForge/circuitforge-hooks/gitleaks.toml"
 [allowlist]
 description = "Peregrine-specific allowlists"
 paths = [
    'docs/plans/.*',                              # plan docs contain example tokens and placeholders
    'docs/reference/.*',                          # reference docs (globally excluded in base config)
    'tests/.*',                                   # test fixtures use fake phone numbers as job IDs
    'scripts/integrations/apple_calendar\.py',    # you@icloud.com is a placeholder comment
    # Streamlit app files: key= params are widget identifiers, not secrets
    'app/feedback\.py',
    'app/pages/2_Settings\.py',
    'app/pages/7_Survey\.py',
    # SearXNG default config: change-me-in-production is a well-known public placeholder
    'docker/searxng/settings\.yml',
 ]
 regexes = [
    # Job listing numeric IDs (look like phone numbers to the phone rule)
    '\d{10}\.html',                               # Craigslist listing IDs
    '\d{10}\/',                                   # LinkedIn job IDs in URLs
    # Localhost port patterns (look like phone numbers)
    'localhost:\d{4,5}',
    # Unix epoch timestamps in the 2025–2026 range (10-digit, look like phone numbers)
    '174\d{7}',
    # Example / placeholder license key patterns
    'CFG-[A-Z]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}',
    # Phone number false positives: 555 area code variants not caught by base allowlist
    '555\) \d{3}-\d{4}',
    '555-\d{3}-\d{4}',
 ]
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@ -0,0 +1,129 @@
 # Changelog
 All notable changes to Peregrine are documented here.
 Format follows [Keep a Changelog](https://keepachangelog.com/en/1.0.0/).
 ---
 ## [Unreleased]
 ---
 ## [0.4.0] — 2026-03-13
 ### Added
 - **LinkedIn profile import** — one-click import from a public LinkedIn profile URL
  (Playwright headless Chrome, no login required) or from a LinkedIn data export zip.
  Staged to `linkedin_stage.json` so the profile is parsed once and reused across
  sessions without repeated network requests. Available on all tiers including Free.
  - `scripts/linkedin_utils.py` — HTML parser with ordered CSS selector fallbacks;
    extracts name, experience, education, skills, certifications, summary
  - `scripts/linkedin_scraper.py` — Playwright URL scraper + export zip CSV parser;
    atomic staging file write; URL validation; robust error handling
  - `scripts/linkedin_parser.py` — staging file reader; re-runs HTML parser on stored
    raw HTML so selector improvements apply without re-scraping
  - `app/components/linkedin_import.py` — shared Streamlit widget (status bar, preview,
    URL import, advanced zip upload) used by both wizard and Settings
  - Wizard step 3: new "🔗 LinkedIn" tab alongside Upload and Build Manually
  - Settings → Resume Profile: collapsible "Import from LinkedIn" expander
  - Dockerfile: Playwright Chromium install added to Docker image
 ### Fixed
 - **Cloud mode perpetual onboarding loop** — wizard gate in `app.py` now reads
  `get_config_dir()/user.yaml` (per-user in cloud, repo-level locally) instead of a
  hardcoded repo path; completing the wizard now correctly exits it in cloud mode
 - **Cloud resume YAML path** — wizard step 3 writes resume to per-user `CONFIG_DIR`
  instead of the shared repo `config/` (would have merged all cloud users' data)
 - **Cloud session redirect** — missing/invalid session token now JS-redirects to
  `circuitforge.tech/login` instead of showing a raw error message
 - Removed remaining AIHawk UI references (`Home.py`, `4_Apply.py`, `migrate.py`)
 ---
 ## [0.3.0] — 2026-03-06
 ### Added
 - **Feedback button** — in-app issue reporting with screenshot paste support; posts
  directly to Forgejo as structured issues; available from sidebar on all pages
  (`app/feedback.py`, `scripts/feedback_api.py`, `app/components/paste_image.py`)
 - **BYOK cloud backend detection** — `scripts/byok_guard.py`: pure Python detection
  engine with full unit test coverage (18 tests); classifies backends as cloud or local
  based on type, `base_url` heuristic, and opt-out `local: true` flag
 - **BYOK activation warning** — one-time acknowledgment required in Settings when a
  new cloud LLM backend is enabled; shows data inventory (what leaves your machine,
  what stays local), provider policy links; ack state persisted to `config/user.yaml`
  under `byok_acknowledged_backends`
 - **Sidebar cloud LLM indicator** — amber badge on every page when any cloud backend
  is active; links to Settings; disappears when reverted to local-only config
 - **LLM suggest: search terms** — three-angle analysis from resume (job titles,
  skills keywords, and exclude terms to filter irrelevant listings)
 - **LLM suggest: resume keywords** — skills gap analysis against job descriptions
 - **LLM Suggest button** in Settings → Search → Skills & Keywords section
 - **Backup/restore script** (`scripts/backup.py`) — multi-instance and legacy support
 - `PRIVACY.md` — short-form privacy notice linked from Settings
 ### Changed
 - Settings save button for LLM Backends now gates on cloud acknowledgment before
  writing `config/llm.yaml`
 ### Fixed
 - Settings widget crash on certain rerun paths
 - Docker service controls in Settings → System tab
 - `DEFAULT_DB` now respects `STAGING_DB` environment variable (was silently ignoring it)
 - `generate()` in cover letter refinement now correctly passes `max_tokens` kwarg
 ### Security / Privacy
 - Full test suite anonymized — fictional "Alex Rivera" replaces all real personal data
  in test fixtures (`tests/test_cover_letter.py`, `test_imap_sync.py`,
  `test_classifier_adapters.py`, `test_db.py`)
 - Complete PII scrub from git history: real name, email address, and phone number
  removed from all 161 commits across both branches via `git filter-repo`
 ---
 ## [0.2.0] — 2026-02-26
 ### Added
 - Cover letter iterative refinement: "Refine with Feedback" expander in Apply Workspace;
  `generate()` accepts `previous_result`/`feedback`; task params passed through `submit_task`
 - Expanded first-run wizard: 7-step onboarding with GPU detection, tier selection,
  resume upload/parsing, LLM inference test, search profile builder, integration cards
 - Tier system: free / paid / premium feature gates (`app/wizard/tiers.py`)
 - 13 integration drivers: Notion, Google Sheets, Airtable, Google Drive, Dropbox,
  OneDrive, MEGA, Nextcloud, Google Calendar, Apple Calendar, Slack, Discord,
  Home Assistant — with auto-discovery registry
 - Resume parser: PDF (pdfplumber) and DOCX (python-docx) + LLM structuring
 - `wizard_generate` background task type with iterative refinement (feedback loop)
 - Dismissible setup banners on Home page (13 contextual prompts)
 - Developer tab in Settings: tier override selectbox and wizard reset button
 - Integrations tab in Settings: connect / test / disconnect all 12 non-Notion drivers
 - HuggingFace token moved to Developer tab
 - `params` column in `background_tasks` for wizard task payloads
 - `wizard_complete`, `wizard_step`, `tier`, `dev_tier_override`, `dismissed_banners`,
  `effective_tier` added to UserProfile
 - MkDocs documentation site (Material theme, 20 pages)
 - `LICENSE-MIT` and `LICENSE-BSL`, `CONTRIBUTING.md`, `CHANGELOG.md`
 ### Changed
 - `app.py` wizard gate now checks `wizard_complete` flag in addition to file existence
 - Settings tabs reorganised: Integrations tab added, Developer tab conditionally shown
 - HF token removed from Services tab (now Developer-only)
 ### Removed
 - Dead `app/pages/3_Resume_Editor.py` (functionality lives in Settings → Resume Profile)
 ---
 ## [0.1.0] — 2026-02-01
 ### Added
 - Initial release: JobSpy discovery pipeline, SQLite staging, Streamlit UI
 - Job Review, Apply Workspace, Interviews kanban, Interview Prep, Survey Assistant
 - LLM router with fallback chain (Ollama, vLLM, Claude Code wrapper, Anthropic)
 - Notion sync, email sync with IMAP classifier, company research with SearXNG
 - Background task runner with daemon threads
 - Vision service (moondream2) for survey screenshot analysis
 - Adzuna, The Ladders, and Craigslist custom board scrapers
 - Docker Compose profiles: remote, cpu, single-gpu, dual-gpu
 - `setup.sh` cross-platform dependency installer
 - `scripts/preflight.py` and `scripts/migrate.py`
--- a/CLAUDE.md
+++ b/CLAUDE.md
@ -1,212 +0,0 @@
 # Job Seeker Platform — Claude Context
 ## Project
 Automated job discovery + resume matching + application pipeline for Alex Rivera.
 Full pipeline:
 ```
 JobSpy → discover.py → SQLite (staging.db) → match.py → Job Review UI
 → Apply Workspace (cover letter + PDF) → Interviews kanban
 → phone_screen → interviewing → offer → hired
         ↓
      Notion DB (synced via sync.py)
 ```
 ## Environment
 - Python env: `conda run -n job-seeker <cmd>` — always use this, never bare python
 - Run tests: `/devl/miniconda3/envs/job-seeker/bin/pytest tests/ -v`
  (use direct binary — `conda run pytest` can spawn runaway processes)
 - Run discovery: `conda run -n job-seeker python scripts/discover.py`
 - Recreate env: `conda env create -f environment.yml`
 - pytest.ini scopes test collection to `tests/` only — never widen this
 ## ⚠️ AIHawk env isolation — CRITICAL
 - NEVER `pip install -r aihawk/requirements.txt` into the job-seeker env
 - AIHawk pulls torch + CUDA (~7GB) which causes OOM during test runs
 - AIHawk must run in its own env: `conda create -n aihawk-env python=3.12`
 - job-seeker env must stay lightweight (no torch, no sentence-transformers, no CUDA)
 ## Web UI (Streamlit)
 - Run: `bash scripts/manage-ui.sh start` → http://localhost:8501
 - Manage: `start | stop | restart | status | logs`
 - Direct binary: `/devl/miniconda3/envs/job-seeker/bin/streamlit run app/app.py`
 - Entry point: `app/app.py` (uses `st.navigation()` — do NOT run `app/Home.py` directly)
 - `staging.db` is gitignored — SQLite staging layer between discovery and Notion
 ### Pages
 | Page | File | Purpose |
 |------|------|---------|
 | Home | `app/Home.py` | Dashboard, discovery trigger, danger-zone purge |
 | Job Review | `app/pages/1_Job_Review.py` | Batch approve/reject with sorting |
 | Settings | `app/pages/2_Settings.py` | LLM backends, search profiles, Notion, services |
 | Resume Profile | Settings → Resume Profile tab | Edit AIHawk YAML profile (was standalone `3_Resume_Editor.py`) |
 | Apply Workspace | `app/pages/4_Apply.py` | Cover letter gen + PDF export + mark applied + reject listing |
 | Interviews | `app/pages/5_Interviews.py` | Kanban: phone_screen→interviewing→offer→hired |
 | Interview Prep | `app/pages/6_Interview_Prep.py` | Live reference sheet during calls + Practice Q&A |
 | Survey Assistant | `app/pages/7_Survey.py` | Culture-fit survey help: text paste + screenshot (moondream2) |
 ## Job Status Pipeline
 ```
 pending → approved/rejected          (Job Review)
 approved → applied                   (Apply Workspace — mark applied)
 approved → rejected                  (Apply Workspace — reject listing button)
 applied → survey                     (Interviews — "📋 Survey" button; pre-kanban section)
 applied → phone_screen               (Interviews — triggers company research)
 survey → phone_screen                (Interviews — after survey completed)
 phone_screen → interviewing
 interviewing → offer
 offer → hired
 any stage → rejected (rejection_stage captured for analytics)
 applied/approved → synced            (sync.py → Notion)
 ```
 ## SQLite Schema (`staging.db`)
 ### `jobs` table key columns
 - Standard: `id, title, company, url, source, location, is_remote, salary, description`
 - Scores: `match_score, keyword_gaps`
 - Dates: `date_found, applied_at, survey_at, phone_screen_at, interviewing_at, offer_at, hired_at`
 - Interview: `interview_date, rejection_stage`
 - Content: `cover_letter, notion_page_id`
 ### Additional tables
 - `job_contacts` — email thread log per job (direction, subject, from/to, body, received_at)
 - `company_research` — LLM-generated brief per job (company_brief, ceo_brief, talking_points, raw_output, accessibility_brief)
 - `background_tasks` — async LLM task queue (task_type, job_id, status: queued/running/completed/failed)
 - `survey_responses` — per-job Q&A pairs (survey_name, received_at, source, raw_input, image_path, mode, llm_output, reported_score)
 ## Scripts
 | Script | Purpose |
 |--------|---------|
 | `scripts/discover.py` | JobSpy + custom board scrape → SQLite insert |
 | `scripts/custom_boards/adzuna.py` | Adzuna Jobs API (app_id + app_key in config/adzuna.yaml) |
 | `scripts/custom_boards/theladders.py` | The Ladders scraper via curl_cffi + __NEXT_DATA__ SSR parse |
 | `scripts/match.py` | Resume keyword matching → match_score |
 | `scripts/sync.py` | Push approved/applied jobs to Notion |
 | `scripts/llm_router.py` | LLM fallback chain (reads config/llm.yaml) |
 | `scripts/generate_cover_letter.py` | Cover letter via LLM; detects mission-aligned companies (music/animal welfare/education) and injects Para 3 hint |
 | `scripts/company_research.py` | Pre-interview brief via LLM + optional SearXNG scrape; includes Inclusion & Accessibility section |
 | `scripts/prepare_training_data.py` | Extract cover letter JSONL for fine-tuning |
 | `scripts/finetune_local.py` | Unsloth QLoRA fine-tune on local GPU |
 | `scripts/db.py` | All SQLite helpers (single source of truth) |
 | `scripts/task_runner.py` | Background thread executor — `submit_task(db, type, job_id)` dispatches daemon threads for LLM jobs |
 | `scripts/vision_service/main.py` | FastAPI moondream2 inference on port 8002; `manage-vision.sh` lifecycle |
 ## LLM Router
 - Config: `config/llm.yaml`
 - Cover letter fallback order: `claude_code → ollama (alex-cover-writer:latest) → vllm → copilot → anthropic`
 - Research fallback order: `claude_code → vllm (__auto__, ouroboros) → ollama_research (llama3.1:8b) → ...`
 - `alex-cover-writer:latest` is cover-letter only — it doesn't follow structured markdown prompts for research
 - `LLMRouter.complete()` accepts `fallback_order=` override for per-task routing
 - `LLMRouter.complete()` accepts `images: list[str]` (base64) — vision backends only; non-vision backends skipped when images present
 - Vision fallback order config key: `vision_fallback_order: [vision_service, claude_code, anthropic]`
 - `vision_service` backend type: POST to `/analyze`; skipped automatically when no images provided
 - Claude Code wrapper: `/Library/Documents/Post Fight Processing/server-openai-wrapper-v2.js`
 - Copilot wrapper: `/Library/Documents/Post Fight Processing/manage-copilot.sh start`
 ## Fine-Tuned Model
 - Model: `alex-cover-writer:latest` registered in Ollama
 - Base: `unsloth/Llama-3.2-3B-Instruct` (QLoRA, rank 16, 10 epochs)
 - Training data: 62 cover letters from `/Library/Documents/JobSearch/`
 - JSONL: `/Library/Documents/JobSearch/training_data/cover_letters.jsonl`
 - Adapter: `/Library/Documents/JobSearch/training_data/finetune_output/adapter/`
 - Merged: `/Library/Documents/JobSearch/training_data/gguf/alex-cover-writer/`
 - Re-train: `conda run -n ogma python scripts/finetune_local.py`
  (uses `ogma` env with unsloth + trl; pin to GPU 0 with `CUDA_VISIBLE_DEVICES=0`)
 ## Background Tasks
 - Cover letter gen and company research run as daemon threads via `scripts/task_runner.py`
 - Tasks survive page navigation; results written to existing tables when done
 - On server restart, `app.py` startup clears any stuck `running`/`queued` rows to `failed`
 - Dedup: only one queued/running task per `(task_type, job_id)` at a time
 - Sidebar indicator (`app/app.py`) polls every 3s via `@st.fragment(run_every=3)`
 - ⚠️ Streamlit fragment + sidebar: use `with st.sidebar: _fragment()` — sidebar context must WRAP the call, not be inside the fragment body
 ## Vision Service
 - Script: `scripts/vision_service/main.py` (FastAPI, port 8002)
 - Model: `vikhyatk/moondream2` revision `2025-01-09` — lazy-loaded on first `/analyze` (~1.8GB download)
 - GPU: 4-bit quantization when CUDA available (~1.5GB VRAM); CPU fallback
 - Conda env: `job-seeker-vision` — separate from job-seeker (torch + transformers live here)
 - Create env: `conda env create -f scripts/vision_service/environment.yml`
 - Manage: `bash scripts/manage-vision.sh start|stop|restart|status|logs`
 - Survey page degrades gracefully to text-only when vision service is down
 - ⚠️ Never install vision deps (torch, bitsandbytes, transformers) into the job-seeker env
 ## Company Research
 - Script: `scripts/company_research.py`
 - Auto-triggered when a job moves to `phone_screen` in the Interviews kanban
 - Three-phase: (1) SearXNG company scrape → (1b) SearXNG news snippets → (2) LLM synthesis
 - SearXNG scraper: `/Library/Development/scrapers/companyScraper.py`
 - SearXNG Docker: run `docker compose up -d` from `/Library/Development/scrapers/SearXNG/` (port 8888)
 - `beautifulsoup4` and `fake-useragent` are installed in job-seeker env (required for scraper)
 - News search hits `/search?format=json` — JSON format must be enabled in `searxng-config/settings.yml`
 - ⚠️ `settings.yml` owned by UID 977 (container user) — use `docker cp` to update, not direct writes
 - ⚠️ `settings.yml` requires `use_default_settings: true` at the top or SearXNG fails schema validation
 - `companyScraper` calls `sys.exit()` on missing deps — use `except BaseException` not `except Exception`
 ## Email Classifier Labels
 Six labels: `interview_request`, `rejection`, `offer`, `follow_up`, `survey_received`, `other`
 - `survey_received` — links or requests to complete a culture-fit survey/assessment
 ## Services (managed via Settings → Services tab)
 | Service | Port | Notes |
 |---------|------|-------|
 | Streamlit UI | 8501 | `bash scripts/manage-ui.sh start` |
 | Ollama | 11434 | `sudo systemctl start ollama` |
 | Claude Code Wrapper | 3009 | `manage-services.sh start` in Post Fight Processing |
 | GitHub Copilot Wrapper | 3010 | `manage-copilot.sh start` in Post Fight Processing |
 | vLLM Server | 8000 | Manual start only |
 | SearXNG | 8888 | `docker compose up -d` in scrapers/SearXNG/ |
 | Vision Service | 8002 | `bash scripts/manage-vision.sh start` — moondream2 survey screenshot analysis |
 ## Notion
 - DB: "Tracking Job Applications" (ID: `1bd75cff-7708-8007-8c00-f1de36620a0a`)
 - `config/notion.yaml` is gitignored (live token); `.example` is committed
 - Field names are non-obvious — always read from `field_map` in `config/notion.yaml`
 - "Salary" = Notion title property (unusual — it's the page title field)
 - "Job Source" = `multi_select` type
 - "Role Link" = URL field
 - "Status of Application" = status field; new listings use "Application Submitted"
 - Sync pushes `approved` + `applied` jobs; marks them `synced` after
 ## Key Config Files
 - `config/notion.yaml` — gitignored, has token + field_map
 - `config/notion.yaml.example` — committed template
 - `config/search_profiles.yaml` — titles, locations, boards, custom_boards, exclude_keywords, mission_tags (per profile)
 - `config/llm.yaml` — LLM backend priority chain + enabled flags
 - `config/tokens.yaml` — gitignored, stores HF token (chmod 600)
 - `config/adzuna.yaml` — gitignored, Adzuna API app_id + app_key
 - `config/adzuna.yaml.example` — committed template
 ## Custom Job Board Scrapers
 - `scripts/custom_boards/adzuna.py` — Adzuna Jobs API; credentials in `config/adzuna.yaml`
 - `scripts/custom_boards/theladders.py` — The Ladders SSR scraper; needs `curl_cffi` installed
 - Scrapers registered in `CUSTOM_SCRAPERS` dict in `discover.py`
 - Activated per-profile via `custom_boards: [adzuna, theladders]` in `search_profiles.yaml`
 - `enrich_all_descriptions()` in `enrich_descriptions.py` covers all sources (not just Glassdoor)
 - Home page "Fill Missing Descriptions" button dispatches `enrich_descriptions` task
 ## Mission Alignment & Accessibility
 - Preferred industries: music, animal welfare, children's education (hardcoded in `generate_cover_letter.py`)
 - `detect_mission_alignment(company, description)` injects a Para 3 hint into cover letters for aligned companies
 - Company research includes an "Inclusion & Accessibility" section (8th section of the brief) in every brief
 - Accessibility search query in `_SEARCH_QUERIES` hits SearXNG for ADA/ERG/disability signals
 - `accessibility_brief` column in `company_research` table; shown in Interview Prep under ♿ section
 - This info is for personal decision-making ONLY — never disclosed in applications
 - In generalization: these become `profile.mission_industries` + `profile.accessibility_priority` in `user.yaml`
 ## Document Rule
 Resumes and cover letters live in `/Library/Documents/JobSearch/` or Notion — never committed to this repo.
 ## AIHawk (LinkedIn Easy Apply)
 - Cloned to `aihawk/` (gitignored)
 - Config: `aihawk/data_folder/plain_text_resume.yaml` — search FILL_IN for gaps
 - Self-ID: non-binary, pronouns any, no disability/drug-test disclosure
 - Run: `conda run -n job-seeker python aihawk/main.py`
 - Playwright: `conda run -n job-seeker python -m playwright install chromium`
 ## Git Remote
 - Forgejo self-hosted at https://git.opensourcesolarpunk.com (username: pyr0ball)
 - `git remote add origin https://git.opensourcesolarpunk.com/pyr0ball/job-seeker.git`
 ## Subagents
 Use `general-purpose` subagent type (not `Bash`) when tasks require file writes.
--- a/CONTRIBUTING.md
+++ b/CONTRIBUTING.md
@ -0,0 +1,83 @@
 # Contributing to Peregrine
 Thanks for your interest. Peregrine is developed primarily at
 [git.opensourcesolarpunk.com](https://git.opensourcesolarpunk.com/pyr0ball/peregrine).
 GitHub and Codeberg are push mirrors — issues and PRs are welcome on either platform.
 ---
 ## License
 Peregrine is licensed under **[BSL 1.1](./LICENSE-BSL)** — Business Source License.
 What this means for you:
 | Use case | Allowed? |
 |----------|----------|
 | Personal self-hosting, non-commercial | ✅ Free |
 | Contributing code, fixing bugs, writing docs | ✅ Free |
 | Commercial SaaS / hosted service | 🔒 Requires a paid license |
 | After 4 years from each release date | ✅ Converts to MIT |
 **By submitting a pull request you agree that your contribution is licensed under the
 project's BSL 1.1 terms.** The PR template includes this as a checkbox.
 ---
 ## Dev Setup
 See [`docs/getting-started/installation.md`](docs/getting-started/installation.md) for
 full instructions.
 **Quick start (Docker — recommended):**
 ```bash
 git clone https://git.opensourcesolarpunk.com/pyr0ball/peregrine.git
 cd peregrine
 ./setup.sh        # installs deps, activates git hooks
 ./manage.sh start
 ```
 **Conda (no Docker):**
 ```bash
 conda run -n job-seeker pip install -r requirements.txt
 streamlit run app/app.py
 ```
 ---
 ## Commit Format
 Hooks enforce [Conventional Commits](https://www.conventionalcommits.org/):
 ```
 type: short description
 type(scope): short description
 ```
 Valid types: `feat` `fix` `docs` `chore` `test` `refactor` `perf` `ci` `build`
 The hook will tell you exactly what went wrong if your message is rejected.
 ---
 ## Pull Request Process
 1. Fork and branch from `main`
 2. Write tests first (we use `pytest`)
 3. Run `pytest tests/ -v` — all tests must pass
 4. Open a PR on GitHub or Codeberg
 5. PRs are reviewed and cherry-picked to Forgejo (the canonical repo) — you don't need a Forgejo account
 ---
 ## Reporting Issues
 Use the issue templates:
 - **Bug** — steps to reproduce, version, OS, Docker or conda, logs
 - **Feature** — problem statement, proposed solution, which tier it belongs to
 **Security issues:** Do **not** open a public issue. Email `security@circuitforge.tech`.
 See [SECURITY.md](./SECURITY.md).
--- a/30
+++ b/30
@ -0,0 +1,30 @@
 # Dockerfile
 FROM python:3.11-slim
 WORKDIR /app
 # System deps for companyScraper (beautifulsoup4, fake-useragent, lxml) and PDF gen
 # libsqlcipher-dev: required to build pysqlcipher3 (SQLCipher AES-256 encryption for cloud mode)
 RUN apt-get update && apt-get install -y --no-install-recommends \
    gcc libffi-dev curl libsqlcipher-dev \
    && rm -rf /var/lib/apt/lists/*
 COPY requirements.txt .
 # Install Python dependencies
 RUN pip install --no-cache-dir -r requirements.txt
 # Install Playwright browser (cached separately from Python deps so requirements
 # changes don't bust the ~600–900 MB Chromium layer and vice versa)
 RUN playwright install chromium && playwright install-deps chromium
 # Bundle companyScraper (company research web scraper)
 COPY scrapers/ /app/scrapers/
 COPY . .
 EXPOSE 8501
 CMD ["streamlit", "run", "app/app.py", \
     "--server.port=8501", \
     "--server.headless=true", \
     "--server.fileWatcherType=none"]
--- a/Dockerfile.finetune
+++ b/Dockerfile.finetune
@ -0,0 +1,38 @@
 # Dockerfile.finetune — Cover letter LoRA fine-tuner (QLoRA via unsloth)
 # Large image (~12-15 GB after build). Built once, cached on rebuilds.
 # GPU strongly recommended. CPU fallback works but training is very slow.
 #
 # Tested base: pytorch/pytorch:2.3.0-cuda12.1-cudnn8-runtime
 # If your GPU requires a different CUDA version, change the FROM line and
 # reinstall bitsandbytes for the matching CUDA (e.g. bitsandbytes-cuda121).
 FROM pytorch/pytorch:2.3.0-cuda12.1-cudnn8-runtime
 WORKDIR /app
 # Build tools needed by bitsandbytes CUDA kernels and unsloth
 RUN apt-get update && apt-get install -y --no-install-recommends \
    gcc g++ git libgomp1 \
    && rm -rf /var/lib/apt/lists/*
 # Install training stack.
 # unsloth detects CUDA version automatically from the base image.
 RUN pip install --no-cache-dir \
    "unsloth @ git+https://github.com/unslothai/unsloth.git" \
    "datasets>=2.18" "trl>=0.8" peft transformers \
    "bitsandbytes>=0.43.0" accelerate sentencepiece \
    requests pyyaml
 COPY scripts/ /app/scripts/
 COPY config/  /app/config/
 ENV PYTHONUNBUFFERED=1
 # Pin to GPU 0; overridable at runtime with --env CUDA_VISIBLE_DEVICES=
 ENV CUDA_VISIBLE_DEVICES=0
 # Runtime env vars injected by compose.yml:
 #   OLLAMA_URL              — Ollama API base (default: http://ollama:11434)
 #   OLLAMA_MODELS_MOUNT     — finetune container's mount path for ollama models volume
 #   OLLAMA_MODELS_OLLAMA_PATH — Ollama container's mount path for same volume
 #   DOCS_DIR                — cover letters + training data root (default: /docs)
 ENTRYPOINT ["python", "scripts/finetune_local.py"]
--- a/26
+++ b/26
@ -0,0 +1,26 @@
 Business Source License 1.1
 Licensor:             Circuit Forge LLC
 Licensed Work:        Peregrine — AI-powered job search pipeline
                      Copyright (c) 2026 Circuit Forge LLC
 Additional Use Grant: You may use the Licensed Work for personal,
                      non-commercial job searching purposes only.
 Change Date:          2030-01-01
 Change License:       MIT License
 For the full Business Source License 1.1 text, see:
 https://mariadb.com/bsl11/
 ---
 This license applies to the following components of Peregrine:
 - scripts/llm_router.py
 - scripts/generate_cover_letter.py
 - scripts/company_research.py
 - scripts/task_runner.py
 - scripts/resume_parser.py
 - scripts/imap_sync.py
 - scripts/vision_service/
 - scripts/integrations/
 - app/
--- a/35
+++ b/35
@ -0,0 +1,35 @@
 MIT License
 Copyright (c) 2026 Circuit Forge LLC
 Permission is hereby granted, free of charge, to any person obtaining a copy
 of this software and associated documentation files (the "Software"), to deal
 in the Software without restriction, including without limitation the rights
 to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 copies of the Software, and to permit persons to whom the Software is
 furnished to do so, subject to the following conditions:
 The above copyright notice and this permission notice shall be included in all
 copies or substantial portions of the Software.
 THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
 IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
 FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
 AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
 LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
 OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
 SOFTWARE.
 ---
 This license applies to the following components of Peregrine:
 - scripts/discover.py
 - scripts/custom_boards/
 - scripts/match.py
 - scripts/db.py
 - scripts/migrate.py
 - scripts/preflight.py
 - scripts/user_profile.py
 - setup.sh
 - Makefile
--- a/84
+++ b/84
@ -0,0 +1,84 @@
 # Makefile — Peregrine convenience targets
 # Usage: make <target>
 .PHONY: setup preflight start stop restart logs test prepare-training finetune clean help
 PROFILE ?= remote
 PYTHON  ?= python3
 # Auto-detect container engine: prefer docker compose, fall back to podman
 COMPOSE ?= $(shell \
  command -v docker >/dev/null 2>&1 && docker compose version >/dev/null 2>&1 \
  && echo "docker compose" \
  || (command -v podman >/dev/null 2>&1 \
      && podman compose version >/dev/null 2>&1 \
      && echo "podman compose" \
      || echo "podman-compose"))
 # GPU profiles require an overlay for NVIDIA device reservations.
 # Docker uses deploy.resources (compose.gpu.yml); Podman uses CDI device specs (compose.podman-gpu.yml).
 # Generate CDI spec for Podman first: sudo nvidia-ctk cdi generate --output=/etc/cdi/nvidia.yaml
 #
 # NOTE: When explicit -f flags are used, Docker Compose does NOT auto-detect
 # compose.override.yml. We must include it explicitly when present.
 OVERRIDE_FILE := $(wildcard compose.override.yml)
 COMPOSE_OVERRIDE := $(if $(OVERRIDE_FILE),-f compose.override.yml,)
 DUAL_GPU_MODE ?= $(shell grep -m1 '^DUAL_GPU_MODE=' .env 2>/dev/null | cut -d= -f2 || echo ollama)
 COMPOSE_FILES := -f compose.yml $(COMPOSE_OVERRIDE)
 ifneq (,$(findstring podman,$(COMPOSE)))
  ifneq (,$(findstring gpu,$(PROFILE)))
    COMPOSE_FILES := -f compose.yml $(COMPOSE_OVERRIDE) -f compose.podman-gpu.yml
  endif
 else
  ifneq (,$(findstring gpu,$(PROFILE)))
    COMPOSE_FILES := -f compose.yml $(COMPOSE_OVERRIDE) -f compose.gpu.yml
  endif
 endif
 ifeq ($(PROFILE),dual-gpu)
  COMPOSE_FILES += --profile dual-gpu-$(DUAL_GPU_MODE)
 endif
 # 'remote' means base services only — no services are tagged 'remote' in compose.yml,
 # so --profile remote is a no-op with Docker and a fatal error on old podman-compose.
 # Only pass --profile for profiles that actually activate optional services.
 PROFILE_ARG := $(if $(filter remote,$(PROFILE)),,--profile $(PROFILE))
 setup:          ## Install dependencies (Docker or Podman + NVIDIA toolkit)
 	@bash setup.sh
 preflight:      ## Check ports + system resources; write .env
 	@$(PYTHON) scripts/preflight.py
 start: preflight  ## Preflight check then start Peregrine (PROFILE=remote|cpu|single-gpu|dual-gpu)
 	$(COMPOSE) $(COMPOSE_FILES) $(PROFILE_ARG) up -d
 stop:           ## Stop all Peregrine services
 	$(COMPOSE) down
 restart:  ## Stop services, re-run preflight (ports now free), then start
 	$(COMPOSE) down
 	@$(PYTHON) scripts/preflight.py
 	$(COMPOSE) $(COMPOSE_FILES) $(PROFILE_ARG) up -d
 logs:           ## Tail app logs
 	$(COMPOSE) logs -f app
 test:           ## Run the test suite
 	@$(PYTHON) -m pytest tests/ -v
 prepare-training: ## Scan docs_dir for cover letters and build training JSONL
 	$(COMPOSE) $(COMPOSE_FILES) run --rm app python scripts/prepare_training_data.py
 finetune:       ## Fine-tune your personal cover letter model (run prepare-training first)
 	@echo "Starting fine-tune (30-90 min on GPU, much longer on CPU)..."
 	$(COMPOSE) $(COMPOSE_FILES) -f compose.gpu.yml --profile finetune run --rm finetune
 clean:          ## Remove containers, images, and data volumes (DESTRUCTIVE)
 	@echo "WARNING: This will delete all Peregrine containers and data."
 	@read -p "Type 'yes' to confirm: " confirm && [ "$$confirm" = "yes" ]
 	$(COMPOSE) down --rmi local --volumes
 help:           ## Show this help
 	@grep -E '^[a-zA-Z_-]+:.*?## .*$$' $(MAKEFILE_LIST) | \
 	  awk 'BEGIN {FS = ":.*?## "}; {printf "  \033[36m%-12s\033[0m %s\n", $$1, $$2}'
--- a/PRIVACY.md
+++ b/PRIVACY.md
@ -0,0 +1,7 @@
 # Privacy Policy
 CircuitForge LLC's privacy policy applies to this product and is published at:
 **<https://circuitforge.tech/privacy>**
 Last reviewed: March 2026.
--- a/README.md
+++ b/README.md
@ -0,0 +1,206 @@
 # Peregrine
 > **Primary development** happens at [git.opensourcesolarpunk.com](https://git.opensourcesolarpunk.com/pyr0ball/peregrine) — GitHub and Codeberg are push mirrors. Issues and PRs are welcome on either platform.
 [![License: BSL 1.1](https://img.shields.io/badge/License-BSL_1.1-blue.svg)](./LICENSE-BSL)
 [![CI](https://github.com/CircuitForge/peregrine/actions/workflows/ci.yml/badge.svg)](https://github.com/CircuitForge/peregrine/actions/workflows/ci.yml)
 **AI-powered job search pipeline — by [Circuit Forge LLC](https://circuitforge.tech)**
 > *"Don't be evil, for real and forever."*
 Automates the full job search lifecycle: discovery → matching → cover letters → applications → interview prep.
 Privacy-first, local-first. Your data never leaves your machine.
 ---
 ## Quick Start
 **1. Clone and install dependencies** (Docker, NVIDIA toolkit if needed):
 ```bash
 git clone https://git.opensourcesolarpunk.com/pyr0ball/peregrine
 cd peregrine
 ./manage.sh setup
 ```
 **2. Start Peregrine:**
 ```bash
 ./manage.sh start                          # remote profile (API-only, no GPU)
 ./manage.sh start --profile cpu            # local Ollama (CPU, or Metal GPU on Apple Silicon — see below)
 ./manage.sh start --profile single-gpu    # Ollama + Vision on GPU 0  (NVIDIA only)
 ./manage.sh start --profile dual-gpu      # Ollama + Vision + vLLM (GPU 0 + 1)  (NVIDIA only)
 ```
 Or use `make` directly:
 ```bash
 make start                        # remote profile
 make start PROFILE=single-gpu
 ```
 **3.** Open http://localhost:8501 — the setup wizard guides you through the rest.
 > **macOS / Apple Silicon:** Docker Desktop must be running. For Metal GPU-accelerated inference, install Ollama natively before starting — `setup.sh` will prompt you to do this. See [Apple Silicon GPU](#apple-silicon-gpu) below.
 > **Windows:** Not supported — use WSL2 with Ubuntu.
 ### Installing to `/opt` or other system directories
 If you clone into a root-owned directory (e.g. `sudo git clone ... /opt/peregrine`), two things need fixing:
 **1. Git ownership warning** (`fatal: detected dubious ownership`) — `./manage.sh setup` fixes this automatically. If you need git to work *before* running setup:
 ```bash
 git config --global --add safe.directory /opt/peregrine
 ```
 **2. Preflight write access** — preflight writes `.env` and `compose.override.yml` into the repo directory. Fix ownership once:
 ```bash
 sudo chown -R $USER:$USER /opt/peregrine
 ```
 After that, run everything without `sudo`.
 ### Podman
 Podman is rootless by default — **no `sudo` needed.** `./manage.sh setup` will configure `podman-compose` if it isn't already present.
 ### Docker
 After `./manage.sh setup`, log out and back in for docker group membership to take effect. Until then, prefix commands with `sudo`. After re-login, `sudo` is no longer required.
 ---
 ## Inference Profiles
 | Profile | Services started | Use case |
 |---------|-----------------|----------|
 | `remote` | app + searxng | No GPU; LLM calls go to Anthropic / OpenAI |
 | `cpu` | app + ollama + searxng | No GPU; local models on CPU. On Apple Silicon, use with native Ollama for Metal acceleration — see below. |
 | `single-gpu` | app + ollama + vision + searxng | One **NVIDIA** GPU: cover letters, research, vision |
 | `dual-gpu` | app + ollama + vllm + vision + searxng | Two **NVIDIA** GPUs: GPU 0 = Ollama, GPU 1 = vLLM |
 ### Apple Silicon GPU
 Docker Desktop on macOS runs in a Linux VM — it cannot access the Apple GPU. Metal-accelerated inference requires Ollama to run **natively** on the host.
 `setup.sh` handles this automatically: it offers to install Ollama via Homebrew, starts it as a background service, and explains what happens next. If Ollama is running on port 11434 when you start Peregrine, preflight detects it, stubs out the Docker Ollama container, and routes inference through the native process — which uses Metal automatically.
 To do it manually:
 ```bash
 brew install ollama
 brew services start ollama          # starts at login, uses Metal GPU
 ./manage.sh start --profile cpu     # preflight adopts native Ollama; Docker container is skipped
 ```
 The `cpu` profile label is a slight misnomer in this context — Ollama will be running on the GPU. `single-gpu` and `dual-gpu` profiles are NVIDIA-specific and not applicable on Mac.
 ---
 ## First-Run Wizard
 On first launch the setup wizard walks through seven steps:
 1. **Hardware** — detects NVIDIA GPUs (Linux) or Apple Silicon GPU (macOS) and recommends a profile
 2. **Tier** — choose free, paid, or premium (or use `dev_tier_override` for local testing)
 3. **Identity** — name, email, phone, LinkedIn, career summary
 4. **Resume** — upload a PDF/DOCX for LLM parsing, or use the guided form builder
 5. **Inference** — configure LLM backends and API keys
 6. **Search** — job titles, locations, boards, keywords, blocklist
 7. **Integrations** — optional cloud storage, calendar, and notification services
 Wizard state is saved after each step — a crash or browser close resumes where you left off.
 Re-enter the wizard any time via **Settings → Developer → Reset wizard**.
 ---
 ## Features
 | Feature | Tier |
 |---------|------|
 | Job discovery (JobSpy + custom boards) | Free |
 | Resume keyword matching & gap analysis | Free |
 | Document storage sync (Google Drive, Dropbox, OneDrive, MEGA, Nextcloud) | Free |
 | Webhook notifications (Discord, Home Assistant) | Free |
 | **Cover letter generation** | Free with LLM¹ |
 | **Company research briefs** | Free with LLM¹ |
 | **Interview prep & practice Q&A** | Free with LLM¹ |
 | **Survey assistant** (culture-fit Q&A, screenshot analysis) | Free with LLM¹ |
 | **AI wizard helpers** (career summary, bullet expansion, skill suggestions) | Free with LLM¹ |
 | Managed cloud LLM (no API key needed) | Paid |
 | Email sync & auto-classification | Paid |
 | Job tracking integrations (Notion, Airtable, Google Sheets) | Paid |
 | Calendar sync (Google, Apple) | Paid |
 | Slack notifications | Paid |
 | CircuitForge shared cover-letter model | Paid |
 | Cover letter model fine-tuning (your writing, your model) | Premium |
 | Multi-user support | Premium |
 ¹ **BYOK unlock:** configure any LLM backend — a local [Ollama](https://ollama.com) or vLLM instance,
 or your own API key (Anthropic, OpenAI-compatible) — and all AI features marked **Free with LLM**
 unlock at no charge. The paid tier earns its price by providing managed cloud inference so you
 don't need a key at all, plus integrations and email sync.
 ---
 ## Email Sync
 Monitors your inbox for job-related emails and automatically updates job stages (interview requests, rejections, survey links, offers).
 Configure in **Settings → Email**. Requires IMAP access and, for Gmail, an App Password.
 ---
 ## Integrations
 Connect external services in **Settings → Integrations**:
 - **Job tracking:** Notion, Airtable, Google Sheets
 - **Document storage:** Google Drive, Dropbox, OneDrive, MEGA, Nextcloud
 - **Calendar:** Google Calendar, Apple Calendar (CalDAV)
 - **Notifications:** Slack, Discord (webhook), Home Assistant
 ---
 ## CLI Reference (`manage.sh`)
 `manage.sh` is the single entry point for all common operations — no need to remember Make targets or Docker commands.
 ```
 ./manage.sh setup               Install Docker/Podman + NVIDIA toolkit
 ./manage.sh start [--profile P] Preflight check then start services
 ./manage.sh stop                Stop all services
 ./manage.sh restart             Restart all services
 ./manage.sh status              Show running containers
 ./manage.sh logs [service]      Tail logs (default: app)
 ./manage.sh update              Pull latest images + rebuild app container
 ./manage.sh preflight           Check ports + resources; write .env
 ./manage.sh test                Run test suite
 ./manage.sh prepare-training    Scan docs for cover letters → training JSONL
 ./manage.sh finetune            Run LoRA fine-tune (needs --profile single-gpu+)
 ./manage.sh open                Open the web UI in your browser
 ./manage.sh clean               Remove containers, images, volumes (asks to confirm)
 ```
 ---
 ## Developer Docs
 Full documentation at: https://docs.circuitforge.tech/peregrine
 - [Installation guide](https://docs.circuitforge.tech/peregrine/getting-started/installation/)
 - [Adding a custom job board scraper](https://docs.circuitforge.tech/peregrine/developer-guide/adding-scrapers/)
 - [Adding an integration](https://docs.circuitforge.tech/peregrine/developer-guide/adding-integrations/)
 - [Contributing](https://docs.circuitforge.tech/peregrine/developer-guide/contributing/)
 ---
 ## License
 Core discovery pipeline: [MIT](LICENSE-MIT)
 AI features (cover letter generation, company research, interview prep, UI): [BSL 1.1](LICENSE-BSL)
 © 2026 Circuit Forge LLC
--- a/SECURITY.md
+++ b/SECURITY.md
@ -0,0 +1,26 @@
 # Security Policy
 ## Reporting a Vulnerability
 **Do not open a GitHub or Codeberg issue for security vulnerabilities.**
 Email: `security@circuitforge.tech`
 Include:
 - A description of the vulnerability
 - Steps to reproduce
 - Potential impact
 - Any suggested fix (optional)
 **Response target:** 72 hours for acknowledgement, 14 days for triage.
 We follow responsible disclosure — we will coordinate a fix and release before any
 public disclosure and will credit you in the release notes unless you prefer to remain
 anonymous.
 ## Supported Versions
 | Version | Supported |
 |---------|-----------|
 | Latest release | ✅ |
 | Older releases | ❌ — please upgrade |
--- a/app/Home.py
+++ b/app/Home.py
@ -8,15 +8,81 @@ import sys
 from pathlib import Path
 import streamlit as st
 import yaml
 sys.path.insert(0, str(Path(__file__).parent.parent))
-from scripts.db import DEFAULT_DB, init_db, get_job_counts, purge_jobs, purge_email_data, \
+from scripts.user_profile import UserProfile
 _USER_YAML = Path(__file__).parent.parent / "config" / "user.yaml"
 _profile = UserProfile(_USER_YAML) if UserProfile.exists(_USER_YAML) else None
 _name = _profile.name if _profile else "Job Seeker"
 from scripts.db import init_db, get_job_counts, purge_jobs, purge_email_data, \
    purge_non_remote, archive_jobs, kill_stuck_tasks, get_task_for_job, get_active_tasks, \
    insert_job, get_existing_urls
 from scripts.task_runner import submit_task
 from app.cloud_session import resolve_session, get_db_path
-init_db(DEFAULT_DB)
+resolve_session("peregrine")
 init_db(get_db_path())
 def _email_configured() -> bool:
    _e = Path(__file__).parent.parent / "config" / "email.yaml"
    if not _e.exists():
        return False
    import yaml as _yaml
    _cfg = _yaml.safe_load(_e.read_text()) or {}
    return bool(_cfg.get("username") or _cfg.get("user") or _cfg.get("imap_host"))
 def _notion_configured() -> bool:
    _n = Path(__file__).parent.parent / "config" / "notion.yaml"
    if not _n.exists():
        return False
    import yaml as _yaml
    _cfg = _yaml.safe_load(_n.read_text()) or {}
    return bool(_cfg.get("token"))
 def _keywords_configured() -> bool:
    _k = Path(__file__).parent.parent / "config" / "resume_keywords.yaml"
    if not _k.exists():
        return False
    import yaml as _yaml
    _cfg = _yaml.safe_load(_k.read_text()) or {}
    return bool(_cfg.get("keywords") or _cfg.get("required") or _cfg.get("preferred"))
 _SETUP_BANNERS = [
    {"key": "connect_cloud",       "text": "Connect a cloud service for resume/cover letter storage",
     "link_label": "Settings → Integrations",
     "done": _notion_configured},
    {"key": "setup_email",         "text": "Set up email sync to catch recruiter outreach",
     "link_label": "Settings → Email",
     "done": _email_configured},
    {"key": "setup_email_labels",  "text": "Set up email label filters for auto-classification",
     "link_label": "Settings → Email (label guide)",
     "done": _email_configured},
    {"key": "tune_mission",        "text": "Tune your mission preferences for better cover letters",
     "link_label": "Settings → My Profile"},
    {"key": "configure_keywords",  "text": "Configure keywords and blocklist for smarter search",
     "link_label": "Settings → Search",
     "done": _keywords_configured},
    {"key": "upload_corpus",       "text": "Upload your cover letter corpus for voice fine-tuning",
     "link_label": "Settings → Fine-Tune"},
    {"key": "configure_linkedin",  "text": "Configure LinkedIn Easy Apply automation",
     "link_label": "Settings → Integrations"},
    {"key": "setup_searxng",       "text": "Set up company research with SearXNG",
     "link_label": "Settings → Services"},
    {"key": "target_companies",    "text": "Build a target company list for focused outreach",
     "link_label": "Settings → Search"},
    {"key": "setup_notifications", "text": "Set up notifications for stage changes",
     "link_label": "Settings → Integrations"},
    {"key": "tune_model",          "text": "Tune a custom cover letter model on your writing",
     "link_label": "Settings → Fine-Tune"},
    {"key": "review_training",     "text": "Review and curate training data for model tuning",
     "link_label": "Settings → Fine-Tune"},
    {"key": "setup_calendar",      "text": "Set up calendar sync to track interview dates",
     "link_label": "Settings → Integrations"},
 ]
 def _dismissible(key: str, status: str, msg: str) -> None:
@ -64,7 +130,7 @@ def _queue_url_imports(db_path: Path, urls: list) -> int:
    return queued
-st.title("🔍 Alex's Job Search")
+st.title(f"🔍 {_name}'s Job Search")
 st.caption("Discover → Review → Sync to Notion")
 st.divider()
@ -72,7 +138,7 @@ st.divider()
@st.fragment(run_every=10)
 def _live_counts():
-    counts = get_job_counts(DEFAULT_DB)
+    counts = get_job_counts(get_db_path())
    col1, col2, col3, col4, col5 = st.columns(5)
    col1.metric("Pending Review", counts.get("pending", 0))
    col2.metric("Approved", counts.get("approved", 0))
@ -91,18 +157,18 @@ with left:
    st.subheader("Find New Jobs")
    st.caption("Scrapes all configured boards and adds new listings to your review queue.")
-    _disc_task = get_task_for_job(DEFAULT_DB, "discovery", 0)
+    _disc_task = get_task_for_job(get_db_path(), "discovery", 0)
    _disc_running = _disc_task and _disc_task["status"] in ("queued", "running")
    if st.button("🚀 Run Discovery", use_container_width=True, type="primary",
                 disabled=bool(_disc_running)):
-        submit_task(DEFAULT_DB, "discovery", 0)
+        submit_task(get_db_path(), "discovery", 0)
        st.rerun()
    if _disc_running:
        @st.fragment(run_every=4)
        def _disc_status():
-            t = get_task_for_job(DEFAULT_DB, "discovery", 0)
+            t = get_task_for_job(get_db_path(), "discovery", 0)
            if t and t["status"] in ("queued", "running"):
                lbl = "Queued…" if t["status"] == "queued" else "Scraping job boards… this may take a minute"
                st.info(f"⏳ {lbl}")
@ -120,18 +186,18 @@ with enrich_col:
    st.subheader("Enrich Descriptions")
    st.caption("Re-fetch missing descriptions for any listing (LinkedIn, Indeed, Glassdoor, Adzuna, The Ladders, generic).")
-    _enrich_task = get_task_for_job(DEFAULT_DB, "enrich_descriptions", 0)
+    _enrich_task = get_task_for_job(get_db_path(), "enrich_descriptions", 0)
    _enrich_running = _enrich_task and _enrich_task["status"] in ("queued", "running")
    if st.button("🔍 Fill Missing Descriptions", use_container_width=True, type="primary",
                 disabled=bool(_enrich_running)):
-        submit_task(DEFAULT_DB, "enrich_descriptions", 0)
+        submit_task(get_db_path(), "enrich_descriptions", 0)
        st.rerun()
    if _enrich_running:
        @st.fragment(run_every=4)
        def _enrich_status():
-            t = get_task_for_job(DEFAULT_DB, "enrich_descriptions", 0)
+            t = get_task_for_job(get_db_path(), "enrich_descriptions", 0)
            if t and t["status"] in ("queued", "running"):
                st.info("⏳ Fetching descriptions…")
            else:
@ -146,10 +212,10 @@ with enrich_col:
 with mid:
    unscored = sum(1 for j in __import__("scripts.db", fromlist=["get_jobs_by_status"])
-                   .get_jobs_by_status(DEFAULT_DB, "pending")
+                   .get_jobs_by_status(get_db_path(), "pending")
                   if j.get("match_score") is None and j.get("description"))
    st.subheader("Score Listings")
-    st.caption(f"Run TF-IDF match scoring against Alex's resume. {unscored} pending job{'s' if unscored != 1 else ''} unscored.")
+    st.caption(f"Run TF-IDF match scoring against {_name}'s resume. {unscored} pending job{'s' if unscored != 1 else ''} unscored.")
    if st.button("📊 Score All Unscored Jobs", use_container_width=True, type="primary",
                 disabled=unscored == 0):
        with st.spinner("Scoring…"):
@ -167,7 +233,7 @@ with mid:
        st.rerun()
 with right:
-    approved_count = get_job_counts(DEFAULT_DB).get("approved", 0)
+    approved_count = get_job_counts(get_db_path()).get("approved", 0)
    st.subheader("Send to Notion")
    st.caption("Push all approved jobs to your Notion tracking database.")
    if approved_count == 0:
@ -179,7 +245,7 @@ with right:
        ):
            with st.spinner("Syncing to Notion…"):
                from scripts.sync import sync_to_notion
-                count = sync_to_notion(DEFAULT_DB)
+                count = sync_to_notion(get_db_path())
            st.success(f"Synced {count} job{'s' if count != 1 else ''} to Notion!")
            st.rerun()
@ -194,18 +260,18 @@ with email_left:
               "New recruiter outreach is added to your Job Review queue.")
 with email_right:
-    _email_task = get_task_for_job(DEFAULT_DB, "email_sync", 0)
+    _email_task = get_task_for_job(get_db_path(), "email_sync", 0)
    _email_running = _email_task and _email_task["status"] in ("queued", "running")
    if st.button("📧 Sync Emails", use_container_width=True, type="primary",
                 disabled=bool(_email_running)):
-        submit_task(DEFAULT_DB, "email_sync", 0)
+        submit_task(get_db_path(), "email_sync", 0)
        st.rerun()
    if _email_running:
        @st.fragment(run_every=4)
        def _email_status():
-            t = get_task_for_job(DEFAULT_DB, "email_sync", 0)
+            t = get_task_for_job(get_db_path(), "email_sync", 0)
            if t and t["status"] in ("queued", "running"):
                st.info("⏳ Syncing emails…")
            else:
@ -240,7 +306,7 @@ with url_tab:
                 disabled=not (url_text or "").strip()):
        _urls = [u.strip() for u in url_text.strip().splitlines() if u.strip().startswith("http")]
        if _urls:
-            _n = _queue_url_imports(DEFAULT_DB, _urls)
+            _n = _queue_url_imports(get_db_path(), _urls)
            if _n:
                st.success(f"Queued {_n} job{'s' if _n != 1 else ''} for import. Check Job Review shortly.")
            else:
@ -263,7 +329,7 @@ with csv_tab:
        if _csv_urls:
            st.caption(f"Found {len(_csv_urls)} URL(s) in CSV.")
            if st.button("📥 Import CSV Jobs", key="add_csv_btn", use_container_width=True):
-                _n = _queue_url_imports(DEFAULT_DB, _csv_urls)
+                _n = _queue_url_imports(get_db_path(),_csv_urls)
                st.success(f"Queued {_n} job{'s' if _n != 1 else ''} for import.")
                st.rerun()
        else:
@ -273,7 +339,7 @@ with csv_tab:
@st.fragment(run_every=3)
 def _scrape_status():
    import sqlite3 as _sq
-    conn = _sq.connect(DEFAULT_DB)
+    conn = _sq.connect(get_db_path())
    conn.row_factory = _sq.Row
    rows = conn.execute(
        """SELECT bt.status, bt.error, j.title, j.company, j.url
@ -320,7 +386,7 @@ with st.expander("⚠️ Danger Zone", expanded=False):
            st.warning("Are you sure? This cannot be undone.")
            c1, c2 = st.columns(2)
            if c1.button("Yes, purge", type="primary", use_container_width=True):
-                deleted = purge_jobs(DEFAULT_DB, statuses=["pending", "rejected"])
+                deleted = purge_jobs(get_db_path(), statuses=["pending", "rejected"])
                st.success(f"Purged {deleted} jobs.")
                st.session_state.pop("confirm_purge", None)
                st.rerun()
@ -338,7 +404,7 @@ with st.expander("⚠️ Danger Zone", expanded=False):
            st.warning("This deletes all email contacts and email-sourced jobs. Cannot be undone.")
            c1, c2 = st.columns(2)
            if c1.button("Yes, purge emails", type="primary", use_container_width=True):
-                contacts, jobs = purge_email_data(DEFAULT_DB)
+                contacts, jobs = purge_email_data(get_db_path())
                st.success(f"Purged {contacts} email contacts, {jobs} email jobs.")
                st.session_state.pop("confirm_purge", None)
                st.rerun()
@ -347,11 +413,11 @@ with st.expander("⚠️ Danger Zone", expanded=False):
                st.rerun()
    with tasks_col:
-        _active = get_active_tasks(DEFAULT_DB)
+        _active = get_active_tasks(get_db_path())
        st.markdown("**Kill stuck tasks**")
        st.caption(f"Force-fail all queued/running background tasks. Currently **{len(_active)}** active.")
        if st.button("⏹ Kill All Tasks", use_container_width=True, disabled=len(_active) == 0):
-            killed = kill_stuck_tasks(DEFAULT_DB)
+            killed = kill_stuck_tasks(get_db_path())
            st.success(f"Killed {killed} task(s).")
            st.rerun()
@ -365,8 +431,8 @@ with st.expander("⚠️ Danger Zone", expanded=False):
            st.warning("This will delete ALL pending, approved, and rejected jobs, then re-scrape. Applied and synced records are kept.")
            c1, c2 = st.columns(2)
            if c1.button("Yes, wipe + scrape", type="primary", use_container_width=True):
-                purge_jobs(DEFAULT_DB, statuses=["pending", "approved", "rejected"])
+                purge_jobs(get_db_path(), statuses=["pending", "approved", "rejected"])
-                submit_task(DEFAULT_DB, "discovery", 0)
+                submit_task(get_db_path(), "discovery", 0)
                st.session_state.pop("confirm_purge", None)
                st.rerun()
            if c2.button("Cancel ", use_container_width=True):
@ -387,7 +453,7 @@ with st.expander("⚠️ Danger Zone", expanded=False):
            st.warning("Deletes all pending jobs. Rejected jobs are kept. Cannot be undone.")
            c1, c2 = st.columns(2)
            if c1.button("Yes, purge pending", type="primary", use_container_width=True):
-                deleted = purge_jobs(DEFAULT_DB, statuses=["pending"])
+                deleted = purge_jobs(get_db_path(), statuses=["pending"])
                st.success(f"Purged {deleted} pending jobs.")
                st.session_state.pop("confirm_purge", None)
                st.rerun()
@ -405,7 +471,7 @@ with st.expander("⚠️ Danger Zone", expanded=False):
            st.warning("Deletes all non-remote jobs not yet applied to. Cannot be undone.")
            c1, c2 = st.columns(2)
            if c1.button("Yes, purge on-site", type="primary", use_container_width=True):
-                deleted = purge_non_remote(DEFAULT_DB)
+                deleted = purge_non_remote(get_db_path())
                st.success(f"Purged {deleted} non-remote jobs.")
                st.session_state.pop("confirm_purge", None)
                st.rerun()
@ -423,7 +489,7 @@ with st.expander("⚠️ Danger Zone", expanded=False):
            st.warning("Deletes all approved-but-not-applied jobs. Cannot be undone.")
            c1, c2 = st.columns(2)
            if c1.button("Yes, purge approved", type="primary", use_container_width=True):
-                deleted = purge_jobs(DEFAULT_DB, statuses=["approved"])
+                deleted = purge_jobs(get_db_path(), statuses=["approved"])
                st.success(f"Purged {deleted} approved jobs.")
                st.session_state.pop("confirm_purge", None)
                st.rerun()
@ -448,7 +514,7 @@ with st.expander("⚠️ Danger Zone", expanded=False):
            st.info("Jobs will be archived (not deleted) — URLs are kept for dedup.")
            c1, c2 = st.columns(2)
            if c1.button("Yes, archive", type="primary", use_container_width=True):
-                archived = archive_jobs(DEFAULT_DB, statuses=["pending", "rejected"])
+                archived = archive_jobs(get_db_path(), statuses=["pending", "rejected"])
                st.success(f"Archived {archived} jobs.")
                st.session_state.pop("confirm_purge", None)
                st.rerun()
@ -466,10 +532,38 @@ with st.expander("⚠️ Danger Zone", expanded=False):
            st.info("Approved jobs will be archived (not deleted).")
            c1, c2 = st.columns(2)
            if c1.button("Yes, archive approved", type="primary", use_container_width=True):
-                archived = archive_jobs(DEFAULT_DB, statuses=["approved"])
+                archived = archive_jobs(get_db_path(), statuses=["approved"])
                st.success(f"Archived {archived} approved jobs.")
                st.session_state.pop("confirm_purge", None)
                st.rerun()
            if c2.button("Cancel       ", use_container_width=True):
                st.session_state.pop("confirm_purge", None)
                st.rerun()
 # ── Setup banners ─────────────────────────────────────────────────────────────
 if _profile and _profile.wizard_complete:
    _dismissed = set(_profile.dismissed_banners)
    _pending_banners = [
        b for b in _SETUP_BANNERS
        if b["key"] not in _dismissed and not b.get("done", lambda: False)()
    ]
    if _pending_banners:
        st.divider()
        st.markdown("#### Finish setting up Peregrine")
        for banner in _pending_banners:
            _bcol, _bdismiss = st.columns([10, 1])
            with _bcol:
                _ic, _lc = st.columns([3, 1])
                _ic.info(f"💡 {banner['text']}")
                with _lc:
                    st.write("")
                    st.page_link("pages/2_Settings.py", label=banner['link_label'], icon="⚙️")
            with _bdismiss:
                st.write("")
                if st.button("✕", key=f"dismiss_banner_{banner['key']}", help="Dismiss"):
                    _data = yaml.safe_load(_USER_YAML.read_text()) if _USER_YAML.exists() else {}
                    _data.setdefault("dismissed_banners", [])
                    if banner["key"] not in _data["dismissed_banners"]:
                        _data["dismissed_banners"].append(banner["key"])
                    _USER_YAML.write_text(yaml.dump(_data, default_flow_style=False, allow_unicode=True))
                    st.rerun()
--- a/app/init.py
+++ b/app/init.py
--- a/app/app.py
+++ b/app/app.py
@ -7,22 +7,32 @@ a "System" section so it doesn't crowd the navigation.
 Run: streamlit run app/app.py
     bash scripts/manage-ui.sh start
 """
 import logging
 import os
 import subprocess
 import sys
 from pathlib import Path
 sys.path.insert(0, str(Path(__file__).parent.parent))
 logging.basicConfig(level=logging.WARNING, format="%(name)s %(levelname)s: %(message)s")
 IS_DEMO = os.environ.get("DEMO_MODE", "").lower() in ("1", "true", "yes")
 import streamlit as st
 from scripts.db import DEFAULT_DB, init_db, get_active_tasks
 from app.feedback import inject_feedback_button
 from app.cloud_session import resolve_session, get_db_path, get_config_dir
 import sqlite3
 st.set_page_config(
-    page_title="Job Seeker",
+    page_title="Peregrine",
    page_icon="💼",
    layout="wide",
 )
-init_db(DEFAULT_DB)
+resolve_session("peregrine")
 init_db(get_db_path())
 # ── Startup cleanup — runs once per server process via cache_resource ──────────
@st.cache_resource
@ -32,12 +42,12 @@ def _startup() -> None:
    2. Auto-queues re-runs for any research generated without SearXNG data,
       if SearXNG is now reachable.
    """
-    conn = sqlite3.connect(DEFAULT_DB)
+    # Reset only in-flight tasks — queued tasks survive for the scheduler to resume.
-    conn.execute(
+    # MUST run before any submit_task() call in this function.
-        "UPDATE background_tasks SET status='failed', error='Interrupted by server restart',"
+    from scripts.db import reset_running_tasks
-        " finished_at=datetime('now') WHERE status IN ('queued','running')"
+    reset_running_tasks(get_db_path())
-    )
+
-    conn.commit()
+    conn = sqlite3.connect(get_db_path())
    # Auto-recovery: re-run LLM-only research when SearXNG is available
    try:
@ -53,7 +63,7 @@ def _startup() -> None:
                _ACTIVE_STAGES,
            ).fetchall()
            for (job_id,) in rows:
-                submit_task(str(DEFAULT_DB), "company_research", job_id)
+                submit_task(str(get_db_path()), "company_research", job_id)
    except Exception:
        pass  # never block startup
@ -61,6 +71,26 @@ def _startup() -> None:
 _startup()
 # Silent license refresh on startup — no-op if unreachable
 try:
    from scripts.license import refresh_if_needed as _refresh_license
    _refresh_license()
 except Exception:
    pass
 # ── First-run wizard gate ───────────────────────────────────────────────────────
 from scripts.user_profile import UserProfile as _UserProfile
 _USER_YAML = get_config_dir() / "user.yaml"
 _show_wizard = not IS_DEMO and (
    not _UserProfile.exists(_USER_YAML)
    or not _UserProfile(_USER_YAML).wizard_complete
 )
 if _show_wizard:
    _setup_page = st.Page("pages/0_Setup.py", title="Setup", icon="👋")
    st.navigation({"": [_setup_page]}).run()
    st.stop()
 # ── Navigation ─────────────────────────────────────────────────────────────────
 # st.navigation() must be called before any sidebar writes so it can establish
 # the navigation structure first; sidebar additions come after.
@ -85,7 +115,7 @@ pg = st.navigation(pages)
 # The sidebar context WRAPS the fragment call — do not write to st.sidebar inside it.
@st.fragment(run_every=3)
 def _task_indicator():
-    tasks = get_active_tasks(DEFAULT_DB)
+    tasks = get_active_tasks(get_db_path())
    if not tasks:
        return
    st.divider()
@ -105,6 +135,8 @@ def _task_indicator():
            label = "Enriching"
        elif task_type == "scrape_url":
            label = "Scraping URL"
        elif task_type == "wizard_generate":
            label = "Wizard generation"
        elif task_type == "enrich_craigslist":
            label = "Enriching listing"
        else:
@ -113,7 +145,47 @@ def _task_indicator():
        detail = f" · {stage}" if stage else (f" — {t.get('company')}" if t.get("company") else "")
        st.caption(f"{icon} {label}{detail}")
@st.cache_resource
 def _get_version() -> str:
    try:
        return subprocess.check_output(
            ["git", "describe", "--tags", "--always"],
            cwd=Path(__file__).parent.parent,
            text=True,
        ).strip()
    except Exception:
        return "dev"
 with st.sidebar:
    if IS_DEMO:
        st.info(
            "**Public demo** — read-only sample data. "
            "AI features and data saves are disabled.\n\n"
            "[Get your own instance →](https://circuitforge.tech/software/peregrine)",
            icon="🔒",
        )
    _task_indicator()
    # Cloud LLM indicator — shown whenever any cloud backend is active
    _llm_cfg_path = Path(__file__).parent.parent / "config" / "llm.yaml"
    try:
        import yaml as _yaml
        from scripts.byok_guard import cloud_backends as _cloud_backends
        _active_cloud = _cloud_backends(_yaml.safe_load(_llm_cfg_path.read_text(encoding="utf-8")) or {})
    except Exception:
        _active_cloud = []
    if _active_cloud:
        _provider_names = ", ".join(b.replace("_", " ").title() for b in _active_cloud)
        st.warning(
            f"**Cloud LLM active**\n\n"
            f"{_provider_names}\n\n"
            "AI features send content to this provider. "
            "[Change in Settings](2_Settings)",
            icon="🔓",
        )
    st.divider()
    st.caption(f"Peregrine {_get_version()}")
    inject_feedback_button(page=pg.title)
 pg.run()
--- a/app/cloud_session.py
+++ b/app/cloud_session.py
@ -0,0 +1,187 @@
 # peregrine/app/cloud_session.py
 """
 Cloud session middleware for multi-tenant Peregrine deployment.
 In local-first mode (CLOUD_MODE unset or false), all functions are no-ops.
 In cloud mode (CLOUD_MODE=true), resolves the Directus session JWT from the
 X-CF-Session header, validates it, and injects user_id + db_path into
 st.session_state.
 All Peregrine pages call get_db_path() instead of DEFAULT_DB directly to
 transparently support both local and cloud deployments.
 """
 import logging
 import os
 import re
 import hmac
 import hashlib
 from pathlib import Path
 import requests
 import streamlit as st
 from scripts.db import DEFAULT_DB
 log = logging.getLogger(__name__)
 CLOUD_MODE: bool = os.environ.get("CLOUD_MODE", "").lower() in ("1", "true", "yes")
 CLOUD_DATA_ROOT: Path = Path(os.environ.get("CLOUD_DATA_ROOT", "/devl/menagerie-data"))
 DIRECTUS_JWT_SECRET: str = os.environ.get("DIRECTUS_JWT_SECRET", "")
 SERVER_SECRET: str = os.environ.get("CF_SERVER_SECRET", "")
 # Heimdall license server — internal URL preferred when running on the same host
 HEIMDALL_URL: str = os.environ.get("HEIMDALL_URL", "https://license.circuitforge.tech")
 HEIMDALL_ADMIN_TOKEN: str = os.environ.get("HEIMDALL_ADMIN_TOKEN", "")
 def _extract_session_token(cookie_header: str) -> str:
    """Extract cf_session value from a Cookie header string."""
    m = re.search(r'(?:^|;)\s*cf_session=([^;]+)', cookie_header)
    return m.group(1).strip() if m else ""
@st.cache_data(ttl=300, show_spinner=False)
 def _fetch_cloud_tier(user_id: str, product: str) -> str:
    """Call Heimdall to resolve the current cloud tier for this user.
    Cached per (user_id, product) for 5 minutes to avoid hammering Heimdall
    on every Streamlit rerun. Returns "free" on any error so the app degrades
    gracefully rather than blocking the user.
    """
    if not HEIMDALL_ADMIN_TOKEN:
        log.warning("HEIMDALL_ADMIN_TOKEN not set — defaulting tier to free")
        return "free"
    try:
        resp = requests.post(
            f"{HEIMDALL_URL}/admin/cloud/resolve",
            json={"user_id": user_id, "product": product},
            headers={"Authorization": f"Bearer {HEIMDALL_ADMIN_TOKEN}"},
            timeout=5,
        )
        if resp.status_code == 200:
            return resp.json().get("tier", "free")
        if resp.status_code == 404:
            # No cloud key yet — user signed up before provision ran; return free.
            return "free"
        log.warning("Heimdall resolve returned %s — defaulting tier to free", resp.status_code)
    except Exception as exc:
        log.warning("Heimdall tier resolve failed: %s — defaulting to free", exc)
    return "free"
 def validate_session_jwt(token: str) -> str:
    """Validate a Directus session JWT and return the user UUID. Raises on failure."""
    import jwt  # PyJWT — lazy import so local mode never needs it
    payload = jwt.decode(token, DIRECTUS_JWT_SECRET, algorithms=["HS256"])
    user_id = payload.get("id") or payload.get("sub")
    if not user_id:
        raise ValueError("JWT missing user id claim")
    return user_id
 def _user_data_path(user_id: str, app: str) -> Path:
    return CLOUD_DATA_ROOT / user_id / app
 def derive_db_key(user_id: str) -> str:
    """Derive a per-user SQLCipher encryption key from the server secret."""
    return hmac.new(
        SERVER_SECRET.encode(),
        user_id.encode(),
        hashlib.sha256,
    ).hexdigest()
 def _render_auth_wall(message: str = "Please sign in to continue.") -> None:
    """Render a branded sign-in prompt and halt the page."""
    st.markdown(
        """
        <style>
        [data-testid="stSidebar"] { display: none; }
        [data-testid="collapsedControl"] { display: none; }
        </style>
        """,
        unsafe_allow_html=True,
    )
    col = st.columns([1, 2, 1])[1]
    with col:
        st.markdown("## 🦅 Peregrine")
        st.info(message, icon="🔒")
        st.link_button(
            "Sign in to CircuitForge",
            url=f"https://circuitforge.tech/login?next=/peregrine",
            use_container_width=True,
        )
 def resolve_session(app: str = "peregrine") -> None:
    """
    Call at the top of each Streamlit page.
    In local mode: no-op.
    In cloud mode: reads X-CF-Session header, validates JWT, creates user
    data directory on first visit, and sets st.session_state keys:
      - user_id: str
      - db_path: Path
      - db_key: str   (SQLCipher key for this user)
      - cloud_tier: str  (free | paid | premium | ultra — resolved from Heimdall)
    Idempotent — skips if user_id already in session_state.
    """
    if not CLOUD_MODE:
        return
    if st.session_state.get("user_id"):
        return
    cookie_header = st.context.headers.get("x-cf-session", "")
    session_jwt = _extract_session_token(cookie_header)
    if not session_jwt:
        _render_auth_wall("Please sign in to access Peregrine.")
        st.stop()
    try:
        user_id = validate_session_jwt(session_jwt)
    except Exception:
        _render_auth_wall("Your session has expired. Please sign in again.")
        st.stop()
    user_path = _user_data_path(user_id, app)
    user_path.mkdir(parents=True, exist_ok=True)
    (user_path / "config").mkdir(exist_ok=True)
    (user_path / "data").mkdir(exist_ok=True)
    st.session_state["user_id"] = user_id
    st.session_state["db_path"] = user_path / "staging.db"
    st.session_state["db_key"] = derive_db_key(user_id)
    st.session_state["cloud_tier"] = _fetch_cloud_tier(user_id, app)
 def get_db_path() -> Path:
    """
    Return the active db_path for this session.
    Cloud: user-scoped path from session_state.
    Local: DEFAULT_DB (from STAGING_DB env var or repo default).
    """
    return st.session_state.get("db_path", DEFAULT_DB)
 def get_config_dir() -> Path:
    """
    Return the config directory for this session.
    Cloud: per-user path (<data_root>/<user_id>/peregrine/config/) so each
           user's YAML files (user.yaml, plain_text_resume.yaml, etc.) are
           isolated and never shared across tenants.
    Local: repo-level config/ directory.
    """
    if CLOUD_MODE and st.session_state.get("db_path"):
        return Path(st.session_state["db_path"]).parent / "config"
    return Path(__file__).parent.parent / "config"
 def get_cloud_tier() -> str:
    """
    Return the current user's cloud tier.
    Cloud mode: resolved from Heimdall at session start (cached 5 min).
    Local mode: always returns "local" so pages can distinguish self-hosted from cloud.
    """
    if not CLOUD_MODE:
        return "local"
    return st.session_state.get("cloud_tier", "free")
--- a/app/components/init.py
+++ b/app/components/init.py
@ -0,0 +1 @@
 # app/components/__init__.py
--- a/app/components/linkedin_import.py
+++ b/app/components/linkedin_import.py
@ -0,0 +1,192 @@
 # app/components/linkedin_import.py
 """
 Shared LinkedIn import widget.
 Usage in a page:
    from app.components.linkedin_import import render_linkedin_tab
    # At top of page render — check for pending import:
    _li_data = st.session_state.pop("_linkedin_extracted", None)
    if _li_data:
        st.session_state["_parsed_resume"] = _li_data
        st.rerun()
    # Inside the LinkedIn tab:
    with tab_linkedin:
        render_linkedin_tab(config_dir=CONFIG_DIR, tier=tier)
 """
 from __future__ import annotations
 import json
 import re
 from datetime import datetime, timezone
 from pathlib import Path
 import streamlit as st
 _LINKEDIN_PROFILE_RE = re.compile(r"https?://(www\.)?linkedin\.com/in/", re.I)
 def _stage_path(config_dir: Path) -> Path:
    return config_dir / "linkedin_stage.json"
 def _load_stage(config_dir: Path) -> dict | None:
    path = _stage_path(config_dir)
    if not path.exists():
        return None
    try:
        return json.loads(path.read_text())
    except Exception:
        return None
 def _days_ago(iso_ts: str) -> str:
    try:
        dt = datetime.fromisoformat(iso_ts)
        delta = datetime.now(timezone.utc) - dt
        days = delta.days
        if days == 0:
            return "today"
        if days == 1:
            return "yesterday"
        return f"{days} days ago"
    except Exception:
        return "unknown"
 def _do_scrape(url: str, config_dir: Path) -> None:
    """Validate URL, run scrape, update state."""
    if not _LINKEDIN_PROFILE_RE.match(url):
        st.error("Please enter a LinkedIn profile URL (linkedin.com/in/…)")
        return
    with st.spinner("Fetching LinkedIn profile… (10–20 seconds)"):
        try:
            from scripts.linkedin_scraper import scrape_profile
            scrape_profile(url, _stage_path(config_dir))
            st.success("Profile imported successfully.")
            st.rerun()
        except ValueError as e:
            st.error(str(e))
        except RuntimeError as e:
            st.warning(str(e))
        except Exception as e:
            st.error(f"Unexpected error: {e}")
 def render_linkedin_tab(config_dir: Path, tier: str) -> None:
    """
    Render the LinkedIn import UI.
    When the user clicks "Use this data", writes the extracted dict to
    st.session_state["_linkedin_extracted"] and calls st.rerun().
    Caller reads: data = st.session_state.pop("_linkedin_extracted", None)
    """
    stage = _load_stage(config_dir)
    # ── Staged data status bar ────────────────────────────────────────────────
    if stage:
        scraped_at = stage.get("scraped_at", "")
        source_label = "LinkedIn export" if stage.get("source") == "export_zip" else "LinkedIn profile"
        col_info, col_refresh = st.columns([4, 1])
        col_info.caption(f"Last imported from {source_label}: {_days_ago(scraped_at)}")
        if col_refresh.button("🔄 Refresh", key="li_refresh"):
            url = stage.get("url")
            if url:
                _do_scrape(url, config_dir)
            else:
                st.info("Original URL not available — paste the URL below to re-import.")
    # ── URL import ────────────────────────────────────────────────────────────
    st.markdown("**Import from LinkedIn profile URL**")
    url_input = st.text_input(
        "LinkedIn profile URL",
        placeholder="https://linkedin.com/in/your-name",
        label_visibility="collapsed",
        key="li_url_input",
    )
    if st.button("🔗 Import from LinkedIn", key="li_import_btn", type="primary"):
        if not url_input.strip():
            st.warning("Please enter your LinkedIn profile URL.")
        else:
            _do_scrape(url_input.strip(), config_dir)
    st.caption(
        "Imports from your public LinkedIn profile. No login or credentials required. "
        "Scraping typically takes 10–20 seconds."
    )
    st.info(
        "**LinkedIn limits public profile data.** Without logging in, LinkedIn only "
        "exposes your name, About summary, current employer, and certifications — "
        "past roles, education, and skills are hidden behind their login wall. "
        "For your full career history use the **data export zip** option below.",
        icon="ℹ️",
    )
    # ── Section preview + use button ─────────────────────────────────────────
    if stage:
        from scripts.linkedin_parser import parse_stage
        extracted, err = parse_stage(_stage_path(config_dir))
        if err:
            st.warning(f"Could not read staged data: {err}")
        else:
            st.divider()
            st.markdown("**Preview**")
            col1, col2, col3 = st.columns(3)
            col1.metric("Experience entries", len(extracted.get("experience", [])))
            col2.metric("Skills", len(extracted.get("skills", [])))
            col3.metric("Certifications", len(extracted.get("achievements", [])))
            if extracted.get("career_summary"):
                with st.expander("Summary"):
                    st.write(extracted["career_summary"])
            if extracted.get("experience"):
                with st.expander(f"Experience ({len(extracted['experience'])} entries)"):
                    for exp in extracted["experience"]:
                        st.markdown(f"**{exp.get('title')}** @ {exp.get('company')} · {exp.get('date_range', '')}")
            if extracted.get("education"):
                with st.expander("Education"):
                    for edu in extracted["education"]:
                        st.markdown(f"**{edu.get('school')}** — {edu.get('degree')} {edu.get('field', '')}".strip())
            if extracted.get("skills"):
                with st.expander("Skills"):
                    st.write(", ".join(extracted["skills"]))
            st.divider()
            if st.button("✅ Use this data", key="li_use_btn", type="primary"):
                st.session_state["_linkedin_extracted"] = extracted
                st.rerun()
    # ── Advanced: data export ─────────────────────────────────────────────────
    with st.expander("⬇️ Import from LinkedIn data export (advanced)", expanded=False):
        st.caption(
            "Download your LinkedIn data: **Settings & Privacy → Data Privacy → "
            "Get a copy of your data → Request archive → Fast file**. "
            "The Fast file is available immediately and contains your profile, "
            "experience, education, and skills."
        )
        zip_file = st.file_uploader(
            "Upload LinkedIn export zip", type=["zip"], key="li_zip_upload"
        )
        if zip_file is not None:
            if st.button("📦 Parse export", key="li_parse_zip"):
                with st.spinner("Parsing export archive…"):
                    try:
                        from scripts.linkedin_scraper import parse_export_zip
                        extracted = parse_export_zip(
                            zip_file.read(), _stage_path(config_dir)
                        )
                        st.success(
                            f"Imported {len(extracted.get('experience', []))} experience entries, "
                            f"{len(extracted.get('skills', []))} skills. "
                            "Click 'Use this data' above to apply."
                        )
                        st.rerun()
                    except Exception as e:
                        st.error(f"Failed to parse export: {e}")
--- a/app/components/paste_image.py
+++ b/app/components/paste_image.py
@ -0,0 +1,31 @@
 """
 Paste-from-clipboard / drag-and-drop image component.
 Uses st.components.v1.declare_component so JS can return image bytes to Python
 (st.components.v1.html() is one-way only).  No build step required — the
 frontend is a single index.html file.
 """
 from __future__ import annotations
 import base64
 from pathlib import Path
 import streamlit.components.v1 as components
 _FRONTEND = Path(__file__).parent / "paste_image_ui"
 _paste_image = components.declare_component("paste_image", path=str(_FRONTEND))
 def paste_image_component(key: str | None = None) -> bytes | None:
    """
    Render the paste/drop zone.  Returns PNG/JPEG bytes when an image is
    pasted or dropped, or None if nothing has been submitted yet.
    """
    result = _paste_image(key=key)
    if result:
        try:
            return base64.b64decode(result)
        except Exception:
            return None
    return None
--- a/app/components/paste_image_ui/index.html
+++ b/app/components/paste_image_ui/index.html
@ -0,0 +1,142 @@
 <!DOCTYPE html>
 <html>
 <head>
  <meta charset="utf-8">
  <style>
    * { box-sizing: border-box; margin: 0; padding: 0; }
    body {
      font-family: -apple-system, BlinkMacSystemFont, "Source Sans Pro", sans-serif;
      background: transparent;
    }
    .zone {
      width: 100%;
      min-height: 72px;
      border: 2px dashed var(--border, #ccc);
      border-radius: 8px;
      display: flex;
      align-items: center;
      justify-content: center;
      flex-direction: column;
      gap: 6px;
      padding: 12px 16px;
      cursor: pointer;
      outline: none;
      transition: border-color 0.15s, background 0.15s;
      color: var(--text-muted, #888);
      font-size: 13px;
      text-align: center;
      user-select: none;
    }
    .zone:focus { border-color: var(--primary, #ff4b4b); background: var(--primary-faint, rgba(255,75,75,0.06)); }
    .zone.dragover { border-color: var(--primary, #ff4b4b); background: var(--primary-faint, rgba(255,75,75,0.06)); }
    .zone.done { border-style: solid; border-color: #00c853; color: #00c853; }
    .icon { font-size: 22px; line-height: 1; }
    .hint { font-size: 11px; opacity: 0.7; }
    .status { margin-top: 5px; font-size: 11px; text-align: center; color: var(--text-muted, #888); min-height: 16px; }
  </style>
 </head>
 <body>
  <div class="zone" id="zone" tabindex="0" role="button"
       aria-label="Click to focus, then paste with Ctrl+V, or drag and drop an image">
    <span class="icon">📋</span>
    <span id="mainMsg"><strong>Click here</strong>, then <strong>Ctrl+V</strong> to paste</span>
    <span class="hint" id="hint">or drag &amp; drop an image file</span>
  </div>
  <div class="status" id="status"></div>
  <script>
    const zone = document.getElementById('zone');
    const status = document.getElementById('status');
    const mainMsg = document.getElementById('mainMsg');
    const hint = document.getElementById('hint');
    // ── Streamlit handshake ─────────────────────────────────────────────────
    window.parent.postMessage({ type: "streamlit:componentReady", apiVersion: 1 }, "*");
    function setHeight() {
      const h = document.body.scrollHeight + 4;
      window.parent.postMessage({ type: "streamlit:setFrameHeight", height: h }, "*");
    }
    setHeight();
    // ── Theme ───────────────────────────────────────────────────────────────
    window.addEventListener("message", (e) => {
      if (e.data && e.data.type === "streamlit:render") {
        const t = e.data.args && e.data.args.theme;
        if (!t) return;
        const r = document.documentElement;
        r.style.setProperty("--primary", t.primaryColor || "#ff4b4b");
        r.style.setProperty("--primary-faint", (t.primaryColor || "#ff4b4b") + "10");
        r.style.setProperty("--text-muted", t.textColor ? t.textColor + "99" : "#888");
        r.style.setProperty("--border", t.textColor ? t.textColor + "33" : "#ccc");
        document.body.style.background = t.backgroundColor || "transparent";
      }
    });
    // ── Image handling ──────────────────────────────────────────────────────
    function markDone() {
      zone.classList.add('done');
      // Clear children and rebuild with safe DOM methods
      while (zone.firstChild) zone.removeChild(zone.firstChild);
      const icon = document.createElement('span');
      icon.className = 'icon';
      icon.textContent = '\u2705';
      const msg = document.createElement('span');
      msg.textContent = 'Image ready \u2014 remove or replace below';
      zone.appendChild(icon);
      zone.appendChild(msg);
      setHeight();
    }
    function sendImage(blob) {
      const reader = new FileReader();
      reader.onload = function(ev) {
        const dataUrl = ev.target.result;
        const b64 = dataUrl.slice(dataUrl.indexOf(',') + 1);
        window.parent.postMessage({ type: "streamlit:setComponentValue", value: b64 }, "*");
        markDone();
      };
      reader.readAsDataURL(blob);
    }
    function findImageItem(items) {
      if (!items) return null;
      for (let i = 0; i < items.length; i++) {
        if (items[i].type && items[i].type.indexOf('image/') === 0) return items[i];
      }
      return null;
    }
    // Ctrl+V paste (works over HTTP — uses paste event, not Clipboard API)
    document.addEventListener('paste', function(e) {
      const item = findImageItem(e.clipboardData && e.clipboardData.items);
      if (item) { sendImage(item.getAsFile()); e.preventDefault(); }
    });
    // Drag and drop
    zone.addEventListener('dragover', function(e) {
      e.preventDefault();
      zone.classList.add('dragover');
    });
    zone.addEventListener('dragleave', function() {
      zone.classList.remove('dragover');
    });
    zone.addEventListener('drop', function(e) {
      e.preventDefault();
      zone.classList.remove('dragover');
      const files = e.dataTransfer && e.dataTransfer.files;
      if (files && files.length) {
        for (let i = 0; i < files.length; i++) {
          if (files[i].type.indexOf('image/') === 0) { sendImage(files[i]); return; }
        }
      }
      // Fallback: dataTransfer items (e.g. dragged from browser)
      const item = findImageItem(e.dataTransfer && e.dataTransfer.items);
      if (item) sendImage(item.getAsFile());
    });
    // Click to focus so Ctrl+V lands in this iframe
    zone.addEventListener('click', function() { zone.focus(); });
  </script>
 </body>
 </html>
--- a/app/feedback.py
+++ b/app/feedback.py
@ -0,0 +1,247 @@
 """
 Floating feedback button + dialog — thin Streamlit shell.
 All business logic lives in scripts/feedback_api.py.
 """
 from __future__ import annotations
 import os
 import sys
 from pathlib import Path
 sys.path.insert(0, str(Path(__file__).parent.parent))
 import streamlit as st
 # ── CSS: float the button to the bottom-right corner ─────────────────────────
 # Targets the button by its aria-label (set via `help=` parameter).
 _FLOAT_CSS = """
 <style>
 button[aria-label="Send feedback or report a bug"] {
    position: fixed !important;
    bottom: 2rem !important;
    right: 2rem !important;
    z-index: 9999 !important;
    border-radius: 25px !important;
    padding: 0.5rem 1.25rem !important;
    box-shadow: 0 4px 16px rgba(0,0,0,0.25) !important;
    font-size: 0.9rem !important;
 }
 </style>
 """
@st.dialog("Send Feedback", width="large")
 def _feedback_dialog(page: str) -> None:
    """Two-step feedback dialog: form → consent/attachments → submit."""
    from scripts.feedback_api import (
        collect_context, collect_logs, collect_listings,
        build_issue_body, create_forgejo_issue, upload_attachment,
    )
    from scripts.db import DEFAULT_DB
    # ── Initialise step counter ───────────────────────────────────────────────
    if "fb_step" not in st.session_state:
        st.session_state.fb_step = 1
    # ═════════════════════════════════════════════════════════════════════════
    # STEP 1 — Form
    # ═════════════════════════════════════════════════════════════════════════
    if st.session_state.fb_step == 1:
        st.subheader("What's on your mind?")
        fb_type = st.selectbox(
            "Type", ["Bug", "Feature Request", "Other"], key="fb_type"
        )
        fb_title = st.text_input(
            "Title", placeholder="Short summary of the issue or idea", key="fb_title"
        )
        fb_desc = st.text_area(
            "Description",
            placeholder="Describe what happened or what you'd like to see...",
            key="fb_desc",
        )
        if fb_type == "Bug":
            st.text_area(
                "Reproduction steps",
                placeholder="1. Go to...\n2. Click...\n3. See error",
                key="fb_repro",
            )
        col_cancel, _, col_next = st.columns([1, 3, 1])
        with col_cancel:
            if st.button("Cancel"):
                _clear_feedback_state()
                st.rerun()  # intentionally closes the dialog
        with col_next:
            if st.button("Next →", type="primary"):
                # Read widget values NOW (same rerun as the click — values are
                # available here even on first click). Copy to non-widget keys
                # so they survive step 2's render (Streamlit removes widget
                # state for widgets that are no longer rendered).
                title = fb_title.strip()
                desc = fb_desc.strip()
                if not title or not desc:
                    st.error("Please fill in both Title and Description.")
                else:
                    st.session_state.fb_data_type = fb_type
                    st.session_state.fb_data_title = title
                    st.session_state.fb_data_desc = desc
                    st.session_state.fb_data_repro = st.session_state.get("fb_repro", "")
                    st.session_state.fb_step = 2
    # ═════════════════════════════════════════════════════════════════════════
    # STEP 2 — Consent + attachments
    # ═════════════════════════════════════════════════════════════════════════
    elif st.session_state.fb_step == 2:
        st.subheader("Optional: attach diagnostic data")
        # ── Diagnostic data toggle + preview ─────────────────────────────────
        include_diag = st.toggle(
            "Include diagnostic data (logs + recent listings)", key="fb_diag"
        )
        if include_diag:
            with st.expander("Preview what will be sent", expanded=True):
                st.caption("**App logs (last 100 lines, PII masked):**")
                st.code(collect_logs(100), language=None)
                st.caption("**Recent listings (title / company / URL only):**")
                for j in collect_listings(DEFAULT_DB, 5):
                    st.write(f"- {j['title']} @ {j['company']} — {j['url']}")
        # ── Screenshot ────────────────────────────────────────────────────────
        st.divider()
        st.caption("**Screenshot** (optional)")
        from app.components.paste_image import paste_image_component
        # Keyed so we can reset the component when the user removes the image
        if "fb_paste_key" not in st.session_state:
            st.session_state.fb_paste_key = 0
        pasted = paste_image_component(key=f"fb_paste_{st.session_state.fb_paste_key}")
        if pasted:
            st.session_state.fb_screenshot = pasted
        st.caption("or upload a file:")
        uploaded = st.file_uploader(
            "Upload screenshot",
            type=["png", "jpg", "jpeg"],
            label_visibility="collapsed",
            key="fb_upload",
        )
        if uploaded:
            st.session_state.fb_screenshot = uploaded.read()
        if st.session_state.get("fb_screenshot"):
            st.image(
                st.session_state["fb_screenshot"],
                caption="Screenshot preview — this will be attached to the issue",
                use_container_width=True,
            )
            if st.button("🗑 Remove screenshot"):
                st.session_state.pop("fb_screenshot", None)
                st.session_state.fb_paste_key = st.session_state.get("fb_paste_key", 0) + 1
                # no st.rerun() — button click already re-renders the dialog
        # ── Attribution consent ───────────────────────────────────────────────
        st.divider()
        submitter: str | None = None
        try:
            import yaml
            _ROOT = Path(__file__).parent.parent
            user = yaml.safe_load((_ROOT / "config" / "user.yaml").read_text()) or {}
            name = (user.get("name") or "").strip()
            email = (user.get("email") or "").strip()
            if name or email:
                label = f"Include my name & email in the report: **{name}** ({email})"
                if st.checkbox(label, key="fb_attr"):
                    submitter = f"{name} <{email}>"
        except Exception:
            pass
        # ── Navigation ────────────────────────────────────────────────────────
        col_back, _, col_submit = st.columns([1, 3, 2])
        with col_back:
            if st.button("← Back"):
                st.session_state.fb_step = 1
                # no st.rerun() — button click already re-renders the dialog
        with col_submit:
            if st.button("Submit Feedback", type="primary"):
                _submit(page, include_diag, submitter, collect_context,
                        collect_logs, collect_listings, build_issue_body,
                        create_forgejo_issue, upload_attachment, DEFAULT_DB)
 def _submit(page, include_diag, submitter, collect_context, collect_logs,
            collect_listings, build_issue_body, create_forgejo_issue,
            upload_attachment, db_path) -> None:
    """Handle form submission: build body, file issue, upload screenshot."""
    with st.spinner("Filing issue…"):
        context = collect_context(page)
        attachments: dict = {}
        if include_diag:
            attachments["logs"] = collect_logs(100)
            attachments["listings"] = collect_listings(db_path, 5)
        if submitter:
            attachments["submitter"] = submitter
        fb_type = st.session_state.get("fb_data_type", "Other")
        type_key = {"Bug": "bug", "Feature Request": "feature", "Other": "other"}.get(
            fb_type, "other"
        )
        labels = ["beta-feedback", "needs-triage"]
        labels.append(
            {"bug": "bug", "feature": "feature-request"}.get(type_key, "question")
        )
        form = {
            "type": type_key,
            "description": st.session_state.get("fb_data_desc", ""),
            "repro": st.session_state.get("fb_data_repro", "") if type_key == "bug" else "",
        }
        body = build_issue_body(form, context, attachments)
        try:
            result = create_forgejo_issue(
                st.session_state.get("fb_data_title", "Feedback"), body, labels
            )
            screenshot = st.session_state.get("fb_screenshot")
            if screenshot:
                upload_attachment(result["number"], screenshot)
            _clear_feedback_state()
            st.success(f"Issue filed! [View on Forgejo]({result['url']})")
            st.balloons()
        except Exception as exc:
            st.error(f"Failed to file issue: {exc}")
 def _clear_feedback_state() -> None:
    for key in [
        "fb_step",
        "fb_type", "fb_title", "fb_desc", "fb_repro",       # widget keys
        "fb_data_type", "fb_data_title", "fb_data_desc", "fb_data_repro",  # saved data
        "fb_diag", "fb_upload", "fb_attr", "fb_screenshot", "fb_paste_key",
    ]:
        st.session_state.pop(key, None)
 def inject_feedback_button(page: str = "Unknown") -> None:
    """
    Inject the floating feedback button. Call once per page render in app.py.
    Hidden automatically in DEMO_MODE.
    """
    if os.environ.get("DEMO_MODE", "").lower() in ("1", "true", "yes"):
        return
    if not os.environ.get("FORGEJO_API_TOKEN"):
        return  # silently skip if not configured
    st.markdown(_FLOAT_CSS, unsafe_allow_html=True)
    if st.button(
        "💬 Feedback",
        key="__feedback_floating_btn__",
        help="Send feedback or report a bug",
    ):
        _feedback_dialog(page)
--- a/app/pages/0_Setup.py
+++ b/app/pages/0_Setup.py
@ -0,0 +1,744 @@
 """
 First-run setup wizard orchestrator.
 Shown by app.py when user.yaml is absent OR wizard_complete is False.
 Seven steps: hardware → tier → identity → resume → inference → search → integrations (optional).
 Steps 1-6 are mandatory; step 7 is optional and can be skipped.
 Each step writes to user.yaml on "Next" for crash recovery.
 """
 from __future__ import annotations
 import json
 import sys
 from pathlib import Path
 sys.path.insert(0, str(Path(__file__).parent.parent.parent))
 import streamlit as st
 import yaml
 from app.cloud_session import resolve_session, get_db_path, get_config_dir
 resolve_session("peregrine")
 _ROOT       = Path(__file__).parent.parent.parent
 CONFIG_DIR  = get_config_dir()   # per-user dir in cloud; repo config/ locally
 USER_YAML   = CONFIG_DIR / "user.yaml"
 STEPS       = 6  # mandatory steps
 STEP_LABELS = ["Hardware", "Tier", "Resume", "Identity", "Inference", "Search"]
 # ── Helpers ────────────────────────────────────────────────────────────────────
 def _load_yaml() -> dict:
    if USER_YAML.exists():
        return yaml.safe_load(USER_YAML.read_text()) or {}
    return {}
 def _save_yaml(updates: dict) -> None:
    existing = _load_yaml()
    existing.update(updates)
    CONFIG_DIR.mkdir(parents=True, exist_ok=True)
    USER_YAML.write_text(
        yaml.dump(existing, default_flow_style=False, allow_unicode=True)
    )
 def _detect_gpus() -> list[str]:
    """Detect GPUs. Prefers env vars written by preflight (works inside Docker)."""
    import os
    import subprocess
    # Preflight writes PEREGRINE_GPU_NAMES to .env; compose passes it to the container.
    # This is the reliable path when running inside Docker without nvidia-smi access.
    env_names = os.environ.get("PEREGRINE_GPU_NAMES", "").strip()
    if env_names:
        return [n.strip() for n in env_names.split(",") if n.strip()]
    # Fallback: try nvidia-smi directly (works when running bare or with GPU passthrough)
    try:
        out = subprocess.check_output(
            ["nvidia-smi", "--query-gpu=name", "--format=csv,noheader"],
            text=True, timeout=5,
        )
        return [l.strip() for l in out.strip().splitlines() if l.strip()]
    except Exception:
        return []
 def _suggest_profile(gpus: list[str]) -> str:
    import os
    # If preflight already ran and wrote a profile recommendation, use it.
    recommended = os.environ.get("RECOMMENDED_PROFILE", "").strip()
    if recommended:
        return recommended
    if len(gpus) >= 2:
        return "dual-gpu"
    if len(gpus) == 1:
        return "single-gpu"
    return "remote"
 def _submit_wizard_task(section: str, input_data: dict) -> int:
    """Submit a wizard_generate background task. Returns task_id."""
    from scripts.task_runner import submit_task
    params = json.dumps({"section": section, "input": input_data})
    task_id, _ = submit_task(get_db_path(), "wizard_generate", 0, params=params)
    return task_id
 def _poll_wizard_task(section: str) -> dict | None:
    """Return the most recent wizard_generate task row for a given section, or None."""
    import sqlite3
    conn = sqlite3.connect(get_db_path())
    conn.row_factory = sqlite3.Row
    row = conn.execute(
        "SELECT * FROM background_tasks "
        "WHERE task_type='wizard_generate' AND params LIKE ? "
        "ORDER BY id DESC LIMIT 1",
        (f'%"section": "{section}"%',),
    ).fetchone()
    conn.close()
    return dict(row) if row else None
 def _generation_widget(section: str, label: str, tier: str,
                        feature_key: str, input_data: dict) -> str | None:
    """Render a generation button + polling fragment.
    Returns the generated result string if completed and not yet applied, else None.
    Call this inside a step to add LLM generation support.
    The caller decides whether to auto-populate a field with the result.
    """
    from app.wizard.tiers import can_use, tier_label as tl, has_configured_llm
    _has_byok = has_configured_llm()
    if not can_use(tier, feature_key, has_byok=_has_byok):
        st.caption(f"{tl(feature_key, has_byok=_has_byok)} {label}")
        return None
    col_btn, col_fb = st.columns([2, 5])
    if col_btn.button(f"\u2728 {label}", key=f"gen_{section}"):
        _submit_wizard_task(section, input_data)
        st.rerun()
    with st.expander("\u270f\ufe0f Request changes (optional)", expanded=False):
        prev = st.session_state.get(f"_gen_result_{section}", "")
        feedback = st.text_area(
            "Describe what to change", key=f"_feedback_{section}",
            placeholder="e.g. Make it shorter and emphasise leadership",
            height=60,
        )
        if prev and st.button(f"\u21ba Regenerate with feedback", key=f"regen_{section}"):
            _submit_wizard_task(section, {**input_data,
                                          "previous_result": prev,
                                          "feedback": feedback})
            st.rerun()
    # Polling fragment
    result_key = f"_gen_result_{section}"
    @st.fragment(run_every=3)
    def _poll():
        task = _poll_wizard_task(section)
        if not task:
            return
        status = task.get("status")
        if status in ("queued", "running"):
            stage = task.get("stage") or "Queued"
            st.info(f"\u23f3 {stage}\u2026")
        elif status == "completed":
            payload = json.loads(task.get("error") or "{}")
            result = payload.get("result", "")
            if result and result != st.session_state.get(result_key):
                st.session_state[result_key] = result
                st.rerun()
        elif status == "failed":
            st.warning(f"Generation failed: {task.get('error', 'unknown error')}")
    _poll()
    return st.session_state.get(result_key)
 # ── Wizard state init ──────────────────────────────────────────────────────────
 if "wizard_step" not in st.session_state:
    saved = _load_yaml()
    last_completed = saved.get("wizard_step", 0)
    st.session_state.wizard_step = min(last_completed + 1, STEPS + 1)  # resume at next step
 step = st.session_state.wizard_step
 saved_yaml = _load_yaml()
 _tier = saved_yaml.get("dev_tier_override") or saved_yaml.get("tier", "free")
 st.title("\U0001f44b Welcome to Peregrine")
 st.caption("Complete the setup to start your job search. Progress saves automatically.")
 st.progress(
    min((step - 1) / STEPS, 1.0),
    text=f"Step {min(step, STEPS)} of {STEPS}" if step <= STEPS else "Almost done!",
 )
 st.divider()
 # ── Step 1: Hardware ───────────────────────────────────────────────────────────
 if step == 1:
    from app.cloud_session import CLOUD_MODE as _CLOUD_MODE
    if _CLOUD_MODE:
        # Cloud deployment: always single-gpu (Heimdall), skip hardware selection
        _save_yaml({"inference_profile": "single-gpu", "wizard_step": 1})
        st.session_state.wizard_step = 2
        st.rerun()
    from app.wizard.step_hardware import validate, PROFILES
    st.subheader("Step 1 \u2014 Hardware Detection")
    gpus = _detect_gpus()
    suggested = _suggest_profile(gpus)
    if gpus:
        st.success(f"Detected {len(gpus)} GPU(s): {', '.join(gpus)}")
    else:
        st.info("No NVIDIA GPUs detected. 'Remote' or 'CPU' mode recommended.")
    profile = st.selectbox(
        "Inference mode", PROFILES, index=PROFILES.index(suggested),
        help="Controls which Docker services start. Change later in Settings \u2192 Services.",
    )
    if profile in ("single-gpu", "dual-gpu") and not gpus:
        st.warning(
            "No GPUs detected \u2014 GPU profiles require the NVIDIA Container Toolkit. "
            "See README for install instructions."
        )
    if st.button("Next \u2192", type="primary", key="hw_next"):
        errs = validate({"inference_profile": profile})
        if errs:
            st.error("\n".join(errs))
        else:
            _save_yaml({"inference_profile": profile, "wizard_step": 1})
            st.session_state.wizard_step = 2
            st.rerun()
 # ── Step 2: Tier ───────────────────────────────────────────────────────────────
 elif step == 2:
    from app.cloud_session import CLOUD_MODE as _CLOUD_MODE
    if _CLOUD_MODE:
        # Cloud mode: tier already resolved from Heimdall at session init
        cloud_tier = st.session_state.get("cloud_tier", "free")
        _save_yaml({"tier": cloud_tier, "wizard_step": 2})
        st.session_state.wizard_step = 3
        st.rerun()
    from app.wizard.step_tier import validate
    st.subheader("Step 2 \u2014 Choose Your Plan")
    st.caption(
        "**Free** is fully functional for self-hosted local use. "
        "**Paid/Premium** unlock LLM-assisted features."
    )
    tier_options = {
        "free":    "\U0001f193 **Free** \u2014 Local discovery, apply workspace, interviews kanban",
        "paid":    "\U0001f4bc **Paid** \u2014 + AI career summary, company research, email classifier, calendar sync",
        "premium": "\u2b50 **Premium** \u2014 + Voice guidelines, model fine-tuning, multi-user",
    }
    from app.wizard.tiers import TIERS
    current_tier = saved_yaml.get("tier", "free")
    selected_tier = st.radio(
        "Plan",
        list(tier_options.keys()),
        format_func=lambda x: tier_options[x],
        index=TIERS.index(current_tier) if current_tier in TIERS else 0,
    )
    col_back, col_next = st.columns([1, 4])
    if col_back.button("\u2190 Back", key="tier_back"):
        st.session_state.wizard_step = 1
        st.rerun()
    if col_next.button("Next \u2192", type="primary", key="tier_next"):
        errs = validate({"tier": selected_tier})
        if errs:
            st.error("\n".join(errs))
        else:
            _save_yaml({"tier": selected_tier, "wizard_step": 2})
            st.session_state.wizard_step = 3
            st.rerun()
 # ── Step 3: Resume ─────────────────────────────────────────────────────────────
 elif step == 3:
    from app.wizard.step_resume import validate
    st.subheader("Step 3 \u2014 Resume")
    st.caption("Upload your resume for fast parsing, or build it section by section.")
    # Read LinkedIn import result before tabs render (spec: "at step render time")
    _li_data = st.session_state.pop("_linkedin_extracted", None)
    if _li_data:
        st.session_state["_parsed_resume"] = _li_data
    tab_upload, tab_builder, tab_linkedin = st.tabs([
        "\U0001f4ce Upload", "\U0001f4dd Build Manually", "\U0001f517 LinkedIn"
    ])
    with tab_upload:
        uploaded = st.file_uploader("Upload PDF, DOCX, or ODT", type=["pdf", "docx", "odt"])
        if uploaded and st.button("Parse Resume", type="primary", key="parse_resume"):
            from scripts.resume_parser import (
                extract_text_from_pdf, extract_text_from_docx,
                extract_text_from_odt, structure_resume,
            )
            file_bytes = uploaded.read()
            ext = uploaded.name.rsplit(".", 1)[-1].lower()
            if ext == "pdf":
                raw_text = extract_text_from_pdf(file_bytes)
            elif ext == "odt":
                raw_text = extract_text_from_odt(file_bytes)
            else:
                raw_text = extract_text_from_docx(file_bytes)
            with st.spinner("Parsing\u2026"):
                parsed, parse_err = structure_resume(raw_text)
            # Diagnostic: show raw extraction + detected fields regardless of outcome
            with st.expander("🔍 Parse diagnostics", expanded=not bool(parsed and any(
                parsed.get(k) for k in ("name", "experience", "skills")
            ))):
                st.caption("**Raw extracted text (first 800 chars)**")
                st.code(raw_text[:800] if raw_text else "(empty)", language="text")
                if parsed:
                    st.caption("**Detected fields**")
                    st.json({k: (v[:3] if isinstance(v, list) else v) for k, v in parsed.items()})
            if parsed and any(parsed.get(k) for k in ("name", "experience", "skills")):
                st.session_state["_parsed_resume"] = parsed
                st.session_state["_raw_resume_text"] = raw_text
                _save_yaml({"_raw_resume_text": raw_text[:8000]})
                st.success("Parsed! Review the builder tab to edit entries.")
            elif parsed:
                # Parsed but empty — show what we got and let them proceed or build manually
                st.session_state["_parsed_resume"] = parsed
                st.warning("Resume text was extracted but no fields were recognised. "
                           "Check the diagnostics above — the section headers may use unusual labels. "
                           "You can still fill in the Build tab manually.")
            else:
                st.warning("Auto-parse failed \u2014 switch to the Build tab and add entries manually.")
                if parse_err:
                    st.caption(f"Reason: {parse_err}")
    with tab_builder:
        parsed = st.session_state.get("_parsed_resume", {})
        experience = st.session_state.get(
            "_experience",
            parsed.get("experience") or saved_yaml.get("experience", []),
        )
        for i, entry in enumerate(experience):
            with st.expander(
                f"{entry.get('title', 'Entry')} @ {entry.get('company', '?')}",
                expanded=(i == len(experience) - 1),
            ):
                entry["company"] = st.text_input("Company", entry.get("company", ""), key=f"co_{i}")
                entry["title"]   = st.text_input("Title",   entry.get("title",   ""), key=f"ti_{i}")
                raw_bullets = st.text_area(
                    "Responsibilities (one per line)",
                    "\n".join(entry.get("bullets", [])),
                    key=f"bu_{i}", height=80,
                )
                entry["bullets"] = [b.strip() for b in raw_bullets.splitlines() if b.strip()]
                if st.button("Remove entry", key=f"rm_{i}"):
                    experience.pop(i)
                    st.session_state["_experience"] = experience
                    st.rerun()
        if st.button("\uff0b Add work experience entry", key="add_exp"):
            experience.append({"company": "", "title": "", "bullets": []})
            st.session_state["_experience"] = experience
            st.rerun()
        # Bullet expansion generation
        if experience:
            all_bullets = "\n".join(
                b for e in experience for b in e.get("bullets", [])
            )
            _generation_widget(
                section="expand_bullets",
                label="Expand bullet points",
                tier=_tier,
                feature_key="llm_expand_bullets",
                input_data={"bullet_notes": all_bullets},
            )
    with tab_linkedin:
        from app.components.linkedin_import import render_linkedin_tab
        render_linkedin_tab(config_dir=CONFIG_DIR, tier=_tier)
    col_back, col_next = st.columns([1, 4])
    if col_back.button("\u2190 Back", key="resume_back"):
        st.session_state.wizard_step = 2
        st.rerun()
    if col_next.button("Next \u2192", type="primary", key="resume_next"):
        parsed = st.session_state.get("_parsed_resume", {})
        experience = (
            parsed.get("experience") or
            st.session_state.get("_experience", [])
        )
        errs = validate({"experience": experience})
        if errs:
            st.error("\n".join(errs))
        else:
            resume_yaml_path = CONFIG_DIR / "plain_text_resume.yaml"
            resume_yaml_path.parent.mkdir(parents=True, exist_ok=True)
            resume_data = {**parsed, "experience": experience} if parsed else {"experience": experience}
            resume_yaml_path.write_text(
                yaml.dump(resume_data, default_flow_style=False, allow_unicode=True)
            )
            _save_yaml({"wizard_step": 3})
            st.session_state.wizard_step = 4
            st.rerun()
 # ── Step 4: Identity ───────────────────────────────────────────────────────────
 elif step == 4:
    from app.wizard.step_identity import validate
    st.subheader("Step 4 \u2014 Your Identity")
    st.caption("Used in cover letter PDFs, LLM prompts, and the app header.")
    c1, c2 = st.columns(2)
    name     = c1.text_input("Full Name *",  saved_yaml.get("name", ""))
    email    = c1.text_input("Email *",      saved_yaml.get("email", ""))
    phone    = c2.text_input("Phone",        saved_yaml.get("phone", ""))
    linkedin = c2.text_input("LinkedIn URL", saved_yaml.get("linkedin", ""))
    # Career summary with optional LLM generation — resume text available now (step 3 ran first)
    summary_default = st.session_state.get("_gen_result_career_summary") or saved_yaml.get("career_summary", "")
    summary = st.text_area(
        "Career Summary *", value=summary_default, height=120,
        placeholder="Experienced professional with X years in [field]. Specialise in [skills].",
        help="Injected into cover letter and research prompts as your professional context.",
    )
    gen_result = _generation_widget(
        section="career_summary",
        label="Generate from resume",
        tier=_tier,
        feature_key="llm_career_summary",
        input_data={"resume_text": saved_yaml.get("_raw_resume_text", "")},
    )
    if gen_result and gen_result != summary:
        st.info(f"\u2728 Suggested summary \u2014 paste it above if it looks good:\n\n{gen_result}")
    col_back, col_next = st.columns([1, 4])
    if col_back.button("\u2190 Back", key="ident_back"):
        st.session_state.wizard_step = 3
        st.rerun()
    if col_next.button("Next \u2192", type="primary", key="ident_next"):
        errs = validate({"name": name, "email": email, "career_summary": summary})
        if errs:
            st.error("\n".join(errs))
        else:
            _save_yaml({
                "name": name, "email": email, "phone": phone,
                "linkedin": linkedin, "career_summary": summary,
                "wizard_complete": False, "wizard_step": 4,
            })
            st.session_state.wizard_step = 5
            st.rerun()
 # ── Step 5: Inference ──────────────────────────────────────────────────────────
 elif step == 5:
    from app.cloud_session import CLOUD_MODE as _CLOUD_MODE
    if _CLOUD_MODE:
        # Cloud deployment: inference is managed server-side; skip this step
        _save_yaml({"wizard_step": 5})
        st.session_state.wizard_step = 6
        st.rerun()
    from app.wizard.step_inference import validate
    st.subheader("Step 5 \u2014 Inference & API Keys")
    profile = saved_yaml.get("inference_profile", "remote")
    if profile == "remote":
        st.info("Remote mode: at least one external API key is required.")
        anthropic_key = st.text_input("Anthropic API Key", type="password", placeholder="sk-ant-\u2026")
        openai_url    = st.text_input("OpenAI-compatible endpoint (optional)",
                                       placeholder="https://api.together.xyz/v1")
        openai_key    = st.text_input("Endpoint API Key (optional)", type="password",
                                       key="oai_key") if openai_url else ""
    else:
        st.info(f"Local mode ({profile}): Ollama provides inference.")
        anthropic_key = openai_url = openai_key = ""
    with st.expander("Advanced \u2014 Service Ports & Hosts"):
        st.caption("Change only if services run on non-default ports or remote hosts.")
        svc = dict(saved_yaml.get("services", {}))
        for svc_name, default_host, default_port in [
            ("ollama",  "ollama",   11434),  # Docker service name
            ("vllm",    "vllm",     8000),   # Docker service name
            ("searxng", "searxng",  8080),   # Docker internal port (host-mapped: 8888)
        ]:
            c1, c2 = st.columns([3, 1])
            svc[f"{svc_name}_host"] = c1.text_input(
                f"{svc_name} host",
                svc.get(f"{svc_name}_host", default_host),
                key=f"h_{svc_name}",
            )
            svc[f"{svc_name}_port"] = int(c2.number_input(
                "port",
                value=int(svc.get(f"{svc_name}_port", default_port)),
                step=1, key=f"p_{svc_name}",
            ))
    confirmed = st.session_state.get("_inf_confirmed", False)
    test_label = "\U0001f50c Test Ollama connection" if profile != "remote" else "\U0001f50c Test LLM connection"
    if st.button(test_label, key="inf_test"):
        if profile == "remote":
            from scripts.llm_router import LLMRouter
            try:
                r = LLMRouter().complete("Reply with only: OK")
                if r and r.strip():
                    st.success("LLM responding.")
                    st.session_state["_inf_confirmed"] = True
                    confirmed = True
            except Exception as e:
                st.error(f"LLM test failed: {e}")
        else:
            import requests
            ollama_url = f"http://{svc.get('ollama_host','localhost')}:{svc.get('ollama_port',11434)}"
            try:
                requests.get(f"{ollama_url}/api/tags", timeout=5)
                st.success("Ollama is running.")
                st.session_state["_inf_confirmed"] = True
                confirmed = True
            except Exception:
                st.warning("Ollama not responding \u2014 you can skip this check and configure later.")
                st.session_state["_inf_confirmed"] = True
                confirmed = True
    col_back, col_next = st.columns([1, 4])
    if col_back.button("\u2190 Back", key="inf_back"):
        st.session_state.wizard_step = 4
        st.rerun()
    if col_next.button("Next \u2192", type="primary", key="inf_next", disabled=not confirmed):
        errs = validate({"endpoint_confirmed": confirmed})
        if errs:
            st.error("\n".join(errs))
        else:
            # Write API keys to .env
            env_path = _ROOT / ".env"
            env_lines = env_path.read_text().splitlines() if env_path.exists() else []
            def _set_env(lines: list[str], key: str, val: str) -> list[str]:
                for i, l in enumerate(lines):
                    if l.startswith(f"{key}="):
                        lines[i] = f"{key}={val}"
                        return lines
                lines.append(f"{key}={val}")
                return lines
            if anthropic_key:
                env_lines = _set_env(env_lines, "ANTHROPIC_API_KEY", anthropic_key)
            if openai_url:
                env_lines = _set_env(env_lines, "OPENAI_COMPAT_URL", openai_url)
            if openai_key:
                env_lines = _set_env(env_lines, "OPENAI_COMPAT_KEY", openai_key)
            if anthropic_key or openai_url:
                env_path.write_text("\n".join(env_lines) + "\n")
            _save_yaml({"services": svc, "wizard_step": 5})
            st.session_state.wizard_step = 6
            st.rerun()
 # ── Step 6: Search ─────────────────────────────────────────────────────────────
 elif step == 6:
    from app.wizard.step_search import validate
    st.subheader("Step 6 \u2014 Job Search Preferences")
    st.caption("Set up what to search for. You can refine these in Settings \u2192 Search later.")
    titles    = st.session_state.get("_titles",    saved_yaml.get("_wiz_titles", []))
    locations = st.session_state.get("_locations", saved_yaml.get("_wiz_locations", []))
    c1, c2 = st.columns(2)
    with c1:
        st.markdown("**Job Titles**")
        for i, t in enumerate(titles):
            tc1, tc2 = st.columns([5, 1])
            tc1.text(t)
            if tc2.button("\u00d7", key=f"rmtitle_{i}"):
                titles.pop(i)
                st.session_state["_titles"] = titles
                st.rerun()
        new_title = st.text_input("Add title", key="new_title_wiz",
                                   placeholder="Software Engineer, Product Manager\u2026")
        ac1, ac2 = st.columns([4, 1])
        if ac2.button("\uff0b", key="add_title"):
            if new_title.strip() and new_title.strip() not in titles:
                titles.append(new_title.strip())
                st.session_state["_titles"] = titles
                st.rerun()
        # LLM title suggestions
        _generation_widget(
            section="job_titles",
            label="Suggest job titles",
            tier=_tier,
            feature_key="llm_job_titles",
            input_data={
                "resume_text": saved_yaml.get("_raw_resume_text", ""),
                "current_titles": str(titles),
            },
        )
    with c2:
        st.markdown("**Locations**")
        for i, l in enumerate(locations):
            lc1, lc2 = st.columns([5, 1])
            lc1.text(l)
            if lc2.button("\u00d7", key=f"rmloc_{i}"):
                locations.pop(i)
                st.session_state["_locations"] = locations
                st.rerun()
        new_loc = st.text_input("Add location", key="new_loc_wiz",
                                 placeholder="Remote, New York NY, San Francisco CA\u2026")
        ll1, ll2 = st.columns([4, 1])
        if ll2.button("\uff0b", key="add_loc"):
            if new_loc.strip():
                locations.append(new_loc.strip())
                st.session_state["_locations"] = locations
                st.rerun()
    col_back, col_next = st.columns([1, 4])
    if col_back.button("\u2190 Back", key="search_back"):
        st.session_state.wizard_step = 5
        st.rerun()
    if col_next.button("Next \u2192", type="primary", key="search_next"):
        errs = validate({"job_titles": titles, "locations": locations})
        if errs:
            st.error("\n".join(errs))
        else:
            search_profile_path = CONFIG_DIR / "search_profiles.yaml"
            existing_profiles = {}
            if search_profile_path.exists():
                existing_profiles = yaml.safe_load(search_profile_path.read_text()) or {}
            profiles_list = existing_profiles.get("profiles", [])
            # Update or create "default" profile
            default_idx = next(
                (i for i, p in enumerate(profiles_list) if p.get("name") == "default"), None
            )
            default_profile = {
                "name": "default",
                "job_titles": titles,
                "locations": locations,
                "remote_only": False,
                "boards": ["linkedin", "indeed", "glassdoor", "zip_recruiter"],
            }
            if default_idx is not None:
                profiles_list[default_idx] = default_profile
            else:
                profiles_list.insert(0, default_profile)
            search_profile_path.write_text(
                yaml.dump({"profiles": profiles_list},
                          default_flow_style=False, allow_unicode=True)
            )
            _save_yaml({"wizard_step": 6})
            st.session_state.wizard_step = 7
            st.rerun()
 # ── Step 7: Integrations (optional) ───────────────────────────────────────────
 elif step == 7:
    st.subheader("Step 7 \u2014 Integrations (Optional)")
    st.caption(
        "Connect cloud services, calendars, and notification tools. "
        "You can add or change these any time in Settings \u2192 Integrations."
    )
    from scripts.integrations import REGISTRY
    from app.wizard.step_integrations import get_available, is_connected
    from app.wizard.tiers import tier_label
    available = get_available(_tier)
    for name, cls in sorted(REGISTRY.items(), key=lambda x: (x[0] not in available, x[0])):
        is_conn = is_connected(name, CONFIG_DIR)
        icon    = "\u2705" if is_conn else "\u25cb"
        lock    = tier_label(f"{name}_sync") or tier_label(f"{name}_notifications")
        with st.expander(f"{icon} {cls.label}  {lock}"):
            if name not in available:
                st.caption(f"Upgrade to {cls.tier} to unlock {cls.label}.")
                continue
            inst   = cls()
            config: dict = {}
            for field in inst.fields():
                val = st.text_input(
                    field["label"],
                    type="password" if field["type"] == "password" else "default",
                    placeholder=field.get("placeholder", ""),
                    help=field.get("help", ""),
                    key=f"int_{name}_{field['key']}",
                )
                config[field["key"]] = val
            required_filled = all(
                config.get(f["key"])
                for f in inst.fields()
                if f.get("required")
            )
            if st.button(f"Connect {cls.label}", key=f"conn_{name}",
                          disabled=not required_filled):
                inst.connect(config)
                with st.spinner(f"Testing {cls.label} connection\u2026"):
                    if inst.test():
                        inst.save_config(config, CONFIG_DIR)
                        st.success(f"{cls.label} connected!")
                        st.rerun()
                    else:
                        st.error(
                            f"Connection test failed for {cls.label}. "
                            "Double-check your credentials."
                        )
    st.divider()
    col_back, col_skip, col_finish = st.columns([1, 1, 3])
    if col_back.button("\u2190 Back", key="int_back"):
        st.session_state.wizard_step = 6
        st.rerun()
    if col_skip.button("Skip \u2192"):
        st.session_state.wizard_step = 8  # trigger Finish
        st.rerun()
    if col_finish.button("\U0001f389 Finish Setup", type="primary", key="finish_btn"):
        st.session_state.wizard_step = 8
        st.rerun()
 # ── Finish ─────────────────────────────────────────────────────────────────────
 elif step >= 8:
    with st.spinner("Finalising setup\u2026"):
        from scripts.user_profile import UserProfile
        from scripts.generate_llm_config import apply_service_urls
        try:
            profile_obj = UserProfile(USER_YAML)
            if (CONFIG_DIR / "llm.yaml").exists():
                apply_service_urls(profile_obj, CONFIG_DIR / "llm.yaml")
        except Exception:
            pass  # don't block finish on llm.yaml errors
        data = _load_yaml()
        data["wizard_complete"] = True
        data.pop("wizard_step", None)
        USER_YAML.write_text(
            yaml.dump(data, default_flow_style=False, allow_unicode=True)
        )
    st.success("\u2705 Setup complete! Loading Peregrine\u2026")
    st.session_state.clear()
    st.rerun()
--- a/app/pages/2_Settings.py
+++ b/app/pages/2_Settings.py
--- a/app/pages/3_Resume_Editor.py
+++ b/app/pages/3_Resume_Editor.py
@ -1,191 +0,0 @@
 # app/pages/3_Resume_Editor.py
 """
 Resume Editor — form-based editor for Alex's AIHawk profile YAML.
 FILL_IN fields highlighted in amber.
 """
 import sys
 from pathlib import Path
 sys.path.insert(0, str(Path(__file__).parent.parent.parent))
 import streamlit as st
 import yaml
 st.set_page_config(page_title="Resume Editor", page_icon="📝", layout="wide")
 st.title("📝 Resume Editor")
 st.caption("Edit Alex's application profile used by AIHawk for LinkedIn Easy Apply.")
 RESUME_PATH = Path(__file__).parent.parent.parent / "aihawk" / "data_folder" / "plain_text_resume.yaml"
 if not RESUME_PATH.exists():
    st.error(f"Resume file not found at `{RESUME_PATH}`. Is AIHawk cloned?")
    st.stop()
 data = yaml.safe_load(RESUME_PATH.read_text()) or {}
 def field(label: str, value: str, key: str, help: str = "", password: bool = False) -> str:
    """Render a text input, highlighted amber if value is FILL_IN or empty."""
    needs_attention = str(value).startswith("FILL_IN") or value == ""
    if needs_attention:
        st.markdown(
            '<p style="color:#F59E0B;font-size:0.8em;margin-bottom:2px">⚠️ Needs your attention</p>',
            unsafe_allow_html=True,
        )
    return st.text_input(label, value=value or "", key=key, help=help,
                         type="password" if password else "default")
 st.divider()
 # ── Personal Info ─────────────────────────────────────────────────────────────
 with st.expander("👤 Personal Information", expanded=True):
    info = data.get("personal_information", {})
    col1, col2 = st.columns(2)
    with col1:
        name = field("First Name", info.get("name", ""), "pi_name")
        email = field("Email", info.get("email", ""), "pi_email")
        phone = field("Phone", info.get("phone", ""), "pi_phone")
        city = field("City", info.get("city", ""), "pi_city")
    with col2:
        surname = field("Last Name", info.get("surname", ""), "pi_surname")
        linkedin = field("LinkedIn URL", info.get("linkedin", ""), "pi_linkedin")
        zip_code = field("Zip Code", info.get("zip_code", ""), "pi_zip")
        dob = field("Date of Birth", info.get("date_of_birth", ""), "pi_dob",
                    help="Format: MM/DD/YYYY")
 # ── Education ─────────────────────────────────────────────────────────────────
 with st.expander("🎓 Education"):
    edu_list = data.get("education_details", [{}])
    updated_edu = []
    degree_options = ["Bachelor's Degree", "Master's Degree", "Some College",
                      "Associate's Degree", "High School", "Other"]
    for i, edu in enumerate(edu_list):
        st.markdown(f"**Entry {i+1}**")
        col1, col2 = st.columns(2)
        with col1:
            inst = field("Institution", edu.get("institution", ""), f"edu_inst_{i}")
            field_study = st.text_input("Field of Study", edu.get("field_of_study", ""), key=f"edu_field_{i}")
            start = st.text_input("Start Year", edu.get("start_date", ""), key=f"edu_start_{i}")
        with col2:
            current_level = edu.get("education_level", "Some College")
            level_idx = degree_options.index(current_level) if current_level in degree_options else 2
            level = st.selectbox("Degree Level", degree_options, index=level_idx, key=f"edu_level_{i}")
            end = st.text_input("Completion Year", edu.get("year_of_completion", ""), key=f"edu_end_{i}")
        updated_edu.append({
            "education_level": level, "institution": inst, "field_of_study": field_study,
            "start_date": start, "year_of_completion": end, "final_evaluation_grade": "", "exam": {},
        })
        st.divider()
 # ── Experience ────────────────────────────────────────────────────────────────
 with st.expander("💼 Work Experience"):
    exp_list = data.get("experience_details", [{}])
    if "exp_count" not in st.session_state:
        st.session_state.exp_count = len(exp_list)
    if st.button("+ Add Experience Entry"):
        st.session_state.exp_count += 1
        exp_list.append({})
    updated_exp = []
    for i in range(st.session_state.exp_count):
        exp = exp_list[i] if i < len(exp_list) else {}
        st.markdown(f"**Position {i+1}**")
        col1, col2 = st.columns(2)
        with col1:
            pos = field("Job Title", exp.get("position", ""), f"exp_pos_{i}")
            company = field("Company", exp.get("company", ""), f"exp_co_{i}")
            period = field("Employment Period", exp.get("employment_period", ""), f"exp_period_{i}",
                           help="e.g. 01/2022 - Present")
        with col2:
            location = st.text_input("Location", exp.get("location", ""), key=f"exp_loc_{i}")
            industry = st.text_input("Industry", exp.get("industry", ""), key=f"exp_ind_{i}")
        responsibilities = st.text_area(
            "Key Responsibilities (one per line)",
            value="\n".join(
                r.get(f"responsibility_{j+1}", "") if isinstance(r, dict) else str(r)
                for j, r in enumerate(exp.get("key_responsibilities", []))
            ),
            key=f"exp_resp_{i}", height=100,
        )
        skills = st.text_input(
            "Skills (comma-separated)",
            value=", ".join(exp.get("skills_acquired", [])),
            key=f"exp_skills_{i}",
        )
        resp_list = [{"responsibility_1": r.strip()} for r in responsibilities.splitlines() if r.strip()]
        skill_list = [s.strip() for s in skills.split(",") if s.strip()]
        updated_exp.append({
            "position": pos, "company": company, "employment_period": period,
            "location": location, "industry": industry,
            "key_responsibilities": resp_list, "skills_acquired": skill_list,
        })
        st.divider()
 # ── Preferences ───────────────────────────────────────────────────────────────
 with st.expander("⚙️ Preferences & Availability"):
    wp = data.get("work_preferences", {})
    sal = data.get("salary_expectations", {})
    avail = data.get("availability", {})
    col1, col2 = st.columns(2)
    with col1:
        salary_range = st.text_input("Salary Range (USD)", sal.get("salary_range_usd", ""),
                                     key="pref_salary", help="e.g. 120000 - 180000")
        notice = st.text_input("Notice Period", avail.get("notice_period", "2 weeks"), key="pref_notice")
    with col2:
        remote_work = st.checkbox("Open to Remote", value=wp.get("remote_work", "Yes") == "Yes", key="pref_remote")
        relocation = st.checkbox("Open to Relocation", value=wp.get("open_to_relocation", "No") == "Yes", key="pref_reloc")
        assessments = st.checkbox("Willing to complete assessments",
                                  value=wp.get("willing_to_complete_assessments", "Yes") == "Yes", key="pref_assess")
        bg_checks = st.checkbox("Willing to undergo background checks",
                                value=wp.get("willing_to_undergo_background_checks", "Yes") == "Yes", key="pref_bg")
        drug_tests = st.checkbox("Willing to undergo drug tests",
                                 value=wp.get("willing_to_undergo_drug_tests", "No") == "Yes", key="pref_drug")
 # ── Self-ID ───────────────────────────────────────────────────────────────────
 with st.expander("🏳️‍🌈 Self-Identification (optional)"):
    sid = data.get("self_identification", {})
    col1, col2 = st.columns(2)
    with col1:
        gender = st.text_input("Gender identity", sid.get("gender", "Non-binary"), key="sid_gender",
                               help="Select 'Non-binary' or 'Prefer not to say' when options allow")
        pronouns = st.text_input("Pronouns", sid.get("pronouns", "Any"), key="sid_pronouns")
        ethnicity = field("Ethnicity", sid.get("ethnicity", ""), "sid_ethnicity",
                          help="'Prefer not to say' is always an option")
    with col2:
        vet_options = ["No", "Yes", "Prefer not to say"]
        veteran = st.selectbox("Veteran status", vet_options,
                               index=vet_options.index(sid.get("veteran", "No")), key="sid_vet")
        dis_options = ["Prefer not to say", "No", "Yes"]
        disability = st.selectbox("Disability disclosure", dis_options,
                                  index=dis_options.index(sid.get("disability", "Prefer not to say")),
                                  key="sid_dis")
 st.divider()
 # ── Save ──────────────────────────────────────────────────────────────────────
 if st.button("💾 Save Resume Profile", type="primary", use_container_width=True):
    data["personal_information"] = {
        **data.get("personal_information", {}),
        "name": name, "surname": surname, "email": email, "phone": phone,
        "city": city, "zip_code": zip_code, "linkedin": linkedin, "date_of_birth": dob,
    }
    data["education_details"] = updated_edu
    data["experience_details"] = updated_exp
    data["salary_expectations"] = {"salary_range_usd": salary_range}
    data["availability"] = {"notice_period": notice}
    data["work_preferences"] = {
        **data.get("work_preferences", {}),
        "remote_work": "Yes" if remote_work else "No",
        "open_to_relocation": "Yes" if relocation else "No",
        "willing_to_complete_assessments": "Yes" if assessments else "No",
        "willing_to_undergo_background_checks": "Yes" if bg_checks else "No",
        "willing_to_undergo_drug_tests": "Yes" if drug_tests else "No",
    }
    data["self_identification"] = {
        "gender": gender, "pronouns": pronouns, "veteran": veteran,
        "disability": disability, "ethnicity": ethnicity,
    }
    RESUME_PATH.write_text(yaml.dump(data, default_flow_style=False, allow_unicode=True))
    st.success("✅ Profile saved!")
    st.balloons()
--- a/app/pages/4_Apply.py
+++ b/app/pages/4_Apply.py
@ -14,19 +14,28 @@ import streamlit as st
 import streamlit.components.v1 as components
 import yaml
 from scripts.user_profile import UserProfile
 _USER_YAML = Path(__file__).parent.parent.parent / "config" / "user.yaml"
 _profile = UserProfile(_USER_YAML) if UserProfile.exists(_USER_YAML) else None
 _name = _profile.name if _profile else "Job Seeker"
 from scripts.db import (
    DEFAULT_DB, init_db, get_jobs_by_status,
    update_cover_letter, mark_applied, update_job_status,
    get_task_for_job,
 )
 from scripts.task_runner import submit_task
 from app.cloud_session import resolve_session, get_db_path
 from app.telemetry import log_usage_event
-DOCS_DIR = Path("/Library/Documents/JobSearch")
+DOCS_DIR = _profile.docs_dir if _profile else Path.home() / "Documents" / "JobSearch"
-RESUME_YAML = Path(__file__).parent.parent.parent / "aihawk" / "data_folder" / "plain_text_resume.yaml"
+RESUME_YAML = Path(__file__).parent.parent.parent / "config" / "plain_text_resume.yaml"
 st.title("🚀 Apply Workspace")
-init_db(DEFAULT_DB)
+resolve_session("peregrine")
 init_db(get_db_path())
 # ── PDF generation ─────────────────────────────────────────────────────────────
 def _make_cover_letter_pdf(job: dict, cover_letter: str, output_dir: Path) -> Path:
@ -70,13 +79,16 @@ def _make_cover_letter_pdf(job: dict, cover_letter: str, output_dir: Path) -> Pa
        textColor=dark, leading=16, spaceAfter=12, alignment=TA_LEFT,
    )
    display_name = _profile.name.upper() if _profile else "YOUR NAME"
    contact_line = " · ".join(filter(None, [
        _profile.email if _profile else "",
        _profile.phone if _profile else "",
        _profile.linkedin if _profile else "",
    ]))
    story = [
-        Paragraph("ALEX RIVERA", name_style),
+        Paragraph(display_name, name_style),
-        Paragraph(
+        Paragraph(contact_line, contact_style),
            "alex@example.com  ·  (555) 867-5309  ·  "
            "linkedin.com/in/AlexMcCann  ·  hirealexmccann.site",
            contact_style,
        ),
        HRFlowable(width="100%", thickness=1, color=teal, spaceBefore=8, spaceAfter=0),
        Paragraph(datetime.now().strftime("%B %d, %Y"), date_style),
    ]
@ -88,7 +100,7 @@ def _make_cover_letter_pdf(job: dict, cover_letter: str, output_dir: Path) -> Pa
    story += [
        Spacer(1, 6),
-        Paragraph("Warm regards,<br/><br/>Alex Rivera", body_style),
+        Paragraph(f"Warm regards,<br/><br/>{_profile.name if _profile else 'Your Name'}", body_style),
    ]
    doc.build(story)
@ -96,7 +108,7 @@ def _make_cover_letter_pdf(job: dict, cover_letter: str, output_dir: Path) -> Pa
 # ── Application Q&A helper ─────────────────────────────────────────────────────
 def _answer_question(job: dict, question: str) -> str:
-    """Call the LLM to answer an application question in Alex's voice.
+    """Call the LLM to answer an application question in the user's voice.
    Uses research_fallback_order (claude_code → vllm → ollama_research)
    rather than the default cover-letter order — the fine-tuned cover letter
@ -106,21 +118,22 @@ def _answer_question(job: dict, question: str) -> str:
    router = LLMRouter()
    fallback = router.config.get("research_fallback_order") or router.config.get("fallback_order")
    description_snippet = (job.get("description") or "")[:1200].strip()
-    prompt = f"""You are answering job application questions for Alex Rivera, a customer success leader.
+    _persona_summary = (
        _profile.career_summary[:200] if _profile and _profile.career_summary
        else "a professional with experience in their field"
    )
    prompt = f"""You are answering job application questions for {_name}.
 Background:
- 6+ years in customer success, technical account management, and CS leadership
+{_persona_summary}
 - Most recent role: led Americas Customer Success at UpGuard (cybersecurity SaaS), NPS consistently ≥95
 - Also founder of M3 Consulting, a CS advisory practice for SaaS startups
 - Based in SF Bay Area; open to remote/hybrid; pronouns: any
-Role she's applying to: {job.get("title", "")} at {job.get("company", "")}
+Role they're applying to: {job.get("title", "")} at {job.get("company", "")}
 {f"Job description excerpt:{chr(10)}{description_snippet}" if description_snippet else ""}
 Application Question:
 {question}
-Answer in Alex's voice — specific, warm, and confident. If the question specifies a word or character limit, respect it. Answer only the question with no preamble or sign-off."""
+Answer in {_name}'s voice — specific, warm, and confident. If the question specifies a word or character limit, respect it. Answer only the question with no preamble or sign-off."""
    return router.complete(prompt, fallback_order=fallback).strip()
@ -146,7 +159,7 @@ def _copy_btn(text: str, label: str = "📋 Copy", done: str = "✅ Copied!", he
    )
 # ── Job selection ──────────────────────────────────────────────────────────────
-approved = get_jobs_by_status(DEFAULT_DB, "approved")
+approved = get_jobs_by_status(get_db_path(), "approved")
 if not approved:
    st.info("No approved jobs — head to Job Review to approve some listings first.")
    st.stop()
@ -209,17 +222,17 @@ with col_tools:
    if _cl_key not in st.session_state:
        st.session_state[_cl_key] = job.get("cover_letter") or ""
-    _cl_task = get_task_for_job(DEFAULT_DB, "cover_letter", selected_id)
+    _cl_task = get_task_for_job(get_db_path(), "cover_letter", selected_id)
    _cl_running = _cl_task and _cl_task["status"] in ("queued", "running")
    if st.button("✨ Generate / Regenerate", use_container_width=True, disabled=bool(_cl_running)):
-        submit_task(DEFAULT_DB, "cover_letter", selected_id)
+        submit_task(get_db_path(), "cover_letter", selected_id)
        st.rerun()
    if _cl_running:
        @st.fragment(run_every=3)
        def _cl_status_fragment():
-            t = get_task_for_job(DEFAULT_DB, "cover_letter", selected_id)
+            t = get_task_for_job(get_db_path(), "cover_letter", selected_id)
            if t and t["status"] in ("queued", "running"):
                lbl = "Queued…" if t["status"] == "queued" else "Generating via LLM…"
                st.info(f"⏳ {lbl}")
@ -245,6 +258,32 @@ with col_tools:
        label_visibility="collapsed",
    )
    # ── Iterative refinement ──────────────────────
    if cl_text and not _cl_running:
        with st.expander("✏️ Refine with Feedback"):
            st.caption("Describe what to change. The current draft is passed to the LLM as context.")
            _fb_key = f"fb_{selected_id}"
            feedback_text = st.text_area(
                "Feedback",
                placeholder="e.g. Shorten the second paragraph and add a line about cross-functional leadership.",
                height=80,
                key=_fb_key,
                label_visibility="collapsed",
            )
            if st.button("✨ Regenerate with Feedback", use_container_width=True,
                         disabled=not (feedback_text or "").strip(),
                         key=f"cl_refine_{selected_id}"):
                import json as _json
                submit_task(
                    get_db_path(), "cover_letter", selected_id,
                    params=_json.dumps({
                        "previous_result": cl_text,
                        "feedback": feedback_text.strip(),
                    }),
                )
                st.session_state.pop(_fb_key, None)
                st.rerun()
    # Copy + Save row
    c1, c2 = st.columns(2)
    with c1:
@ -252,7 +291,7 @@ with col_tools:
            _copy_btn(cl_text, label="📋 Copy Letter")
    with c2:
        if st.button("💾 Save draft", use_container_width=True):
-            update_cover_letter(DEFAULT_DB, selected_id, cl_text)
+            update_cover_letter(get_db_path(), selected_id, cl_text)
            st.success("Saved!")
    # PDF generation
@ -261,8 +300,10 @@ with col_tools:
            with st.spinner("Generating PDF…"):
                try:
                    pdf_path = _make_cover_letter_pdf(job, cl_text, DOCS_DIR)
-                    update_cover_letter(DEFAULT_DB, selected_id, cl_text)
+                    update_cover_letter(get_db_path(), selected_id, cl_text)
                    st.success(f"Saved: `{pdf_path.name}`")
                    if user_id := st.session_state.get("user_id"):
                        log_usage_event(user_id, "peregrine", "cover_letter_generated")
                except Exception as e:
                    st.error(f"PDF error: {e}")
@ -276,13 +317,15 @@ with col_tools:
    with c4:
        if st.button("✅ Mark as Applied", use_container_width=True, type="primary"):
            if cl_text:
-                update_cover_letter(DEFAULT_DB, selected_id, cl_text)
+                update_cover_letter(get_db_path(), selected_id, cl_text)
-            mark_applied(DEFAULT_DB, [selected_id])
+            mark_applied(get_db_path(), [selected_id])
            st.success("Marked as applied!")
            if user_id := st.session_state.get("user_id"):
                log_usage_event(user_id, "peregrine", "job_applied")
            st.rerun()
    if st.button("🚫 Reject listing", use_container_width=True):
-        update_job_status(DEFAULT_DB, [selected_id], "rejected")
+        update_job_status(get_db_path(), [selected_id], "rejected")
        # Advance selectbox to next job so list doesn't snap to first item
        current_idx = ids.index(selected_id) if selected_id in ids else 0
        if current_idx + 1 < len(ids):
@ -346,7 +389,7 @@ with col_tools:
                st.markdown("---")
        else:
-            st.warning("Resume YAML not found — check that AIHawk is cloned.")
+            st.warning("Resume profile not found — complete setup or upload a resume in Settings → Resume Profile.")
    # ── Application Q&A ───────────────────────────────────────────────────────
    with st.expander("💬 Answer Application Questions"):
--- a/app/pages/5_Interviews.py
+++ b/app/pages/5_Interviews.py
@ -22,6 +22,12 @@ sys.path.insert(0, str(Path(__file__).parent.parent.parent))
 import streamlit as st
 from scripts.user_profile import UserProfile
 _USER_YAML = Path(__file__).parent.parent.parent / "config" / "user.yaml"
 _profile = UserProfile(_USER_YAML) if UserProfile.exists(_USER_YAML) else None
 _name = _profile.name if _profile else "Job Seeker"
 from scripts.db import (
    DEFAULT_DB, init_db,
    get_interview_jobs, advance_to_stage, reject_at_stage,
@ -186,19 +192,21 @@ def _email_modal(job: dict) -> None:
                with st.spinner("Drafting…"):
                    try:
                        from scripts.llm_router import complete
                        _persona = (
                            f"{_name} is a {_profile.career_summary[:120] if _profile and _profile.career_summary else 'professional'}"
                        )
                        draft = complete(
                            prompt=(
                                f"Draft a professional, warm reply to this email.\n\n"
                                f"From: {last.get('from_addr', '')}\n"
                                f"Subject: {last.get('subject', '')}\n\n"
                                f"{last.get('body', '')}\n\n"
-                                f"Context: Alex Rivera is a Customer Success / "
+                                f"Context: {_persona} applying for "
                                f"Technical Account Manager applying for "
                                f"{job.get('title')} at {job.get('company')}."
                            ),
                            system=(
-                                "You are Alex Rivera's professional email assistant. "
+                                f"You are {_name}'s professional email assistant. "
-                                "Write concise, warm, and professional replies in her voice. "
+                                "Write concise, warm, and professional replies in their voice. "
                                "Keep it to 3–5 sentences unless more is needed."
                            ),
                        )
--- a/app/pages/6_Interview_Prep.py
+++ b/app/pages/6_Interview_Prep.py
@ -13,6 +13,12 @@ sys.path.insert(0, str(Path(__file__).parent.parent.parent))
 import streamlit as st
 from scripts.user_profile import UserProfile
 _USER_YAML = Path(__file__).parent.parent.parent / "config" / "user.yaml"
 _profile = UserProfile(_USER_YAML) if UserProfile.exists(_USER_YAML) else None
 _name = _profile.name if _profile else "Job Seeker"
 from scripts.db import (
    DEFAULT_DB, init_db,
    get_interview_jobs, get_contacts, get_research,
@ -231,7 +237,7 @@ with col_prep:
                        system=(
                            f"You are a recruiter at {job.get('company')} conducting "
                            f"a phone screen for the {job.get('title')} role. "
-                            f"Ask one question at a time. After Alex answers, give "
+                            f"Ask one question at a time. After {_name} answers, give "
                            f"brief feedback (1–2 sentences), then ask your next question. "
                            f"Be professional but warm."
                        ),
@ -253,7 +259,7 @@ with col_prep:
                    "content": (
                        f"You are a recruiter at {job.get('company')} conducting "
                        f"a phone screen for the {job.get('title')} role. "
-                        f"Ask one question at a time. After Alex answers, give "
+                        f"Ask one question at a time. After {_name} answers, give "
                        f"brief feedback (1–2 sentences), then ask your next question."
                    ),
                }
@ -265,7 +271,7 @@ with col_prep:
                    router = LLMRouter()
                    # Build prompt from history for single-turn backends
                    convo = "\n\n".join(
-                        f"{'Interviewer' if m['role'] == 'assistant' else 'Alex'}: {m['content']}"
+                        f"{'Interviewer' if m['role'] == 'assistant' else _name}: {m['content']}"
                        for m in history
                    )
                    response = router.complete(
@ -331,12 +337,12 @@ with col_context:
                                    f"From: {last.get('from_addr', '')}\n"
                                    f"Subject: {last.get('subject', '')}\n\n"
                                    f"{last.get('body', '')}\n\n"
-                                    f"Context: Alex is a CS/TAM professional applying "
+                                    f"Context: {_name} is a professional applying "
                                    f"for {job.get('title')} at {job.get('company')}."
                                ),
                                system=(
-                                    "You are Alex Rivera's professional email assistant. "
+                                    f"You are {_name}'s professional email assistant. "
-                                    "Write concise, warm, and professional replies in her voice."
+                                    "Write concise, warm, and professional replies in their voice."
                                ),
                            )
                            st.session_state[f"draft_{selected_id}"] = draft
--- a/app/telemetry.py
+++ b/app/telemetry.py
@ -0,0 +1,127 @@
 # peregrine/app/telemetry.py
 """
 Usage event telemetry for cloud-hosted Peregrine.
 In local-first mode (CLOUD_MODE unset/false), all functions are no-ops —
 no network calls, no DB writes, no imports of psycopg2.
 In cloud mode, events are written to the platform Postgres DB ONLY after
 confirming the user's telemetry consent.
 THE HARD RULE: if telemetry_consent.all_disabled is True for a user,
 nothing is written, no exceptions. This function is the ONLY path to
 usage_events — no feature may write there directly.
 """
 import os
 import json
 from typing import Any
 CLOUD_MODE: bool = os.environ.get("CLOUD_MODE", "").lower() in ("1", "true", "yes")
 PLATFORM_DB_URL: str = os.environ.get("PLATFORM_DB_URL", "")
 _platform_conn = None
 def get_platform_conn():
    """Lazy psycopg2 connection to the platform Postgres DB. Reconnects if closed."""
    global _platform_conn
    if _platform_conn is None or _platform_conn.closed:
        import psycopg2
        _platform_conn = psycopg2.connect(PLATFORM_DB_URL)
    return _platform_conn
 def get_consent(user_id: str) -> dict:
    """
    Fetch telemetry consent for the user.
    Returns safe defaults if record doesn't exist yet:
      - usage_events_enabled: True  (new cloud users start opted-in, per onboarding disclosure)
      - all_disabled: False
    """
    conn = get_platform_conn()
    with conn.cursor() as cur:
        cur.execute(
            "SELECT all_disabled, usage_events_enabled "
            "FROM telemetry_consent WHERE user_id = %s",
            (user_id,)
        )
        row = cur.fetchone()
    if row is None:
        return {"all_disabled": False, "usage_events_enabled": True}
    return {"all_disabled": row[0], "usage_events_enabled": row[1]}
 def log_usage_event(
    user_id: str,
    app: str,
    event_type: str,
    metadata: dict[str, Any] | None = None,
 ) -> None:
    """
    Write a usage event to the platform DB if consent allows.
    Silent no-op in local mode. Silent no-op if telemetry is disabled.
    Swallows all exceptions — telemetry must never crash the app.
    Args:
        user_id:    Directus user UUID (from st.session_state["user_id"])
        app:        App slug ('peregrine', 'falcon', etc.)
        event_type: Snake_case event label ('cover_letter_generated', 'job_applied', etc.)
        metadata:   Optional JSON-serialisable dict — NO PII
    """
    if not CLOUD_MODE:
        return
    try:
        consent = get_consent(user_id)
        if consent.get("all_disabled") or not consent.get("usage_events_enabled", True):
            return
        conn = get_platform_conn()
        with conn.cursor() as cur:
            cur.execute(
                "INSERT INTO usage_events (user_id, app, event_type, metadata) "
                "VALUES (%s, %s, %s, %s)",
                (user_id, app, event_type, json.dumps(metadata) if metadata else None),
            )
        conn.commit()
    except Exception:
        # Telemetry must never crash the app
        pass
 def update_consent(user_id: str, **fields) -> None:
    """
    UPSERT telemetry consent for a user.
    Accepted keyword args (all optional, any subset may be provided):
        all_disabled: bool
        usage_events_enabled: bool
        content_sharing_enabled: bool
        support_access_enabled: bool
    Safe to call in cloud mode only — no-op in local mode.
    Swallows all exceptions so the Settings UI is never broken by a DB hiccup.
    """
    if not CLOUD_MODE:
        return
    allowed = {"all_disabled", "usage_events_enabled", "content_sharing_enabled", "support_access_enabled"}
    cols = {k: v for k, v in fields.items() if k in allowed}
    if not cols:
        return
    try:
        conn = get_platform_conn()
        col_names = ", ".join(cols)
        placeholders = ", ".join(["%s"] * len(cols))
        set_clause = ", ".join(f"{k} = EXCLUDED.{k}" for k in cols)
        col_vals = list(cols.values())
        with conn.cursor() as cur:
            cur.execute(
                f"INSERT INTO telemetry_consent (user_id, {col_names}) "
                f"VALUES (%s, {placeholders}) "
                f"ON CONFLICT (user_id) DO UPDATE SET {set_clause}, updated_at = NOW()",
                [user_id] + col_vals,
            )
        conn.commit()
    except Exception:
        pass
--- a/app/wizard/init.py
+++ b/app/wizard/init.py
--- a/app/wizard/step_hardware.py
+++ b/app/wizard/step_hardware.py
@ -0,0 +1,14 @@
 """Step 1 — Hardware detection and inference profile selection."""
 PROFILES = ["remote", "cpu", "single-gpu", "dual-gpu"]
 def validate(data: dict) -> list[str]:
    """Return list of validation errors. Empty list = step passes."""
    errors = []
    profile = data.get("inference_profile", "")
    if not profile:
        errors.append("Inference profile is required.")
    elif profile not in PROFILES:
        errors.append(f"Invalid inference profile '{profile}'. Choose: {', '.join(PROFILES)}.")
    return errors
--- a/app/wizard/step_identity.py
+++ b/app/wizard/step_identity.py
@ -0,0 +1,13 @@
 """Step 3 — Identity (name, email, phone, linkedin, career_summary)."""
 def validate(data: dict) -> list[str]:
    """Return list of validation errors. Empty list = step passes."""
    errors = []
    if not (data.get("name") or "").strip():
        errors.append("Full name is required.")
    if not (data.get("email") or "").strip():
        errors.append("Email address is required.")
    if not (data.get("career_summary") or "").strip():
        errors.append("Career summary is required.")
    return errors
--- a/app/wizard/step_inference.py
+++ b/app/wizard/step_inference.py
@ -0,0 +1,9 @@
 """Step 5 — LLM inference backend configuration and key entry."""
 def validate(data: dict) -> list[str]:
    """Return list of validation errors. Empty list = step passes."""
    errors = []
    if not data.get("endpoint_confirmed"):
        errors.append("At least one working LLM endpoint must be confirmed.")
    return errors
--- a/app/wizard/step_integrations.py
+++ b/app/wizard/step_integrations.py
@ -0,0 +1,36 @@
 """Step 7 — Optional integrations (cloud storage, calendars, notifications).
 This step is never mandatory — validate() always returns [].
 Helper functions support the wizard UI for tier-filtered integration cards.
 """
 from __future__ import annotations
 from pathlib import Path
 def validate(data: dict) -> list[str]:
    """Integrations step is optional — never blocks Finish."""
    return []
 def get_available(tier: str) -> list[str]:
    """Return list of integration names available for the given tier.
    An integration is available if the user's tier meets or exceeds the
    integration's minimum required tier (as declared by cls.tier).
    """
    from scripts.integrations import REGISTRY
    from app.wizard.tiers import TIERS
    available = []
    for name, cls in REGISTRY.items():
        try:
            if TIERS.index(tier) >= TIERS.index(cls.tier):
                available.append(name)
        except ValueError:
            pass  # unknown tier string — skip
    return available
 def is_connected(name: str, config_dir: Path) -> bool:
    """Return True if a live config file exists for this integration."""
    return (config_dir / "integrations" / f"{name}.yaml").exists()
--- a/app/wizard/step_resume.py
+++ b/app/wizard/step_resume.py
@ -0,0 +1,10 @@
 """Step 4 — Resume (upload or guided form builder)."""
 def validate(data: dict) -> list[str]:
    """Return list of validation errors. Empty list = step passes."""
    errors = []
    experience = data.get("experience") or []
    if not experience:
        errors.append("At least one work experience entry is required.")
    return errors
--- a/app/wizard/step_search.py
+++ b/app/wizard/step_search.py
@ -0,0 +1,13 @@
 """Step 6 — Job search preferences (titles, locations, boards, keywords)."""
 def validate(data: dict) -> list[str]:
    """Return list of validation errors. Empty list = step passes."""
    errors = []
    titles = data.get("job_titles") or []
    locations = data.get("locations") or []
    if not titles:
        errors.append("At least one job title is required.")
    if not locations:
        errors.append("At least one location is required.")
    return errors
--- a/app/wizard/step_tier.py
+++ b/app/wizard/step_tier.py
@ -0,0 +1,13 @@
 """Step 2 — Tier selection (free / paid / premium)."""
 from app.wizard.tiers import TIERS
 def validate(data: dict) -> list[str]:
    """Return list of validation errors. Empty list = step passes."""
    errors = []
    tier = data.get("tier", "")
    if not tier:
        errors.append("Tier selection is required.")
    elif tier not in TIERS:
        errors.append(f"Invalid tier '{tier}'. Choose: {', '.join(TIERS)}.")
    return errors
--- a/app/wizard/tiers.py
+++ b/app/wizard/tiers.py
@ -0,0 +1,160 @@
 """
 Tier definitions and feature gates for Peregrine.
 Tiers: free < paid < premium
 FEATURES maps feature key → minimum tier required.
 Features not in FEATURES are available to all tiers (free).
 BYOK policy
 -----------
 Features in BYOK_UNLOCKABLE are gated only because CircuitForge would otherwise
 be providing the LLM compute. When a user has any configured LLM backend (local
 ollama/vllm or their own API key), those features unlock regardless of tier.
 Pass has_byok=has_configured_llm() to can_use() at call sites.
 Features that stay gated even with BYOK:
  - Integrations (Notion sync, calendars, etc.) — infrastructure we run
  - llm_keywords_blocklist — orchestration pipeline over background keyword data
  - email_classifier — training pipeline, not a single LLM call
  - shared_cover_writer_model — our fine-tuned model weights
  - model_fine_tuning — GPU infrastructure
  - multi_user — account infrastructure
 """
 from __future__ import annotations
 from pathlib import Path
 TIERS = ["free", "paid", "premium"]
 # Maps feature key → minimum tier string required.
 # Features absent from this dict are free (available to all).
 FEATURES: dict[str, str] = {
    # Wizard LLM generation — BYOK-unlockable (pure LLM calls)
    "llm_career_summary":           "paid",
    "llm_expand_bullets":           "paid",
    "llm_suggest_skills":           "paid",
    "llm_voice_guidelines":         "premium",
    "llm_job_titles":               "paid",
    "llm_mission_notes":            "paid",
    # Orchestration — stays gated (background data pipeline, not just an LLM call)
    "llm_keywords_blocklist":       "paid",
    # App features — BYOK-unlockable (pure LLM calls over job/profile data)
    "company_research":             "paid",
    "interview_prep":               "paid",
    "survey_assistant":             "paid",
    # Orchestration / infrastructure — stays gated
    "email_classifier":             "paid",
    "model_fine_tuning":            "premium",
    "shared_cover_writer_model":    "paid",
    "multi_user":                   "premium",
    # Integrations — stays gated (infrastructure CircuitForge operates)
    "notion_sync":                  "paid",
    "google_sheets_sync":           "paid",
    "airtable_sync":                "paid",
    "google_calendar_sync":         "paid",
    "apple_calendar_sync":          "paid",
    "slack_notifications":          "paid",
 }
 # Features that unlock when the user supplies any LLM backend (local or BYOK).
 # These are pure LLM-call features — the only reason they're behind a tier is
 # because CircuitForge would otherwise be providing the compute.
 BYOK_UNLOCKABLE: frozenset[str] = frozenset({
    "llm_career_summary",
    "llm_expand_bullets",
    "llm_suggest_skills",
    "llm_voice_guidelines",
    "llm_job_titles",
    "llm_mission_notes",
    "company_research",
    "interview_prep",
    "survey_assistant",
 })
 # Free integrations (not in FEATURES):
 # google_drive_sync, dropbox_sync, onedrive_sync, mega_sync,
 # nextcloud_sync, discord_notifications, home_assistant
 _LLM_CFG = Path(__file__).parent.parent.parent / "config" / "llm.yaml"
 def has_configured_llm(config_path: Path | None = None) -> bool:
    """Return True if at least one non-vision LLM backend is enabled in llm.yaml.
    Local backends (ollama, vllm) count — the policy is "you're providing the
    compute", whether that's your own hardware or your own API key.
    """
    import yaml
    path = config_path or _LLM_CFG
    try:
        with open(path) as f:
            cfg = yaml.safe_load(f) or {}
        return any(
            b.get("enabled", True) and b.get("type") != "vision_service"
            for b in cfg.get("backends", {}).values()
        )
    except Exception:
        return False
 def can_use(tier: str, feature: str, has_byok: bool = False) -> bool:
    """Return True if the given tier has access to the feature.
    has_byok: pass has_configured_llm() to unlock BYOK_UNLOCKABLE features
    for users who supply their own LLM backend regardless of tier.
    Returns True for unknown features (not gated).
    Returns False for unknown/invalid tier strings.
    """
    required = FEATURES.get(feature)
    if required is None:
        return True  # not gated — available to all
    if has_byok and feature in BYOK_UNLOCKABLE:
        return True
    try:
        return TIERS.index(tier) >= TIERS.index(required)
    except ValueError:
        return False  # invalid tier string
 def tier_label(feature: str, has_byok: bool = False) -> str:
    """Return a display label for a locked feature, or '' if free/unlocked."""
    if has_byok and feature in BYOK_UNLOCKABLE:
        return ""
    required = FEATURES.get(feature)
    if required is None:
        return ""
    return "🔒 Paid" if required == "paid" else "⭐ Premium"
 def effective_tier(
    profile=None,
    license_path=None,
    public_key_path=None,
 ) -> str:
    """Return the effective tier for this installation.
    Priority:
    1. profile.dev_tier_override (developer mode override)
    2. License JWT verification (offline RS256 check)
    3. "free" (fallback)
    license_path and public_key_path default to production paths when None.
    Pass explicit paths in tests to avoid touching real files.
    """
    if profile and getattr(profile, "dev_tier_override", None):
        return profile.dev_tier_override
    from scripts.license import effective_tier as _license_tier
    from pathlib import Path as _Path
    kwargs = {}
    if license_path is not None:
        kwargs["license_path"] = _Path(license_path)
    if public_key_path is not None:
        kwargs["public_key_path"] = _Path(public_key_path)
    return _license_tier(**kwargs)
--- a/compose.cloud.yml
+++ b/compose.cloud.yml
@ -0,0 +1,57 @@
 # compose.cloud.yml — Multi-tenant cloud stack for menagerie.circuitforge.tech/peregrine
 #
 # Each authenticated user gets their own encrypted SQLite data tree at
 # /devl/menagerie-data/<user-id>/peregrine/
 #
 # Caddy injects the Directus session cookie as X-CF-Session header before forwarding.
 # cloud_session.py resolves user_id → per-user db_path at session init.
 #
 # Usage:
 #   docker compose -f compose.cloud.yml --project-name peregrine-cloud up -d
 #   docker compose -f compose.cloud.yml --project-name peregrine-cloud down
 #   docker compose -f compose.cloud.yml --project-name peregrine-cloud logs app -f
 services:
  app:
    build: .
    container_name: peregrine-cloud
    ports:
      - "8505:8501"
    volumes:
      - /devl/menagerie-data:/devl/menagerie-data  # per-user data trees
    environment:
      - CLOUD_MODE=true
      - CLOUD_DATA_ROOT=/devl/menagerie-data
      - DIRECTUS_JWT_SECRET=${DIRECTUS_JWT_SECRET}
      - CF_SERVER_SECRET=${CF_SERVER_SECRET}
      - PLATFORM_DB_URL=${PLATFORM_DB_URL}
      - HEIMDALL_URL=${HEIMDALL_URL:-http://cf-license:8000}
      - HEIMDALL_ADMIN_TOKEN=${HEIMDALL_ADMIN_TOKEN}
      - STAGING_DB=/devl/menagerie-data/cloud-default.db  # fallback only — never used
      - DOCS_DIR=/tmp/cloud-docs
      - STREAMLIT_SERVER_BASE_URL_PATH=peregrine
      - PYTHONUNBUFFERED=1
      - DEMO_MODE=false
    depends_on:
      searxng:
        condition: service_healthy
    extra_hosts:
      - "host.docker.internal:host-gateway"
    restart: unless-stopped
  searxng:
    image: searxng/searxng:latest
    volumes:
      - ./docker/searxng:/etc/searxng:ro
    healthcheck:
      test: ["CMD", "wget", "-q", "--spider", "http://localhost:8080/"]
      interval: 10s
      timeout: 5s
      retries: 3
    restart: unless-stopped
    # No host port — internal only
 networks:
  default:
    external: true
    name: caddy-proxy_caddy-internal
--- a/compose.demo.yml
+++ b/compose.demo.yml
@ -0,0 +1,52 @@
 # compose.demo.yml — Public demo stack for demo.circuitforge.tech/peregrine
 #
 # Runs a fully isolated, neutered Peregrine instance:
 #   - DEMO_MODE=true: blocks all LLM inference in llm_router.py
 #   - demo/config/: pre-seeded demo user profile, all backends disabled
 #   - demo/data/: isolated SQLite DB (no personal job data)
 #   - No personal documents mounted
 #   - Port 8504 (separate from the personal instance on 8502)
 #
 # Usage:
 #   docker compose -f compose.demo.yml --project-name peregrine-demo up -d
 #   docker compose -f compose.demo.yml --project-name peregrine-demo down
 #
 # Caddy demo.circuitforge.tech/peregrine* → host port 8504
 services:
  app:
    build: .
    ports:
      - "8504:8501"
    volumes:
      - ./demo/config:/app/config
      - ./demo/data:/app/data
      # No /docs mount — demo has no personal documents
    environment:
      - DEMO_MODE=true
      - STAGING_DB=/app/data/staging.db
      - DOCS_DIR=/tmp/demo-docs
      - STREAMLIT_SERVER_BASE_URL_PATH=peregrine
      - PYTHONUNBUFFERED=1
      - PYTHONLOGGING=WARNING
      # No API keys — inference is blocked by DEMO_MODE before any key is needed
    depends_on:
      searxng:
        condition: service_healthy
    extra_hosts:
      - "host.docker.internal:host-gateway"
    restart: unless-stopped
  searxng:
    image: searxng/searxng:latest
    volumes:
      - ./docker/searxng:/etc/searxng:ro
    healthcheck:
      test: ["CMD", "wget", "-q", "--spider", "http://localhost:8080/"]
      interval: 10s
      timeout: 5s
      retries: 3
    restart: unless-stopped
    # No host port published — internal only; demo app uses it for job description enrichment
    # (non-AI scraping is allowed; only LLM inference is blocked)
--- a/compose.gpu.yml
+++ b/compose.gpu.yml
@ -0,0 +1,55 @@
 # compose.gpu.yml — Docker NVIDIA GPU overlay
 #
 # Adds NVIDIA GPU reservations to Peregrine services.
 # Applied automatically by `make start PROFILE=single-gpu|dual-gpu` when Docker is detected.
 # Manual: docker compose -f compose.yml -f compose.gpu.yml --profile single-gpu up -d
 #
 # Prerequisites:
 #   sudo nvidia-ctk runtime configure --runtime=docker
 #   sudo systemctl restart docker
 #
 services:
  ollama:
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              device_ids: ["0"]
              capabilities: [gpu]
  ollama_research:
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              device_ids: ["1"]
              capabilities: [gpu]
  vision:
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              device_ids: ["0"]
              capabilities: [gpu]
  vllm:
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              device_ids: ["1"]
              capabilities: [gpu]
  finetune:
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              device_ids: ["0"]
              capabilities: [gpu]
--- a/compose.podman-gpu.yml
+++ b/compose.podman-gpu.yml
@ -0,0 +1,51 @@
 # compose.podman-gpu.yml — Podman GPU override
 #
 # Replaces Docker-specific `driver: nvidia` reservations with CDI device specs
 # for rootless Podman. Applied automatically via `make start PROFILE=single-gpu|dual-gpu`
 # when podman/podman-compose is detected, or manually:
 #   podman-compose -f compose.yml -f compose.podman-gpu.yml --profile single-gpu up -d
 #
 # Prerequisites:
 #   sudo nvidia-ctk cdi generate --output=/etc/cdi/nvidia.yaml
 #   (requires nvidia-container-toolkit >= 1.14)
 #
 services:
  ollama:
    devices:
      - nvidia.com/gpu=0
    deploy:
      resources:
        reservations:
          devices: []
  ollama_research:
    devices:
      - nvidia.com/gpu=1
    deploy:
      resources:
        reservations:
          devices: []
  vision:
    devices:
      - nvidia.com/gpu=0
    deploy:
      resources:
        reservations:
          devices: []
  vllm:
    devices:
      - nvidia.com/gpu=1
    deploy:
      resources:
        reservations:
          devices: []
  finetune:
    devices:
      - nvidia.com/gpu=0
    deploy:
      resources:
        reservations:
          devices: []
--- a/compose.yml
+++ b/compose.yml
@ -0,0 +1,127 @@
 # compose.yml — Peregrine by Circuit Forge LLC
 # Profiles: remote | cpu | single-gpu | dual-gpu-ollama | dual-gpu-vllm | dual-gpu-mixed
 services:
  app:
    build: .
    command: >
      bash -c "streamlit run app/app.py
      --server.port=8501
      --server.headless=true
      --server.fileWatcherType=none
      2>&1 | tee /app/data/.streamlit.log"
    ports:
      - "${STREAMLIT_PORT:-8501}:8501"
    volumes:
      - ./config:/app/config
      - ./data:/app/data
      - ${DOCS_DIR:-~/Documents/JobSearch}:/docs
      - /var/run/docker.sock:/var/run/docker.sock
      - /usr/bin/docker:/usr/bin/docker:ro
    environment:
      - STAGING_DB=/app/data/staging.db
      - DOCS_DIR=/docs
      - ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY:-}
      - OPENAI_COMPAT_URL=${OPENAI_COMPAT_URL:-}
      - OPENAI_COMPAT_KEY=${OPENAI_COMPAT_KEY:-}
      - PEREGRINE_GPU_COUNT=${PEREGRINE_GPU_COUNT:-0}
      - PEREGRINE_GPU_NAMES=${PEREGRINE_GPU_NAMES:-}
      - RECOMMENDED_PROFILE=${RECOMMENDED_PROFILE:-remote}
      - STREAMLIT_SERVER_BASE_URL_PATH=${STREAMLIT_BASE_URL_PATH:-}
      - FORGEJO_API_TOKEN=${FORGEJO_API_TOKEN:-}
      - FORGEJO_REPO=${FORGEJO_REPO:-}
      - FORGEJO_API_URL=${FORGEJO_API_URL:-}
      - PYTHONUNBUFFERED=1
      - PYTHONLOGGING=WARNING
    depends_on:
      searxng:
        condition: service_healthy
    extra_hosts:
      - "host.docker.internal:host-gateway"
    restart: unless-stopped
  searxng:
    image: searxng/searxng:latest
    ports:
      - "${SEARXNG_PORT:-8888}:8080"
    volumes:
      - ./docker/searxng:/etc/searxng:ro
    healthcheck:
      test: ["CMD", "wget", "-q", "--spider", "http://localhost:8080/"]
      interval: 10s
      timeout: 5s
      retries: 3
    restart: unless-stopped
  ollama:
    image: ollama/ollama:latest
    ports:
      - "${OLLAMA_PORT:-11434}:11434"
    volumes:
      - ${OLLAMA_MODELS_DIR:-~/models/ollama}:/root/.ollama
      - ./docker/ollama/entrypoint.sh:/entrypoint.sh
    environment:
      - OLLAMA_MODELS=/root/.ollama
      - DEFAULT_OLLAMA_MODEL=${OLLAMA_DEFAULT_MODEL:-llama3.2:3b}
    entrypoint: ["/bin/bash", "/entrypoint.sh"]
    profiles: [cpu, single-gpu, dual-gpu-ollama, dual-gpu-vllm, dual-gpu-mixed]
    restart: unless-stopped
  ollama_research:
    image: ollama/ollama:latest
    ports:
      - "${OLLAMA_RESEARCH_PORT:-11435}:11434"
    volumes:
      - ${OLLAMA_MODELS_DIR:-~/models/ollama}:/root/.ollama
      - ./docker/ollama/entrypoint.sh:/entrypoint.sh
    environment:
      - OLLAMA_MODELS=/root/.ollama
      - DEFAULT_OLLAMA_MODEL=${OLLAMA_RESEARCH_MODEL:-llama3.2:3b}
    entrypoint: ["/bin/bash", "/entrypoint.sh"]
    profiles: [dual-gpu-ollama, dual-gpu-mixed]
    restart: unless-stopped
  vision:
    build:
      context: .
      dockerfile: scripts/vision_service/Dockerfile
    ports:
      - "${VISION_PORT:-8002}:8002"
    environment:
      - VISION_MODEL=${VISION_MODEL:-vikhyatk/moondream2}
      - VISION_REVISION=${VISION_REVISION:-2025-01-09}
    profiles: [single-gpu, dual-gpu-ollama, dual-gpu-vllm, dual-gpu-mixed]
    restart: unless-stopped
  vllm:
    image: vllm/vllm-openai:latest
    ports:
      - "${VLLM_PORT:-8000}:8000"
    volumes:
      - ${VLLM_MODELS_DIR:-~/models/vllm}:/models
    command: >
      --model /models/${VLLM_MODEL:-Ouro-1.4B}
      --trust-remote-code
      --max-model-len 4096
      --gpu-memory-utilization 0.75
      --enforce-eager
      --max-num-seqs 8
      --cpu-offload-gb ${CPU_OFFLOAD_GB:-0}
    profiles: [dual-gpu-vllm, dual-gpu-mixed]
    restart: unless-stopped
  finetune:
    build:
      context: .
      dockerfile: Dockerfile.finetune
    volumes:
      - ${DOCS_DIR:-~/Documents/JobSearch}:/docs
      - ${OLLAMA_MODELS_DIR:-~/models/ollama}:/ollama-models
      - ./config:/app/config
    environment:
      - DOCS_DIR=/docs
      - OLLAMA_URL=http://ollama:11434
      - OLLAMA_MODELS_MOUNT=/ollama-models
      - OLLAMA_MODELS_OLLAMA_PATH=/root/.ollama
    profiles: [finetune]
    restart: "no"
--- a/config/blocklist.yaml
+++ b/config/blocklist.yaml
@ -3,7 +3,8 @@
 # Company name blocklist — partial case-insensitive match on the company field.
 # e.g. "Amazon" blocks any listing where company contains "amazon".
-companies: []
+companies:
  - jobgether
 # Industry/content blocklist — blocked if company name OR job description contains any keyword.
 # Use this for industries you will never work in regardless of company.
--- a/config/integrations/airtable.yaml.example
+++ b/config/integrations/airtable.yaml.example
@ -0,0 +1,3 @@
 api_key: "patXXX..."
 base_id: "appXXX..."
 table_name: "Jobs"
--- a/config/integrations/apple_calendar.yaml.example
+++ b/config/integrations/apple_calendar.yaml.example
@ -0,0 +1,4 @@
 caldav_url: "https://caldav.icloud.com/"
 username: "you@icloud.com"
 app_password: "xxxx-xxxx-xxxx-xxxx"
 calendar_name: "Interviews"
--- a/config/integrations/discord.yaml.example
+++ b/config/integrations/discord.yaml.example
@ -0,0 +1 @@
 webhook_url: "https://discord.com/api/webhooks/..."
--- a/config/integrations/dropbox.yaml.example
+++ b/config/integrations/dropbox.yaml.example
@ -0,0 +1,2 @@
 access_token: "sl...."
 folder_path: "/Peregrine"
--- a/config/integrations/google_calendar.yaml.example
+++ b/config/integrations/google_calendar.yaml.example
@ -0,0 +1,2 @@
 calendar_id: "primary"
 credentials_json: "~/credentials/google-calendar-sa.json"
--- a/config/integrations/google_drive.yaml.example
+++ b/config/integrations/google_drive.yaml.example
@ -0,0 +1,2 @@
 folder_id: "your-google-drive-folder-id"
 credentials_json: "~/credentials/google-drive-sa.json"
--- a/config/integrations/google_sheets.yaml.example
+++ b/config/integrations/google_sheets.yaml.example
@ -0,0 +1,3 @@
 spreadsheet_id: "your-spreadsheet-id"
 sheet_name: "Jobs"
 credentials_json: "~/credentials/google-sheets-sa.json"
--- a/config/integrations/home_assistant.yaml.example
+++ b/config/integrations/home_assistant.yaml.example
@ -0,0 +1,3 @@
 base_url: "http://homeassistant.local:8123"
 token: "eyJ0eXAiOiJKV1Qi..."
 notification_service: "notify.mobile_app_my_phone"
--- a/config/integrations/mega.yaml.example
+++ b/config/integrations/mega.yaml.example
@ -0,0 +1,3 @@
 email: "you@example.com"
 password: "your-mega-password"
 folder_path: "/Peregrine"
--- a/config/integrations/nextcloud.yaml.example
+++ b/config/integrations/nextcloud.yaml.example
@ -0,0 +1,4 @@
 host: "https://nextcloud.example.com"
 username: "your-username"
 password: "your-app-password"
 folder_path: "/Peregrine"
--- a/config/integrations/notion.yaml.example
+++ b/config/integrations/notion.yaml.example
@ -0,0 +1,2 @@
 token: "secret_..."
 database_id: "32-character-notion-db-id"
--- a/config/integrations/onedrive.yaml.example
+++ b/config/integrations/onedrive.yaml.example
@ -0,0 +1,3 @@
 client_id: "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"
 client_secret: "your-client-secret"
 folder_path: "/Peregrine"
--- a/config/integrations/slack.yaml.example
+++ b/config/integrations/slack.yaml.example
@ -0,0 +1,2 @@
 webhook_url: "https://hooks.slack.com/services/..."
 channel: "#job-alerts"
--- a/config/llm.yaml
+++ b/config/llm.yaml
@ -3,48 +3,55 @@ backends:
    api_key_env: ANTHROPIC_API_KEY
    enabled: false
    model: claude-sonnet-4-6
    type: anthropic
    supports_images: true
    type: anthropic
  claude_code:
    api_key: any
    base_url: http://localhost:3009/v1
    enabled: false
    model: claude-code-terminal
    type: openai_compat
    supports_images: true
    type: openai_compat
  github_copilot:
    api_key: any
    base_url: http://localhost:3010/v1
    enabled: false
    model: gpt-4o
    type: openai_compat
    supports_images: false
    type: openai_compat
  ollama:
    api_key: ollama
-    base_url: http://localhost:11434/v1
+    base_url: http://host.docker.internal:11434/v1
    enabled: true
-    model: alex-cover-writer:latest
+    model: llama3.2:3b
    type: openai_compat
    supports_images: false
    type: openai_compat
  ollama_research:
    api_key: ollama
-    base_url: http://localhost:11434/v1
+    base_url: http://host.docker.internal:11434/v1
    enabled: true
-    model: llama3.1:8b
+    model: llama3.2:3b
    type: openai_compat
    supports_images: false
    type: openai_compat
  vision_service:
    base_url: http://host.docker.internal:8002
    enabled: true
    supports_images: true
    type: vision_service
  vllm:
    api_key: ''
-    base_url: http://localhost:8000/v1
+    base_url: http://host.docker.internal:8000/v1
    enabled: true
    model: __auto__
    type: openai_compat
    supports_images: false
-  vision_service:
+    type: openai_compat
-    base_url: http://localhost:8002
+  vllm_research:
-    enabled: false
+    api_key: ''
-    type: vision_service
+    base_url: http://host.docker.internal:8000/v1
-    supports_images: true
+    enabled: true
    model: __auto__
    supports_images: false
    type: openai_compat
 fallback_order:
 - ollama
 - claude_code
@ -53,7 +60,7 @@ fallback_order:
 - anthropic
 research_fallback_order:
 - claude_code
- vllm
+- vllm_research
 - ollama_research
 - github_copilot
 - anthropic
@ -61,6 +68,3 @@ vision_fallback_order:
 - vision_service
 - claude_code
 - anthropic
 # Note: 'ollama' (alex-cover-writer) intentionally excluded — research
 # must never use the fine-tuned writer model, and this also avoids evicting
 # the writer from GPU memory while a cover letter task is in flight.
--- a/config/llm.yaml.example
+++ b/config/llm.yaml.example
@ -21,21 +21,21 @@ backends:
    supports_images: false
  ollama:
    api_key: ollama
-    base_url: http://localhost:11434/v1
+    base_url: http://ollama:11434/v1    # Docker service name; use localhost:11434 outside Docker
    enabled: true
-    model: alex-cover-writer:latest
+    model: llama3.2:3b
    type: openai_compat
    supports_images: false
  ollama_research:
    api_key: ollama
-    base_url: http://localhost:11434/v1
+    base_url: http://ollama:11434/v1    # Docker service name; use localhost:11434 outside Docker
    enabled: true
-    model: llama3.1:8b
+    model: llama3.2:3b
    type: openai_compat
    supports_images: false
  vllm:
    api_key: ''
-    base_url: http://localhost:8000/v1
+    base_url: http://vllm:8000/v1      # Docker service name; use localhost:8000 outside Docker
    enabled: true
    model: __auto__
    type: openai_compat
@ -64,3 +64,14 @@ vision_fallback_order:
 # Note: 'ollama' (alex-cover-writer) intentionally excluded — research
 # must never use the fine-tuned writer model, and this also avoids evicting
 # the writer from GPU memory while a cover letter task is in flight.
 # ── Scheduler — LLM batch queue optimizer ─────────────────────────────────────
 # The scheduler batches LLM tasks by model type to avoid GPU model switching.
 # VRAM budgets are conservative peak estimates (GB) for each task type.
 # Increase if your models are larger; decrease if tasks share GPU memory well.
 scheduler:
  vram_budgets:
    cover_letter: 2.5       # alex-cover-writer:latest (~2GB GGUF + headroom)
    company_research: 5.0   # llama3.1:8b or vllm model
    wizard_generate: 2.5    # same model family as cover_letter
  max_queue_depth: 500      # max pending tasks per type before drops (with logged warning)
--- a/config/search_profiles.yaml
+++ b/config/search_profiles.yaml
@ -1,4 +1,15 @@
 profiles:
 - boards:
  - linkedin
  - indeed
  - glassdoor
  - zip_recruiter
  job_titles:
  - Customer Service Specialist
  locations:
  - San Francisco CA
  name: default
  remote_only: false
 - boards:
  - linkedin
  - indeed
--- a/config/server.yaml.example
+++ b/config/server.yaml.example
@ -0,0 +1,14 @@
 # config/server.yaml — Peregrine deployment / server settings
 # Copy to config/server.yaml and edit. Gitignored — do not commit.
 # Changes require restarting Peregrine to take effect (./manage.sh restart).
 # base_url_path: URL prefix when serving Peregrine behind a reverse proxy.
 #   Leave empty ("") for direct access (http://localhost:8502).
 #   Set to "peregrine" when proxied at https://example.com/peregrine.
 #   Maps to STREAMLIT_BASE_URL_PATH in .env → STREAMLIT_SERVER_BASE_URL_PATH
 #   in the container. See: https://docs.streamlit.io/develop/api-reference/configuration/config.toml#server
 base_url_path: ""
 # server_port: Port Streamlit listens on inside the container (usually 8501).
 #   The external/host port is set via STREAMLIT_PORT in .env.
 server_port: 8501
--- a/config/skills_suggestions.yaml
+++ b/config/skills_suggestions.yaml
@ -0,0 +1,193 @@
 # skills_suggestions.yaml — Bundled tag suggestions for the Skills & Keywords UI.
 # Shown as searchable options in the multiselect. Users can add custom tags beyond these.
 # Future: community aggregate (paid tier) will supplement this list from anonymised installs.
 skills:
  # ── Customer Success & Account Management ──
  - Customer Success
  - Technical Account Management
  - Account Management
  - Customer Onboarding
  - Renewal Management
  - Churn Prevention
  - Expansion Revenue
  - Executive Relationship Management
  - Escalation Management
  - QBR Facilitation
  - Customer Advocacy
  - Voice of the Customer
  - Customer Health Scoring
  - Success Planning
  - Customer Education
  - Implementation Management
  # ── Revenue & Operations ──
  - Revenue Operations
  - Sales Operations
  - Pipeline Management
  - Forecasting
  - Contract Negotiation
  - Upsell & Cross-sell
  - ARR / MRR Management
  - NRR Optimization
  - Quota Attainment
  # ── Leadership & Management ──
  - Team Leadership
  - People Management
  - Cross-functional Collaboration
  - Change Management
  - Stakeholder Management
  - Executive Presentation
  - Strategic Planning
  - OKR Setting
  - Hiring & Recruiting
  - Coaching & Mentoring
  - Performance Management
  # ── Project & Program Management ──
  - Project Management
  - Program Management
  - Agile / Scrum
  - Kanban
  - Risk Management
  - Resource Planning
  - Process Improvement
  - SOP Development
  # ── Technical Skills ──
  - SQL
  - Python
  - Data Analysis
  - Tableau
  - Looker
  - Power BI
  - Excel / Google Sheets
  - REST APIs
  - Salesforce
  - HubSpot
  - Gainsight
  - Totango
  - ChurnZero
  - Zendesk
  - Intercom
  - Jira
  - Confluence
  - Notion
  - Slack
  - Zoom
  # ── Communications & Writing ──
  - Executive Communication
  - Technical Writing
  - Proposal Writing
  - Presentation Skills
  - Public Speaking
  - Stakeholder Communication
  # ── Compliance & Security ──
  - Compliance
  - Risk Assessment
  - SOC 2
  - ISO 27001
  - GDPR
  - Security Awareness
  - Vendor Management
 domains:
  # ── Software & Tech ──
  - B2B SaaS
  - Enterprise Software
  - Cloud Infrastructure
  - Developer Tools
  - Cybersecurity
  - Data & Analytics
  - AI / ML Platform
  - FinTech
  - InsurTech
  - LegalTech
  - HR Tech
  - MarTech
  - AdTech
  - DevOps / Platform Engineering
  - Open Source
  # ── Industry Verticals ──
  - Healthcare / HealthTech
  - Education / EdTech
  - Non-profit / Social Impact
  - Government / GovTech
  - E-commerce / Retail
  - Manufacturing
  - Financial Services
  - Media & Entertainment
  - Music Industry
  - Logistics & Supply Chain
  - Real Estate / PropTech
  - Energy / CleanTech
  - Hospitality & Travel
  # ── Market Segments ──
  - Enterprise
  - Mid-Market
  - SMB / SME
  - Startup
  - Fortune 500
  - Public Sector
  - International / Global
  # ── Business Models ──
  - Subscription / SaaS
  - Marketplace
  - Usage-based Pricing
  - Professional Services
  - Self-serve / PLG
 keywords:
  # ── CS Metrics & Outcomes ──
  - NPS
  - CSAT
  - CES
  - Churn Rate
  - Net Revenue Retention
  - Gross Revenue Retention
  - Logo Retention
  - Time-to-Value
  - Product Adoption
  - Feature Utilisation
  - Health Score
  - Customer Lifetime Value
  # ── Sales & Growth ──
  - ARR
  - MRR
  - GRR
  - NRR
  - Expansion ARR
  - Pipeline Coverage
  - Win Rate
  - Average Contract Value
  - Land & Expand
  - Multi-threading
  # ── Process & Delivery ──
  - Onboarding
  - Implementation
  - Knowledge Transfer
  - Escalation
  - SLA
  - Root Cause Analysis
  - Post-mortem
  - Runbook
  - Playbook Development
  - Feedback Loop
  - Product Roadmap Input
  # ── Team & Culture ──
  - Cross-functional
  - Distributed Team
  - Remote-first
  - High-growth
  - Fast-paced
  - Autonomous
  - Data-driven
  - Customer-centric
  - Empathetic Leadership
  - Inclusive Culture
  # ── Job-seeker Keywords ──
  - Strategic
  - Proactive
  - Hands-on
  - Scalable Processes
  - Operational Excellence
  - Business Impact
  - Executive Visibility
  - Player-Coach
--- a/config/user.yaml.example
+++ b/config/user.yaml.example
@ -0,0 +1,66 @@
 # config/user.yaml.example
 # Copy to config/user.yaml and fill in your details.
 # The first-run wizard will create this file automatically.
 name: "Your Name"
 email: "you@example.com"
 phone: "555-000-0000"
 linkedin: "linkedin.com/in/yourprofile"
 career_summary: >
  Experienced professional with X years in [your field].
  Specialise in [key skills]. Known for [strength].
 nda_companies: []   # e.g. ["FormerEmployer"] — masked in research briefs
 # Optional: industries you genuinely care about.
 # When a company/JD matches an industry, the cover letter prompt injects
 # your personal note so Para 3 can reflect authentic alignment.
 # Leave a value empty ("") to use a sensible generic default.
 mission_preferences:
  music: ""           # e.g. "I've played in bands for 15 years and care deeply about how artists get paid"
  animal_welfare: ""  # e.g. "I volunteer at my local shelter every weekend"
  education: ""       # e.g. "I tutored underserved kids for 3 years and care deeply about literacy"
  social_impact: ""   # e.g. "I want my work to reach people who need help most"
  health: ""          # e.g. "I care about people navigating rare or poorly-understood health conditions"
                      # Note: if left empty, Para 3 defaults to focusing on the people the company
                      # serves — not the industry. Fill in for a more personal connection.
 # Optional: how you write and communicate. Used to shape cover letter voice.
 # e.g. "Warm and direct. Cares about people first. Finds rare and complex situations fascinating."
 candidate_voice: ""
 # Set to true to include optional identity-related sections in research briefs.
 # Both are for your personal decision-making only — never included in applications.
 # Adds a disability inclusion & accessibility section (ADA, ERGs, WCAG signals).
 candidate_accessibility_focus: false
 # Adds an LGBTQIA+ inclusion section (ERGs, non-discrimination policies, culture signals).
 candidate_lgbtq_focus: false
 tier: free                  # free | paid | premium
 dev_tier_override: null     # overrides tier locally (for testing only)
 wizard_complete: false
 wizard_step: 0
 dismissed_banners: []
 docs_dir: "~/Documents/JobSearch"
 ollama_models_dir: "~/models/ollama"
 vllm_models_dir: "~/models/vllm"
 inference_profile: "remote"  # remote | cpu | single-gpu | dual-gpu
 services:
  streamlit_port: 8501
  ollama_host: ollama        # Docker service name; use "localhost" if running outside Docker
  ollama_port: 11434
  ollama_ssl: false
  ollama_ssl_verify: true
  vllm_host: vllm            # Docker service name; use "localhost" if running outside Docker
  vllm_port: 8000
  vllm_ssl: false
  vllm_ssl_verify: true
  searxng_host: searxng      # Docker service name; use "localhost" if running outside Docker
  searxng_port: 8080         # internal Docker port; use 8888 for host-mapped access
  searxng_ssl: false
  searxng_ssl_verify: true
--- a/data/email_score.jsonl.example
+++ b/data/email_score.jsonl.example
@ -0,0 +1,8 @@
 {"subject": "Interview Invitation — Senior Engineer", "body": "Hi Alex, we'd love to schedule a 30-min phone screen. Are you available Thursday at 2pm? Please reply to confirm.", "label": "interview_scheduled"}
 {"subject": "Your application to Acme Corp", "body": "Thank you for your interest in the Senior Engineer role. After careful consideration, we have decided to move forward with other candidates whose experience more closely matches our current needs.", "label": "rejected"}
 {"subject": "Offer Letter — Product Manager at Initech", "body": "Dear Alex, we are thrilled to extend an offer of employment for the Product Manager position. Please find the attached offer letter outlining compensation and start date.", "label": "offer_received"}
 {"subject": "Quick question about your background", "body": "Hi Alex, I came across your profile and would love to connect. We have a few roles that seem like a great match. Would you be open to a brief chat this week?", "label": "positive_response"}
 {"subject": "Company Culture Survey — Acme Corp", "body": "Alex, as part of our evaluation process, we invite all candidates to complete our culture fit assessment. The survey takes approximately 15 minutes. Please click the link below.", "label": "survey_received"}
 {"subject": "Application Received — DataCo", "body": "Thank you for submitting your application for the Data Engineer role at DataCo. We have received your materials and will be in touch if your qualifications match our needs.", "label": "neutral"}
 {"subject": "Following up on your application", "body": "Hi Alex, I wanted to follow up on your recent application. Your background looks interesting and we'd like to learn more. Can we set up a quick call?", "label": "positive_response"}
 {"subject": "We're moving forward with other candidates", "body": "Dear Alex, thank you for taking the time to interview with us. After thoughtful consideration, we have decided not to move forward with your candidacy at this time.", "label": "rejected"}
--- a/demo/config/llm.yaml
+++ b/demo/config/llm.yaml
@ -0,0 +1,68 @@
 # Demo LLM config — all backends disabled.
 # DEMO_MODE=true in the environment blocks the router before any backend is tried,
 # so these values are never actually used. Kept for schema completeness.
 backends:
  anthropic:
    api_key_env: ANTHROPIC_API_KEY
    enabled: false
    model: claude-sonnet-4-6
    supports_images: true
    type: anthropic
  claude_code:
    api_key: any
    base_url: http://localhost:3009/v1
    enabled: false
    model: claude-code-terminal
    supports_images: true
    type: openai_compat
  github_copilot:
    api_key: any
    base_url: http://localhost:3010/v1
    enabled: false
    model: gpt-4o
    supports_images: false
    type: openai_compat
  ollama:
    api_key: ollama
    base_url: http://localhost:11434/v1
    enabled: false
    model: llama3.2:3b
    supports_images: false
    type: openai_compat
  ollama_research:
    api_key: ollama
    base_url: http://localhost:11434/v1
    enabled: false
    model: llama3.2:3b
    supports_images: false
    type: openai_compat
  vision_service:
    base_url: http://localhost:8002
    enabled: false
    supports_images: true
    type: vision_service
  vllm:
    api_key: ''
    base_url: http://localhost:8000/v1
    enabled: false
    model: __auto__
    supports_images: false
    type: openai_compat
  vllm_research:
    api_key: ''
    base_url: http://localhost:8000/v1
    enabled: false
    model: __auto__
    supports_images: false
    type: openai_compat
 fallback_order:
 - ollama
 - vllm
 - anthropic
 research_fallback_order:
 - vllm_research
 - ollama_research
 - anthropic
 vision_fallback_order:
 - vision_service
 - anthropic
--- a/demo/config/user.yaml
+++ b/demo/config/user.yaml
@ -0,0 +1,44 @@
 candidate_accessibility_focus: false
 candidate_lgbtq_focus: false
 candidate_voice: Clear, direct, and human. Focuses on impact over jargon.
 career_summary: 'Experienced software engineer with a background in full-stack development,
  cloud infrastructure, and data pipelines. Passionate about building tools that help
  people navigate complex systems.
  '
 dev_tier_override: null
 dismissed_banners:
 - connect_cloud
 - setup_email
 docs_dir: /docs
 email: demo@circuitforge.tech
 inference_profile: remote
 linkedin: ''
 mission_preferences:
  animal_welfare: ''
  education: ''
  health: ''
  music: ''
  social_impact: Want my work to reach people who need it most.
 name: Demo User
 nda_companies: []
 ollama_models_dir: ~/models/ollama
 phone: ''
 services:
  ollama_host: localhost
  ollama_port: 11434
  ollama_ssl: false
  ollama_ssl_verify: true
  searxng_host: searxng
  searxng_port: 8080
  searxng_ssl: false
  searxng_ssl_verify: true
  streamlit_port: 8501
  vllm_host: localhost
  vllm_port: 8000
  vllm_ssl: false
  vllm_ssl_verify: true
 tier: free
 vllm_models_dir: ~/models/vllm
 wizard_complete: true
 wizard_step: 0
--- a/demo/data/.gitkeep
+++ b/demo/data/.gitkeep
--- a/docker/ollama/entrypoint.sh
+++ b/docker/ollama/entrypoint.sh
@ -0,0 +1,10 @@
 #!/usr/bin/env bash
 # Start Ollama server and pull a default model if none are present
 ollama serve &
 sleep 5
 if [ -z "$(ollama list 2>/dev/null | tail -n +2)" ]; then
    MODEL="${DEFAULT_OLLAMA_MODEL:-llama3.2:3b}"
    echo "No models found — pulling $MODEL..."
    ollama pull "$MODEL"
 fi
 wait
--- a/docker/searxng/settings.yml
+++ b/docker/searxng/settings.yml
@ -0,0 +1,8 @@
 use_default_settings: true
 search:
  formats:
    - html
    - json
 server:
  secret_key: "change-me-in-production"
  bind_address: "0.0.0.0:8080"
--- a/docs/.gitkeep
+++ b/docs/.gitkeep
--- a/docs/backlog.md
+++ b/docs/backlog.md
@ -0,0 +1,197 @@
 # Peregrine — Feature Backlog
 Unscheduled ideas and deferred features. Roughly grouped by area.
 See also: `circuitforge-plans/shared/2026-03-07-launch-checklist.md` for pre-launch blockers
 (legal docs, Stripe live keys, website deployment, demo DB ownership fix).
 ---
 ## Launch Blockers (tracked in shared launch checklist)
 - **ToS + Refund Policy** — required before live Stripe charges. Files go in `website/content/legal/`.
 - **Stripe live key rotation** — swap test keys to live in `website/.env` (zero code changes).
 - **Website deployment to bastion** — Caddy route for Nuxt frontend at `circuitforge.tech`.
 - **Demo DB ownership** — `demo/data/staging.db` is root-owned (Docker artifact); fix with `sudo chown alan:alan` then re-run `demo/seed_demo.py`.
 ---
 ## Post-Launch / Infrastructure
 - **Accessibility Statement** — WCAG 2.1 conformance doc at `website/content/legal/accessibility.md`. High credibility value for ND audience.
 - **Data deletion request process** — published procedure at `website/content/legal/data-deletion.md` (GDPR/CCPA; references `privacy@circuitforge.tech`).
 - **Uptime Kuma monitors** — 6 monitors need to be added manually (website, Heimdall, demo, Directus, Forgejo, Peregrine container health).
 - **Directus admin password rotation** — change from `changeme-set-via-ui-on-first-run` before website goes public.
 ---
 ## Discovery — Community Scraper Plugin System
 Design doc: `circuitforge-plans/peregrine/2026-03-07-community-scraper-plugin-design.md`
 **Summary:** Add a `scripts/plugins/` directory with auto-discovery and a documented MIT-licensed
 plugin API. Separates CF-built custom scrapers (paid, BSL 1.1, in `scripts/custom_boards/`) from
 community-contributed and CF-freebie scrapers (free, MIT, in `scripts/plugins/`).
 **Implementation tasks:**
 - [ ] Add `scripts/plugins/` with `__init__.py`, `README.md`, and `example_plugin.py`
 - [ ] Add `config/plugins/` directory with `.gitkeep`; gitignore `config/plugins/*.yaml` (not `.example`)
 - [ ] Update `discover.py`: `load_plugins()` auto-discovery + tier gate (`custom_boards` = paid, `plugins` = free)
 - [ ] Update `search_profiles.yaml` schema: add `plugins:` list + `plugin_config:` block
 - [ ] Migrate `scripts/custom_boards/craigslist.py` → `scripts/plugins/craigslist.py` (CF freebie)
 - [ ] Settings UI: render `CONFIG_SCHEMA` fields for installed plugins (Settings → Search)
 - [ ] Rewrite `docs/developer-guide/adding-scrapers.md` to document the plugin API
 - [ ] Add `scripts/plugins/LICENSE` (MIT) to make the dual-license split explicit
 **CF freebie candidates** (future, after plugin system ships):
 - Dice.com (tech-focused, no API key)
 - We Work Remotely (remote-only, clean HTML)
 - Wellfound / AngelList (startup roles)
 ---
 ## Discovery — Jobgether Non-Headless Scraper
 Design doc: `peregrine/docs/superpowers/specs/2026-03-15-jobgether-integration-design.md`
 **Background:** Headless Playwright is blocked by Cloudflare Turnstile on all `jobgether.com` pages.
 A non-headless Playwright instance backed by `Xvfb` (virtual framebuffer) renders as a real browser and
 bypasses Turnstile. Heimdall already has Xvfb available.
 **Live-inspection findings (2026-03-15):**
 - Search URL: `https://jobgether.com/search-offers?keyword=<query>`
 - Job cards: `div.new-opportunity` — one per listing
 - Card URL: `div.new-opportunity > a[href*="/offer/"]` (`href` attr)
 - Title: `#offer-body h3`
 - Company: `#offer-body p.font-medium`
 - Dedup: existing URL-based dedup in `discover.py` covers Jobgether↔other-board overlap
 **Implementation tasks (blocked until Xvfb-Playwright integration is in place):**
 - [ ] Add `Xvfb` launch helper to `scripts/custom_boards/` (shared util, or inline in scraper)
 - [ ] Implement `scripts/custom_boards/jobgether.py` using `p.chromium.launch(headless=False)` with `DISPLAY=:99`
 - [ ] Pre-launch `Xvfb :99 -screen 0 1280x720x24` (or assert `DISPLAY` is already set)
 - [ ] Register `jobgether` in `discover.py` `CUSTOM_SCRAPERS` (currently omitted — no viable scraper)
 - [ ] Add `jobgether` to `custom_boards` in remote-eligible profiles in `config/search_profiles.yaml`
 - [ ] Remove or update the "Jobgether discovery scraper — decided against" note in the design spec
 **Pre-condition:** Validate Xvfb approach manually (headless=False + `DISPLAY=:99`) before implementing.
 The `filter-api.jobgether.com` endpoint still requires auth and `robots.txt` still blocks bots —
 confirm Turnstile acceptance is the only remaining blocker before beginning.
 ---
 ## Settings / Data Management
 - **Backup / Restore / Teleport** — Settings panel option to export a full config snapshot (user.yaml + all gitignored configs) as a zip, restore from a snapshot, and "teleport" (export + import to a new machine or Docker volume). Useful for migrations, multi-machine setups, and safe wizard testing.
 - **Complete Google Drive integration test()** — `scripts/integrations/google_drive.py` `test()` currently only checks that the credentials file exists (TODO comment). Implement actual Google Drive API call using `google-api-python-client` to verify the token works.
 ---
 ## First-Run Wizard
 - **Wire real LLM test in Step 5 (Inference)** — `app/wizard/step_inference.py` validates an `endpoint_confirmed` boolean flag only. Replace with an actual LLM call: submit a minimal prompt to the configured endpoint, show pass/fail, and only set `endpoint_confirmed: true` on success. Should test whichever backend the user selected (Ollama, vLLM, Anthropic, etc.).
 ---
 ## LinkedIn Import
 Shipped in v0.4.0. Ongoing maintenance and known decisions:
 - **Selector maintenance** — LinkedIn changes their DOM periodically. When import stops working, update
  CSS selectors in `scripts/linkedin_utils.py` only (all other files import from there). Real `data-section`
  attribute values (as of 2025 DOM): `summary`, `currentPositionsDetails`, `educationsDetails`,
  `certifications`, `posts`, `volunteering`, `publications`, `projects`.
 - **Data export zip is the recommended path for full history** — LinkedIn's unauthenticated public profile
  page is server-side degraded: experience titles, past roles, education, and skills are blurred/omitted.
  Only available without login: name, About summary (truncated), current employer name, certifications.
  The "Import from LinkedIn data export zip" expander (Settings → Resume Profile and Wizard step 3) is the
  correct path for full career history. UI already shows an `ℹ️` callout explaining this.
 - **LinkedIn OAuth — decided: not viable** — LinkedIn's OAuth API is restricted to approved partner
  programs. Even if approved, it only grants name + email (not career history, experience, or skills).
  This is a deliberate LinkedIn platform restriction, not a technical gap. Do not pursue this path.
 - **Selector test harness** (future) — A lightweight test that fetches a known-public LinkedIn profile
  and asserts at least N fields non-empty would catch DOM breakage before users report it. Low priority
  until selector breakage becomes a recurring support issue.
 ---
 ## Cover Letter / Resume Generation
 - ~~**Iterative refinement feedback loop**~~ — ✅ Done (`94225c9`): `generate()` accepts `previous_result`/`feedback`; task_runner parses params JSON; Apply Workspace has "Refine with Feedback" expander. Same pattern available for wizard `expand_bullets` via `_run_wizard_generate`.
 ---
 ## Apply / Browser Integration
 - **Browser autofill extension** — Chrome/Firefox extension that reads job application forms and auto-fills from the user's profile + generated cover letter; syncs submitted applications back into the pipeline automatically. (Phase 2 paid+ feature per business plan.)
 ---
 ## Ultra Tier — Managed Applications (White-Glove Service)
 - **Concept** — A human-in-the-loop concierge tier where a trained operator submits applications on the user's behalf, powered by AI-generated artifacts (cover letter, company research, survey responses). AI handles ~80% of the work; operator handles form submission, CAPTCHAs, and complex custom questions.
 - **Pricing model** — Per-application or bundle pricing rather than flat "X apps/month" — application complexity varies too much for flat pricing to be sustainable.
 - **Operator interface** — Thin admin UI (separate from user-facing app) that reads from the same `staging.db`: shows candidate profile, job listing, generated cover letter, company brief, and a "Mark submitted" button. New job status `queued_for_operator` to represent the handoff.
 - **Key unlock** — Browser autofill extension (above) becomes the operator's primary tool; pre-fills forms from profile + cover letter, operator reviews and submits.
 - **Tier addition** — Add `"ultra"` to `TIERS` in `app/wizard/tiers.py`; gate `"managed_applications"` feature. The existing tier system is designed to accommodate this cleanly.
 - **Quality / trust** — Each submission requires explicit per-job user approval before operator acts. Full audit trail (who submitted, when, what was sent). Clear ToS around representation.
 - **Bootstrap strategy** — Waitlist + small trusted operator team initially to validate workflow before scaling or automating further. Don't build operator tooling until the manual flow is proven.
 ---
 ## Container Runtime
 - ~~**Podman support**~~ — ✅ Done: `Makefile` auto-detects `docker compose` / `podman compose` / `podman-compose`; `compose.podman-gpu.yml` CDI override for GPU profiles; `setup.sh` detects existing Podman and skips Docker install.
 - **FastAPI migration path** — When concurrent-user scale demands it: port Streamlit pages to FastAPI + React/HTMX, keep `scripts/` layer unchanged, replace daemon threads with Celery + Redis. The `scripts/` separation already makes this clean.
 ---
 ## Email Sync
 See also: `docs/plans/email-sync-testing-checklist.md` for outstanding test coverage items.
 ---
 ## Circuit Forge LLC — Product Expansion ("Heinous Tasks" Platform)
 The core insight: the Peregrine pipeline architecture (monitor → AI assist → human approval → execute) is domain-agnostic. Job searching is the proof-of-concept. The same pattern applies to any task that is high-stakes, repetitive, opaque, or just deeply unpleasant.
 Each product ships as a **separate app** sharing the same underlying scaffold (pipeline engine, LLM router, background tasks, wizard, tier system, operator interface for Ultra tier). The business is Circuit Forge LLC; the brand positioning is: *"AI for the tasks you hate most."*
 ### Candidate products (rough priority order)
 - **Falcon** — Government form assistance. Benefits applications, disability claims, FAFSA, immigration forms, small business permits. AI pre-fills from user profile, flags ambiguous questions, generates supporting statements. High value: mistakes here are costly and correction is slow.
 - **Osprey** — Customer service queue management. Monitors hold queues, auto-navigates IVR trees via speech synthesis, escalates to human agent at the right moment, drafts complaint letters and dispute emails with the right tone and regulatory citations (CFPB, FCC, etc.). Tracks ticket status across cases.
 - **Kestrel** — DMV / government appointment booking. Monitors appointment availability for DMV, passport offices, Social Security offices, USCIS biometrics, etc. Auto-books the moment a slot opens. Sends reminders with checklist of required documents.
 - **Harrier** — Insurance navigation. Prior authorization tracking, claim dispute drafting, EOB reconciliation, appeal letters. High willingness-to-pay: a denied $50k claim is worth paying to fight.
 - **Merlin** — Rental / housing applications. Monitors listings, auto-applies to matching properties, generates cover letters for competitive rental markets, tracks responses, flags lease red flags.
 - **Ibis** — Healthcare coordination. The sacred ibis was the symbol of Thoth, Egyptian god of medicine — the name carries genuine medical heritage. Referral tracking, specialist waitlist monitoring, prescription renewal reminders, medical record request management, prior auth paper trails.
 - **Tern** — Travel planning. The Arctic tern makes the longest migration of any animal (44,000 miles/year, pole to pole) — the ultimate traveler. Flight/hotel monitoring, itinerary generation, visa requirement research, travel insurance comparison, rebooking assistance on disruption.
 - **Wren** — Contractor engagement. Wrens are legendary nest-builders — meticulous, structural, persistent. Contractor discovery, quote comparison, scope-of-work generation, milestone tracking, dispute documentation, lien waiver management.
 - **Martin** — Car / home maintenance. The house martin nests on the exterior of buildings and returns to the same site every year to maintain it — almost too on-the-nose. Service scheduling, maintenance history tracking, recall monitoring, warranty tracking, finding trusted local providers.
 ### Shared architecture decisions
 - **Separate repos, shared `circuitforge-core` package** — pipeline engine, LLM router, background task runner, wizard framework, tier system, operator interface all extracted into a private PyPI package that each product imports.
 - **Same Docker Compose scaffold** — each product is a `compose.yml` away from deployment.
 - **Same Ultra tier model** — operator interface reads from product's DB, human-in-the-loop for tasks that can't be automated (CAPTCHAs, phone calls, wet signatures).
 - **Prove Peregrine first** — don't extract `circuitforge-core` until the second product is actively being built. Premature extraction is over-engineering.
 ### What makes this viable
 - Each domain has the same pain profile: high-stakes, time-sensitive, opaque processes with inconsistent UX.
 - Users are highly motivated to pay — the alternative is hours of their own time on hold or filling out forms.
 - The human-in-the-loop (Ultra) model handles the hardest cases without requiring full automation.
 - Regulatory moat: knowing which citations matter (CFPB for billing disputes, ADA for accommodation requests) is defensible knowledge that gets baked into prompts over time.
 ---
--- a/docs/developer-guide/adding-integrations.md
+++ b/docs/developer-guide/adding-integrations.md
@ -0,0 +1,249 @@
 # Adding an Integration
 Peregrine's integration system is auto-discovered — add a class and a config example, and it appears in the wizard and Settings automatically. No registration step is needed.
 ---
 ## Step 1 — Create the integration module
 Create `scripts/integrations/myservice.py`:
 ```python
 # scripts/integrations/myservice.py
 from scripts.integrations.base import IntegrationBase
 class MyServiceIntegration(IntegrationBase):
    name  = "myservice"      # must be unique; matches config filename
    label = "My Service"     # display name shown in the UI
    tier  = "free"           # "free" | "paid" | "premium"
    def fields(self) -> list[dict]:
        """Return form field definitions for the connection card in the wizard/Settings UI."""
        return [
            {
                "key":         "api_key",
                "label":       "API Key",
                "type":        "password",   # "text" | "password" | "url" | "checkbox"
                "placeholder": "sk-...",
                "required":    True,
                "help":        "Get your key at myservice.com/settings/api",
            },
            {
                "key":         "workspace_id",
                "label":       "Workspace ID",
                "type":        "text",
                "placeholder": "ws_abc123",
                "required":    True,
                "help":        "Found in your workspace URL",
            },
        ]
    def connect(self, config: dict) -> bool:
        """
        Store credentials in memory. Return True if all required fields are present.
        Does NOT verify credentials — call test() for that.
        """
        self._api_key      = config.get("api_key", "").strip()
        self._workspace_id = config.get("workspace_id", "").strip()
        return bool(self._api_key and self._workspace_id)
    def test(self) -> bool:
        """
        Verify the stored credentials actually work.
        Returns True on success, False on any failure.
        """
        try:
            import requests
            r = requests.get(
                "https://api.myservice.com/v1/ping",
                headers={"Authorization": f"Bearer {self._api_key}"},
                params={"workspace": self._workspace_id},
                timeout=5,
            )
            return r.ok
        except Exception:
            return False
    def sync(self, jobs: list[dict]) -> int:
        """
        Optional: push jobs to the external service.
        Return the count of successfully synced jobs.
        The default implementation in IntegrationBase returns 0 (no-op).
        Only override this if your integration supports job syncing
        (e.g. Notion, Airtable, Google Sheets).
        """
        synced = 0
        for job in jobs:
            try:
                self._push_job(job)
                synced += 1
            except Exception as e:
                print(f"[myservice] sync error for job {job.get('id')}: {e}")
        return synced
    def _push_job(self, job: dict) -> None:
        import requests
        requests.post(
            "https://api.myservice.com/v1/records",
            headers={"Authorization": f"Bearer {self._api_key}"},
            json={
                "workspace":  self._workspace_id,
                "title":      job.get("title", ""),
                "company":    job.get("company", ""),
                "status":     job.get("status", "pending"),
                "url":        job.get("url", ""),
            },
            timeout=10,
        ).raise_for_status()
 ```
 ---
 ## Step 2 — Create the config example file
 Create `config/integrations/myservice.yaml.example`:
 ```yaml
 # config/integrations/myservice.yaml.example
 # Copy to config/integrations/myservice.yaml and fill in your credentials.
 # This file is gitignored — never commit the live credentials.
 api_key: ""
 workspace_id: ""
 ```
 The live credentials file (`config/integrations/myservice.yaml`) is gitignored automatically via the `config/integrations/` entry in `.gitignore`.
 ---
 ## Step 3 — Auto-discovery
 No registration step is needed. The integration registry (`scripts/integrations/__init__.py`) imports all `.py` files in the `integrations/` directory and discovers subclasses of `IntegrationBase` automatically.
 On next startup, `myservice` will appear in:
 - The first-run wizard Step 7 (Integrations)
 - **Settings → Integrations** with a connection card rendered from `fields()`
 ---
 ## Step 4 — Tier-gate new features (optional)
 If you want to gate a specific action (not just the integration itself) behind a tier, add an entry to `app/wizard/tiers.py`:
 ```python
 FEATURES: dict[str, str] = {
    # ...existing entries...
    "myservice_sync": "paid",   # or "free" | "premium"
 }
 ```
 Then guard the action in the relevant UI page:
 ```python
 from app.wizard.tiers import can_use
 from scripts.user_profile import UserProfile
 user = UserProfile()
 if can_use(user.tier, "myservice_sync"):
    # show the sync button
 else:
    st.info("MyService sync requires a Paid plan.")
 ```
 ---
 ## Step 5 — Write a test
 Create or add to `tests/test_integrations.py`:
 ```python
 # tests/test_integrations.py (add to existing file)
 import pytest
 from unittest.mock import patch, MagicMock
 from pathlib import Path
 from scripts.integrations.myservice import MyServiceIntegration
 def test_fields_returns_required_keys():
    integration = MyServiceIntegration()
    fields = integration.fields()
    assert len(fields) >= 1
    for field in fields:
        assert "key" in field
        assert "label" in field
        assert "type" in field
        assert "required" in field
 def test_connect_returns_true_with_valid_config():
    integration = MyServiceIntegration()
    result = integration.connect({"api_key": "sk-abc", "workspace_id": "ws-123"})
    assert result is True
 def test_connect_returns_false_with_missing_required_field():
    integration = MyServiceIntegration()
    result = integration.connect({"api_key": "", "workspace_id": "ws-123"})
    assert result is False
 def test_test_returns_true_on_200(tmp_path):
    integration = MyServiceIntegration()
    integration.connect({"api_key": "sk-abc", "workspace_id": "ws-123"})
    mock_resp = MagicMock()
    mock_resp.ok = True
    with patch("scripts.integrations.myservice.requests.get", return_value=mock_resp):
        assert integration.test() is True
 def test_test_returns_false_on_error(tmp_path):
    integration = MyServiceIntegration()
    integration.connect({"api_key": "sk-abc", "workspace_id": "ws-123"})
    with patch("scripts.integrations.myservice.requests.get", side_effect=Exception("timeout")):
        assert integration.test() is False
 def test_is_configured_reflects_file_presence(tmp_path):
    config_dir = tmp_path / "config"
    config_dir.mkdir()
    (config_dir / "integrations").mkdir()
    assert MyServiceIntegration.is_configured(config_dir) is False
    (config_dir / "integrations" / "myservice.yaml").write_text("api_key: sk-abc\n")
    assert MyServiceIntegration.is_configured(config_dir) is True
 ```
 ---
 ## IntegrationBase Reference
 All integrations inherit from `scripts/integrations/base.py`. Here is the full interface:
 | Method / attribute | Required | Description |
 |-------------------|----------|-------------|
 | `name: str` | Yes | Machine key — must be unique. Matches the YAML config filename. |
 | `label: str` | Yes | Human-readable display name for the UI. |
 | `tier: str` | Yes | Minimum tier: `"free"`, `"paid"`, or `"premium"`. |
 | `fields() -> list[dict]` | Yes | Returns form field definitions. Each dict: `key`, `label`, `type`, `placeholder`, `required`, `help`. |
 | `connect(config: dict) -> bool` | Yes | Stores credentials in memory. Returns `True` if required fields are present. Does NOT verify credentials. |
 | `test() -> bool` | Yes | Makes a real network call to verify stored credentials. Returns `True` on success. |
 | `sync(jobs: list[dict]) -> int` | No | Pushes jobs to the external service. Returns count synced. Default is a no-op returning 0. |
 | `config_path(config_dir: Path) -> Path` | Inherited | Returns `config_dir / "integrations" / f"{name}.yaml"`. |
 | `is_configured(config_dir: Path) -> bool` | Inherited | Returns `True` if the config YAML file exists. |
 | `save_config(config: dict, config_dir: Path)` | Inherited | Writes config dict to the YAML file. Call after `test()` returns `True`. |
 | `load_config(config_dir: Path) -> dict` | Inherited | Loads and returns the YAML config, or `{}` if not configured. |
 ### Field type values
 | `type` value | UI widget rendered |
 |-------------|-------------------|
 | `"text"` | Plain text input |
 | `"password"` | Password input (masked) |
 | `"url"` | URL input |
 | `"checkbox"` | Boolean checkbox |
--- a/docs/developer-guide/adding-scrapers.md
+++ b/docs/developer-guide/adding-scrapers.md
@ -0,0 +1,244 @@
 # Adding a Custom Job Board Scraper
 Peregrine supports pluggable custom job board scrapers. Standard boards use the JobSpy library. Custom scrapers handle boards with non-standard APIs, paywalls, or SSR-rendered pages.
 This guide walks through adding a new scraper from scratch.
 ---
 ## Step 1 — Create the scraper module
 Create `scripts/custom_boards/myboard.py`. Every custom scraper must implement one function:
 ```python
 # scripts/custom_boards/myboard.py
 def scrape(profile: dict, db_path: str) -> list[dict]:
    """
    Scrape job listings from MyBoard for the given search profile.
    Args:
        profile: The active search profile dict from search_profiles.yaml.
                 Keys include: titles (list), locations (list),
                 hours_old (int), results_per_board (int).
        db_path: Absolute path to staging.db. Use this if you need to
                 check for existing URLs before returning.
    Returns:
        List of job dicts. Each dict must contain at minimum:
            title       (str)   — job title
            company     (str)   — company name
            url         (str)   — canonical job URL (used as unique key)
            source      (str)   — board identifier, e.g. "myboard"
            location    (str)   — "Remote" or "City, State"
            is_remote   (bool)  — True if remote
            salary      (str)   — salary string or "" if unknown
            description (str)   — full job description text or "" if unavailable
            date_found  (str)   — ISO 8601 datetime string, e.g. "2026-02-25T12:00:00"
    """
    jobs = []
    for title in profile.get("titles", []):
        for location in profile.get("locations", []):
            results = _fetch_from_myboard(title, location, profile)
            jobs.extend(results)
    return jobs
 def _fetch_from_myboard(title: str, location: str, profile: dict) -> list[dict]:
    """Internal helper — call the board's API and transform results."""
    import requests
    from datetime import datetime
    params = {
        "q": title,
        "l": location,
        "limit": profile.get("results_per_board", 50),
    }
    try:
        resp = requests.get(
            "https://api.myboard.com/jobs",
            params=params,
            timeout=15,
        )
        resp.raise_for_status()
        data = resp.json()
    except Exception as e:
        print(f"[myboard] fetch error: {e}")
        return []
    jobs = []
    for item in data.get("results", []):
        jobs.append({
            "title":       item.get("title", ""),
            "company":     item.get("company", ""),
            "url":         item.get("url", ""),
            "source":      "myboard",
            "location":    item.get("location", ""),
            "is_remote":   "remote" in item.get("location", "").lower(),
            "salary":      item.get("salary", ""),
            "description": item.get("description", ""),
            "date_found":  datetime.utcnow().isoformat(),
        })
    return jobs
 ```
 ### Required fields
 | Field | Type | Notes |
 |-------|------|-------|
 | `title` | str | Job title |
 | `company` | str | Company name |
 | `url` | str | **Unique key** — must be stable and canonical |
 | `source` | str | Short board identifier, e.g. `"myboard"` |
 | `location` | str | `"Remote"` or `"City, ST"` |
 | `is_remote` | bool | `True` if remote |
 | `salary` | str | Salary string or `""` |
 | `description` | str | Full description text or `""` |
 | `date_found` | str | ISO 8601 UTC datetime |
 ### Deduplication
 `discover.py` deduplicates by `url` before inserting into the database. If a job with the same URL already exists, it is silently skipped. You do not need to handle deduplication inside your scraper.
 ### Rate limiting
 Be a good citizen:
 - Add a `time.sleep(0.5)` between paginated requests
 - Respect `Retry-After` headers
 - Do not scrape faster than a human browsing the site
 - If the site provides an official API, prefer that over scraping HTML
 ### Credentials
 If your scraper requires API keys or credentials:
 - Create `config/myboard.yaml.example` as a template
 - Create `config/myboard.yaml` (gitignored) for live credentials
 - Read it in your scraper with `yaml.safe_load(open("config/myboard.yaml"))`
 - Document the credential setup in comments at the top of your module
 ---
 ## Step 2 — Register the scraper
 Open `scripts/discover.py` and add your scraper to the `CUSTOM_SCRAPERS` dict:
 ```python
 from scripts.custom_boards import adzuna, theladders, craigslist, myboard
 CUSTOM_SCRAPERS = {
    "adzuna":     adzuna.scrape,
    "theladders": theladders.scrape,
    "craigslist": craigslist.scrape,
    "myboard":    myboard.scrape,   # add this line
 }
 ```
 ---
 ## Step 3 — Activate in a search profile
 Open `config/search_profiles.yaml` and add `myboard` to `custom_boards` in any profile:
 ```yaml
 profiles:
  - name: cs_leadership
    boards:
      - linkedin
      - indeed
    custom_boards:
      - adzuna
      - myboard          # add this line
    titles:
      - Customer Success Manager
    locations:
      - Remote
 ```
 ---
 ## Step 4 — Write a test
 Create `tests/test_myboard.py`. Mock the HTTP call to avoid hitting the live API during tests:
 ```python
 # tests/test_myboard.py
 from unittest.mock import patch
 from scripts.custom_boards.myboard import scrape
 MOCK_RESPONSE = {
    "results": [
        {
            "title": "Customer Success Manager",
            "company": "Acme Corp",
            "url": "https://myboard.com/jobs/12345",
            "location": "Remote",
            "salary": "$80,000 - $100,000",
            "description": "We are looking for a CSM...",
        }
    ]
 }
 def test_scrape_returns_correct_shape():
    profile = {
        "titles": ["Customer Success Manager"],
        "locations": ["Remote"],
        "results_per_board": 10,
        "hours_old": 240,
    }
    with patch("scripts.custom_boards.myboard.requests.get") as mock_get:
        mock_get.return_value.ok = True
        mock_get.return_value.raise_for_status = lambda: None
        mock_get.return_value.json.return_value = MOCK_RESPONSE
        jobs = scrape(profile, db_path="nonexistent.db")
    assert len(jobs) == 1
    job = jobs[0]
    # Required fields
    assert "title" in job
    assert "company" in job
    assert "url" in job
    assert "source" in job
    assert "location" in job
    assert "is_remote" in job
    assert "salary" in job
    assert "description" in job
    assert "date_found" in job
    assert job["source"] == "myboard"
    assert job["title"] == "Customer Success Manager"
    assert job["url"] == "https://myboard.com/jobs/12345"
 def test_scrape_handles_http_error_gracefully():
    profile = {
        "titles": ["Customer Success Manager"],
        "locations": ["Remote"],
        "results_per_board": 10,
        "hours_old": 240,
    }
    with patch("scripts.custom_boards.myboard.requests.get") as mock_get:
        mock_get.side_effect = Exception("Connection refused")
        jobs = scrape(profile, db_path="nonexistent.db")
    assert jobs == []
 ```
 ---
 ## Existing Scrapers as Reference
 | Scraper | Notes |
 |---------|-------|
 | `scripts/custom_boards/adzuna.py` | REST API with `app_id` + `app_key` authentication |
 | `scripts/custom_boards/theladders.py` | SSR scraper using `curl_cffi` to parse `__NEXT_DATA__` JSON embedded in the page |
 | `scripts/custom_boards/craigslist.py` | RSS feed scraper |
--- a/docs/developer-guide/architecture.md
+++ b/docs/developer-guide/architecture.md
@ -0,0 +1,286 @@
 # Architecture
 This page describes Peregrine's system structure, layer boundaries, and key design decisions.
 ---
 ## System Overview
 ### Pipeline
 ```mermaid
 flowchart LR
    sources["JobSpy\nCustom Boards"]
    discover["discover.py"]
    db[("staging.db\nSQLite")]
    match["match.py\nScoring"]
    review["Job Review\nApprove / Reject"]
    apply["Apply Workspace\nCover letter + PDF"]
    kanban["Interviews\nphone_screen → hired"]
    sync["sync.py"]
    notion["Notion DB"]
    sources --> discover --> db --> match --> review --> apply --> kanban
    db --> sync --> notion
 ```
 ### Docker Compose Services
 Three compose files serve different deployment contexts:
 | File | Project name | Port | Purpose |
 |------|-------------|------|---------|
 | `compose.yml` | `peregrine` | 8502 | Local self-hosted install (default) |
 | `compose.demo.yml` | `peregrine-demo` | 8504 | Public demo at `demo.circuitforge.tech/peregrine` — `DEMO_MODE=true`, no LLM |
 | `compose.cloud.yml` | `peregrine-cloud` | 8505 | Cloud managed instance at `menagerie.circuitforge.tech/peregrine` — `CLOUD_MODE=true`, per-user data |
 ```mermaid
 flowchart TB
    subgraph local["compose.yml (local)"]
        app_l["**app** :8502\nStreamlit UI"]
        ollama_l["**ollama**\nLocal LLM"]
        vllm_l["**vllm**\nvLLM"]
        vision_l["**vision**\nMoondream2"]
        searxng_l["**searxng**\nWeb Search"]
        db_l[("staging.db\nSQLite")]
    end
    subgraph cloud["compose.cloud.yml (cloud)"]
        app_c["**app** :8505\nStreamlit UI\nCLOUD_MODE=true"]
        searxng_c["**searxng**\nWeb Search"]
        db_c[("menagerie-data/\n&lt;user-id&gt;/staging.db\nSQLCipher")]
        pg[("Postgres\nplatform DB\n:5433")]
    end
 ```
 Solid lines = always connected. Dashed lines = optional/profile-dependent backends.
 ### Streamlit App Layer
 ```mermaid
 flowchart TD
    entry["app/app.py\nEntry point · navigation · sidebar task badge"]
    setup["0_Setup.py\nFirst-run wizard\n⚠️ Gates everything"]
    review["1_Job_Review.py\nApprove / reject queue"]
    settings["2_Settings.py\nAll user configuration"]
    apply["4_Apply.py\nCover letter gen + PDF export"]
    interviews["5_Interviews.py\nKanban: phone_screen → hired"]
    prep["6_Interview_Prep.py\nResearch brief + practice Q&A"]
    survey["7_Survey.py\nCulture-fit survey assistant"]
    wizard["app/wizard/\nstep_hardware.py … step_integrations.py\ntiers.py — feature gate definitions"]
    entry --> setup
    entry --> review
    entry --> settings
    entry --> apply
    entry --> interviews
    entry --> prep
    entry --> survey
    setup <-.->|wizard steps| wizard
 ```
 ### Scripts Layer
 Framework-independent — no Streamlit imports. Can be called from CLI, FastAPI, or background threads.
 | Script | Purpose |
 |--------|---------|
 | `discover.py` | JobSpy + custom board orchestration |
 | `match.py` | Resume keyword scoring |
 | `db.py` | All SQLite helpers (single source of truth) |
 | `llm_router.py` | LLM fallback chain |
 | `generate_cover_letter.py` | Cover letter generation |
 | `company_research.py` | Pre-interview research brief |
 | `task_runner.py` | Background daemon thread executor |
 | `imap_sync.py` | IMAP email fetch + classify |
 | `sync.py` | Push to external integrations |
 | `user_profile.py` | `UserProfile` wrapper for `user.yaml` |
 | `preflight.py` | Port + resource check |
 | `custom_boards/` | Per-board scrapers |
 | `integrations/` | Per-service integration drivers |
 | `vision_service/` | FastAPI Moondream2 inference server |
 ### Config Layer
 Plain YAML files. Gitignored files contain secrets; `.example` files are committed as templates.
 | File | Purpose |
 |------|---------|
 | `config/user.yaml` | Personal data + wizard state |
 | `config/llm.yaml` | LLM backends + fallback chains |
 | `config/search_profiles.yaml` | Job search configuration |
 | `config/resume_keywords.yaml` | Scoring keywords |
 | `config/blocklist.yaml` | Excluded companies/domains |
 | `config/email.yaml` | IMAP credentials |
 | `config/integrations/` | Per-integration credentials |
 ### Database Layer
 **Local mode** — `staging.db`: SQLite, single file, gitignored.
 **Cloud mode** — Hybrid:
 - **Postgres (platform layer):** account data, subscriptions, telemetry consent. Shared across all users.
 - **SQLite-per-user (content layer):** each user's job data in an isolated, SQLCipher-encrypted file at `/devl/menagerie-data/<user-id>/peregrine/staging.db`. Schema is identical to local — the app sees no difference.
 #### Local SQLite tables
 | Table | Purpose |
 |-------|---------|
 | `jobs` | Core pipeline — all job data |
 | `job_contacts` | Email thread log per job |
 | `company_research` | LLM-generated research briefs |
 | `background_tasks` | Async task queue state |
 | `survey_responses` | Culture-fit survey Q&A pairs |
 #### Postgres platform tables (cloud only)
 | Table | Purpose |
 |-------|---------|
 | `subscriptions` | User tier, license JWT, product |
 | `usage_events` | Anonymous usage telemetry (consent-gated) |
 | `telemetry_consent` | Per-user telemetry preferences + hard kill switch |
 | `support_access_grants` | Time-limited support session grants |
 ---
 ### Cloud Session Middleware
 `app/cloud_session.py` handles multi-tenant routing transparently:
 ```
 Request → Caddy injects X-CF-Session header (from Directus session cookie)
        → resolve_session() validates JWT, derives db_path + db_key
        → all DB calls use get_db_path() instead of DEFAULT_DB
 ```
 Key functions:
 | Function | Purpose |
 |----------|---------|
 | `resolve_session(app)` | Called at top of every page — no-op in local mode |
 | `get_db_path()` | Returns per-user `db_path` (cloud) or `DEFAULT_DB` (local) |
 | `derive_db_key(user_id)` | `HMAC(SERVER_SECRET, user_id)` — deterministic per-user SQLCipher key |
 The app code never branches on `CLOUD_MODE` except at the entry points (`resolve_session` and `get_db_path`). Everything downstream is transparent.
 ### Telemetry (cloud only)
 `app/telemetry.py` is the **only** path to the `usage_events` table. No feature may write there directly.
 ```python
 from app.telemetry import log_usage_event
 log_usage_event(user_id, "peregrine", "cover_letter_generated", {"words": 350})
 ```
 - Complete no-op when `CLOUD_MODE=false`
 - Checks `telemetry_consent.all_disabled` first — if set, nothing is written, no exceptions
 - Swallows all exceptions so telemetry never crashes the app
 ---
 ## Layer Boundaries
 ### App layer (app/)
 The Streamlit UI layer. Its only responsibilities are:
 - Reading from `scripts/db.py` helpers
 - Calling `scripts/` functions directly or via `task_runner.submit_task()`
 - Rendering results to the browser
 The app layer does not contain business logic. Database queries, LLM calls, and integrations all live in `scripts/`.
 ### Scripts layer (scripts/)
 This is the stable public API of Peregrine. Scripts are designed to be framework-independent — they do not import Streamlit and can be called from a CLI, FastAPI endpoint, or background thread without modification.
 All personal data access goes through `scripts/user_profile.py` (`UserProfile` class). Scripts never read `config/user.yaml` directly.
 All database access goes through `scripts/db.py`. No script does raw SQLite outside of `db.py`.
 ### Config layer (config/)
 Plain YAML files. Gitignored files contain secrets; `.example` files are committed as templates.
 ---
 ## Background Tasks
 `scripts/task_runner.py` provides a simple background thread executor for long-running LLM tasks.
 ```python
 from scripts.task_runner import submit_task
 # Queue a cover letter generation task
 submit_task(db_path, task_type="cover_letter", job_id=42)
 # Queue a company research task
 submit_task(db_path, task_type="company_research", job_id=42)
 ```
 Tasks are recorded in the `background_tasks` table with the following state machine:
 ```mermaid
 stateDiagram-v2
    [*] --> queued : submit_task()
    queued --> running : daemon picks up
    running --> completed
    running --> failed
    queued --> failed : server restart clears stuck tasks
    completed --> [*]
    failed --> [*]
 ```
 **Dedup rule:** Only one `queued` or `running` task per `(task_type, job_id)` pair is allowed at a time. Submitting a duplicate is a silent no-op.
 **On startup:** `app/app.py` resets any `running` or `queued` rows to `failed` to clear tasks that were interrupted by a server restart.
 **Sidebar indicator:** `app/app.py` polls the `background_tasks` table every 3 seconds via a Streamlit fragment and displays a badge in the sidebar.
 ---
 ## LLM Router
 `scripts/llm_router.py` provides a single `complete()` call that tries backends in priority order and falls back transparently. See [LLM Router](../reference/llm-router.md) for full documentation.
 ---
 ## Key Design Decisions
 ### scripts/ is framework-independent
 The scripts layer was deliberately kept free of Streamlit imports. This means the full pipeline can be migrated to a FastAPI or Celery backend without rewriting business logic.
 ### All personal data via UserProfile
 `scripts/user_profile.py` is the single source of truth for all user data. This makes it easy to swap the storage backend (e.g. from YAML to a database) without touching every script.
 ### SQLite as staging layer
 `staging.db` acts as the staging layer between discovery and external integrations. This lets discovery, matching, and the UI all run independently without network dependencies. External integrations (Notion, Airtable, etc.) are push-only and optional.
 ### Tier system in app/wizard/tiers.py
 `FEATURES` is a single dict that maps feature key → minimum tier. `can_use(tier, feature)` is the single gating function. New features are added to `FEATURES` in one place.
 ### Vision service is a separate process
 Moondream2 requires `torch` and `transformers`, which are incompatible with the lightweight main conda environment. The vision service runs as a separate FastAPI process in a separate conda environment (`job-seeker-vision`), keeping the main env free of GPU dependencies.
 ### Cloud mode is a transparent layer, not a fork
 `CLOUD_MODE=true` activates two entry points (`resolve_session`, `get_db_path`) and the telemetry middleware. Every other line of app code is unchanged. There is no cloud branch, no conditional imports, no schema divergence. The local-first architecture is preserved end-to-end; the cloud layer sits on top of it.
 ### SQLite-per-user instead of shared Postgres
 Each cloud user gets their own encrypted SQLite file. This means:
 - No SQL migrations when the schema changes — new users get the latest schema, existing users keep their file as-is
 - Zero risk of cross-user data leakage at the DB layer
 - GDPR deletion is `rm -rf /devl/menagerie-data/<user-id>/` — auditable and complete
 - The app can be tested locally with `CLOUD_MODE=false` without any Postgres dependency
 The Postgres platform DB holds only account metadata (subscriptions, consent, telemetry) — never job search content.
--- a/docs/developer-guide/cloud-deployment.md
+++ b/docs/developer-guide/cloud-deployment.md
@ -0,0 +1,198 @@
 # Cloud Deployment
 This page covers operating the Peregrine cloud managed instance at `menagerie.circuitforge.tech/peregrine`.
 ---
 ## Architecture Overview
 ```
 Browser → Caddy (bastion) → host:8505 → peregrine-cloud container
                                              │
                    ┌─────────────────────────┼──────────────────────────┐
                    │                         │                          │
              cloud_session.py        /devl/menagerie-data/       Postgres :5433
              (session routing)       <user-id>/peregrine/        (platform DB)
                                      staging.db (SQLCipher)
 ```
 Caddy injects the Directus session cookie as `X-CF-Session`. `cloud_session.py` validates the JWT, derives the per-user db path and SQLCipher key, and injects both into `st.session_state`. All downstream DB calls are transparent — the app never knows it's multi-tenant.
 ---
 ## Compose File
 ```bash
 # Start
 docker compose -f compose.cloud.yml --project-name peregrine-cloud --env-file .env up -d
 # Stop
 docker compose -f compose.cloud.yml --project-name peregrine-cloud down
 # Logs
 docker compose -f compose.cloud.yml --project-name peregrine-cloud logs app -f
 # Rebuild after code changes
 docker compose -f compose.cloud.yml --project-name peregrine-cloud build app
 docker compose -f compose.cloud.yml --project-name peregrine-cloud up -d
 ```
 ---
 ## Required Environment Variables
 These must be present in `.env` (gitignored) before starting the cloud stack:
 | Variable | Description | Where to find |
 |----------|-------------|---------------|
 | `CLOUD_MODE` | Must be `true` | Hardcoded in compose.cloud.yml |
 | `CLOUD_DATA_ROOT` | Host path for per-user data trees | `/devl/menagerie-data` |
 | `DIRECTUS_JWT_SECRET` | Directus signing secret — validates session JWTs | `website/.env` → `DIRECTUS_SECRET` |
 | `CF_SERVER_SECRET` | Server secret for SQLCipher key derivation | Generate: `openssl rand -base64 32 \| tr -d '/=+' \| cut -c1-32` |
 | `PLATFORM_DB_URL` | Postgres connection string for platform DB | `postgresql://cf_platform:<pass>@host.docker.internal:5433/circuitforge_platform` |
 !!! warning "SECRET ROTATION"
    `CF_SERVER_SECRET` is used to derive all per-user SQLCipher keys via `HMAC(secret, user_id)`. Rotating this secret renders all existing user databases unreadable. Do not rotate it without a migration plan.
 ---
 ## Data Root
 User data lives at `/devl/menagerie-data/` on the host, bind-mounted into the container:
 ```
 /devl/menagerie-data/
  <directus-user-uuid>/
    peregrine/
      staging.db        ← SQLCipher-encrypted (AES-256)
      config/           ← llm.yaml, server.yaml, user.yaml, etc.
      data/             ← documents, exports, attachments
 ```
 The directory is created automatically on first login. The SQLCipher key for each user is derived deterministically: `HMAC-SHA256(CF_SERVER_SECRET, user_id)`.
 ### GDPR / Data deletion
 To fully delete a user's data:
 ```bash
 # Remove all content data
 rm -rf /devl/menagerie-data/<user-id>/
 # Remove platform DB rows (cascades)
 docker exec cf-platform-db psql -U cf_platform -d circuitforge_platform \
  -c "DELETE FROM subscriptions WHERE user_id = '<user-id>';"
 ```
 ---
 ## Platform Database
 The Postgres platform DB runs as `cf-platform-db` in the website compose stack (port 5433 on host).
 ```bash
 # Connect
 docker exec cf-platform-db psql -U cf_platform -d circuitforge_platform
 # Check tables
 \dt
 # View telemetry consent for a user
 SELECT * FROM telemetry_consent WHERE user_id = '<uuid>';
 # View recent usage events
 SELECT user_id, event_type, occurred_at FROM usage_events
  ORDER BY occurred_at DESC LIMIT 20;
 ```
 The schema is initialised on container start from `platform-db/init.sql` in the website repo.
 ---
 ## Telemetry
 `app/telemetry.py` is the **only** entry point to `usage_events`. Never write to that table directly.
 ```python
 from app.telemetry import log_usage_event
 # Fires in cloud mode only; no-op locally
 log_usage_event(user_id, "peregrine", "cover_letter_generated", {"words": 350})
 ```
 Events are blocked if:
 1. `telemetry_consent.all_disabled = true` (hard kill switch, overrides all)
 2. `telemetry_consent.usage_events_enabled = false`
 The user controls both from Settings → 🔒 Privacy.
 ---
 ## Backup / Restore (Cloud Mode)
 The Settings → 💾 Data tab handles backup/restore transparently. In cloud mode:
 - **Export:** the SQLCipher-encrypted DB is decrypted before zipping — the downloaded `.zip` is a portable plain SQLite archive, compatible with any local Docker install.
 - **Import:** a plain SQLite backup is re-encrypted with the user's key on restore.
 The user's `base_dir` in cloud mode is `get_db_path().parent` (`/devl/menagerie-data/<user-id>/peregrine/`), not the app root.
 ---
 ## Routing (Caddy)
 `menagerie.circuitforge.tech` in `/devl/caddy-proxy/Caddyfile`:
 ```caddy
 menagerie.circuitforge.tech {
    encode gzip zstd
    handle /peregrine* {
        reverse_proxy http://host.docker.internal:8505 {
            header_up X-CF-Session {header.Cookie}
        }
    }
    handle {
        respond "This app is not yet available in the managed cloud — check back soon." 503
    }
    log {
        output file /data/logs/menagerie.circuitforge.tech.log
        format json
    }
 }
 ```
 `header_up X-CF-Session {header.Cookie}` passes the full cookie header so `cloud_session.py` can extract the Directus session token.
 !!! note "Caddy inode gotcha"
    After editing the Caddyfile, run `docker restart caddy-proxy` — not `caddy reload`. The Edit tool creates a new inode; Docker bind mounts pin to the original inode and `caddy reload` re-reads the stale one.
 ---
 ## Demo Instance
 The public demo at `demo.circuitforge.tech/peregrine` runs separately:
 ```bash
 # Start demo
 docker compose -f compose.demo.yml --project-name peregrine-demo up -d
 # Rebuild after code changes
 docker compose -f compose.demo.yml --project-name peregrine-demo build app
 docker compose -f compose.demo.yml --project-name peregrine-demo up -d
 ```
 `DEMO_MODE=true` blocks all LLM inference calls at `llm_router.py`. Discovery, job enrichment, and the UI work normally. Demo data lives in `demo/config/` and `demo/data/` — isolated from personal data.
 ---
 ## Adding a New App to the Cloud
 To onboard a new menagerie app (e.g. `falcon`) to the cloud:
 1. Add `resolve_session("falcon")` at the top of each page (calls `cloud_session.py` with the app slug)
 2. Replace `DEFAULT_DB` references with `get_db_path()`
 3. Add `app/telemetry.py` import and `log_usage_event()` calls at key action points
 4. Create `compose.cloud.yml` following the Peregrine pattern (port, `CLOUD_MODE=true`, data mount)
 5. Add a Caddy `handle /falcon*` block in `menagerie.circuitforge.tech`, routing to the new port
 6. `cloud_session.py` automatically creates `<data_root>/<user-id>/falcon/` on first login
--- a/docs/developer-guide/contributing.md
+++ b/docs/developer-guide/contributing.md
@ -0,0 +1,120 @@
 # Contributing
 Thank you for your interest in contributing to Peregrine. This guide covers the development environment, code standards, test requirements, and pull request process.
 !!! note "License"
    Peregrine uses a dual licence. The discovery pipeline (`scripts/discover.py`, `scripts/match.py`, `scripts/db.py`, `scripts/custom_boards/`) is MIT. All AI features, the UI, and everything else is BSL 1.1.
    Do not add `Co-Authored-By:` trailers or AI-attribution notices to commits — this is a commercial repository.
 ---
 ## Fork and Clone
 ```bash
 git clone https://git.circuitforge.io/circuitforge/peregrine
 cd peregrine
 ```
 Create a feature branch from `main`:
 ```bash
 git checkout -b feat/my-feature
 ```
 ---
 ## Dev Environment Setup
 Peregrine's Python dependencies are managed with conda. The same `job-seeker` environment is used for both the legacy personal app and Peregrine.
 ```bash
 # Create the environment from the lockfile
 conda env create -f environment.yml
 # Activate
 conda activate job-seeker
 ```
 Alternatively, install from `requirements.txt` into an existing Python 3.12 environment:
 ```bash
 pip install -r requirements.txt
 ```
 !!! warning "Keep the env lightweight"
    Do not add `torch`, `sentence-transformers`, `bitsandbytes`, `transformers`, or any other CUDA/GPU package to the main environment. These live in separate conda environments (`job-seeker-vision` for the vision service, `ogma` for fine-tuning). Adding them to the main env causes out-of-memory failures during test runs.
 ---
 ## Running Tests
 ```bash
 conda run -n job-seeker python -m pytest tests/ -v
 ```
 Or with the direct binary (avoids runaway process spawning):
 ```bash
 /path/to/miniconda3/envs/job-seeker/bin/pytest tests/ -v
 ```
 The `pytest.ini` file scopes collection to the `tests/` directory only — do not widen this.
 All tests must pass before submitting a PR. See [Testing](testing.md) for patterns and conventions.
 ---
 ## Code Style
 - **PEP 8** for all Python code — use `flake8` or `ruff` to check
 - **Type hints preferred** on function signatures — not required but strongly encouraged
 - **Docstrings** on all public functions and classes
 - **No print statements** in library code (`scripts/`); use Python's `logging` module or return status in the return value. `print` is acceptable in one-off scripts and `discover.py`-style entry points.
 ---
 ## Branch Naming
 | Prefix | Use for |
 |--------|---------|
 | `feat/` | New features |
 | `fix/` | Bug fixes |
 | `docs/` | Documentation only |
 | `refactor/` | Code reorganisation without behaviour change |
 | `test/` | Test additions or corrections |
 | `chore/` | Dependency updates, CI, tooling |
 Example: `feat/add-greenhouse-scraper`, `fix/email-imap-timeout`, `docs/add-integration-guide`
 ---
 ## PR Checklist
 Before opening a pull request:
 - [ ] All tests pass: `conda run -n job-seeker python -m pytest tests/ -v`
 - [ ] New behaviour is covered by at least one test
 - [ ] No new dependencies added to `environment.yml` or `requirements.txt` without a clear justification in the PR description
 - [ ] Documentation updated if the PR changes user-visible behaviour (update the relevant page in `docs/`)
 - [ ] Config file changes are reflected in the `.example` file
 - [ ] No secrets, tokens, or personal data in any committed file
 - [ ] Gitignored files (`config/*.yaml`, `staging.db`, `aihawk/`, `.env`) are not committed
 ---
 ## What NOT to Do
 - Do not commit `config/user.yaml`, `config/notion.yaml`, `config/email.yaml`, `config/adzuna.yaml`, or any `config/integrations/*.yaml` — all are gitignored
 - Do not commit `staging.db`
 - Do not add `torch`, `bitsandbytes`, `transformers`, or `sentence-transformers` to the main environment
 - Do not add `Co-Authored-By:` or AI-attribution lines to commit messages
 - Do not force-push to `main`
 ---
 ## Getting Help
 Open an issue on the repository with the `question` label. Include:
 - Your OS and Docker version
 - The `inference_profile` from your `config/user.yaml`
 - Relevant log output from `make logs`
--- a/docs/developer-guide/testing.md
+++ b/docs/developer-guide/testing.md
@ -0,0 +1,181 @@
 # Testing
 Peregrine has a test suite covering the core scripts layer, LLM router, integrations, wizard steps, and database helpers.
 ---
 ## Running the Test Suite
 ```bash
 conda run -n job-seeker python -m pytest tests/ -v
 ```
 Or using the direct binary (recommended to avoid runaway process spawning):
 ```bash
 /path/to/miniconda3/envs/job-seeker/bin/pytest tests/ -v
 ```
 `pytest.ini` scopes test collection to `tests/` only:
 ```ini
 [pytest]
 testpaths = tests
 ```
 Do not widen this — the `aihawk/` subtree has its own test files that pull in GPU dependencies.
 ---
 ## What Is Covered
 The suite currently has approximately 219 tests covering:
 | Module | What is tested |
 |--------|---------------|
 | `scripts/db.py` | CRUD helpers, status transitions, dedup logic |
 | `scripts/llm_router.py` | Fallback chain, backend selection, vision routing, error handling |
 | `scripts/match.py` | Keyword scoring, gap calculation |
 | `scripts/imap_sync.py` | Email parsing, classification label mapping |
 | `scripts/company_research.py` | Prompt construction, output parsing |
 | `scripts/generate_cover_letter.py` | Mission alignment detection, prompt injection |
 | `scripts/task_runner.py` | Task submission, dedup, status transitions |
 | `scripts/user_profile.py` | Accessor methods, defaults, YAML round-trip |
 | `scripts/integrations/` | Base class contract, per-driver `fields()` and `connect()` |
 | `app/wizard/tiers.py` | `can_use()`, `tier_label()`, edge cases |
 | `scripts/custom_boards/` | Scraper return shape, HTTP error handling |
 ---
 ## Test Structure
 Tests live in `tests/`. File naming mirrors the module being tested:
 ```
 tests/
  test_db.py
  test_llm_router.py
  test_match.py
  test_imap_sync.py
  test_company_research.py
  test_cover_letter.py
  test_task_runner.py
  test_user_profile.py
  test_integrations.py
  test_tiers.py
  test_adzuna.py
  test_theladders.py
 ```
 ---
 ## Key Patterns
 ### tmp_path for YAML files
 Use pytest's built-in `tmp_path` fixture for any test that reads or writes YAML config files:
 ```python
 def test_user_profile_reads_name(tmp_path):
    config = tmp_path / "user.yaml"
    config.write_text("name: Alice\nemail: alice@example.com\n")
    from scripts.user_profile import UserProfile
    profile = UserProfile(config_path=config)
    assert profile.name == "Alice"
 ```
 ### Mocking LLM calls
 Never make real LLM calls in tests. Patch `LLMRouter.complete`:
 ```python
 from unittest.mock import patch
 def test_cover_letter_calls_llm(tmp_path):
    with patch("scripts.generate_cover_letter.LLMRouter") as MockRouter:
        MockRouter.return_value.complete.return_value = "Dear Hiring Manager,\n..."
        from scripts.generate_cover_letter import generate
        result = generate(job={...}, user_profile={...})
    assert "Dear Hiring Manager" in result
    MockRouter.return_value.complete.assert_called_once()
 ```
 ### Mocking HTTP in scraper tests
 ```python
 from unittest.mock import patch
 def test_adzuna_returns_jobs():
    with patch("scripts.custom_boards.adzuna.requests.get") as mock_get:
        mock_get.return_value.ok = True
        mock_get.return_value.raise_for_status = lambda: None
        mock_get.return_value.json.return_value = {"results": [...]}
        from scripts.custom_boards.adzuna import scrape
        jobs = scrape(profile={...}, db_path="nonexistent.db")
    assert len(jobs) > 0
 ```
 ### In-memory SQLite for DB tests
 ```python
 import sqlite3, tempfile, os
 def test_insert_job():
    with tempfile.NamedTemporaryFile(suffix=".db", delete=False) as f:
        db_path = f.name
    try:
        from scripts.db import init_db, insert_job
        init_db(db_path)
        insert_job(db_path, title="CSM", company="Acme", url="https://example.com/1", ...)
        # assert...
    finally:
        os.unlink(db_path)
 ```
 ---
 ## What NOT to Test
 - **Streamlit widget rendering** — Streamlit has no headless test support. Do not try to test `st.button()` or `st.text_input()` calls. Test the underlying script functions instead.
 - **Real network calls** — always mock HTTP and LLM clients
 - **Real GPU inference** — mock the vision service and LLM router
 ---
 ## Adding Tests for New Code
 ### New scraper
 Create `tests/test_myboard.py`. Required test cases:
 1. Happy path: mock HTTP returns valid data → correct job dict shape
 2. HTTP error: mock raises `Exception` → function returns `[]` (does not raise)
 3. Empty results: API returns `{"results": []}` → function returns `[]`
 ### New integration
 Add to `tests/test_integrations.py`. Required test cases:
 1. `fields()` returns list of dicts with required keys
 2. `connect()` returns `True` with valid config, `False` with missing required field
 3. `test()` returns `True` with mocked successful HTTP, `False` with exception
 4. `is_configured()` reflects file presence in `tmp_path`
 ### New wizard step
 Add to `tests/test_wizard_steps.py`. Test the step's pure-logic functions (validation, data extraction). Do not test the Streamlit rendering.
 ### New tier feature gate
 Add to `tests/test_tiers.py`:
 ```python
 from app.wizard.tiers import can_use
 def test_my_new_feature_requires_paid():
    assert can_use("free", "my_new_feature") is False
    assert can_use("paid", "my_new_feature") is True
    assert can_use("premium", "my_new_feature") is True
 ```
--- a/docs/getting-started/docker-profiles.md
+++ b/docs/getting-started/docker-profiles.md
@ -0,0 +1,118 @@
 # Docker Profiles
 Peregrine uses Docker Compose profiles to start only the services your hardware can support. Choose a profile with `make start PROFILE=<name>`.
 ---
 ## Profile Reference
 | Profile | Services started | Use case |
 |---------|----------------|----------|
 | `remote` | `app`, `searxng` | No GPU. LLM calls go to an external API (Anthropic, OpenAI-compatible). |
 | `cpu` | `app`, `ollama`, `searxng` | No GPU. Runs local models on CPU — functional but slow. |
 | `single-gpu` | `app`, `ollama`, `vision`, `searxng` | One NVIDIA GPU. Covers cover letters, research, and vision (survey screenshots). |
 | `dual-gpu` | `app`, `ollama`, `vllm`, `vision`, `searxng` | Two NVIDIA GPUs. GPU 0 = Ollama (cover letters), GPU 1 = vLLM (research). |
 ---
 ## Service Descriptions
 | Service | Image / Source | Port | Purpose |
 |---------|---------------|------|---------|
 | `app` | `Dockerfile` (Streamlit) | 8501 | The main Peregrine UI |
 | `ollama` | `ollama/ollama` | 11434 | Local model inference — cover letters and general tasks |
 | `vllm` | `vllm/vllm-openai` | 8000 | High-throughput local inference — research tasks |
 | `vision` | `scripts/vision_service/` | 8002 | Moondream2 — survey screenshot analysis |
 | `searxng` | `searxng/searxng` | 8888 | Private meta-search engine — company research web scraping |
 ---
 ## Choosing a Profile
 ### remote
 Use `remote` if:
 - You have no NVIDIA GPU
 - You plan to use Anthropic Claude or another API-hosted model exclusively
 - You want the fastest startup (only two containers)
 You must configure at least one external LLM backend in **Settings → LLM Backends**.
 ### cpu
 Use `cpu` if:
 - You have no GPU but want to run models locally (e.g. for privacy)
 - Acceptable for light use; cover letter generation may take several minutes per request
 Pull a model after the container starts:
 ```bash
 docker exec -it peregrine-ollama-1 ollama pull llama3.1:8b
 ```
 ### single-gpu
 Use `single-gpu` if:
 - You have one NVIDIA GPU with at least 8 GB VRAM
 - Recommended for most single-user installs
 - The vision service (Moondream2) starts on the same GPU using 4-bit quantisation (~1.5 GB VRAM)
 ### dual-gpu
 Use `dual-gpu` if:
 - You have two or more NVIDIA GPUs
 - GPU 0 handles Ollama (cover letters, quick tasks)
 - GPU 1 handles vLLM (research, long-context tasks)
 - The vision service shares GPU 0 with Ollama
 ---
 ## GPU Memory Guidance
 | GPU VRAM | Recommended profile | Notes |
 |----------|-------------------|-------|
 | < 4 GB | `cpu` | GPU too small for practical model loading |
 | 4–8 GB | `single-gpu` | Run smaller models (3B–8B parameters) |
 | 8–16 GB | `single-gpu` | Run 8B–13B models comfortably |
 | 16–24 GB | `single-gpu` | Run 13B–34B models |
 | 24 GB+ | `single-gpu` or `dual-gpu` | 70B models with quantisation |
 ---
 ## How preflight.py Works
 `make start` calls `scripts/preflight.py` before launching Docker. Preflight does the following:
 1. **Port conflict detection** — checks whether `STREAMLIT_PORT`, `OLLAMA_PORT`, `VLLM_PORT`, `SEARXNG_PORT`, and `VISION_PORT` are already in use. Reports any conflicts and suggests alternatives.
 2. **GPU enumeration** — queries `nvidia-smi` for GPU count and VRAM per card.
 3. **RAM check** — reads `/proc/meminfo` (Linux) or `vm_stat` (macOS) to determine available system RAM.
 4. **KV cache offload** — if GPU VRAM is less than 10 GB, preflight calculates `CPU_OFFLOAD_GB` (the amount of KV cache to spill to system RAM) and writes it to `.env`. The vLLM container picks this up via `--cpu-offload-gb`.
 5. **Profile recommendation** — writes `RECOMMENDED_PROFILE` to `.env`. This is informational; `make start` uses the `PROFILE` variable you specify (defaulting to `remote`).
 You can run preflight independently:
 ```bash
 make preflight
 # or
 python scripts/preflight.py
 ```
 ---
 ## Customising Ports
 Edit `.env` before running `make start`:
 ```bash
 STREAMLIT_PORT=8501
 OLLAMA_PORT=11434
 VLLM_PORT=8000
 SEARXNG_PORT=8888
 VISION_PORT=8002
 ```
 All containers read from `.env` via the `env_file` directive in `compose.yml`.
--- a/docs/getting-started/first-run-wizard.md
+++ b/docs/getting-started/first-run-wizard.md
@ -0,0 +1,165 @@
 # First-Run Wizard
 When you open Peregrine for the first time, the setup wizard launches automatically. It walks through seven steps and saves your progress after each one — if your browser closes or the server restarts, it resumes where you left off.
 ---
 ## Step 1 — Hardware
 Peregrine detects NVIDIA GPUs using `nvidia-smi` and reports:
 - Number of GPUs found
 - VRAM per GPU
 - Available system RAM
 Based on this, it recommends a Docker Compose profile:
 | Recommendation | Condition |
 |---------------|-----------|
 | `remote` | No GPU detected |
 | `cpu` | GPU detected but VRAM < 4 GB |
 | `single-gpu` | One GPU with VRAM >= 4 GB |
 | `dual-gpu` | Two or more GPUs |
 You can override the recommendation and select any profile manually. The selection is written to `config/user.yaml` as `inference_profile`.
 ---
 ## Step 2 — Tier
 Select your Peregrine tier:
 | Tier | Description |
 |------|-------------|
 | **Free** | Job discovery, matching, and basic pipeline — no LLM features |
 | **Paid** | Adds cover letters, company research, email sync, integrations, and all AI features |
 | **Premium** | Adds fine-tuning and multi-user support |
 Your tier is written to `config/user.yaml` as `tier`.
 **Dev tier override** — for local testing without a paid licence, set `dev_tier_override: premium` in `config/user.yaml`. This is for development use only and has no effect on production deployments.
 See [Tier System](../reference/tier-system.md) for the full feature gate table.
 ---
 ## Step 3 — Identity
 Enter your personal details. These are stored locally in `config/user.yaml` and used to personalise cover letters and research briefs.
 | Field | Description |
 |-------|-------------|
 | Name | Your full name |
 | Email | Primary contact email |
 | Phone | Contact phone number |
 | LinkedIn | LinkedIn profile URL |
 | Career summary | 2–4 sentence professional summary — used in cover letters and interview prep |
 **LLM-assisted writing (Paid):** If you have a paid tier, the wizard offers to generate your career summary from a few bullet points using your configured LLM backend.
 ---
 ## Step 4 — Resume
 Two paths are available:
 ### Upload PDF or DOCX
 Upload your existing resume. The LLM parses it and extracts:
 - Work experience (employer, title, dates, bullets)
 - Education
 - Skills
 - Certifications
 The extracted data is stored in `config/user.yaml` and used when generating cover letters.
 ### Guided form builder
 Fill in each section manually using structured form fields. Useful if you do not have a digital resume file ready, or if the parser misses something important.
 Both paths produce the same data structure. You can mix them — upload first, then edit the result in the form.
 ---
 ## Step 5 — Inference
 Configure which LLM backends Peregrine uses. Backends are tried in priority order; if the first fails, Peregrine falls back to the next.
 Available backend types:
 | Type | Examples | Notes |
 |------|---------|-------|
 | `openai_compat` | Ollama, vLLM, Claude Code wrapper, Copilot wrapper | Any OpenAI-compatible API |
 | `anthropic` | Claude via Anthropic API | Requires `ANTHROPIC_API_KEY` env var |
 | `vision_service` | Moondream2 local service | Used for survey screenshot analysis only |
 For each backend you want to enable:
 1. Enter the base URL (e.g. `http://localhost:11434/v1` for Ollama)
 2. Enter an API key if required (Anthropic, OpenAI)
 3. Click **Test** — Peregrine pings the `/health` endpoint and attempts a short completion
 The full backend configuration is written to `config/llm.yaml`. You can edit it directly later via **Settings → LLM Backends**.
 !!! tip "Recommended minimum"
    Enable at least Ollama with a general-purpose model (e.g. `llama3.1:8b`) for research tasks, and either Ollama or Anthropic for cover letter generation. The wizard will not block you if no backend is configured, but most features will not work.
 ---
 ## Step 6 — Search
 Define what jobs to look for. Search configuration is written to `config/search_profiles.yaml`.
 | Field | Description |
 |-------|-------------|
 | Profile name | A label for this search profile (e.g. `cs_leadership`) |
 | Job titles | List of titles to search for (e.g. `Customer Success Manager`, `TAM`) |
 | Locations | City/region strings or `Remote` |
 | Boards | Standard boards: `linkedin`, `indeed`, `glassdoor`, `zip_recruiter`, `google` |
 | Custom boards | Additional scrapers: `adzuna`, `theladders`, `craigslist` |
 | Exclude keywords | Jobs containing these words in the title are dropped |
 | Results per board | Max jobs to fetch per board per run |
 | Hours old | Only fetch jobs posted within this many hours |
 You can create multiple profiles (e.g. one for remote roles, one for a target industry). Run them all from the Home page or run a specific one.
 ---
 ## Step 7 — Integrations
 Connect optional external services. All integrations are optional — skip this step if you want to use Peregrine without external accounts.
 Available integrations:
 **Job tracking (Paid):** Notion, Airtable, Google Sheets
 **Document storage (Free):** Google Drive, Dropbox, OneDrive, MEGA, Nextcloud
 **Calendar (Paid):** Google Calendar, Apple Calendar (CalDAV)
 **Notifications (Paid for Slack; Free for Discord and Home Assistant):** Slack, Discord, Home Assistant
 Each integration has a connection card with the required credentials. Click **Test** to verify the connection before saving. Credentials are written to `config/integrations/<name>.yaml` (gitignored).
 See [Integrations](../user-guide/integrations.md) for per-service details.
 ---
 ## Crash Recovery
 The wizard saves your progress to `config/user.yaml` after each step is completed (`wizard_step` field). If anything goes wrong:
 - Restart Peregrine and navigate to http://localhost:8501
 - The wizard resumes at the last completed step
 ---
 ## Re-entering the Wizard
 To go through the wizard again (e.g. to change your search profile or swap LLM backends):
 1. Open **Settings**
 2. Go to the **Developer** tab
 3. Click **Reset wizard**
 This sets `wizard_complete: false` and `wizard_step: 0` in `config/user.yaml`. Your previously entered data is preserved as defaults.
--- a/docs/getting-started/installation.md
+++ b/docs/getting-started/installation.md
@ -0,0 +1,134 @@
 # Installation
 This page walks through a full Peregrine installation from scratch.
 ---
 ## Prerequisites
 - **Git** — to clone the repository
 - **Internet connection** — `setup.sh` downloads Docker and other dependencies
 - **Operating system**: Ubuntu/Debian, Fedora/RHEL, Arch Linux, or macOS (with Docker Desktop)
 !!! warning "Windows"
    Windows is not supported. Use [WSL2 with Ubuntu](https://docs.microsoft.com/windows/wsl/install) instead.
 ---
 ## Step 1 — Clone the repository
 ```bash
 git clone https://git.circuitforge.io/circuitforge/peregrine
 cd peregrine
 ```
 ---
 ## Step 2 — Run setup.sh
 ```bash
 bash setup.sh
 ```
 `setup.sh` performs the following automatically:
 1. **Detects your platform** (Ubuntu/Debian, Fedora/RHEL, Arch, macOS)
 2. **Installs Git** if not already present
 3. **Installs Docker Engine** and the Docker Compose v2 plugin via the official Docker repositories
 4. **Adds your user to the `docker` group** so you do not need `sudo` for docker commands (Linux only — log out and back in after this)
 5. **Detects NVIDIA GPUs** — if `nvidia-smi` is present and working, installs the NVIDIA Container Toolkit and configures Docker to use it
 6. **Creates `.env` from `.env.example`** — edit `.env` to customise ports and model storage paths before starting
 !!! note "macOS"
    `setup.sh` installs Docker Desktop via Homebrew (`brew install --cask docker`) then exits. Open Docker Desktop, start it, then re-run the script.
 !!! note "GPU requirement"
    For GPU support, `nvidia-smi` must return output before you run `setup.sh`. Install your NVIDIA driver first. The Container Toolkit installation will fail silently if the driver is not present.
 ---
 ## Step 3 — (Optional) Edit .env
 The `.env` file controls ports and volume mount paths. The defaults work for most single-user installs:
 ```bash
 # Default ports
 STREAMLIT_PORT=8501
 OLLAMA_PORT=11434
 VLLM_PORT=8000
 SEARXNG_PORT=8888
 VISION_PORT=8002
 ```
 Change `STREAMLIT_PORT` if 8501 is taken on your machine.
 ---
 ## Step 4 — Start Peregrine
 Choose a profile based on your hardware:
 ```bash
 make start                        # remote — no GPU, use API-only LLMs
 make start PROFILE=cpu            # cpu — local models on CPU (slow)
 make start PROFILE=single-gpu     # single-gpu — one NVIDIA GPU
 make start PROFILE=dual-gpu       # dual-gpu — GPU 0 = Ollama, GPU 1 = vLLM
 ```
 `make start` runs `preflight.py` first, which checks for port conflicts and writes GPU/RAM recommendations back to `.env`. Then it calls `docker compose --profile <PROFILE> up -d`.
 ---
 ## Step 5 — Open the UI
 Navigate to **http://localhost:8501** (or whatever `STREAMLIT_PORT` you set).
 The first-run wizard launches automatically. See [First-Run Wizard](first-run-wizard.md) for a step-by-step guide through all seven steps.
 ---
 ## Supported Platforms
 | Platform | Tested | Notes |
 |----------|--------|-------|
 | Ubuntu 22.04 / 24.04 | Yes | Primary target |
 | Debian 12 | Yes | |
 | Fedora 39/40 | Yes | |
 | RHEL / Rocky / AlmaLinux | Yes | |
 | Arch Linux / Manjaro | Yes | |
 | macOS (Apple Silicon) | Yes | Docker Desktop required; no GPU support |
 | macOS (Intel) | Yes | Docker Desktop required; no GPU support |
 | Windows | No | Use WSL2 with Ubuntu |
 ---
 ## GPU Support
 Only NVIDIA GPUs are supported. AMD ROCm is not currently supported.
 Requirements:
 - NVIDIA driver installed and `nvidia-smi` working before running `setup.sh`
 - CUDA 12.x recommended (CUDA 11.x may work but is untested)
 - Minimum 8 GB VRAM for `single-gpu` profile with default models
 - For `dual-gpu`: GPU 0 is assigned to Ollama, GPU 1 to vLLM
 If your GPU has less than 10 GB VRAM, `preflight.py` will calculate a `CPU_OFFLOAD_GB` value and write it to `.env`. The vLLM container picks this up via `--cpu-offload-gb` to overflow KV cache to system RAM.
 ---
 ## Stopping Peregrine
 ```bash
 make stop       # stop all containers
 make restart    # stop then start again (runs preflight first)
 ```
 ---
 ## Reinstalling / Clean State
 ```bash
 make clean      # removes containers, images, and data volumes (destructive)
 ```
 You will be prompted to type `yes` to confirm.
--- a/docs/index.md
+++ b/docs/index.md
@ -0,0 +1,65 @@
 # Peregrine
 **AI-powered job search pipeline — by [Circuit Forge LLC](https://circuitforge.io)**
 Peregrine automates the full job search lifecycle: discovery, matching, cover letter generation, application tracking, and interview preparation. It is privacy-first and local-first — your data never leaves your machine unless you configure an external integration.
 ---
 ## Quick Start
 ```bash
 # 1. Clone and install dependencies
 git clone https://git.circuitforge.io/circuitforge/peregrine
 cd peregrine
 bash setup.sh
 # 2. Start Peregrine
 make start                        # no GPU, API-only
 make start PROFILE=single-gpu     # one NVIDIA GPU
 make start PROFILE=dual-gpu       # dual GPU (Ollama + vLLM)
 # 3. Open the UI
 # http://localhost:8501
 ```
 The first-run wizard guides you through hardware detection, tier selection, identity, resume, LLM configuration, search profiles, and integrations. See [Installation](getting-started/installation.md) for the full walkthrough.
 ---
 ## Feature Overview
 | Feature | Free | Paid | Premium |
 |---------|------|------|---------|
 | Job discovery (JobSpy + custom boards) | Yes | Yes | Yes |
 | Resume keyword matching | Yes | Yes | Yes |
 | Cover letter generation | - | Yes | Yes |
 | Company research briefs | - | Yes | Yes |
 | Interview prep & practice Q&A | - | Yes | Yes |
 | Email sync & auto-classification | - | Yes | Yes |
 | Survey assistant (culture-fit Q&A) | - | Yes | Yes |
 | Integration connectors (Notion, Airtable, etc.) | Partial | Yes | Yes |
 | Calendar sync (Google, Apple) | - | Yes | Yes |
 | Cover letter model fine-tuning | - | - | Yes |
 | Multi-user support | - | - | Yes |
 See [Tier System](reference/tier-system.md) for the full feature gate table.
 ---
 ## Documentation Sections
 - **[Getting Started](getting-started/installation.md)** — Install, configure, and launch Peregrine
 - **[User Guide](user-guide/job-discovery.md)** — How to use every feature in the UI
 - **[Developer Guide](developer-guide/contributing.md)** — Add scrapers, integrations, and contribute code
 - **[Reference](reference/tier-system.md)** — Tier system, LLM router, and config file schemas
 ---
 ## License
 Core discovery pipeline: [MIT](https://git.circuitforge.io/circuitforge/peregrine/src/branch/main/LICENSE-MIT)
 AI features (cover letter generation, company research, interview prep, UI): [BSL 1.1](https://git.circuitforge.io/circuitforge/peregrine/src/branch/main/LICENSE-BSL)
 © 2026 Circuit Forge LLC
--- a/docs/plans/2026-02-20-job-seeker-design.md
+++ b/docs/plans/2026-02-20-job-seeker-design.md
@ -1,201 +0,0 @@
 # Job Seeker Platform — Design Document
 **Date:** 2026-02-20
 **Status:** Approved
 **Candidate:** Alex Rivera
 ---
 ## Overview
 A monorepo project at `/devl/job-seeker/` that integrates three FOSS tools into a
 cohesive job search pipeline: automated discovery (JobSpy), resume-to-listing keyword
 matching (Resume Matcher), and automated application submission (AIHawk). Job listings
 and interactive documents are tracked in Notion; source documents live in
 `/Library/Documents/JobSearch/`.
 ---
 ## Project Structure
 ```
 /devl/job-seeker/
 ├── config/
 │   ├── search_profiles.yaml    # JobSpy queries (titles, locations, boards)
 │   ├── llm.yaml                # LLM router: backends + fallback order
 │   └── notion.yaml             # Notion DB IDs and field mappings
 ├── aihawk/                     # git clone — Auto_Jobs_Applier_AIHawk
 ├── resume_matcher/             # git clone — Resume-Matcher
 ├── scripts/
 │   ├── discover.py             # JobSpy → deduplicate → push to Notion
 │   ├── match.py                # Notion job URL → Resume Matcher → write score back
 │   └── llm_router.py           # LLM abstraction layer with priority fallback chain
 ├── docs/plans/                 # Design and implementation docs (no resume files)
 ├── environment.yml             # conda env spec (env name: job-seeker)
 └── .gitignore
 ```
 **Document storage rule:** Resumes, cover letters, and any interactable documents live
 in `/Library/Documents/JobSearch/` or Notion — never committed to this repo.
 ---
 ## Architecture
 ### Data Flow
 ```
 JobSpy (LinkedIn / Indeed / Glassdoor / ZipRecruiter)
    └─▶ discover.py
            ├─ deduplicate by URL against existing Notion records
            └─▶ Notion DB  (Status: "New")
 Notion DB  (daily review — decide what to pursue)
    └─▶ match.py <notion-page-url>
            ├─ fetch job description from listing URL
            ├─ run Resume Matcher vs. /Library/Documents/JobSearch/Alex_Rivera_Resume_02-19-2025.pdf
            └─▶ write Match Score + Keyword Gaps back to Notion page
 AIHawk  (when ready to apply)
    ├─ reads config pointing to same resume + personal_info.yaml
    ├─ llm_router.py → best available LLM backend
    ├─ submits LinkedIn Easy Apply
    └─▶ Notion status → "Applied"
 ```
 ---
 ## Notion Database Schema
 | Field         | Type     | Notes                                                      |
 |---------------|----------|------------------------------------------------------------|
 | Job Title     | Title    | Primary identifier                                         |
 | Company       | Text     |                                                            |
 | Location      | Text     |                                                            |
 | Remote        | Checkbox |                                                            |
 | URL           | URL      | Deduplication key                                          |
 | Source        | Select   | LinkedIn / Indeed / Glassdoor / ZipRecruiter               |
 | Status        | Select   | New → Reviewing → Applied → Interview → Offer → Rejected   |
 | Match Score   | Number   | 0–100, written by match.py                                 |
 | Keyword Gaps  | Text     | Comma-separated missing keywords from Resume Matcher       |
 | Salary        | Text     | If listed                                                  |
 | Date Found    | Date     | Set at discovery time                                      |
 | Notes         | Text     | Manual field                                               |
 ---
 ## LLM Router (`scripts/llm_router.py`)
 Single `complete(prompt, system=None)` interface. On each call: health-check each
 backend in configured order, use the first that responds. Falls back silently on
 connection error, timeout, or 5xx. Logs which backend was used.
 All backends except Anthropic use the `openai` Python package (OpenAI-compatible
 endpoints). Anthropic uses the `anthropic` package.
 ### `config/llm.yaml`
 ```yaml
 fallback_order:
  - claude_code      # port 3009 — Claude via local pipeline (highest quality)
  - ollama           # port 11434 — local, always-on
  - vllm             # port 8000 — start when needed
  - github_copilot   # port 3010 — Copilot via gh token
  - anthropic        # cloud fallback, burns API credits
 backends:
  claude_code:
    type: openai_compat
    base_url: http://localhost:3009/v1
    model: claude-code-terminal
    api_key: "any"
  ollama:
    type: openai_compat
    base_url: http://localhost:11434/v1
    model: llama3.2
    api_key: "ollama"
  vllm:
    type: openai_compat
    base_url: http://localhost:8000/v1
    model: __auto__
    api_key: ""
  github_copilot:
    type: openai_compat
    base_url: http://localhost:3010/v1
    model: gpt-4o
    api_key: "any"
  anthropic:
    type: anthropic
    model: claude-sonnet-4-6
    api_key_env: ANTHROPIC_API_KEY
 ```
 ---
 ## Job Search Profile
 ### `config/search_profiles.yaml` (initial)
 ```yaml
 profiles:
  - name: cs_leadership
    titles:
      - "Customer Success Manager"
      - "Director of Customer Success"
      - "VP Customer Success"
      - "Head of Customer Success"
      - "Technical Account Manager"
      - "Revenue Operations Manager"
      - "Customer Experience Lead"
    locations:
      - "Remote"
      - "San Francisco Bay Area, CA"
    boards:
      - linkedin
      - indeed
      - glassdoor
      - zip_recruiter
    results_per_board: 25
    remote_only: false          # remote preferred but Bay Area in-person ok
    hours_old: 72               # listings posted in last 3 days
 ```
 ---
 ## Conda Environment
 New dedicated env `job-seeker` (not base). Core packages:
 - `python-jobspy` — job scraping
 - `notion-client` — Notion API
 - `openai` — OpenAI-compatible calls (Ollama, vLLM, Copilot, Claude pipeline)
 - `anthropic` — Anthropic API fallback
 - `pyyaml` — config parsing
 - `pandas` — CSV handling and dedup
 - Resume Matcher dependencies (sentence-transformers, streamlit — installed from clone)
 Resume Matcher Streamlit UI runs on port **8501** (confirmed clear).
 ---
 ## Port Map
 | Port  | Service                        | Status         |
 |-------|--------------------------------|----------------|
 | 3009  | Claude Code OpenAI wrapper     | Start via manage.sh in Post Fight Processing |
 | 3010  | GitHub Copilot wrapper         | Start via manage-copilot.sh                  |
 | 11434 | Ollama                         | Running        |
 | 8000  | vLLM                           | Start when needed                            |
 | 8501  | Resume Matcher (Streamlit)     | Start when needed                            |
 ---
 ## Out of Scope (this phase)
 - Scheduled/cron automation (run discover.py manually for now)
 - Email/SMS alerts for new listings
 - ATS resume rebuild (separate task)
 - Applications to non-LinkedIn platforms via AIHawk
--- a/docs/plans/2026-02-20-job-seeker-implementation.md
+++ b/docs/plans/2026-02-20-job-seeker-implementation.md
--- a/docs/plans/2026-02-20-ui-design.md
+++ b/docs/plans/2026-02-20-ui-design.md
@ -1,148 +0,0 @@
 # Job Seeker Platform — Web UI Design
 **Date:** 2026-02-20
 **Status:** Approved
 ## Overview
 A Streamlit multi-page web UI that gives Alex (and her partner) a friendly interface to review scraped job listings, curate them before they hit Notion, edit search/LLM/Notion settings, and fill out her AIHawk application profile. Designed to be usable by anyone — no technical knowledge required.
 ---
 ## Architecture & Data Flow
 ```
 discover.py → SQLite staging.db (status: pending)
                      ↓
              Streamlit UI
        review / approve / reject
                      ↓
         "Sync N approved jobs" button
                      ↓
              Notion DB (status: synced)
 ```
 `discover.py` is modified to write to SQLite instead of directly to Notion.
 A new `sync.py` handles the approved → Notion push.
 `db.py` provides shared SQLite helpers used by both scripts and UI pages.
 ### SQLite Schema (`staging.db`, gitignored)
 ```sql
 CREATE TABLE jobs (
    id          INTEGER PRIMARY KEY AUTOINCREMENT,
    title       TEXT,
    company     TEXT,
    url         TEXT UNIQUE,
    source      TEXT,
    location    TEXT,
    is_remote   INTEGER,
    salary      TEXT,
    description TEXT,
    match_score REAL,
    keyword_gaps TEXT,
    date_found  TEXT,
    status      TEXT DEFAULT 'pending',  -- pending / approved / rejected / synced
    notion_page_id TEXT
 );
 ```
 ---
 ## Pages
 ### Home (Dashboard)
 - Stat cards: Pending / Approved / Rejected / Synced counts
 - "Run Discovery" button — runs `discover.py` as subprocess, streams output
 - "Sync N approved jobs → Notion" button — visible only when approved count > 0
 - Recent activity list (last 10 jobs found)
 ### Job Review
 - Filterable table/card view of pending jobs
 - Filters: source (LinkedIn/Indeed/etc), remote only toggle, minimum match score slider
 - Checkboxes for batch selection
 - "Approve Selected" / "Reject Selected" buttons
 - Rejected jobs hidden by default, togglable
 - Match score shown as colored badge (green ≥70, amber 40–69, red <40)
 ### Settings
 Three tabs:
 **Search** — edit `config/search_profiles.yaml`:
 - Job titles (add/remove tags)
 - Locations (add/remove)
 - Boards checkboxes
 - Hours old slider
 - Results per board slider
 **LLM Backends** — edit `config/llm.yaml`:
 - Fallback order (drag or up/down arrows)
 - Per-backend: URL, model name, enabled toggle
 - "Test connection" button per backend
 **Notion** — edit `config/notion.yaml`:
 - Token field (masked, show/hide toggle)
 - Database ID
 - "Test connection" button
 ### Resume Editor
 Sectioned form over `aihawk/data_folder/plain_text_resume.yaml`:
 - **Personal Info** — name, email, phone, LinkedIn, city, zip
 - **Education** — list of entries, add/remove buttons
 - **Experience** — list of entries, add/remove buttons
 - **Skills & Interests** — tag-style inputs
 - **Preferences** — salary range, notice period, remote/relocation toggles
 - **Self-Identification** — gender, pronouns, veteran, disability, ethnicity (with "prefer not to say" options)
 - **Legal** — work authorization checkboxes
 `FILL_IN` fields highlighted in amber with "Needs your attention" note.
 Save button writes back to YAML. No raw YAML shown by default.
 ---
 ## Theme & Styling
 Central theme at `app/.streamlit/config.toml`:
 - Dark base, accent color teal/green (job search = growth)
 - Consistent font (Inter or system sans-serif)
 - Responsive column layouts — usable on tablet/mobile
 - No jargon — "Run Discovery" not "Execute scrape", "Sync to Notion" not "Push records"
 ---
 ## File Layout
 ```
 app/
 ├── .streamlit/
 │   └── config.toml          # central theme
 ├── Home.py                  # dashboard
 └── pages/
    ├── 1_Job_Review.py
    ├── 2_Settings.py
    └── 3_Resume_Editor.py
 scripts/
 ├── db.py                    # new: SQLite helpers
 ├── sync.py                  # new: approved → Notion push
 ├── discover.py              # modified: write to SQLite not Notion
 ├── match.py                 # unchanged
 └── llm_router.py            # unchanged
 ```
 Run: `conda run -n job-seeker streamlit run app/Home.py`
 ---
 ## New Dependencies
 None — `streamlit` already installed via resume_matcher deps.
 `sqlite3` is Python stdlib.
 ---
 ## Out of Scope
 - Real-time collaboration
 - Mobile native app
 - Cover letter editor (handled separately via LoRA fine-tune task)
 - AIHawk trigger from UI (run manually for now)
--- a/docs/plans/2026-02-20-ui-implementation.md
+++ b/docs/plans/2026-02-20-ui-implementation.md
--- a/docs/plans/2026-02-21-background-tasks-design.md
+++ b/docs/plans/2026-02-21-background-tasks-design.md
@ -1,100 +0,0 @@
 # Background Task Processing — Design
 **Date:** 2026-02-21
 **Status:** Approved
 ## Problem
 Cover letter generation (`4_Apply.py`) and company research (`6_Interview_Prep.py`) call LLM scripts synchronously inside `st.spinner()`. If the user navigates away during generation, Streamlit abandons the in-progress call and the result is lost. Both results are already persisted to SQLite on completion, so if the task kept running in the background the result would be available on return.
 ## Solution Overview
 Python threading + SQLite task table. When a user clicks Generate, a daemon thread is spawned immediately and the task is recorded in a new `background_tasks` table. The thread writes results to the existing tables (`jobs.cover_letter`, `company_research`) and marks itself complete/failed. All pages share a sidebar indicator that auto-refreshes while tasks are active. Individual pages show task-level status inline.
 ## SQLite Schema
 New table `background_tasks` added in `scripts/db.py`:
 ```sql
 CREATE TABLE IF NOT EXISTS background_tasks (
    id          INTEGER PRIMARY KEY AUTOINCREMENT,
    task_type   TEXT NOT NULL,   -- "cover_letter" | "company_research"
    job_id      INTEGER NOT NULL,
    status      TEXT NOT NULL DEFAULT 'queued',  -- queued | running | completed | failed
    error       TEXT,
    created_at  DATETIME DEFAULT (datetime('now')),
    started_at  DATETIME,
    finished_at DATETIME
 )
 ```
 ## Deduplication Rule
 Before inserting a new task, check for an existing `queued` or `running` row with the same `(task_type, job_id)`. If one exists, reject the submission (return the existing task's id). Different task types for the same job (e.g. cover letter + research) are allowed to run concurrently. Different jobs of the same type are allowed concurrently.
 ## Components
 ### `scripts/task_runner.py` (new)
 - `submit_task(db, task_type, job_id) -> int` — dedup check, insert row, spawn daemon thread, return task id
 - `_run_task(db, task_id, task_type, job_id)` — thread body: mark running, call generator, save result, mark completed/failed
 - `get_active_tasks(db) -> list[dict]` — all queued/running rows with job title+company joined
 - `get_task_for_job(db, task_type, job_id) -> dict | None` — latest task row for a specific job+type
 ### `scripts/db.py` (modified)
 - Add `init_background_tasks(conn)` called inside `init_db()`
 - Add `insert_task`, `update_task_status`, `get_active_tasks`, `get_task_for_job` helpers
 ### `app/app.py` (modified)
 - After `st.navigation()`, call `get_active_tasks()` and render sidebar indicator
 - Use `st.fragment` with `time.sleep(3)` + `st.rerun(scope="fragment")` to poll while tasks are active
 - Sidebar shows: `⏳ N task(s) running` count + per-task line (type + company name)
 - Fragment polling stops when active task count reaches zero
 ### `app/pages/4_Apply.py` (modified)
 - Generate button calls `submit_task(db, "cover_letter", job_id)` instead of running inline
 - If a task is `queued`/`running` for the selected job, disable button and show inline status fragment (polls every 3s)
 - On `completed`, load cover letter from `jobs` row (already saved by thread)
 - On `failed`, show error message and re-enable button
 ### `app/pages/6_Interview_Prep.py` (modified)
 - Generate/Refresh buttons call `submit_task(db, "company_research", job_id)` instead of running inline
 - Same inline status fragment pattern as Apply page
 ## Data Flow
 ```
 User clicks Generate
    → submit_task(db, type, job_id)
        → dedup check (reject if already queued/running for same type+job)
        → INSERT background_tasks row (status=queued)
        → spawn daemon thread
        → return task_id
    → page shows inline "⏳ Queued…" fragment
 Thread runs
    → UPDATE status=running, started_at=now
    → call generate_cover_letter.generate() OR research_company()
    → write result to jobs.cover_letter OR company_research table
    → UPDATE status=completed, finished_at=now
    (on exception: UPDATE status=failed, error=str(e))
 Sidebar fragment (every 3s while active tasks > 0)
    → get_active_tasks() → render count + list
    → st.rerun(scope="fragment")
 Page fragment (every 3s while task for this job is running)
    → get_task_for_job() → render status
    → on completed: st.rerun() (full rerun to reload cover letter / research)
 ```
 ## What Is Not Changed
 - `generate_cover_letter.generate()` and `research_company()` are called unchanged from the thread
 - `update_cover_letter()` and `save_research()` DB helpers are reused unchanged
 - No new Python packages required
 - No separate worker process — daemon threads die with the Streamlit server, but results already written to SQLite survive
--- a/docs/plans/2026-02-21-background-tasks-plan.md
+++ b/docs/plans/2026-02-21-background-tasks-plan.md
@ -1,933 +0,0 @@
 # Background Task Processing Implementation Plan
 > **For Claude:** REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.
 **Goal:** Replace synchronous LLM calls in Apply and Interview Prep pages with background threads so cover letter and research generation survive page navigation.
 **Architecture:** A new `background_tasks` SQLite table tracks task state. `scripts/task_runner.py` spawns daemon threads that call existing generator functions and write results via existing DB helpers. The Streamlit sidebar polls active tasks every 3s via `@st.fragment(run_every=3)`; individual pages show per-job status with the same pattern.
 **Tech Stack:** Python `threading` (stdlib), SQLite, Streamlit `st.fragment` (≥1.33 — already installed)
 ---
 ## Task 1: Add background_tasks table and DB helpers
 **Files:**
 - Modify: `scripts/db.py`
 - Test: `tests/test_db.py`
 ### Step 1: Write the failing tests
 Add to `tests/test_db.py`:
 ```python
 # ── background_tasks tests ────────────────────────────────────────────────────
 def test_init_db_creates_background_tasks_table(tmp_path):
    """init_db creates a background_tasks table."""
    from scripts.db import init_db
    db_path = tmp_path / "test.db"
    init_db(db_path)
    import sqlite3
    conn = sqlite3.connect(db_path)
    cur = conn.execute(
        "SELECT name FROM sqlite_master WHERE type='table' AND name='background_tasks'"
    )
    assert cur.fetchone() is not None
    conn.close()
 def test_insert_task_returns_id_and_true(tmp_path):
    """insert_task returns (task_id, True) for a new task."""
    from scripts.db import init_db, insert_job, insert_task
    db_path = tmp_path / "test.db"
    init_db(db_path)
    job_id = insert_job(db_path, {
        "title": "CSM", "company": "Acme", "url": "https://ex.com/1",
        "source": "linkedin", "location": "Remote", "is_remote": True,
        "salary": "", "description": "", "date_found": "2026-02-20",
    })
    task_id, is_new = insert_task(db_path, "cover_letter", job_id)
    assert isinstance(task_id, int) and task_id > 0
    assert is_new is True
 def test_insert_task_deduplicates_active_task(tmp_path):
    """insert_task returns (existing_id, False) if a queued/running task already exists."""
    from scripts.db import init_db, insert_job, insert_task
    db_path = tmp_path / "test.db"
    init_db(db_path)
    job_id = insert_job(db_path, {
        "title": "CSM", "company": "Acme", "url": "https://ex.com/1",
        "source": "linkedin", "location": "Remote", "is_remote": True,
        "salary": "", "description": "", "date_found": "2026-02-20",
    })
    first_id, _ = insert_task(db_path, "cover_letter", job_id)
    second_id, is_new = insert_task(db_path, "cover_letter", job_id)
    assert second_id == first_id
    assert is_new is False
 def test_insert_task_allows_different_types_same_job(tmp_path):
    """insert_task allows cover_letter and company_research for the same job concurrently."""
    from scripts.db import init_db, insert_job, insert_task
    db_path = tmp_path / "test.db"
    init_db(db_path)
    job_id = insert_job(db_path, {
        "title": "CSM", "company": "Acme", "url": "https://ex.com/1",
        "source": "linkedin", "location": "Remote", "is_remote": True,
        "salary": "", "description": "", "date_found": "2026-02-20",
    })
    _, cl_new = insert_task(db_path, "cover_letter", job_id)
    _, res_new = insert_task(db_path, "company_research", job_id)
    assert cl_new is True
    assert res_new is True
 def test_update_task_status_running(tmp_path):
    """update_task_status('running') sets started_at."""
    from scripts.db import init_db, insert_job, insert_task, update_task_status
    import sqlite3
    db_path = tmp_path / "test.db"
    init_db(db_path)
    job_id = insert_job(db_path, {
        "title": "CSM", "company": "Acme", "url": "https://ex.com/1",
        "source": "linkedin", "location": "Remote", "is_remote": True,
        "salary": "", "description": "", "date_found": "2026-02-20",
    })
    task_id, _ = insert_task(db_path, "cover_letter", job_id)
    update_task_status(db_path, task_id, "running")
    conn = sqlite3.connect(db_path)
    row = conn.execute("SELECT status, started_at FROM background_tasks WHERE id=?", (task_id,)).fetchone()
    conn.close()
    assert row[0] == "running"
    assert row[1] is not None
 def test_update_task_status_completed(tmp_path):
    """update_task_status('completed') sets finished_at."""
    from scripts.db import init_db, insert_job, insert_task, update_task_status
    import sqlite3
    db_path = tmp_path / "test.db"
    init_db(db_path)
    job_id = insert_job(db_path, {
        "title": "CSM", "company": "Acme", "url": "https://ex.com/1",
        "source": "linkedin", "location": "Remote", "is_remote": True,
        "salary": "", "description": "", "date_found": "2026-02-20",
    })
    task_id, _ = insert_task(db_path, "cover_letter", job_id)
    update_task_status(db_path, task_id, "completed")
    conn = sqlite3.connect(db_path)
    row = conn.execute("SELECT status, finished_at FROM background_tasks WHERE id=?", (task_id,)).fetchone()
    conn.close()
    assert row[0] == "completed"
    assert row[1] is not None
 def test_update_task_status_failed_stores_error(tmp_path):
    """update_task_status('failed') stores error message and sets finished_at."""
    from scripts.db import init_db, insert_job, insert_task, update_task_status
    import sqlite3
    db_path = tmp_path / "test.db"
    init_db(db_path)
    job_id = insert_job(db_path, {
        "title": "CSM", "company": "Acme", "url": "https://ex.com/1",
        "source": "linkedin", "location": "Remote", "is_remote": True,
        "salary": "", "description": "", "date_found": "2026-02-20",
    })
    task_id, _ = insert_task(db_path, "cover_letter", job_id)
    update_task_status(db_path, task_id, "failed", error="LLM timeout")
    conn = sqlite3.connect(db_path)
    row = conn.execute("SELECT status, error, finished_at FROM background_tasks WHERE id=?", (task_id,)).fetchone()
    conn.close()
    assert row[0] == "failed"
    assert row[1] == "LLM timeout"
    assert row[2] is not None
 def test_get_active_tasks_returns_only_active(tmp_path):
    """get_active_tasks returns only queued/running tasks with job info joined."""
    from scripts.db import init_db, insert_job, insert_task, update_task_status, get_active_tasks
    db_path = tmp_path / "test.db"
    init_db(db_path)
    job_id = insert_job(db_path, {
        "title": "CSM", "company": "Acme", "url": "https://ex.com/1",
        "source": "linkedin", "location": "Remote", "is_remote": True,
        "salary": "", "description": "", "date_found": "2026-02-20",
    })
    active_id, _ = insert_task(db_path, "cover_letter", job_id)
    done_id, _ = insert_task(db_path, "company_research", job_id)
    update_task_status(db_path, done_id, "completed")
    tasks = get_active_tasks(db_path)
    assert len(tasks) == 1
    assert tasks[0]["id"] == active_id
    assert tasks[0]["company"] == "Acme"
    assert tasks[0]["title"] == "CSM"
 def test_get_task_for_job_returns_latest(tmp_path):
    """get_task_for_job returns the most recent task for the given type+job."""
    from scripts.db import init_db, insert_job, insert_task, update_task_status, get_task_for_job
    db_path = tmp_path / "test.db"
    init_db(db_path)
    job_id = insert_job(db_path, {
        "title": "CSM", "company": "Acme", "url": "https://ex.com/1",
        "source": "linkedin", "location": "Remote", "is_remote": True,
        "salary": "", "description": "", "date_found": "2026-02-20",
    })
    first_id, _ = insert_task(db_path, "cover_letter", job_id)
    update_task_status(db_path, first_id, "completed")
    second_id, _ = insert_task(db_path, "cover_letter", job_id)  # allowed since first is done
    task = get_task_for_job(db_path, "cover_letter", job_id)
    assert task is not None
    assert task["id"] == second_id
 def test_get_task_for_job_returns_none_when_absent(tmp_path):
    """get_task_for_job returns None when no task exists for that job+type."""
    from scripts.db import init_db, insert_job, get_task_for_job
    db_path = tmp_path / "test.db"
    init_db(db_path)
    job_id = insert_job(db_path, {
        "title": "CSM", "company": "Acme", "url": "https://ex.com/1",
        "source": "linkedin", "location": "Remote", "is_remote": True,
        "salary": "", "description": "", "date_found": "2026-02-20",
    })
    assert get_task_for_job(db_path, "cover_letter", job_id) is None
 ```
 ### Step 2: Run tests to verify they fail
 ```bash
 /devl/miniconda3/envs/job-seeker/bin/pytest tests/test_db.py -v -k "background_tasks or insert_task or update_task_status or get_active_tasks or get_task_for_job"
 ```
 Expected: FAIL with `ImportError: cannot import name 'insert_task'`
 ### Step 3: Implement in scripts/db.py
 Add the DDL constant after `CREATE_COMPANY_RESEARCH`:
 ```python
 CREATE_BACKGROUND_TASKS = """
 CREATE TABLE IF NOT EXISTS background_tasks (
    id          INTEGER PRIMARY KEY AUTOINCREMENT,
    task_type   TEXT NOT NULL,
    job_id      INTEGER NOT NULL,
    status      TEXT NOT NULL DEFAULT 'queued',
    error       TEXT,
    created_at  DATETIME DEFAULT (datetime('now')),
    started_at  DATETIME,
    finished_at DATETIME
 )
 """
 ```
 Add `conn.execute(CREATE_BACKGROUND_TASKS)` inside `init_db()`, after the existing three `conn.execute()` calls:
 ```python
 def init_db(db_path: Path = DEFAULT_DB) -> None:
    """Create tables if they don't exist, then run migrations."""
    conn = sqlite3.connect(db_path)
    conn.execute(CREATE_JOBS)
    conn.execute(CREATE_JOB_CONTACTS)
    conn.execute(CREATE_COMPANY_RESEARCH)
    conn.execute(CREATE_BACKGROUND_TASKS)   # ← add this line
    conn.commit()
    conn.close()
    _migrate_db(db_path)
 ```
 Add the four helper functions at the end of `scripts/db.py`:
 ```python
 # ── Background task helpers ───────────────────────────────────────────────────
 def insert_task(db_path: Path = DEFAULT_DB, task_type: str = "",
                job_id: int = None) -> tuple[int, bool]:
    """Insert a new background task.
    Returns (task_id, True) if inserted, or (existing_id, False) if a
    queued/running task for the same (task_type, job_id) already exists.
    """
    conn = sqlite3.connect(db_path)
    existing = conn.execute(
        "SELECT id FROM background_tasks WHERE task_type=? AND job_id=? AND status IN ('queued','running')",
        (task_type, job_id),
    ).fetchone()
    if existing:
        conn.close()
        return existing[0], False
    cur = conn.execute(
        "INSERT INTO background_tasks (task_type, job_id, status) VALUES (?, ?, 'queued')",
        (task_type, job_id),
    )
    task_id = cur.lastrowid
    conn.commit()
    conn.close()
    return task_id, True
 def update_task_status(db_path: Path = DEFAULT_DB, task_id: int = None,
                       status: str = "", error: Optional[str] = None) -> None:
    """Update a task's status and set the appropriate timestamp."""
    now = datetime.now().isoformat()[:16]
    conn = sqlite3.connect(db_path)
    if status == "running":
        conn.execute(
            "UPDATE background_tasks SET status=?, started_at=? WHERE id=?",
            (status, now, task_id),
        )
    elif status in ("completed", "failed"):
        conn.execute(
            "UPDATE background_tasks SET status=?, finished_at=?, error=? WHERE id=?",
            (status, now, error, task_id),
        )
    else:
        conn.execute("UPDATE background_tasks SET status=? WHERE id=?", (status, task_id))
    conn.commit()
    conn.close()
 def get_active_tasks(db_path: Path = DEFAULT_DB) -> list[dict]:
    """Return all queued/running tasks with job title and company joined in."""
    conn = sqlite3.connect(db_path)
    conn.row_factory = sqlite3.Row
    rows = conn.execute("""
        SELECT bt.*, j.title, j.company
        FROM background_tasks bt
        LEFT JOIN jobs j ON j.id = bt.job_id
        WHERE bt.status IN ('queued', 'running')
        ORDER BY bt.created_at ASC
    """).fetchall()
    conn.close()
    return [dict(r) for r in rows]
 def get_task_for_job(db_path: Path = DEFAULT_DB, task_type: str = "",
                     job_id: int = None) -> Optional[dict]:
    """Return the most recent task row for a (task_type, job_id) pair, or None."""
    conn = sqlite3.connect(db_path)
    conn.row_factory = sqlite3.Row
    row = conn.execute(
        """SELECT * FROM background_tasks
           WHERE task_type=? AND job_id=?
           ORDER BY id DESC LIMIT 1""",
        (task_type, job_id),
    ).fetchone()
    conn.close()
    return dict(row) if row else None
 ```
 ### Step 4: Run tests to verify they pass
 ```bash
 /devl/miniconda3/envs/job-seeker/bin/pytest tests/test_db.py -v -k "background_tasks or insert_task or update_task_status or get_active_tasks or get_task_for_job"
 ```
 Expected: all new tests PASS, no regressions
 ### Step 5: Run full test suite
 ```bash
 /devl/miniconda3/envs/job-seeker/bin/pytest tests/ -v
 ```
 Expected: all tests PASS
 ### Step 6: Commit
 ```bash
 git add scripts/db.py tests/test_db.py
 git commit -m "feat: add background_tasks table and DB helpers"
 ```
 ---
 ## Task 2: Create scripts/task_runner.py
 **Files:**
 - Create: `scripts/task_runner.py`
 - Test: `tests/test_task_runner.py`
 ### Step 1: Write the failing tests
 Create `tests/test_task_runner.py`:
 ```python
 import threading
 import time
 import pytest
 from pathlib import Path
 from unittest.mock import patch, MagicMock
 import sqlite3
 def _make_db(tmp_path):
    from scripts.db import init_db, insert_job
    db = tmp_path / "test.db"
    init_db(db)
    job_id = insert_job(db, {
        "title": "CSM", "company": "Acme", "url": "https://ex.com/1",
        "source": "linkedin", "location": "Remote", "is_remote": True,
        "salary": "", "description": "Great role.", "date_found": "2026-02-20",
    })
    return db, job_id
 def test_submit_task_returns_id_and_true(tmp_path):
    """submit_task returns (task_id, True) and spawns a thread."""
    db, job_id = _make_db(tmp_path)
    with patch("scripts.task_runner._run_task"):  # don't actually call LLM
        from scripts.task_runner import submit_task
        task_id, is_new = submit_task(db, "cover_letter", job_id)
    assert isinstance(task_id, int) and task_id > 0
    assert is_new is True
 def test_submit_task_deduplicates(tmp_path):
    """submit_task returns (existing_id, False) for a duplicate in-flight task."""
    db, job_id = _make_db(tmp_path)
    with patch("scripts.task_runner._run_task"):
        from scripts.task_runner import submit_task
        first_id, _ = submit_task(db, "cover_letter", job_id)
        second_id, is_new = submit_task(db, "cover_letter", job_id)
    assert second_id == first_id
    assert is_new is False
 def test_run_task_cover_letter_success(tmp_path):
    """_run_task marks running→completed and saves cover letter to DB."""
    db, job_id = _make_db(tmp_path)
    from scripts.db import insert_task, get_task_for_job, get_jobs_by_status
    task_id, _ = insert_task(db, "cover_letter", job_id)
    with patch("scripts.generate_cover_letter.generate", return_value="Dear Hiring Manager,\nGreat fit!"):
        from scripts.task_runner import _run_task
        _run_task(db, task_id, "cover_letter", job_id)
    task = get_task_for_job(db, "cover_letter", job_id)
    assert task["status"] == "completed"
    assert task["error"] is None
    conn = sqlite3.connect(db)
    row = conn.execute("SELECT cover_letter FROM jobs WHERE id=?", (job_id,)).fetchone()
    conn.close()
    assert row[0] == "Dear Hiring Manager,\nGreat fit!"
 def test_run_task_company_research_success(tmp_path):
    """_run_task marks running→completed and saves research to DB."""
    db, job_id = _make_db(tmp_path)
    from scripts.db import insert_task, get_task_for_job, get_research
    task_id, _ = insert_task(db, "company_research", job_id)
    fake_result = {
        "raw_output": "raw", "company_brief": "brief",
        "ceo_brief": "ceo", "talking_points": "points",
    }
    with patch("scripts.company_research.research_company", return_value=fake_result):
        from scripts.task_runner import _run_task
        _run_task(db, task_id, "company_research", job_id)
    task = get_task_for_job(db, "company_research", job_id)
    assert task["status"] == "completed"
    research = get_research(db, job_id=job_id)
    assert research["company_brief"] == "brief"
 def test_run_task_marks_failed_on_exception(tmp_path):
    """_run_task marks status=failed and stores error when generator raises."""
    db, job_id = _make_db(tmp_path)
    from scripts.db import insert_task, get_task_for_job
    task_id, _ = insert_task(db, "cover_letter", job_id)
    with patch("scripts.generate_cover_letter.generate", side_effect=RuntimeError("LLM timeout")):
        from scripts.task_runner import _run_task
        _run_task(db, task_id, "cover_letter", job_id)
    task = get_task_for_job(db, "cover_letter", job_id)
    assert task["status"] == "failed"
    assert "LLM timeout" in task["error"]
 def test_submit_task_actually_completes(tmp_path):
    """Integration: submit_task spawns a thread that completes asynchronously."""
    db, job_id = _make_db(tmp_path)
    from scripts.db import get_task_for_job
    with patch("scripts.generate_cover_letter.generate", return_value="Cover letter text"):
        from scripts.task_runner import submit_task
        task_id, _ = submit_task(db, "cover_letter", job_id)
        # Wait for thread to complete (max 5s)
        for _ in range(50):
            task = get_task_for_job(db, "cover_letter", job_id)
            if task and task["status"] in ("completed", "failed"):
                break
            time.sleep(0.1)
    task = get_task_for_job(db, "cover_letter", job_id)
    assert task["status"] == "completed"
 ```
 ### Step 2: Run tests to verify they fail
 ```bash
 /devl/miniconda3/envs/job-seeker/bin/pytest tests/test_task_runner.py -v
 ```
 Expected: FAIL with `ModuleNotFoundError: No module named 'scripts.task_runner'`
 ### Step 3: Implement scripts/task_runner.py
 Create `scripts/task_runner.py`:
 ```python
 # scripts/task_runner.py
 """
 Background task runner for LLM generation tasks.
 Submitting a task inserts a row in background_tasks and spawns a daemon thread.
 The thread calls the appropriate generator, writes results to existing tables,
 and marks the task completed or failed.
 Deduplication: only one queued/running task per (task_type, job_id) is allowed.
 Different task types for the same job run concurrently (e.g. cover letter + research).
 """
 import sqlite3
 import threading
 from pathlib import Path
 from scripts.db import (
    DEFAULT_DB,
    insert_task,
    update_task_status,
    update_cover_letter,
    save_research,
 )
 def submit_task(db_path: Path = DEFAULT_DB, task_type: str = "",
                job_id: int = None) -> tuple[int, bool]:
    """Submit a background LLM task.
    Returns (task_id, True) if a new task was queued and a thread spawned.
    Returns (existing_id, False) if an identical task is already in-flight.
    """
    task_id, is_new = insert_task(db_path, task_type, job_id)
    if is_new:
        t = threading.Thread(
            target=_run_task,
            args=(db_path, task_id, task_type, job_id),
            daemon=True,
        )
        t.start()
    return task_id, is_new
 def _run_task(db_path: Path, task_id: int, task_type: str, job_id: int) -> None:
    """Thread body: run the generator and persist the result."""
    conn = sqlite3.connect(db_path)
    conn.row_factory = sqlite3.Row
    row = conn.execute("SELECT * FROM jobs WHERE id=?", (job_id,)).fetchone()
    conn.close()
    if row is None:
        update_task_status(db_path, task_id, "failed", error=f"Job {job_id} not found")
        return
    job = dict(row)
    update_task_status(db_path, task_id, "running")
    try:
        if task_type == "cover_letter":
            from scripts.generate_cover_letter import generate
            result = generate(
                job.get("title", ""),
                job.get("company", ""),
                job.get("description", ""),
            )
            update_cover_letter(db_path, job_id, result)
        elif task_type == "company_research":
            from scripts.company_research import research_company
            result = research_company(job)
            save_research(db_path, job_id=job_id, **result)
        else:
            raise ValueError(f"Unknown task_type: {task_type!r}")
        update_task_status(db_path, task_id, "completed")
    except Exception as exc:
        update_task_status(db_path, task_id, "failed", error=str(exc))
 ```
 ### Step 4: Run tests to verify they pass
 ```bash
 /devl/miniconda3/envs/job-seeker/bin/pytest tests/test_task_runner.py -v
 ```
 Expected: all tests PASS
 ### Step 5: Run full test suite
 ```bash
 /devl/miniconda3/envs/job-seeker/bin/pytest tests/ -v
 ```
 Expected: all tests PASS
 ### Step 6: Commit
 ```bash
 git add scripts/task_runner.py tests/test_task_runner.py
 git commit -m "feat: add task_runner — background thread executor for LLM tasks"
 ```
 ---
 ## Task 3: Add sidebar task indicator to app/app.py
 **Files:**
 - Modify: `app/app.py`
 No new tests needed — this is pure UI wiring.
 ### Step 1: Replace the contents of app/app.py
 Current file is 33 lines. Replace entirely with:
 ```python
 # app/app.py
 """
 Streamlit entry point — uses st.navigation() to control the sidebar.
 Main workflow pages are listed at the top; Settings is separated into
 a "System" section so it doesn't crowd the navigation.
 Run: streamlit run app/app.py
     bash scripts/manage-ui.sh start
 """
 import sys
 from pathlib import Path
 sys.path.insert(0, str(Path(__file__).parent.parent))
 import streamlit as st
 from scripts.db import DEFAULT_DB, init_db, get_active_tasks
 st.set_page_config(
    page_title="Job Seeker",
    page_icon="💼",
    layout="wide",
 )
 init_db(DEFAULT_DB)
 # ── Background task sidebar indicator ─────────────────────────────────────────
@st.fragment(run_every=3)
 def _task_sidebar() -> None:
    tasks = get_active_tasks(DEFAULT_DB)
    if not tasks:
        return
    with st.sidebar:
        st.divider()
        st.markdown(f"**⏳ {len(tasks)} task(s) running**")
        for t in tasks:
            icon = "⏳" if t["status"] == "running" else "🕐"
            label = "Cover letter" if t["task_type"] == "cover_letter" else "Research"
            st.caption(f"{icon} {label} — {t.get('company') or 'unknown'}")
 _task_sidebar()
 # ── Navigation ─────────────────────────────────────────────────────────────────
 pages = {
    "": [
        st.Page("Home.py",                   title="Home",            icon="🏠"),
        st.Page("pages/1_Job_Review.py",     title="Job Review",      icon="📋"),
        st.Page("pages/4_Apply.py",          title="Apply Workspace", icon="🚀"),
        st.Page("pages/5_Interviews.py",     title="Interviews",      icon="🎯"),
        st.Page("pages/6_Interview_Prep.py", title="Interview Prep",  icon="📞"),
    ],
    "System": [
        st.Page("pages/2_Settings.py",       title="Settings",        icon="⚙️"),
    ],
 }
 pg = st.navigation(pages)
 pg.run()
 ```
 ### Step 2: Smoke-test by running the UI
 ```bash
 bash /devl/job-seeker/scripts/manage-ui.sh restart
 ```
 Navigate to http://localhost:8501 and confirm the app loads without error. The sidebar task indicator does not appear when no tasks are running (correct).
 ### Step 3: Commit
 ```bash
 git add app/app.py
 git commit -m "feat: sidebar background task indicator with 3s auto-refresh"
 ```
 ---
 ## Task 4: Update 4_Apply.py to use background generation
 **Files:**
 - Modify: `app/pages/4_Apply.py`
 No new unit tests — covered by existing test suite for DB layer. Smoke-test in browser.
 ### Step 1: Add imports at the top of 4_Apply.py
 After the existing imports block (after `from scripts.db import ...`), add:
 ```python
 from scripts.db import get_task_for_job
 from scripts.task_runner import submit_task
 ```
 So the full import block becomes:
 ```python
 from scripts.db import (
    DEFAULT_DB, init_db, get_jobs_by_status,
    update_cover_letter, mark_applied,
    get_task_for_job,
 )
 from scripts.task_runner import submit_task
 ```
 ### Step 2: Replace the Generate button section
 Find this block (around line 174–185):
 ```python
    if st.button("✨ Generate / Regenerate", use_container_width=True):
        with st.spinner("Generating via LLM…"):
            try:
                from scripts.generate_cover_letter import generate as _gen
                st.session_state[_cl_key] = _gen(
                    job.get("title", ""),
                    job.get("company", ""),
                    job.get("description", ""),
                )
                st.rerun()
            except Exception as e:
                st.error(f"Generation failed: {e}")
 ```
 Replace with:
 ```python
    _cl_task = get_task_for_job(DEFAULT_DB, "cover_letter", selected_id)
    _cl_running = _cl_task and _cl_task["status"] in ("queued", "running")
    if st.button("✨ Generate / Regenerate", use_container_width=True, disabled=bool(_cl_running)):
        submit_task(DEFAULT_DB, "cover_letter", selected_id)
        st.rerun()
    if _cl_running:
        @st.fragment(run_every=3)
        def _cl_status_fragment():
            t = get_task_for_job(DEFAULT_DB, "cover_letter", selected_id)
            if t and t["status"] in ("queued", "running"):
                lbl = "Queued…" if t["status"] == "queued" else "Generating via LLM…"
                st.info(f"⏳ {lbl}")
            else:
                st.rerun()  # full page rerun — reloads cover letter from DB
        _cl_status_fragment()
    elif _cl_task and _cl_task["status"] == "failed":
        st.error(f"Generation failed: {_cl_task.get('error', 'unknown error')}")
 ```
 Also update the session-state initialiser just below (line 171–172) so it loads from DB after background completion. The existing code already does this correctly:
 ```python
    if _cl_key not in st.session_state:
        st.session_state[_cl_key] = job.get("cover_letter") or ""
 ```
 This is fine — `job` is fetched fresh on each full-page rerun, so when the background thread writes to `jobs.cover_letter`, the next full rerun picks it up.
 ### Step 3: Smoke-test in browser
 1. Navigate to Apply Workspace
 2. Select an approved job
 3. Click "Generate / Regenerate"
 4. Navigate away to Home
 5. Navigate back to Apply Workspace for the same job
 6. Observe: button is disabled and "⏳ Generating via LLM…" shows while running; cover letter appears when done
 ### Step 4: Commit
 ```bash
 git add app/pages/4_Apply.py
 git commit -m "feat: cover letter generation runs in background, survives navigation"
 ```
 ---
 ## Task 5: Update 6_Interview_Prep.py to use background research
 **Files:**
 - Modify: `app/pages/6_Interview_Prep.py`
 ### Step 1: Add imports at the top of 6_Interview_Prep.py
 After the existing `from scripts.db import (...)` block, add:
 ```python
 from scripts.db import get_task_for_job
 from scripts.task_runner import submit_task
 ```
 So the full import block becomes:
 ```python
 from scripts.db import (
    DEFAULT_DB, init_db,
    get_interview_jobs, get_contacts, get_research,
    save_research, get_task_for_job,
 )
 from scripts.task_runner import submit_task
 ```
 ### Step 2: Replace the "no research yet" generate button block
 Find this block (around line 99–111):
 ```python
    if not research:
        st.warning("No research brief yet for this job.")
        if st.button("🔬 Generate research brief", type="primary", use_container_width=True):
            with st.spinner("Generating… this may take 30–60 seconds"):
                try:
                    from scripts.company_research import research_company
                    result = research_company(job)
                    save_research(DEFAULT_DB, job_id=selected_id, **result)
                    st.success("Done!")
                    st.rerun()
                except Exception as e:
                    st.error(f"Error: {e}")
        st.stop()
    else:
 ```
 Replace with:
 ```python
    _res_task = get_task_for_job(DEFAULT_DB, "company_research", selected_id)
    _res_running = _res_task and _res_task["status"] in ("queued", "running")
    if not research:
        if not _res_running:
            st.warning("No research brief yet for this job.")
            if _res_task and _res_task["status"] == "failed":
                st.error(f"Last attempt failed: {_res_task.get('error', '')}")
            if st.button("🔬 Generate research brief", type="primary", use_container_width=True):
                submit_task(DEFAULT_DB, "company_research", selected_id)
                st.rerun()
        if _res_running:
            @st.fragment(run_every=3)
            def _res_status_initial():
                t = get_task_for_job(DEFAULT_DB, "company_research", selected_id)
                if t and t["status"] in ("queued", "running"):
                    lbl = "Queued…" if t["status"] == "queued" else "Generating… this may take 30–60 seconds"
                    st.info(f"⏳ {lbl}")
                else:
                    st.rerun()
            _res_status_initial()
        st.stop()
    else:
 ```
 ### Step 3: Replace the "refresh" button block
 Find this block (around line 113–124):
 ```python
        generated_at = research.get("generated_at", "")
        col_ts, col_btn = st.columns([3, 1])
        col_ts.caption(f"Research generated: {generated_at}")
        if col_btn.button("🔄 Refresh", use_container_width=True):
            with st.spinner("Refreshing…"):
                try:
                    from scripts.company_research import research_company
                    result = research_company(job)
                    save_research(DEFAULT_DB, job_id=selected_id, **result)
                    st.rerun()
                except Exception as e:
                    st.error(f"Error: {e}")
 ```
 Replace with:
 ```python
        generated_at = research.get("generated_at", "")
        col_ts, col_btn = st.columns([3, 1])
        col_ts.caption(f"Research generated: {generated_at}")
        if col_btn.button("🔄 Refresh", use_container_width=True, disabled=bool(_res_running)):
            submit_task(DEFAULT_DB, "company_research", selected_id)
            st.rerun()
        if _res_running:
            @st.fragment(run_every=3)
            def _res_status_refresh():
                t = get_task_for_job(DEFAULT_DB, "company_research", selected_id)
                if t and t["status"] in ("queued", "running"):
                    lbl = "Queued…" if t["status"] == "queued" else "Refreshing research…"
                    st.info(f"⏳ {lbl}")
                else:
                    st.rerun()
            _res_status_refresh()
        elif _res_task and _res_task["status"] == "failed":
            st.error(f"Refresh failed: {_res_task.get('error', '')}")
 ```
 ### Step 4: Smoke-test in browser
 1. Move a job to Phone Screen on the Interviews page
 2. Navigate to Interview Prep, select that job
 3. Click "Generate research brief"
 4. Navigate away to Home
 5. Navigate back — observe "⏳ Generating…" inline indicator
 6. Wait for completion — research sections populate automatically
 ### Step 5: Run full test suite one final time
 ```bash
 /devl/miniconda3/envs/job-seeker/bin/pytest tests/ -v
 ```
 Expected: all tests PASS
 ### Step 6: Commit
 ```bash
 git add app/pages/6_Interview_Prep.py
 git commit -m "feat: company research generation runs in background, survives navigation"
 ```
 ---
 ## Summary of Changes
 | File | Change |
 |------|--------|
 | `scripts/db.py` | Add `CREATE_BACKGROUND_TASKS`, `init_db` call, 4 new helpers |
 | `scripts/task_runner.py` | New file — `submit_task` + `_run_task` thread body |
 | `app/app.py` | Add `_task_sidebar` fragment with 3s auto-refresh |
 | `app/pages/4_Apply.py` | Generate button → `submit_task`; inline status fragment |
 | `app/pages/6_Interview_Prep.py` | Generate/Refresh buttons → `submit_task`; inline status fragments |
 | `tests/test_db.py` | 9 new tests for background_tasks helpers |
 | `tests/test_task_runner.py` | New file — 6 tests for task_runner |
--- a/docs/plans/2026-02-21-email-handling-design.md
+++ b/docs/plans/2026-02-21-email-handling-design.md
@ -1,91 +0,0 @@
 # Email Handling Design
 **Date:** 2026-02-21
 **Status:** Approved
 ## Problem
 IMAP sync already pulls emails for active pipeline jobs, but two gaps exist:
 1. Inbound emails suggesting a stage change (e.g. "let's schedule a call") produce no signal — the recruiter's message just sits in the email log.
 2. Recruiter outreach to email addresses not yet in the pipeline is invisible — those leads never enter Job Review.
 ## Goals
 - Surface stage-change suggestions inline on the Interviews kanban card (suggest-only, never auto-advance).
 - Capture recruiter leads from unmatched inbound email and surface them in Job Review.
 - Make email sync a background task triggerable from the UI (Home page + Interviews sidebar).
 ## Data Model
 **No new tables.** Two columns added to `job_contacts`:
 ```sql
 ALTER TABLE job_contacts ADD COLUMN stage_signal          TEXT;
 ALTER TABLE job_contacts ADD COLUMN suggestion_dismissed  INTEGER DEFAULT 0;
 ```
 - `stage_signal` — one of: `interview_scheduled`, `offer_received`, `rejected`, `positive_response`, `neutral` (or NULL if not yet classified).
 - `suggestion_dismissed` — 1 when the user clicks Dismiss; prevents the banner re-appearing.
 Email leads reuse the existing `jobs` table with `source = 'email'` and `status = 'pending'`. No new columns needed.
 ## Components
 ### 1. Stage Signal Classification (`scripts/imap_sync.py`)
 After saving each **inbound** contact row, call `phi3:mini` via Ollama to classify the email into one of the five labels. Store the result in `stage_signal`. If classification fails, default to `NULL` (no suggestion shown).
 **Model:** `phi3:mini` via `LLMRouter.complete(model_override="phi3:mini", fallback_order=["ollama_research"])`.
 Benchmarked at 100% accuracy / 3.0 s per email on a 12-case test suite. Runner-up Qwen2.5-3B untested but phi3-mini is the safe choice.
 ### 2. Recruiter Lead Extraction (`scripts/imap_sync.py`)
 A second pass after per-job sync: scan INBOX broadly for recruitment-keyword emails that don't match any known pipeline company. For each unmatched email, call **Nemotron 1.5B** (already in use for company research) to extract `{company, title}`. If extraction returns a company name not already in the DB, insert a new job row `source='email', status='pending'`.
 **Dedup:** checked by `message_id` against all known contacts (cross-job), plus `url` uniqueness on the jobs table (the email lead URL is set to a synthetic `email://<from_domain>/<message_id>` value).
 ### 3. Background Task (`scripts/task_runner.py`)
 New task type: `email_sync` with `job_id = 0`.
 `submit_task(db, "email_sync", 0)` → daemon thread → `sync_all()` → returns summary via task `error` field.
 Deduplication: only one `email_sync` can be queued/running at a time (existing insert_task logic handles this).
 ### 4. UI — Sync Button (Home + Interviews)
 **Home.py:** New "Sync Emails" section alongside Find Jobs / Score / Notion sync.
 **5_Interviews.py:** Existing sync button already present in sidebar; convert from synchronous `sync_all()` call to `submit_task()` + fragment polling.
 ### 5. UI — Email Leads (Job Review)
 When `show_status == "pending"`, prepend email leads (`source = 'email'`) at the top of the list with a distinct `📧 Email Lead` badge. Actions are identical to scraped pending jobs (Approve / Reject).
 ### 6. UI — Stage Suggestion Banner (Interviews Kanban)
 Inside `_render_card()`, before the advance/reject buttons, check for unseen stage signals:
 ```
 💡 Email suggests: interview_scheduled
 From: sarah@company.com · "Let's book a call"
 [→ Move to Phone Screen]   [Dismiss]
 ```
 - "Move" calls `advance_to_stage()` + `submit_task("company_research")` then reruns.
 - "Dismiss" calls `dismiss_stage_signal(contact_id)` then reruns.
 - Only the most recent undismissed signal is shown per card.
 ## Error Handling
 | Failure | Behaviour |
 |---------|-----------|
 | IMAP connection fails | Error stored in task `error` field; shown as warning in UI after sync |
 | Classifier call fails | `stage_signal` left NULL; no suggestion shown; sync continues |
 | Lead extractor fails | Email skipped; appended to `result["errors"]`; sync continues |
 | Duplicate `email_sync` task | `insert_task` returns existing id; no new thread spawned |
 | LLM extraction returns no company | Email silently skipped (not a lead) |
 ## Out of Scope
 - Auto-advancing pipeline stage (suggest only).
 - Sending email replies from the app (draft helper already exists).
 - OAuth / token-refresh IMAP (config/email.yaml credentials only).
--- a/Show more
+++ b/Show more
		`@ -0,0 +1 @@`
							`webhook_url: "https://discord.com/api/webhooks/..."`
		`@ -0,0 +1,2 @@`
							`access_token: "sl...."`
							`folder_path: "/Peregrine"`
		`@ -0,0 +1,2 @@`
							`calendar_id: "primary"`
							`credentials_json: "~/credentials/google-calendar-sa.json"`
		`@ -0,0 +1,2 @@`
							`folder_id: "your-google-drive-folder-id"`
							`credentials_json: "~/credentials/google-drive-sa.json"`
		`@ -0,0 +1,2 @@`
							`token: "secret_..."`
							`database_id: "32-character-notion-db-id"`
		`@ -0,0 +1,2 @@`
							`webhook_url: "https://hooks.slack.com/services/..."`
							`channel: "#job-alerts"`