ci: retrigger after Docker network pool fix

ci: add freeze/** branches to CI trigger
freeze/rc branches were not covered by the push trigger, leaving RC-stage work untested. Adds 'freeze/**' alongside existing patterns.
2026-06-26 20:41:18 -07:00 · 2026-06-26 19:24:40 -07:00 · 2026-06-15 16:52:56 -07:00 · 2026-06-15 09:11:14 -07:00 · 2026-06-14 20:03:40 -07:00 · 2026-06-14 15:21:53 -07:00
58 changed files with 4055 additions and 506 deletions
--- a/.dockerignore
+++ b/.dockerignore
@ -3,17 +3,20 @@ __pycache__
 *.pyc
 *.pyo
 staging.db
+# gitignored secrets — belt-and-suspenders with the RUN rm -f in Dockerfile
 config/user.yaml
+config/plain_text_resume.yaml
 config/notion.yaml
 config/email.yaml
 config/tokens.yaml
 config/craigslist.yaml
+config/adzuna.yaml
+.env
 .streamlit.pid
 .streamlit.log
 aihawk/
 docs/
 tests/
-.env
 data/
 log/
 unsloth_compiled_cache/
--- a/.env.example
+++ b/.env.example
@ -2,10 +2,10 @@
 # Auto-generated by the setup wizard, or fill in manually.
 # NEVER commit .env to git.

-STREAMLIT_PORT=8502
+VUE_PORT=8506
 OLLAMA_PORT=11434
 VLLM_PORT=8000
-CF_TEXT_PORT=8006
+CF_TEXT_PORT=8008
 SEARXNG_PORT=8888
 VISION_PORT=8002
 VISION_MODEL=vikhyatk/moondream2
@ -35,7 +35,8 @@ OPENAI_COMPAT_URL=
 OPENAI_COMPAT_KEY=

 # Feedback button — Forgejo issue filing
-FORGEJO_API_TOKEN=
+FORGEJO_API_TOKEN=             # dev/admin token (your personal account)
+FORGEJO_BOT_TOKEN=             # cf-bugbot bot token — used for in-app feedback; falls back to FORGEJO_API_TOKEN
 FORGEJO_REPO=pyr0ball/peregrine
 FORGEJO_API_URL=https://git.opensourcesolarpunk.com/api/v1
 # GITHUB_TOKEN=          # future — enable when public mirror is active
@ -64,8 +65,28 @@ CF_ORCH_AGENT_PORT=7701
 # Cloud multi-tenancy (compose.cloud.yml only — do not set for local installs)
 CLOUD_MODE=false
 CLOUD_DATA_ROOT=/devl/menagerie-data
+SYNC_DB_PATH=                  # optional; defaults to CLOUD_DATA_ROOT/sync.db
+SYNC_DB_KEY=                   # optional; SQLCipher key for at-rest encryption
 DIRECTUS_JWT_SECRET=           # must match website/.env DIRECTUS_SECRET value
 CF_SERVER_SECRET=              # random 64-char hex — generate: openssl rand -hex 32
 PLATFORM_DB_URL=postgresql://cf_platform:<password>@host.docker.internal:5433/circuitforge_platform
 HEIMDALL_URL=http://cf-license:8000   # internal Docker URL; override for external access
 HEIMDALL_ADMIN_TOKEN=                 # must match ADMIN_TOKEN in circuitforge-license .env
+
+# ── Memory (mnemo sidecar) — opt-in, requires --profile memory ───────────────
+# Launch with: docker compose --profile memory --profile <gpu-profile> up -d
+# Mnemo builds a persistent knowledge graph from conversations and injects
+# relevant context back into LLM prompts. Uses the ollama service as its LLM.
+MNEMO_HOST=mnemo                         # internal service name; change for external sidecar
+MNEMO_PORT=8080
+MNEMO_LLM_PROVIDER=ollama               # ollama | openai | anthropic | custom
+MNEMO_LLM_BASE_URL=http://ollama:11434/v1  # override for external LLM
+MNEMO_LLM_API_KEY=ollama                # "ollama" is a dummy value for local Ollama
+MNEMO_LLM_MODEL=llama3.2:3b            # must be pulled in the ollama container
+
+# ── Rate limiting (LLM generation endpoints) ─────────────────────────────────
+LLM_RATE_COVER_LETTER=20/hour
+LLM_RATE_RESEARCH=10/hour
+LLM_RATE_QA_SUGGEST=60/hour
+LLM_RATE_SURVEY=30/hour
+LLM_RATE_WIZARD=60/hour
--- a/.forgejo/workflows/release.yml
+++ b/.forgejo/workflows/release.yml
@ -1,12 +1,20 @@
 # Tag-triggered release workflow.
-# Generates changelog and creates Forgejo release on v* tags.
-# Copied from Circuit-Forge/cf-agents workflows/release.yml
+# Generates changelog, publishes Docker images to GHCR, and creates Forgejo release.
 #
-# Docker push is intentionally disabled — BSL 1.1 registry policy not yet resolved.
-# Tracked in Circuit-Forge/cf-agents#3. Re-enable the Docker steps when that lands.
+# Images published on v* tags:
+#   ghcr.io/circuitforgellc/peregrine:latest        — FastAPI API (includes cf-orch)
+#   ghcr.io/circuitforgellc/peregrine:<tag>
+#   ghcr.io/circuitforgellc/peregrine-web:latest    — Vue SPA (base path /)
+#   ghcr.io/circuitforgellc/peregrine-web:<tag>
 #
-# Required secrets: FORGEJO_RELEASE_TOKEN
-# (GHCR_TOKEN not needed until Docker push is enabled)
+# The cloud image (compose.cloud.yml) is never published — it is built and
+# deployed directly on Heimdall from Dockerfile.cfcore with sibling repos.
+#
+# Required secrets:
+#   FORGEJO_RELEASE_TOKEN    — Forgejo API token for creating releases
+#   GH_GHCR_TOKEN            — GitHub PAT with packages:write for GHCR push
+#   FORGEJO_CF_ORCH_TOKEN    — Forgejo token to install private circuitforge-orch
+#                              during the API image build (BSL client for paid tier)

 name: Release

@ -32,28 +40,56 @@ jobs:
        env:
          OUTPUT: CHANGES.md

-      # ── Docker (disabled — BSL registry policy pending cf-agents#3) ──────────
-      # - name: Set up QEMU
-      #   uses: docker/setup-qemu-action@v3
-      # - name: Set up Buildx
-      #   uses: docker/setup-buildx-action@v3
-      # - name: Log in to GHCR
-      #   uses: docker/login-action@v3
-      #   with:
-      #     registry: ghcr.io
-      #     username: ${{ github.actor }}
-      #     password: ${{ secrets.GHCR_TOKEN }}
-      # - name: Build and push Docker image
-      #   uses: docker/build-push-action@v6
-      #   with:
-      #     context: .
-      #     push: true
-      #     platforms: linux/amd64,linux/arm64
-      #     tags: |
-      #       ghcr.io/circuitforgellc/peregrine:${{ github.ref_name }}
-      #       ghcr.io/circuitforgellc/peregrine:latest
-      #     cache-from: type=gha
-      #     cache-to: type=gha,mode=max
+      # ── Docker setup ─────────────────────────────────────────────────────────
+      - name: Set up QEMU
+        uses: docker/setup-qemu-action@v3
+
+      - name: Set up Buildx
+        uses: docker/setup-buildx-action@v3
+
+      - name: Log in to GHCR
+        uses: docker/login-action@v3
+        with:
+          registry: ghcr.io
+          username: ${{ github.actor }}
+          password: ${{ secrets.GH_GHCR_TOKEN }}
+
+      # ── API image ─────────────────────────────────────────────────────────────
+      # cf-orch (BSL, private) is installed via BuildKit secret — token never
+      # appears in any image layer. Community builds without the secret fall back
+      # to local backends automatically.
+      - name: Build and push API image
+        uses: docker/build-push-action@v6
+        with:
+          context: .
+          dockerfile: Dockerfile
+          push: true
+          platforms: linux/amd64,linux/arm64
+          secrets: |
+            forgejo_token=${{ secrets.FORGEJO_CF_ORCH_TOKEN }}
+          tags: |
+            ghcr.io/circuitforgellc/peregrine:${{ github.ref_name }}
+            ghcr.io/circuitforgellc/peregrine:latest
+          cache-from: type=gha,scope=api
+          cache-to: type=gha,mode=max,scope=api
+
+      # ── Web image ─────────────────────────────────────────────────────────────
+      # Published with VITE_BASE_PATH=/ (self-hosted default).
+      # Cloud and demo deployments build locally with VITE_BASE_PATH=/peregrine/.
+      - name: Build and push web image
+        uses: docker/build-push-action@v6
+        with:
+          context: .
+          dockerfile: docker/web/Dockerfile
+          push: true
+          platforms: linux/amd64,linux/arm64
+          build-args: |
+            VITE_BASE_PATH=/
+          tags: |
+            ghcr.io/circuitforgellc/peregrine-web:${{ github.ref_name }}
+            ghcr.io/circuitforgellc/peregrine-web:latest
+          cache-from: type=gha,scope=web
+          cache-to: type=gha,mode=max,scope=web

      # ── Forgejo Release ───────────────────────────────────────────────────────
      - name: Create Forgejo release
--- a/.gitignore
+++ b/.gitignore
@ -60,3 +60,4 @@ demo/seed_demo.py
 tests/e2e/results/demo/
 tests/e2e/results/cloud/
 tests/e2e/results/local/
+config/wizard-test/
--- a/55
+++ b/55
@ -1,30 +1,59 @@
-# Dockerfile
+# Dockerfile — Peregrine release build
+# Self-contained single-repo context. Used for published images and community builds.
+#
+# cf-core: installed from public Forgejo via requirements.txt
+# cf-orch: BSL-licensed cloud inference client; installed only when the
+#          forgejo_token BuildKit secret is present (release CI).
+#          Community builds skip it gracefully — local Ollama/vllm still work.
+#
+# Release CI (Forgejo):
+#   docker buildx build --secret id=forgejo_token,env=FORGEJO_TOKEN -t peregrine:latest .
+#
+# Community / source build:
+#   docker buildx build -t peregrine:latest .
+#
+# Previously this file ran Streamlit (app/app.py). Streamlit was removed in
+# peregrine#104. The runtime is now uvicorn (FastAPI). Dockerfile.cfcore remains
+# for the cloud deployment on Heimdall, where sibling repos are available.
+
 FROM python:3.11-slim

 WORKDIR /app

-# System deps for companyScraper (beautifulsoup4, fake-useragent, lxml) and PDF gen
-# libsqlcipher-dev: required to build pysqlcipher3 (SQLCipher AES-256 encryption for cloud mode)
 RUN apt-get update && apt-get install -y --no-install-recommends \
    gcc libffi-dev curl libsqlcipher-dev git \
    && rm -rf /var/lib/apt/lists/*

 COPY requirements.txt .
-# Install Python dependencies
 RUN pip install --no-cache-dir -r requirements.txt

-# Install Playwright browser (cached separately from Python deps so requirements
-# changes don't bust the ~600–900 MB Chromium layer and vice versa)
+# cf-orch BSL client — cloud inference routing for paid/premium tier.
+# The --mount=type=secret keeps the token out of all image layers.
+# If no secret is provided the pip install is skipped; the app falls back to
+# local backends (Ollama, vllm) and tier gating blocks cloud-orch features.
+RUN --mount=type=secret,id=forgejo_token \
+    TOKEN=$(cat /run/secrets/forgejo_token 2>/dev/null || true) && \
+    if [ -n "$TOKEN" ]; then \
+      pip install --no-cache-dir \
+        "git+https://x-access-token:${TOKEN}@git.opensourcesolarpunk.com/Circuit-Forge/circuitforge-orch.git@main" \
+        && echo "cf-orch installed"; \
+    else \
+      echo "cf-orch skipped (community build — local backends available)"; \
+    fi
+
+# Chromium for Playwright-based scrapers (companyScraper, job board scraping)
 RUN playwright install chromium && playwright install-deps chromium

-# Bundle companyScraper (company research web scraper)
 COPY scrapers/ /app/scrapers/
-
 COPY . .

-EXPOSE 8501
+# Strip gitignored secrets that may exist in a local checkout.
+# Defense-in-depth: .dockerignore already excludes these, but an explicit rm
+# guarantees they never appear in the image even if .dockerignore is misconfigured.
+RUN rm -f config/user.yaml config/plain_text_resume.yaml config/notion.yaml \
+          config/email.yaml config/tokens.yaml config/craigslist.yaml \
+          config/adzuna.yaml .env

-CMD ["streamlit", "run", "app/app.py", \
-     "--server.port=8501", \
-     "--server.headless=true", \
-     "--server.fileWatcherType=none"]
+EXPOSE 8601
+
+CMD ["uvicorn", "dev_api:app", "--host", "0.0.0.0", "--port", "8601"]
--- a/README.md
+++ b/README.md
@ -70,7 +70,7 @@ cd peregrine
 ./manage.sh start
 ```

-Open **http://localhost:8502** — the setup wizard walks you through the rest.
+Open **http://localhost:8506** — the setup wizard walks you through the rest.

 > **macOS / Apple Silicon:** install Ollama natively via Homebrew before starting for Metal GPU-accelerated inference. `install.sh` handles this automatically.
 > **Windows:** use WSL2 with Ubuntu.
@ -78,10 +78,11 @@ Open **http://localhost:8502** — the setup wizard walks you through the rest.
 ### Inference profiles

 ```bash
-./manage.sh start                       # remote — no GPU; LLM calls go to Anthropic / OpenAI
-./manage.sh start --profile cpu         # local Ollama on CPU (or Metal via native Ollama on macOS)
+./manage.sh start                       # cpu — local Ollama on CPU (recommended default)
 ./manage.sh start --profile single-gpu  # Ollama + vision on GPU 0 (NVIDIA only)
 ./manage.sh start --profile dual-gpu    # Ollama + vLLM on two NVIDIA GPUs
+./manage.sh start --profile cf-orch     # no local LLM — route to CircuitForge GPU cluster
+./manage.sh start --profile remote      # no local LLM — use cloud API keys
 ```

 ---
@ -109,7 +110,7 @@ Open **http://localhost:8502** — the setup wizard walks you through the rest.
 | **Voice guidelines** (custom writing style and tone) | Premium with LLM ¹ |
 | Cover letter model fine-tuning — your writing, your model | Premium |
 | Multi-user support | Premium |
-| Human-in-the-loop operator (CAPTCHAs, phone calls, wet signatures) | Ultra |
+| Human-in-the-loop operator (CAPTCHAs, phone calls, wet signatures) | Premium |

 ¹ **BYOK (bring your own key) unlock:** configure any LLM backend — a local [Ollama](https://ollama.com) or vLLM instance, or your own API key (Anthropic, OpenAI-compatible) — and all "Free with LLM" and "Premium with LLM" features unlock at no charge.

@ -180,4 +181,8 @@ Peregrine uses a split license:

 Fine-tuned model weights are proprietary and per-user — not redistributable.

+---
+
+Humans own design, architecture, code review, testing, and verification. LLMs are part of our development workflow. [Our positions on LLM use →](https://circuitforge.tech/positions)
+
 © 2026 Circuit Forge LLC
--- a/app/wizard/step_hardware.py
+++ b/app/wizard/step_hardware.py
@ -1,6 +1,6 @@
 """Step 1 — Hardware detection and inference profile selection."""

-PROFILES = ["remote", "cpu", "single-gpu", "dual-gpu"]
+PROFILES = ["cpu", "single-gpu", "dual-gpu", "cf-orch", "remote"]


 def validate(data: dict) -> list[str]:
--- a/app/wizard/tiers.py
+++ b/app/wizard/tiers.py
@ -41,6 +41,7 @@ FEATURES: dict[str, str] = {
    "llm_voice_guidelines":         "premium",
    "llm_job_titles":               "paid",
    "llm_mission_notes":            "paid",
+    "llm_ai_wizard":                "paid",

    # Orchestration — stays gated (background data pipeline, not just an LLM call)
    "llm_keywords_blocklist":       "paid",
@ -79,6 +80,7 @@ BYOK_UNLOCKABLE: frozenset[str] = frozenset({
    "llm_voice_guidelines",
    "llm_job_titles",
    "llm_mission_notes",
+    "llm_ai_wizard",
    "company_research",
    "interview_prep",
    "survey_assistant",
--- a/compose.demo.yml
+++ b/compose.demo.yml
@ -16,6 +16,7 @@
 services:

  api:
+    image: ghcr.io/circuitforgellc/peregrine:latest
    build: .
    command: >
      bash -c "uvicorn dev_api:app --host 0.0.0.0 --port 8601"
@ -42,6 +43,8 @@ services:
    # No host port — nginx proxies /api/ → api:8601 internally

  web:
+    # Built with VITE_BASE_PATH=/peregrine/ — not the same as the published
+    # peregrine-web:latest image (which uses base path /). Always build locally.
    build:
      context: .
      dockerfile: docker/web/Dockerfile
--- a/compose.wizard-test.yml
+++ b/compose.wizard-test.yml
@ -0,0 +1,62 @@
+# compose.wizard-test.yml — Fresh first-run instance for testing wizard/onboarding flows
+#
+# Spins up on port 8507 with ephemeral storage so every `docker compose restart`
+# gives a completely clean slate. Perfect for exercising the onboarding wizard,
+# AI interview, and first-run UX without touching the real data.
+#
+# Usage:
+#   docker compose -f compose.wizard-test.yml --project-name peregrine-wizard up -d
+#   docker compose -f compose.wizard-test.yml --project-name peregrine-wizard restart api
+#   docker compose -f compose.wizard-test.yml --project-name peregrine-wizard down
+
+services:
+
+  api:
+    image: ghcr.io/circuitforgellc/peregrine:latest   # same image as main compose
+    command: >
+      bash -c "uvicorn dev_api:app --host 0.0.0.0 --port 8601"
+    volumes:
+      - ./config/wizard-test:/app/config   # LLM config only — no user.yaml triggers wizard
+    tmpfs:
+      - /app/data                           # ephemeral DB; wipes on restart → clean first-run every time
+    environment:
+      - STAGING_DB=/app/data/staging.db
+      - DOCS_DIR=/tmp/wizard-test-docs
+      - PYTHONUNBUFFERED=1
+      - CF_ORCH_URL=http://host.docker.internal:7700
+      - CF_APP_NAME=peregrine
+      - GPU_SERVER_URL=http://host.docker.internal:7700
+      - HEIMDALL_URL=http://host.docker.internal:8000    # license check — skip for local testing
+    extra_hosts:
+      - "host.docker.internal:host-gateway"
+    deploy:
+      resources:
+        reservations:
+          devices:
+            - driver: nvidia
+              count: all
+              capabilities: [gpu]
+    depends_on:
+      searxng:
+        condition: service_healthy
+    restart: unless-stopped
+    # No host port — nginx in web proxies /api/ → api:8601
+
+  web:
+    image: ghcr.io/circuitforgellc/peregrine-web:latest   # same image as main compose
+    ports:
+      - "8507:80"
+    depends_on:
+      - api
+    restart: unless-stopped
+
+  searxng:
+    image: searxng/searxng:latest
+    volumes:
+      - ./docker/searxng:/etc/searxng:ro
+    healthcheck:
+      test: ["CMD", "wget", "-q", "--spider", "http://localhost:8080/"]
+      interval: 10s
+      timeout: 5s
+      retries: 3
+    restart: unless-stopped
--- a/compose.yml
+++ b/compose.yml
@ -3,9 +3,10 @@
 services:

  api:
+    image: ghcr.io/circuitforgellc/peregrine:latest
    build:
-      context: ..
-      dockerfile: peregrine/Dockerfile.cfcore
+      context: .
+      dockerfile: Dockerfile
    command: >
      bash -c "uvicorn dev_api:app --host 0.0.0.0 --port 8601"
    volumes:
@ -23,12 +24,15 @@ services:
      - GPU_SERVER_URL=${GPU_SERVER_URL:-${CF_ORCH_URL:-http://host.docker.internal:7700}}
      - CF_ORCH_URL=${CF_ORCH_URL:-${GPU_SERVER_URL:-http://host.docker.internal:7700}}
      - CF_APP_NAME=peregrine
+      - MNEMO_HOST=${MNEMO_HOST:-mnemo}
+      - MNEMO_PORT=${MNEMO_PORT:-8080}
      - PYTHONUNBUFFERED=1
    extra_hosts:
      - "host.docker.internal:host-gateway"
    restart: unless-stopped

  web:
+    image: ghcr.io/circuitforgellc/peregrine-web:latest
    build:
      context: .
      dockerfile: docker/web/Dockerfile
@ -116,6 +120,28 @@ services:
    profiles: [single-gpu, dual-gpu-ollama, dual-gpu-vllm, dual-gpu-mixed]
    restart: unless-stopped

+  mnemo:
+    image: ghcr.io/zaydmulani09/mnemo:latest
+    ports:
+      - "${MNEMO_PORT:-8080}:8080"
+    volumes:
+      - mnemo-data:/data
+    environment:
+      - MNEMO_DB_PATH=/data/mnemo.db
+      - MNEMO_LLM_PROVIDER=${MNEMO_LLM_PROVIDER:-ollama}
+      - MNEMO_LLM_BASE_URL=${MNEMO_LLM_BASE_URL:-http://ollama:11434/v1}
+      - MNEMO_LLM_API_KEY=${MNEMO_LLM_API_KEY:-ollama}
+      - MNEMO_LLM_MODEL=${MNEMO_LLM_MODEL:-llama3.2:3b}
+    depends_on:
+      - ollama
+    healthcheck:
+      test: ["CMD", "wget", "-q", "--spider", "http://localhost:8080/health"]
+      interval: 15s
+      timeout: 5s
+      retries: 3
+    profiles: [memory]
+    restart: unless-stopped
+
  finetune:
    build:
      context: .
@ -131,3 +157,6 @@ services:
      - OLLAMA_MODELS_OLLAMA_PATH=/root/.ollama
    profiles: [finetune]
    restart: "no"
+
+volumes:
+  mnemo-data:
--- a/config/llm.cloud.yaml
+++ b/config/llm.cloud.yaml
@ -1,4 +1,14 @@
 backends:
+  cf_text:
+    api_key: any
+    base_url: http://host.docker.internal:8008/v1
+    enabled: true
+    model: cf-text
+    supports_images: false
+    type: openai_compat
+    cf_orch:
+      service: cf-text
+      ttl_s: 300
  anthropic:
    api_key_env: ANTHROPIC_API_KEY
    enabled: false
@ -26,6 +36,9 @@ backends:
    model: llama3.1:8b  # generic — no personal fine-tunes in cloud
    supports_images: false
    type: openai_compat
+    cf_orch:
+      service: ollama
+      ttl_s: 300
  ollama_research:
    api_key: ollama
    base_url: http://host.docker.internal:11434/v1
@ -33,6 +46,9 @@ backends:
    model: llama3.1:8b
    supports_images: false
    type: openai_compat
+    cf_orch:
+      service: ollama
+      ttl_s: 300
  vision_service:
    base_url: http://host.docker.internal:8002
    enabled: true
@ -63,9 +79,11 @@ backends:
        - Qwen2.5-3B-Instruct
      ttl_s: 300
 fallback_order:
+- cf_text
 - vllm
 - ollama
 research_fallback_order:
+- cf_text
 - vllm_research
 - ollama_research
 vision_fallback_order:
--- a/config/llm.yaml
+++ b/config/llm.yaml
@ -1,11 +1,14 @@
 backends:
  cf_text:
    api_key: any
-    base_url: http://host.docker.internal:8006/v1
+    base_url: http://host.docker.internal:8008/v1
    enabled: true
    model: cf-text
    supports_images: false
    type: openai_compat
+    cf_orch:
+      service: cf-text
+      ttl_s: 300
  anthropic:
    api_key_env: ANTHROPIC_API_KEY
    enabled: false
@ -33,13 +36,19 @@ backends:
    model: llama3.2:3b
    supports_images: false
    type: openai_compat
+    cf_orch:
+      service: ollama
+      ttl_s: 300
  ollama_research:
    api_key: ollama
-    base_url: http://ollama_research:11434/v1
+    base_url: http://host.docker.internal:11435/v1
    enabled: true
    model: llama3.1:8b
    supports_images: false
    type: openai_compat
+    cf_orch:
+      service: ollama
+      ttl_s: 300
  vision_service:
    base_url: http://vision:8002
    enabled: true
@ -64,6 +73,11 @@ backends:
    model: __auto__
    supports_images: false
    type: openai_compat
+    cf_orch:
+      service: vllm
+      model_candidates:
+      - Qwen2.5-3B-Instruct
+      ttl_s: 300
 fallback_order:
 - cf_text
 - ollama
@ -72,10 +86,10 @@ fallback_order:
 - github_copilot
 - anthropic
 research_fallback_order:
- claude_code
+- cf_text
 - vllm_research
 - ollama_research
- cf_text
+- claude_code
 - github_copilot
 - anthropic
 vision_fallback_order:
--- a/config/resume_keywords.yaml
+++ b/config/resume_keywords.yaml
@ -1,17 +1,47 @@
 domains:
 - B2B SaaS
 - enterprise software
+- cybersecurity
 - security
 - compliance
 - post-sale lifecycle
 - SaaS metrics
 - web security
+- risk management
+- Fortune 500
+- enterprise accounts
+- consulting
+- CS advisory
+- startup
 keywords:
 - churn reduction
 - escalation management
 - cross-functional
 - product feedback loop
 - customer advocacy
+- NPS
+- net promoter score
+- QBR
+- quarterly business review
+- executive relationships
+- EBR
+- renewal
+- expansion
+- upsell
+- health score
+- time-to-value
+- TTV
+- onboarding
+- playbook
+- success plan
+- stakeholder management
+- executive sponsor
+- risk identification
+- at-risk accounts
+- forecasting
+- GRR
+- NRR
+- ARR
 skills:
 - Customer Success
 - Technical Account Management
@ -21,3 +51,19 @@ skills:
 - project management
 - onboarding
 - renewal management
+- executive communication
+- CS leadership
+- team building
+- cross-functional collaboration
+- customer segmentation
+- success planning
+- account management
+- risk management
+- Salesforce
+- Gainsight
+- ChurnZero
+- Zendesk
+- Jira
+- Notion
+- Slack
+- Looker
--- a/dev-api.py
+++ b/dev-api.py
@ -8,6 +8,7 @@ import imaplib
 import json
 import logging
 import os
+import ipaddress
 import re
 import socket
 import sqlite3
@ -25,7 +26,7 @@ import yaml
 from bs4 import BeautifulSoup
 from contextlib import asynccontextmanager

-from fastapi import FastAPI, HTTPException, Query, Request, Response, UploadFile
+from fastapi import Depends, FastAPI, HTTPException, Query, Request, Response, UploadFile
 from fastapi.middleware.cors import CORSMiddleware
 from pydantic import BaseModel

@ -38,15 +39,49 @@ if str(PEREGRINE_ROOT) not in sys.path:

 from circuitforge_core.api import make_feedback_router as _make_feedback_router  # noqa: E402
 from circuitforge_core.config.settings import load_env as _load_env  # noqa: E402
+from circuitforge_core.sync import SyncConfig, make_sync_router  # noqa: E402
 from scripts.credential_store import get_credential, set_credential  # noqa: E402
+from scripts.rate_limit import limiter, rate_limit_exceeded_handler  # noqa: E402
+from slowapi.errors import RateLimitExceeded  # noqa: E402

 DB_PATH = os.environ.get("STAGING_DB", "/devl/job-seeker/staging.db")

 _CLOUD_MODE       = os.environ.get("CLOUD_MODE", "").lower() in ("1", "true")
 _CLOUD_DATA_ROOT  = Path(os.environ.get("CLOUD_DATA_ROOT", "/devl/menagerie-data"))
 _DIRECTUS_SECRET  = os.environ.get("DIRECTUS_JWT_SECRET", "")
+
+# Allowlist for cloud user_id values — UUID format only (prevents path traversal)
+_VALID_USER_ID_RE = re.compile(r'^[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}$', re.IGNORECASE)
+
+# RFC-1918 + loopback + link-local blocks blocked from IMAP SSRF
+_PRIVATE_NETS = [
+    ipaddress.ip_network("10.0.0.0/8"),
+    ipaddress.ip_network("172.16.0.0/12"),
+    ipaddress.ip_network("192.168.0.0/16"),
+    ipaddress.ip_network("127.0.0.0/8"),
+    ipaddress.ip_network("169.254.0.0/16"),
+    ipaddress.ip_network("::1/128"),
+    ipaddress.ip_network("fc00::/7"),
+    ipaddress.ip_network("fe80::/10"),
+]
+
+
+def _is_ssrf_host(host: str) -> bool:
+    """Return True if host resolves to a private/loopback address (SSRF guard)."""
+    try:
+        addr = ipaddress.ip_address(socket.gethostbyname(host))
+        return any(addr in net for net in _PRIVATE_NETS)
+    except Exception:
+        return True  # fail closed on resolution errors
 IS_DEMO: bool = os.environ.get("DEMO_MODE", "").lower() in ("1", "true", "yes")

+# ── Rate limiting (LLM generation endpoints) ──────────────────────────────────
+_RL_COVER_LETTER = os.environ.get("LLM_RATE_COVER_LETTER", "20/hour")
+_RL_RESEARCH     = os.environ.get("LLM_RATE_RESEARCH", "10/hour")
+_RL_QA_SUGGEST   = os.environ.get("LLM_RATE_QA_SUGGEST", "60/hour")
+_RL_SURVEY       = os.environ.get("LLM_RATE_SURVEY", "30/hour")
+_RL_WIZARD       = os.environ.get("LLM_RATE_WIZARD", "60/hour")
+
 # Resolve GPU inference server URL.
 # Priority: GPU_SERVER_URL → CF_ORCH_URL (backward compat) → cloud default when licensed.
 # Result is written back to CF_ORCH_URL so all downstream callers need no changes.
@ -81,12 +116,35 @@ def _load_demo_seed(db_path: str, seed_file: str) -> None:
        con.close()


+def _load_data_env() -> None:
+    """Load API keys written by the wizard into the running process.
+
+    The wizard saves keys to <data_dir>/.env (next to staging.db).  The main
+    _load_env() call targets the image-baked /app/.env, which is a different
+    path.  This helper bridges the gap by force-overriding env vars that are
+    unset or empty (compose injects empty strings for optional vars).
+    """
+    data_env = Path(DB_PATH).parent / ".env"
+    if not data_env.exists():
+        return
+    for line in data_env.read_text().splitlines():
+        line = line.strip()
+        if not line or line.startswith("#") or "=" not in line:
+            continue
+        key, _, value = line.partition("=")
+        key, value = key.strip(), value.strip()
+        if value and not os.environ.get(key):
+            os.environ[key] = value
+
+
@asynccontextmanager
 async def lifespan(app: FastAPI):
    """Load .env, run migrations, and (in demo mode) seed the demo DB."""
    # Load .env before any runtime env reads — safe because lifespan doesn't run
    # when dev_api is imported by tests (only when uvicorn actually starts).
    _load_env(PEREGRINE_ROOT / ".env")
+    # Also load wizard-saved keys from the data directory (overrides empty compose vars).
+    _load_data_env()
    from scripts.db_migrate import migrate_db
    migrate_db(Path(DB_PATH))

@ -109,6 +167,8 @@ async def lifespan(app: FastAPI):


 app = FastAPI(title="Peregrine Dev API", lifespan=lifespan)
+app.state.limiter = limiter
+app.add_exception_handler(RateLimitExceeded, rate_limit_exceeded_handler)

 app.add_middleware(
    CORSMiddleware,
@ -126,6 +186,43 @@ _feedback_router = _make_feedback_router(
 )
 app.include_router(_feedback_router, prefix="/api/feedback")

+# ── Cross-device sync (cf-core sync module, Paid+ only) ──────────────────────
+
+class _SyncUser:
+    """Minimal user object expected by the cf-core sync router."""
+    def __init__(self, user_id: str) -> None:
+        self.user_id = user_id
+
+def _get_sync_session() -> _SyncUser:
+    """FastAPI dependency: resolves user_id from the per-request DB ContextVar.
+    Returns a fixed 'local' user in single-tenant mode so the prefs/delete
+    endpoints still work for self-hosted users.
+    """
+    db_path = _request_db.get()
+    if db_path:
+        try:
+            user_id = Path(db_path).parts[-3]
+        except IndexError:
+            raise HTTPException(status_code=401, detail="Invalid session")
+    else:
+        user_id = "local"
+    return _SyncUser(user_id)
+
+def _require_paid_sync() -> _SyncUser:
+    """FastAPI dependency: raises 403 unless the resolved tier is paid or premium."""
+    tier = _resolve_cloud_tier()
+    if tier not in ("paid", "premium"):
+        raise HTTPException(status_code=403, detail="Cross-device sync requires a Paid or Premium subscription.")
+    return _get_sync_session()
+
+_sync_router = make_sync_router(
+    product="peregrine",
+    get_session=_get_sync_session,
+    require_paid=_require_paid_sync,
+    config=SyncConfig.from_env("peregrine"),
+)
+app.include_router(_sync_router, prefix="/sync", tags=["sync"])
+
 _log = logging.getLogger("peregrine.session")

 # ── Structured auth logging ───────────────────────────────────────────────────
@ -220,6 +317,10 @@ async def cloud_session_middleware(request: Request, call_next):
    if _CLOUD_MODE and _DIRECTUS_SECRET:
        cookie_header = request.headers.get("X-CF-Session", "")
        user_id = _resolve_cf_user_id(cookie_header)
+        if user_id:
+            if not _VALID_USER_ID_RE.match(user_id):
+                _log.warning("cloud_session_middleware: rejected non-UUID user_id: %s", user_id[:40])
+                user_id = None
        if user_id:
            first_access = user_id not in _seen_users
            if first_access:
@ -512,7 +613,8 @@ def save_cover_letter(job_id: int, body: CoverLetterBody):
 # ── POST /api/jobs/:id/cover_letter/generate ─────────────────────────────────

@app.post("/api/jobs/{job_id}/cover_letter/generate")
-def generate_cover_letter(job_id: int):
+@limiter.limit(_RL_COVER_LETTER)
+def generate_cover_letter(job_id: int, request: Request):
    _demo_guard()
    try:
        from scripts.task_runner import submit_task
@ -565,7 +667,9 @@ def get_research_brief(job_id: int):


@app.post("/api/jobs/{job_id}/research/generate")
-def generate_research(job_id: int):
+@limiter.limit(_RL_RESEARCH)
+def generate_research(job_id: int, request: Request):
+    _demo_guard()
    try:
        from scripts.task_runner import submit_task
        task_id, is_new = submit_task(db_path=Path(_request_db.get() or DB_PATH), task_type="company_research", job_id=job_id)
@ -1085,9 +1189,17 @@ def apply_resume_to_profile(resume_id: int):
    with open(resume_path, "w", encoding="utf-8") as f:
        yaml.dump(current_profile, f, allow_unicode=True, default_flow_style=False)

-    from scripts.db import update_resume_synced_at as _mark_synced
+    from scripts.db import update_resume_synced_at as _mark_synced, set_default_resume as _set_default
    _mark_synced(db_path, resume_id)

+    # Establish this entry as the default so future Profile saves sync back to it
+    _set_default(db_path, resume_id)
+    _user_yaml = db_path.parent / "config" / "user.yaml"
+    if _user_yaml.exists():
+        _prof = yaml.safe_load(_user_yaml.read_text(encoding="utf-8")) or {}
+        _prof["default_resume_id"] = resume_id
+        _user_yaml.write_text(yaml.dump(_prof, default_flow_style=False, allow_unicode=True))
+
    return {
        "ok":             True,
        "backup_id":      backup["id"],
@ -1124,6 +1236,35 @@ def set_job_resume_endpoint(job_id: int, body: dict):
 # context. Avocet then routes these prompts through different local models to
 # compare generation quality against the real Peregrine pipeline.

+_SYNTHETIC_JOB = {
+    "id": 0,
+    "title": "Senior Software Engineer",
+    "company": "Acme Corp",
+    "description": (
+        "We are looking for a Senior Software Engineer to join our platform team. "
+        "You will design and build scalable backend services in Python and Go, "
+        "contribute to our event-driven architecture using Kafka and Redis, and "
+        "mentor junior engineers. We value clear communication, strong code review "
+        "practices, and an ownership mindset.\n\n"
+        "Requirements:\n"
+        "- 5+ years of backend engineering experience\n"
+        "- Proficiency in Python or Go; experience with both is a plus\n"
+        "- Solid understanding of distributed systems and API design (REST/gRPC)\n"
+        "- Experience with containerization (Docker/Kubernetes)\n"
+        "- Comfort working in a remote-first, async team environment\n\n"
+        "Nice to have:\n"
+        "- Experience with Kafka or other message-queue systems\n"
+        "- Open-source contributions\n"
+        "- Familiarity with observability tooling (Prometheus, Grafana)\n"
+    ),
+    "status": "applied",
+    "cover_letter": "",
+    "raw_output": "",
+    "company_brief": "",
+    "ats_gap_report": "",
+    "talking_points": "",
+}
+
 def _imitate_load_profile():
    """Load UserProfile from config/user.yaml, or None if missing."""
    try:
@ -1153,6 +1294,9 @@ def _imitate_cover_letter(db, profile, limit: int) -> dict:
    except Exception:
        corpus = []

+    if not rows:
+        rows = [_SYNTHETIC_JOB]
+
    samples = []
    for r in rows:
        desc = r["description"] or ""
@ -1209,6 +1353,9 @@ def _imitate_company_research(db, profile, limit: int) -> dict:
    except Exception:
        pass

+    if not rows:
+        rows = [_SYNTHETIC_JOB]
+
    samples = []
    for r in rows:
        jd = (r["description"] or "")[:1500].strip()
@ -1266,6 +1413,10 @@ def _imitate_interview_prep(db, profile, limit: int) -> dict:
    ).fetchall()

    name = profile.name if profile else "the candidate"
+
+    if not rows:
+        rows = [_SYNTHETIC_JOB]
+
    samples = []
    for r in rows:
        system_prompt = (
@ -1320,6 +1471,9 @@ def _imitate_ats_resume(db, profile, limit: int) -> dict:
        pass
    resume_block = f"\n## Current Resume\n{resume_text}" if resume_text else ""

+    if not rows:
+        rows = [_SYNTHETIC_JOB]
+
    samples = []
    for r in rows:
        desc = (r["description"] or "")[:1500].strip()
@ -1458,7 +1612,7 @@ def calendar_push(job_id: int):
 # ── Survey endpoints ─────────────────────────────────────────────────────────

 # Module-level imports so tests can patch dev_api.LLMRouter etc.
-from scripts.db import insert_survey_response, get_survey_responses
+from scripts.db import insert_survey_response, get_survey_responses  # noqa: E402



@ -1478,7 +1632,8 @@ class SurveyAnalyzeBody(BaseModel):


@app.post("/api/jobs/{job_id}/survey/analyze")
-def survey_analyze(job_id: int, body: SurveyAnalyzeBody):
+@limiter.limit(_RL_SURVEY)
+def survey_analyze(job_id: int, body: SurveyAnalyzeBody, request: Request):
    if body.mode not in ("quick", "detailed"):
        raise HTTPException(400, f"Invalid mode: {body.mode!r}")
    import json as _json
@ -1693,8 +1848,10 @@ def save_qa(job_id: int, payload: QAPayload):


@app.post("/api/jobs/{job_id}/qa/suggest")
-def suggest_qa_answer(job_id: int, payload: QASuggestPayload):
+@limiter.limit(_RL_QA_SUGGEST)
+def suggest_qa_answer(job_id: int, payload: QASuggestPayload, request: Request):
    """Synchronously generate an LLM answer for an application Q&A question."""
+    _demo_guard()
    db = _get_db()
    job_row = db.execute(
        "SELECT title, company, description FROM jobs WHERE id = ?", (job_id,)
@ -1725,7 +1882,7 @@ def suggest_qa_answer(job_id: int, payload: QASuggestPayload):
                parts.append(f"Summary: {resume_data['career_summary'][:400]}")
            resume_context = "\n".join(parts)
    except Exception:
-        pass
+        _log.warning("suggest_qa_answer: failed to load resume context", exc_info=True)

    prompt = (
        f"You are helping a job applicant answer an application question.\n\n"
@ -2652,6 +2809,9 @@ def get_app_config():
        except Exception:
            wizard_complete = False

+    from app.wizard.tiers import has_configured_llm
+    byok_unlocked = has_configured_llm()
+
    return {
        "isCloud": os.environ.get("CLOUD_MODE", "").lower() in ("1", "true"),
        "isDemo": os.environ.get("DEMO_MODE", "").lower() in ("1", "true", "yes"),
@ -2660,6 +2820,7 @@ def get_app_config():
        "contractedClient": os.environ.get("CONTRACTED_CLIENT", "").lower() in ("1", "true"),
        "inferenceProfile": profile if profile in valid_profiles else "cpu",
        "wizardComplete": wizard_complete,
+        "byokUnlocked": byok_unlocked,
    }


@ -2680,7 +2841,7 @@ def config_user():

 # ── Settings: My Profile endpoints ───────────────────────────────────────────

-from scripts.user_profile import load_user_profile, save_user_profile
+from scripts.user_profile import load_user_profile, save_user_profile  # noqa: E402


 def _user_yaml_path() -> str:
@ -3120,7 +3281,23 @@ async def upload_resume(file: UploadFile):
        resume_path.parent.mkdir(parents=True, exist_ok=True)
        with open(resume_path, "w") as f:
            yaml.dump(result, f, allow_unicode=True, default_flow_style=False)
+
+        # Also add to resume library and mark as default
+        import json as _json
+        from scripts.db import create_resume as _create_r, set_default_resume as _set_default
+        db_path = Path(_request_db.get() or DB_PATH)
+        resume_name = Path(file.filename).stem or "Uploaded Resume"
+        library_entry = _create_r(
+            db_path,
+            name=resume_name,
+            text=raw_text,
+            source="upload",
+            struct_json=_json.dumps(result),
+        )
+        _set_default(db_path, library_entry["id"])
+
        result["exists"] = True
+        result["library_id"] = library_entry["id"]
        return {"ok": True, "data": result}
    except Exception as e:
        raise HTTPException(status_code=500, detail=str(e))
@ -3183,6 +3360,10 @@ def get_search_prefs():
                for b in boards
            ]

+        # Normalize title key — wizard saved "titles", settings canonical is "job_titles"
+        if "titles" in profile and "job_titles" not in profile:
+            profile["job_titles"] = profile.pop("titles")
+
        return profile
    except Exception as e:
        raise HTTPException(status_code=500, detail=str(e))
@ -3554,6 +3735,8 @@ def test_email(payload: dict):
        username = payload.get("username", "")
        if not all([host, username, password]):
            return {"ok": False, "error": "Missing host, username, or password"}
+        if _is_ssrf_host(host):
+            return {"ok": False, "error": "IMAP host must be a public address"}
        if use_ssl:
            ctx = ssl_mod.create_default_context()
            conn = imaplib.IMAP4_SSL(host, port, ssl_context=ctx)
@ -3685,6 +3868,26 @@ def save_deploy_config(payload: dict):
    return {"ok": True, "note": "Restart required to apply changes"}


+class OrchUrlPayload(BaseModel):
+    orch_url: str = ""
+
+
+@app.get("/api/settings/system/orch-url")
+def get_orch_url():
+    """Return the saved Orchard coordinator URL."""
+    cfg = _load_wizard_yaml()
+    return {"orch_url": cfg.get("cf_orch_url", "")}
+
+
+@app.post("/api/settings/system/orch-url")
+def save_orch_url(payload: OrchUrlPayload):
+    """Persist the Orchard coordinator URL to user.yaml."""
+    cfg = _load_wizard_yaml()
+    cfg["cf_orch_url"] = payload.orch_url.strip()
+    _save_wizard_yaml(cfg)
+    return {"ok": True}
+
+
 # ── Settings: Fine-Tune ───────────────────────────────────────────────────────

 _TRAINING_JSONL = Path("/Library/Documents/JobSearch/training_data/cover_letters.jsonl")
@ -4148,7 +4351,7 @@ def export_classifier():
 # State is persisted to user.yaml on every step so the wizard can resume
 # after a browser refresh or crash (mirrors the Streamlit wizard behaviour).

-_WIZARD_PROFILES = ("remote", "cpu", "single-gpu", "dual-gpu", "cf-orch")
+_WIZARD_PROFILES = ("cpu", "single-gpu", "dual-gpu", "cf-orch", "remote")
 _WIZARD_TIERS = ("free", "paid", "premium")


@ -4194,7 +4397,7 @@ def _suggest_profile(gpus: list[str]) -> str:
        return "dual-gpu"
    if len(gpus) == 1:
        return "single-gpu"
-    return "remote"
+    return "cpu"


@app.get("/api/wizard/status")
@ -4218,6 +4421,7 @@ def wizard_status():
            "linkedin": cfg.get("linkedin", ""),
            "career_summary": cfg.get("career_summary", ""),
            "services": cfg.get("services", {}),
+            "cf_orch_url": cfg.get("cf_orch_url", ""),
        },
    }

@ -4239,8 +4443,8 @@ def wizard_save_step(payload: WizardStepPayload):
    step = payload.step
    data = payload.data

-    if step < 1 or step > 7:
-        raise HTTPException(status_code=400, detail="step must be 1–7")
+    if step < 1 or step > 8:
+        raise HTTPException(status_code=400, detail="step must be 1–8")

    updates: dict = {"wizard_step": step}

@ -4266,13 +4470,16 @@ def wizard_save_step(payload: WizardStepPayload):
            with open(resume_path, "w") as f:
                yaml.dump(resume, f, allow_unicode=True, default_flow_style=False)

-    elif step == 4:
+    elif step in (4, 5):
+        # Step 4 (legacy) or step 5 (current) — identity fields.
+        # Step 4 was the original numbering before the training step was inserted
+        # between resume and identity; both are accepted for backward compat.
        for field in ("name", "email", "phone", "linkedin", "career_summary"):
            if field in data:
                updates[field] = data[field]

-    elif step == 5:
-        # Write API keys to .env (never store in user.yaml)
+    elif step == 6:
+        # Step 6 — inference: API keys + optional Orchard coordinator URL.
        env_path = Path(_wizard_yaml_path()).parent.parent / ".env"
        env_lines = env_path.read_text().splitlines() if env_path.exists() else []

@ -4290,18 +4497,24 @@ def wizard_save_step(payload: WizardStepPayload):
            env_lines = _set_env_key(env_lines, "OPENAI_COMPAT_URL", data["openai_url"])
        if data.get("openai_key"):
            env_lines = _set_env_key(env_lines, "OPENAI_COMPAT_KEY", data["openai_key"])
-        if any(data.get(k) for k in ("anthropic_key", "openai_url", "openai_key")):
+        if data.get("orch_url"):
+            env_lines = _set_env_key(env_lines, "GPU_SERVER_URL", data["orch_url"])
+            updates["cf_orch_url"] = data["orch_url"]
+        if any(data.get(k) for k in ("anthropic_key", "openai_url", "openai_key", "orch_url")):
            env_path.parent.mkdir(parents=True, exist_ok=True)
            env_path.write_text("\n".join(env_lines) + "\n")

        if "services" in data:
            updates["services"] = data["services"]

-    elif step == 6:
-        # Persist search preferences to search_profiles.yaml in canonical format:
-        #   profiles: [{name, titles, locations, boards, ...}]
-        titles = data.get("titles", [])
-        locations = data.get("locations", [])
+    elif step == 7:
+        # Step 7 — search preferences.
+        # Wizard sends { search: { titles, locations, remote_only } }; fall back to
+        # top-level keys for direct API callers that omit the "search" wrapper.
+        search = data.get("search", {})
+        titles = search.get("titles", data.get("titles", data.get("job_titles", [])))
+        locations = search.get("locations", data.get("locations", []))
+        remote_only = search.get("remote_only", data.get("remote_only", False))
        search_path = _search_prefs_path()
        existing_search: dict = {}
        if search_path.exists():
@ -4318,14 +4531,15 @@ def wizard_save_step(payload: WizardStepPayload):
        if default_profile is None:
            default_profile = {"name": "default"}
            profiles_list.append(default_profile)
-        default_profile["titles"] = titles
+        default_profile["job_titles"] = titles
        default_profile["locations"] = locations
+        default_profile["remote_only"] = remote_only
        existing_search["profiles"] = profiles_list
        search_path.parent.mkdir(parents=True, exist_ok=True)
        with open(search_path, "w") as f:
            yaml.dump(existing_search, f, allow_unicode=True, default_flow_style=False)

-    # Step 7 (integrations) has no extra side effects here — connections are
+    # Step 8 (integrations) has no extra side effects here — connections are
    # handled by the existing /api/settings/system/integrations/{id}/connect.

    try:
@ -4352,6 +4566,39 @@ def _fetch_cforch_nodes() -> list[dict]:
        return []


+def _probe_ollama() -> bool:
+    """Return True if Ollama is reachable from inside the container."""
+    candidates = [
+        "http://host.docker.internal:11434/api/tags",
+        "http://ollama:11434/api/tags",
+    ]
+    for url in candidates:
+        try:
+            r = requests.get(url, timeout=2)
+            if r.status_code == 200:
+                return True
+        except Exception:
+            pass
+    return False
+
+
+def _probe_searxng() -> bool:
+    """Return True if SearXNG is reachable from inside the container."""
+    candidates = [
+        "http://searxng:8080/",
+        "http://host.docker.internal:8888/",
+        "http://host.docker.internal:8080/",
+    ]
+    for url in candidates:
+        try:
+            r = requests.get(url, timeout=2)
+            if r.status_code < 500:
+                return True
+        except Exception:
+            pass
+    return False
+
+
@app.get("/api/wizard/hardware")
 def wizard_hardware():
    """Detect local GPUs, suggest an inference profile, and report cf-orch nodes."""
@ -4370,35 +4617,71 @@ def wizard_hardware():
                "vram_free_mb": gpu["vram_free_mb"],
            })

+    ollama_running = _probe_ollama()
+    searxng_running = _probe_searxng()
+
+    # If no GPU but Ollama is already running, default to cpu rather than remote
+    if suggested == "cpu" and not gpus and not ollama_running:
+        suggested = "remote"
+
    return {
        "gpus": gpus,
        "suggested_profile": suggested,
        "profiles": list(_WIZARD_PROFILES),
        "cf_orch_available": len(orch_nodes) > 0,
        "cf_orch_gpus": orch_summary,
+        "ollama_running": ollama_running,
+        "searxng_running": searxng_running,
    }


+def _container_safe_url(url: str) -> str:
+    """Replace localhost/127.0.0.1 with host.docker.internal so tests reach the host."""
+    import re as _re
+    return _re.sub(r"(https?://)(?:localhost|127\.0\.0\.1)\b", r"\1host.docker.internal", url)
+
+
 class WizardInferenceTestPayload(BaseModel):
    profile: str = "remote"
    anthropic_key: str = ""
    openai_url: str = ""
    openai_key: str = ""
+    orch_url: str = ""
    ollama_host: str = "localhost"
    ollama_port: int = 11434


@app.post("/api/wizard/inference/test")
 def wizard_test_inference(payload: WizardInferenceTestPayload):
-    """Test LLM or Ollama connectivity.
+    """Test LLM, Ollama, or Orchard coordinator connectivity.

-    Always returns {ok, message} — a connection failure is reported as a
-    soft warning (message), not an HTTP error, so the wizard can let the
-    user continue past a temporarily-down Ollama instance.
+    Always returns {ok, message} — a connection failure is a soft warning so
+    the wizard lets the user continue past a temporarily-unreachable service.
    """
-    if payload.profile == "remote":
+    if payload.profile == "cf-orch":
+        orch_url = _container_safe_url(payload.orch_url.rstrip("/")) if payload.orch_url else ""
+        if not orch_url:
+            return {"ok": False, "message": "Enter the Orchard coordinator URL first."}
+        try:
+            resp = requests.get(f"{orch_url}/api/nodes", timeout=5,
+                                headers={"Accept": "application/json"})
+            if resp.status_code == 200:
+                nodes = resp.json().get("nodes", [])
+                n = len(nodes)
+                return {"ok": True, "message": f"Orchard reachable — {n} node(s) online."}
+            return {"ok": False, "message": f"Orchard returned HTTP {resp.status_code}."}
+        except Exception as exc:
+            return {
+                "ok": False,
+                "message": (
+                    f"Cannot reach Orchard at {payload.orch_url} — "
+                    "check the URL and that the coordinator is running. "
+                    f"({exc})"
+                ),
+            }
+
+    elif payload.profile == "remote":
        try:
-            # Temporarily inject key if provided (don't persist yet)
            env_override = {}
            if payload.anthropic_key:
                env_override["ANTHROPIC_API_KEY"] = payload.anthropic_key
@ -4422,15 +4705,16 @@ def wizard_test_inference(payload: WizardInferenceTestPayload):
                        os.environ[k] = v
        except Exception as exc:
            return {"ok": False, "message": f"LLM test failed: {exc}"}
+
    else:
-        # Local profile — ping Ollama
-        ollama_url = f"http://{payload.ollama_host}:{payload.ollama_port}"
+        # Local profiles (cpu, single-gpu, dual-gpu) — ping Ollama
+        host = payload.ollama_host or "localhost"
+        ollama_url = _container_safe_url(f"http://{host}:{payload.ollama_port}")
        try:
            resp = requests.get(f"{ollama_url}/api/tags", timeout=5)
            ok = resp.status_code == 200
            message = "Ollama is running." if ok else f"Ollama returned HTTP {resp.status_code}."
        except Exception:
-            # Soft-fail: user can skip and configure later
            return {
                "ok": False,
                "message": (
@ -4469,6 +4753,125 @@ def wizard_complete():
        raise HTTPException(status_code=500, detail=str(e))


+# ── AI Interview Wizard (BSL 1.1) ─────────────────────────────────────────────
+
+_AI_WIZARD_SYSTEM_PROMPT = """You are a friendly, patient assistant helping someone set up their job search profile. Your goal is to gather the following information through natural conversation:
+
+- name (string): their full name
+- email (string): their preferred contact email
+- career_summary (string): 1-2 sentence background summary
+- candidate_voice (string): their preferred writing voice/tone for cover letters
+- mission_preferences (list of strings): industries or causes they care about
+- candidate_accessibility_focus (bool): whether to include accessibility culture in company research
+- candidate_lgbtq_focus (bool): whether to include LGBTQIA+ inclusion signals in company research
+- linkedin (string, optional): their LinkedIn URL
+
+Rules:
+1. Ask one or two questions at a time — never overwhelm
+2. Always remind them they can skip any question
+3. For candidate_voice, offer these options if they struggle: "professional and direct", "warm and conversational", "concise and clear", "enthusiastic and personable"
+4. For candidate_accessibility_focus and candidate_lgbtq_focus, use plain language: "Would you like me to look into whether companies actively support employees with disabilities or neurodivergent needs?" and "Would you like me to check whether companies have strong LGBTQIA+ inclusion policies?"
+5. When you have gathered enough information or the user says they are done, set complete to true
+
+You must ALWAYS respond with valid JSON in this exact format:
+{"reply": "your conversational message here", "extracted_fields": {"name": "...", ...}, "complete": false}
+
+Only include fields in extracted_fields that you are confident about from the conversation. Do not include fields the user hasn't mentioned. Infer complete=true when all required fields (name, email, career_summary) are gathered or when user explicitly says done."""
+
+
+class HistoryMessage(BaseModel):
+    role: str  # "user" or "assistant"
+    content: str
+
+
+class WizardInterviewRequest(BaseModel):
+    history: list[HistoryMessage] = []
+    profile_so_far: dict = {}
+
+
+class WizardFinalizeRequest(BaseModel):
+    profile: dict
+
+
+_WIZARD_ALLOWED_FIELDS: frozenset[str] = frozenset({
+    "name",
+    "email",
+    "career_summary",
+    "candidate_voice",
+    "mission_preferences",
+    "candidate_accessibility_focus",
+    "candidate_lgbtq_focus",
+    "linkedin",
+})
+
+
+@app.post("/api/wizard/ai/interview")
+@limiter.limit(_RL_WIZARD)
+def wizard_ai_interview(request: Request, body: WizardInterviewRequest):
+    """Conduct one turn of the AI-guided profile interview. Tier-gated (BYOK-unlockable)."""
+    from app.wizard.tiers import can_use, has_configured_llm
+
+    tier = _get_effective_tier()
+    if not can_use(tier, "llm_ai_wizard", has_byok=has_configured_llm()):
+        raise HTTPException(402, detail={"error": "tier_required"})
+
+    # Build conversation prompt from history
+    conversation_lines = []
+    for msg in body.history:
+        role = msg.role
+        content = msg.content.replace("\n", " ").replace("\r", "")
+        if role == "user":
+            conversation_lines.append(f"User: {content}")
+        else:
+            conversation_lines.append(f"Assistant: {content}")
+
+    history_block = "\n".join(conversation_lines) if conversation_lines else "User: (starting conversation)"
+
+    # Build profile summary to give LLM context about what's already known
+    if body.profile_so_far:
+        gathered = ", ".join(
+            f"{k}={repr(v)}"
+            for k, v in body.profile_so_far.items()
+            if v not in (None, "", [], {})
+        )
+        profile_context = f"\n\n[Already gathered: {gathered}]" if gathered else ""
+    else:
+        profile_context = ""
+
+    prompt = history_block + profile_context
+
+    try:
+        from scripts.llm_router import LLMRouter
+        response_text = LLMRouter().complete(prompt, system=_AI_WIZARD_SYSTEM_PROMPT)
+    except Exception as exc:
+        raise HTTPException(503, detail={"error": "llm_error", "message": str(exc)})
+
+    try:
+        parsed = json.loads(response_text)
+        return {
+            "reply": parsed.get("reply", ""),
+            "extracted_fields": parsed.get("extracted_fields", {}),
+            "complete": bool(parsed.get("complete", False)),
+        }
+    except (json.JSONDecodeError, AttributeError):
+        return {"reply": response_text, "extracted_fields": {}, "complete": False}
+
+
+@app.post("/api/wizard/ai/finalize")
+def wizard_ai_finalize(request: WizardFinalizeRequest):
+    """Merge AI-collected wizard fields into user.yaml. Only allowed fields are written."""
+    yaml_path = _user_yaml_path()
+    try:
+        current = load_user_profile(yaml_path)
+        updates = {k: v for k, v in request.profile.items() if k in _WIZARD_ALLOWED_FIELDS}
+        merged = {**current, **updates}
+        save_user_profile(yaml_path, merged)
+    except Exception as exc:
+        raise HTTPException(500, detail={"error": "write_error", "message": str(exc)})
+    merged_keys = list(updates.keys())
+    return {"saved": True, "fields": merged_keys}
+
+
 # ── Messaging models ──────────────────────────────────────────────────────────

 class MessageCreateBody(BaseModel):
--- a/docs/getting-started/docker-profiles.md
+++ b/docs/getting-started/docker-profiles.md
@ -1,69 +1,129 @@
 # Docker Profiles

-Peregrine uses Docker Compose profiles to start only the services your hardware can support. Choose a profile with `make start PROFILE=<name>`.
+Peregrine uses Docker Compose profiles to start only the services your hardware supports. Choose a profile with `./manage.sh start --profile <name>`.
+
+`manage.sh` delegates to `make`, which auto-detects Docker vs Podman and applies the correct GPU overlay — `compose.gpu.yml` for Docker, `compose.podman-gpu.yml` for Podman (CDI-based). You do not need to specify the overlay manually.

 ---

 ## Profile Reference

 | Profile | Services started | Use case |
-|---------|----------------|----------|
-| `remote` | `app`, `searxng` | No GPU. LLM calls go to an external API (Anthropic, OpenAI-compatible). |
-| `cpu` | `app`, `ollama`, `searxng` | No GPU. Runs local models on CPU — functional but slow. |
-| `single-gpu` | `app`, `ollama`, `vision`, `searxng` | One NVIDIA GPU. Covers cover letters, research, and vision (survey screenshots). |
-| `dual-gpu` | `app`, `ollama`, `vllm`, `vision`, `searxng` | Two NVIDIA GPUs. GPU 0 = Ollama (cover letters), GPU 1 = vLLM (research). |
+|---------|-----------------|----------|
+| `cpu` | `web`, `api`, `ollama`, `searxng` | No GPU. Local models on CPU. Recommended default for new installs. |
+| `single-gpu` | `web`, `api`, `ollama`, `vision`, `searxng` | One NVIDIA GPU. Covers cover letters, research, and vision. |
+| `dual-gpu` | `web`, `api`, `ollama`, `vllm`, `vision`, `searxng` | Two NVIDIA GPUs. GPU split controlled by `DUAL_GPU_MODE`. |
+| `cf-orch` | `web`, `api`, `searxng` | No local LLM. Inference routed to CircuitForge GPU cluster. Requires Paid license. |
+| `remote` | `web`, `api`, `searxng` | No local LLM. Inference goes to cloud API keys (Anthropic, OpenAI-compatible). |
+| `memory` | (any + memory flag) | Enables RAM-optimised container limits for low-RAM machines. Combine with another profile. |

 ---

 ## Service Descriptions

-| Service | Image / Source | Port | Purpose |
-|---------|---------------|------|---------|
-| `app` | `Dockerfile` (Streamlit) | 8501 | The main Peregrine UI |
+| Service | Image / Source | Host Port | Purpose |
+|---------|---------------|-----------|---------|
+| `web` | `Dockerfile.web` (Nginx + Vue SPA) | `VUE_PORT` (default 8506) | Main UI — serves the Vue frontend and proxies `/api/` to `api` |
+| `api` | `Dockerfile` (FastAPI) | Internal only (proxied through `web`) | REST API — all backend logic |
 | `ollama` | `ollama/ollama` | 11434 | Local model inference — cover letters and general tasks |
-| `vllm` | `vllm/vllm-openai` | 8000 | High-throughput local inference — research tasks |
+| `vllm` | `vllm/vllm-openai` | 8000 | High-throughput inference — research tasks |
 | `vision` | `scripts/vision_service/` | 8002 | Moondream2 — survey screenshot analysis |
-| `searxng` | `searxng/searxng` | 8888 | Private meta-search engine — company research web scraping |
+| `searxng` | `searxng/searxng` | 8888 | Private meta-search — company research web scraping |
+
+The `web` container runs Nginx internally on port 80, mapped to `VUE_PORT` on the host. The Nginx config proxies `/api/` requests to `api:8601` — the FastAPI container is not exposed directly.

 ---

 ## Choosing a Profile

-### remote
-
-Use `remote` if:
- You have no NVIDIA GPU
- You plan to use Anthropic Claude or another API-hosted model exclusively
- You want the fastest startup (only two containers)
-
-You must configure at least one external LLM backend in **Settings → LLM Backends**.
-
 ### cpu

 Use `cpu` if:
- You have no GPU but want to run models locally (e.g. for privacy)
+- You have no GPU but want local inference (good for privacy)
 - Acceptable for light use; cover letter generation may take several minutes per request

-Pull a model after the container starts:
+Pull a model after starting:

 ```bash
-docker exec -it peregrine-ollama-1 ollama pull llama3.1:8b
+docker exec -it peregrine-ollama-1 ollama pull llama3.2:3b
 ```

+`llama3.2:3b` is the recommended CPU model — it runs on machines with 8 GB of system RAM.
+
 ### single-gpu

 Use `single-gpu` if:
 - You have one NVIDIA GPU with at least 8 GB VRAM
 - Recommended for most single-user installs
- The vision service (Moondream2) starts on the same GPU using 4-bit quantisation (~1.5 GB VRAM)
+
+The vision service (Moondream2) starts on the same GPU using 4-bit quantisation (~1.5 GB VRAM). Pull a model after starting:
+
+```bash
+docker exec -it peregrine-ollama-1 ollama pull llama3.1:8b
+```

 ### dual-gpu

 Use `dual-gpu` if:
 - You have two or more NVIDIA GPUs
- GPU 0 handles Ollama (cover letters, quick tasks)
- GPU 1 handles vLLM (research, long-context tasks)
- The vision service shares GPU 0 with Ollama
+- Default: GPU 0 handles Ollama (cover letters), GPU 1 handles vLLM (research)
+
+See [Dual-GPU Modes](#dual-gpu-modes) below to configure how the two GPUs are split.
+
+### cf-orch
+
+Use `cf-orch` if:
+- You have access to a CircuitForge GPU cluster running the cf-orch coordinator
+- No local GPU required — inference is handled by the cluster
+- Requires a Paid or higher license
+
+Set `CF_ORCH_URL` in `.env` to your coordinator address:
+
+```bash
+CF_ORCH_URL=http://10.1.10.71:7700
+```
+
+The wizard hardware step lets you enter the URL interactively and verifies the connection before saving.
+
+### remote
+
+Use `remote` if:
+- You have no local GPU and no cf-orch cluster
+- You are using Anthropic Claude, OpenAI, or another cloud API exclusively
+
+Configure at least one external LLM backend in **Settings → LLM Backends** after first login.
+
+### memory (add-on)
+
+Use the `memory` add-on alongside any profile for machines with limited RAM:
+
+```bash
+./manage.sh start --profile single-gpu --profile memory
+```
+
+This applies conservative container memory limits to prevent the OOM (out-of-memory) killer from terminating containers.
+
+---
+
+## Dual-GPU Modes
+
+When using `dual-gpu`, `DUAL_GPU_MODE` in `.env` controls how the second GPU is used:
+
+| Mode | GPU 0 | GPU 1 | Use case |
+|------|-------|-------|----------|
+| `mixed` (default) | Ollama | vLLM | Best overall: fast cover letters + high-throughput research |
+| `ollama` | Ollama | Ollama | Both GPUs run Ollama; no vLLM; useful if vLLM models are too large for one card |
+| `vllm` | vLLM | vLLM | Both GPUs run vLLM (tensor parallel); maximum research throughput |
+
+Set in `.env`:
+
+```bash
+DUAL_GPU_MODE=mixed    # default
+# DUAL_GPU_MODE=ollama
+# DUAL_GPU_MODE=vllm
+```
+
+The Makefile expands `dual-gpu` into `--profile dual-gpu-$(DUAL_GPU_MODE)` before passing it to `docker compose`. The `compose.gpu.yml` overlay defines the `dual-gpu-mixed`, `dual-gpu-ollama`, and `dual-gpu-vllm` profile variants.

 ---

@ -75,40 +135,69 @@ Use `dual-gpu` if:
 | 4–8 GB | `single-gpu` | Run smaller models (3B–8B parameters) |
 | 8–16 GB | `single-gpu` | Run 8B–13B models comfortably |
 | 16–24 GB | `single-gpu` | Run 13B–34B models |
-| 24 GB+ | `single-gpu` or `dual-gpu` | 70B models with quantisation |
+| 24 GB+ (one card) | `single-gpu` | 70B models with quantisation |
+| 16+ GB (two cards) | `dual-gpu` | Parallel cover letters + research |

 ---

 ## How preflight.py Works

-`make start` calls `scripts/preflight.py` before launching Docker. Preflight does the following:
+`./manage.sh start` calls `scripts/preflight.py` before launching Docker. Preflight does the following:

-1. **Port conflict detection** — checks whether `STREAMLIT_PORT`, `OLLAMA_PORT`, `VLLM_PORT`, `SEARXNG_PORT`, and `VISION_PORT` are already in use. Reports any conflicts and suggests alternatives.
+1. **Port conflict detection** — checks whether `VUE_PORT`, `OLLAMA_PORT`, `VLLM_PORT`, `SEARXNG_PORT`, and `VISION_PORT` are already in use. Reports any conflicts and suggests alternatives.

-2. **GPU enumeration** — queries `nvidia-smi` for GPU count and VRAM per card.
+2. **External service adoption** — if Ollama or SearXNG are already running on their configured ports (common when using native Ollama on macOS, or a shared SearXNG instance), preflight writes a `compose.override.yml` that stubs out the duplicate containers. The running process is adopted rather than replaced.

-3. **RAM check** — reads `/proc/meminfo` (Linux) or `vm_stat` (macOS) to determine available system RAM.
+3. **GPU enumeration** — queries `nvidia-smi` for GPU count and VRAM per card. On Apple Silicon Macs, falls back to `system_profiler SPDisplaysDataType` and returns unified memory as the VRAM figure.

-4. **KV cache offload** — if GPU VRAM is less than 10 GB, preflight calculates `CPU_OFFLOAD_GB` (the amount of KV cache to spill to system RAM) and writes it to `.env`. The vLLM container picks this up via `--cpu-offload-gb`.
+4. **RAM check** — reads `/proc/meminfo` (Linux) or `vm_stat` (macOS) for available system RAM.

-5. **Profile recommendation** — writes `RECOMMENDED_PROFILE` to `.env`. This is informational; `make start` uses the `PROFILE` variable you specify (defaulting to `remote`).
+5. **KV cache offload** — if GPU VRAM is less than 10 GB, preflight calculates `CPU_OFFLOAD_GB` and writes it to `.env`. The vLLM container picks this up via `--cpu-offload-gb` to overflow the KV cache to system RAM.

-You can run preflight independently:
+6. **Profile recommendation** — writes `RECOMMENDED_PROFILE` to `.env`. This is informational only; `./manage.sh start --profile <name>` uses the profile you specify.
+
+Run preflight independently at any time:

 ```bash
-make preflight
+./manage.sh preflight
 # or
-python scripts/preflight.py
+conda run -n cf python scripts/preflight.py
 ```

 ---

+## Podman Support
+
+Podman is fully supported as a Docker drop-in. `install.sh` detects whether Podman or Docker is available, and `manage.sh`/`make` use it automatically.
+
+### GPU setup for Podman (CDI)
+
+Podman uses the CDI (Container Device Interface) standard for GPU passthrough, rather than Docker's `--gpus all` flag. Generate the CDI spec once after driver installation:
+
+```bash
+sudo nvidia-ctk cdi generate --output=/etc/cdi/nvidia.yaml
+```
+
+Without this step, GPU profiles start but containers have no GPU access.
+
+### Rootless Podman
+
+Rootless Podman is supported. If you encounter permission errors on the Docker socket, ensure `podman.socket` is running for your user:
+
+```bash
+systemctl --user enable --now podman.socket
+```
+
+The `make` layer auto-detects rootless Podman and uses `XDG_RUNTIME_DIR/podman/podman.sock` instead of `/var/run/docker.sock`.
+
+---
+
 ## Customising Ports

-Edit `.env` before running `make start`:
+Edit `.env` before running `./manage.sh start`:

 ```bash
-STREAMLIT_PORT=8501
+VUE_PORT=8506          # main UI (Vue SPA)
 OLLAMA_PORT=11434
 VLLM_PORT=8000
 SEARXNG_PORT=8888
@ -116,3 +205,15 @@ VISION_PORT=8002
 ```

 All containers read from `.env` via the `env_file` directive in `compose.yml`.
+
+---
+
+## Wizard Test Instance
+
+A separate compose file is available for testing first-run and onboarding wizard flows without touching your main data:
+
+```bash
+docker compose -f compose.wizard-test.yml --project-name peregrine-wizard up -d
+```
+
+The wizard test instance runs on port **8507** with ephemeral storage — every `docker compose restart` wipes the database back to a clean slate. Uses the same images as the main instance but mounts a minimal LLM config so the wizard detection endpoints work correctly.
--- a/docs/getting-started/installation.md
+++ b/docs/getting-started/installation.md
@ -7,7 +7,7 @@ This page walks through a full Peregrine installation from scratch.
 ## Prerequisites

 - **Git** — to clone the repository
- **Internet connection** — `install.sh` downloads Docker and other dependencies
+- **Internet connection** — `install.sh` downloads Docker/Podman and other dependencies
 - **Operating system**: Ubuntu/Debian, Fedora/RHEL, Arch Linux, or macOS (with Docker Desktop)

 !!! warning "Windows"
@ -34,16 +34,28 @@ bash install.sh

 1. **Detects your platform** (Ubuntu/Debian, Fedora/RHEL, Arch, macOS)
 2. **Installs Git** if not already present
-3. **Installs Docker Engine** and the Docker Compose v2 plugin via the official Docker repositories
+3. **Installs Docker Engine** (or Podman if Docker is not available) via official repositories
 4. **Adds your user to the `docker` group** so you do not need `sudo` for docker commands (Linux only — log out and back in after this)
-5. **Detects NVIDIA GPUs** — if `nvidia-smi` is present and working, installs the NVIDIA Container Toolkit and configures Docker to use it
+5. **Detects NVIDIA GPUs** — if `nvidia-smi` is present and working, installs the NVIDIA Container Toolkit and configures Docker/Podman to use it
 6. **Creates `.env` from `.env.example`** — edit `.env` to customise ports and model storage paths before starting

 !!! note "macOS"
-    `install.sh` installs Docker Desktop via Homebrew (`brew install --cask docker`) then exits. Open Docker Desktop, start it, then re-run the script.
+    `install.sh` installs Docker Desktop via Homebrew (`brew install --cask docker`) then exits. Open Docker Desktop, start it, then re-run the script. Ollama can also run natively for Metal GPU-accelerated inference — see the macOS note in Step 4.

 !!! note "GPU requirement"
-    For GPU support, `nvidia-smi` must return output before you run `install.sh`. Install your NVIDIA driver first. The Container Toolkit installation will fail silently if the driver is not present.
+    For GPU support, `nvidia-smi` must return output before you run `install.sh`. Install your NVIDIA driver first.
+
+---
+
+## Step 2a — Podman users: GPU CDI setup
+
+If you prefer rootless Podman over Docker, `install.sh` detects it and manages.sh/make use it automatically. For GPU profiles to work with Podman you must generate a CDI spec first:
+
+```bash
+sudo nvidia-ctk cdi generate --output=/etc/cdi/nvidia.yaml
+```
+
+This needs to be done once after driver installation. Without it, GPU profiles will start but containers will not have GPU access. Docker users can skip this step — Docker uses `--gpus all` instead of CDI.

 ---

@ -52,15 +64,21 @@ bash install.sh
 The `.env` file controls ports and volume mount paths. The defaults work for most single-user installs:

 ```bash
-# Default ports
-STREAMLIT_PORT=8501
-OLLAMA_PORT=11434
-VLLM_PORT=8000
-SEARXNG_PORT=8888
-VISION_PORT=8002
+# Main UI port
+VUE_PORT=8506
+
+# Model paths — use full absolute paths, not ~ (tilde does not expand inside containers)
+DOCS_DIR=/home/yourname/Documents/JobSearch
+OLLAMA_MODELS_DIR=/home/yourname/models/ollama
+
+# Inference model defaults
+OLLAMA_DEFAULT_MODEL=llama3.2:3b
+
+# External API keys — only needed for the "remote" profile or BYOK unlock
+ANTHROPIC_API_KEY=
 ```

-Change `STREAMLIT_PORT` if 8501 is taken on your machine.
+Change `VUE_PORT` if 8506 is taken on your machine. See [Docker Profiles](docker-profiles.md) for a full port reference.

 ---

@ -69,21 +87,24 @@ Change `STREAMLIT_PORT` if 8501 is taken on your machine.
 Choose a profile based on your hardware:

 ```bash
-make start                        # remote — no GPU, use API-only LLMs
-make start PROFILE=cpu            # cpu — local models on CPU (slow)
-make start PROFILE=single-gpu     # single-gpu — one NVIDIA GPU
-make start PROFILE=dual-gpu       # dual-gpu — GPU 0 = Ollama, GPU 1 = vLLM
+./manage.sh start                        # cpu — local Ollama on CPU (recommended default)
+./manage.sh start --profile single-gpu   # one NVIDIA GPU
+./manage.sh start --profile dual-gpu     # two NVIDIA GPUs
+./manage.sh start --profile remote       # no local LLM — use cloud API keys only
 ```

-`make start` runs `preflight.py` first, which checks for port conflicts and writes GPU/RAM recommendations back to `.env`. Then it calls `docker compose --profile <PROFILE> up -d`.
+`manage.sh start` runs `preflight.py` first, which checks for port conflicts and writes GPU/RAM recommendations to `.env`. Then it calls `docker compose` (or `podman compose`) with the right compose file overlay for your hardware.
+
+!!! tip "macOS with native Ollama"
+    If you installed Ollama natively via Homebrew for Metal GPU inference, start with `--profile cpu`. The container API on port 8506 connects to your host's Ollama at `localhost:11434` automatically.

 ---

 ## Step 5 — Open the UI

-Navigate to **http://localhost:8501** (or whatever `STREAMLIT_PORT` you set).
+Navigate to **http://localhost:8506** (or whatever `VUE_PORT` you set).

-The first-run wizard launches automatically. See [First-Run Wizard](first-run-wizard.md) for a step-by-step guide through all seven steps.
+The first-run wizard launches automatically. See [First-Run Wizard](first-run-wizard.md) for a step-by-step guide.

 ---

@ -96,7 +117,7 @@ The first-run wizard launches automatically. See [First-Run Wizard](first-run-wi
 | Fedora 39/40 | Yes | |
 | RHEL / Rocky / AlmaLinux | Yes | |
 | Arch Linux / Manjaro | Yes | |
-| macOS (Apple Silicon) | Yes | Docker Desktop required; no GPU support |
+| macOS (Apple Silicon) | Yes | Docker Desktop required; GPU via native Ollama (Metal) |
 | macOS (Intel) | Yes | Docker Desktop required; no GPU support |
 | Windows | No | Use WSL2 with Ubuntu |

@ -107,20 +128,23 @@ The first-run wizard launches automatically. See [First-Run Wizard](first-run-wi
 Only NVIDIA GPUs are supported. AMD ROCm is not currently supported.

 Requirements:
+
 - NVIDIA driver installed and `nvidia-smi` working before running `install.sh`
 - CUDA 12.x recommended (CUDA 11.x may work but is untested)
 - Minimum 8 GB VRAM for `single-gpu` profile with default models
- For `dual-gpu`: GPU 0 is assigned to Ollama, GPU 1 to vLLM
+- **Podman users:** CDI spec required — see Step 2a above

-If your GPU has less than 10 GB VRAM, `preflight.py` will calculate a `CPU_OFFLOAD_GB` value and write it to `.env`. The vLLM container picks this up via `--cpu-offload-gb` to overflow KV cache to system RAM.
+For `dual-gpu`, both cards must be NVIDIA. GPU 0 handles Ollama (cover letters, general tasks) and GPU 1 handles the research workload. The exact behaviour is controlled by `DUAL_GPU_MODE` — see [Docker Profiles](docker-profiles.md#dual-gpu-modes).
+
+If your GPU has less than 10 GB VRAM, `preflight.py` calculates a `CPU_OFFLOAD_GB` value and writes it to `.env`. The vLLM container picks this up via `--cpu-offload-gb` to overflow KV cache to system RAM.

 ---

 ## Stopping Peregrine

 ```bash
-make stop       # stop all containers
-make restart    # stop then start again (runs preflight first)
+./manage.sh stop       # stop all containers
+./manage.sh restart    # stop then start again (runs preflight first)
 ```

 ---
@ -128,7 +152,7 @@ make restart    # stop then start again (runs preflight first)
 ## Reinstalling / Clean State

 ```bash
-make clean      # removes containers, images, and data volumes (destructive)
+./manage.sh clean      # removes containers, images, and data volumes (destructive)
 ```

 You will be prompted to type `yes` to confirm.
--- a/docs/user-guide/daily-workflow.md
+++ b/docs/user-guide/daily-workflow.md
@ -0,0 +1,142 @@
+# Daily Workflow
+
+This page describes how Peregrine fits into a typical active job search. The core loop is short: find jobs, triage them, generate and send applications, track what happens next.
+
+---
+
+## The Core Loop
+
+```
+Run Discovery → Review Jobs → Apply Workspace → Track in Interviews
+```
+
+Each stage feeds the next. You can run the full loop in under ten minutes on a good day, or spend longer editing cover letters and doing interview prep when you need to.
+
+---
+
+## Starting Your Day
+
+### 1. Run Discovery
+
+Open the **Home** page and click **Run Discovery**. Peregrine queries all your configured job boards simultaneously and stores results in the local database.
+
+- Discovery runs one search profile at a time. Each profile produces results per board, then moves to the next.
+- A summary at the end shows how many new jobs were found vs. already known.
+- Jobs you have already seen (by URL) are skipped automatically.
+
+If some jobs came back with short descriptions, click **Fill Missing Descriptions** to enrich them in the background while you work.
+
+See [Job Discovery](job-discovery.md) for search profile configuration and board details.
+
+---
+
+### 2. Review the Queue
+
+Navigate to **Job Review**. New jobs arrive with status `pending` and appear in the review queue.
+
+For each job you can:
+- **Approve** — sends it into the application pipeline
+- **Reject** — archives it out of the queue
+
+Sort by **Match Score** (high to low) to see the best keyword matches first. The match score compares the job description against your resume keywords — a rough signal, not a hard filter.
+
+Jobs with incoming email leads (a recruiter contacted you about this role) sort to the top automatically.
+
+See [Job Review](job-review.md) for sorting, keyword gaps, and bulk actions.
+
+---
+
+### 3. Write and Send Applications
+
+Navigate to **Apply Workspace**. All approved jobs appear here.
+
+For each job:
+1. Click **Generate Cover Letter** — runs as a background task using your resume and career summary.
+2. Read and edit the result. The generator uses your mission alignment notes when it detects company fit.
+3. Click **Export PDF** to save a formatted PDF to your documents directory.
+4. Apply externally (via the company site or board).
+5. Click **Mark Applied** to move the job into the Interviews kanban.
+
+See [Apply Workspace](apply-workspace.md) for cover letter configuration, PDF formatting, and ATS optimization.
+
+---
+
+### 4. Track Interviews
+
+The **Interviews** page is a kanban board. Jobs move through stages as your search progresses:
+
+```
+applied → phone_screen → interviewing → offer → hired
+```
+
+When a job moves to **phone_screen**, Peregrine automatically kicks off a company research brief in the background — a one-page summary of the company, recent news, leadership, and accessibility signals.
+
+Use **Interview Prep** to review talking points, practice Q&A, and get live reference cards during calls.
+
+See [Interviews](interviews.md) for stage transitions, research briefs, and prep tools.
+
+---
+
+## Managing Your Resume
+
+Peregrine has two resume views that work together:
+
+### Resume Library (`/resumes`)
+
+An archive of every resume version — uploaded originals, AI-optimised variants, and auto-backups. The starred entry is your **active default**.
+
+- **Import** a PDF, DOCX, ODT, or plain text file to add a version to the library.
+- **★ Set as Default** marks the entry as the active resume used for cover letter generation and keyword matching.
+- **⇩ Apply to profile** pushes a library entry into the structured Resume Profile (see below), and links it so future profile edits sync back automatically.
+
+### Resume Profile (`Settings → Resume Profile`)
+
+A structured editor for personal details, work experience, education, and skills. This is the data the cover letter generator reads directly.
+
+- When content was applied from the library, the view shows a sync status and date.
+- Saving the Resume Profile automatically updates the linked library entry — keeping them in sync without manual effort.
+- You can replace the current profile by uploading a new file directly from this view.
+
+**Recommended flow:** upload to the library → set as default → "Apply to profile" → edit in Resume Profile as needed. Your library stays current automatically.
+
+---
+
+## Keeping Search Preferences Fresh
+
+Go to **Settings → Search Prefs** to update what Peregrine searches for.
+
+Key fields:
+
+| Field | What it does |
+|-------|-------------|
+| Job Titles | The roles searched across all boards |
+| Locations | Geographic scope (leave blank for unrestricted) |
+| Remote only | Filter to remote positions only |
+| Exclude Keywords | Drop any job title containing these words before it enters the database |
+| Job Boards | Enable or disable specific sources |
+| Blocklists | Companies, industries, or locations to always skip |
+
+Click **Suggest** next to any field to get AI-generated suggestions based on your resume profile.
+
+Changes take effect on the next discovery run — no restart needed.
+
+---
+
+## Weekly Habits
+
+**Clean up the queue** — reject stale pending jobs at least once a week so the queue stays scannable.
+
+**Update your search prefs** — if you are getting too many mismatches, add more terms to Exclude Keywords. If the queue is thin, broaden Locations or add boards.
+
+**Check Interviews** — move any stalled jobs to the right stage so the kanban reflects reality. The research brief appears in Interview Prep once a job reaches `phone_screen`.
+
+**Tune your resume keywords** — go to **Settings → Skills** if you want to add or reweight keywords used for match scoring.
+
+---
+
+## Tips
+
+- **Match score is a triage signal, not a gate.** A score of 40 might be a perfect cultural fit that uses different terminology. Read the description.
+- **Cover letters improve with context.** The richer your career summary and mission alignment notes (Settings → My Profile), the more specific and accurate the generated letters.
+- **Company research auto-runs.** You do not need to request it manually — it starts the moment a job hits `phone_screen`.
+- **Everything is local.** Your database, resume, and application history live in `data/staging.db` and `data/config/`. Back them up like any other important file.
--- a/docs/user-guide/settings.md
+++ b/docs/user-guide/settings.md
@ -1,6 +1,8 @@
 # Settings

-The Settings page is accessible from the sidebar. It contains all configuration for Peregrine, organised into tabs.
+Access Settings from the sidebar. The page has a navigation panel on the left (desktop) or a chip bar at the top (mobile). Each section is described below.
+
+For an overview of how settings fit into your daily use, see [Daily Workflow](daily-workflow.md).

 ---

@ -10,143 +12,177 @@ Personal information used in cover letters, research briefs, and interview prep.

 | Field | Description |
 |-------|-------------|
-| Name | Your full name |
-| Email | Contact email address |
+| Full name | Your name as it appears in generated documents |
+| Email | Contact email |
 | Phone | Contact phone number |
-| LinkedIn | LinkedIn profile URL |
-| Career summary | 2–4 sentence professional summary |
-| NDA companies | Companies you cannot mention in research briefs (previous employers under NDA) |
-| Docs directory | Where PDFs and exported documents are saved (default: `~/Documents/JobSearch`) |
+| LinkedIn URL | Used in cover letter headers |
+| Career summary | 2–4 sentences that anchor all LLM-generated content |

 ### Mission Preferences

-Optional notes about industries you genuinely care about. When the cover letter generator detects alignment with one of these industries, it injects your note into paragraph 3 of the cover letter.
+Optional notes about industries you genuinely care about. When the cover letter generator detects alignment with one of these industries, it injects your note into the generated letter.

-| Field | Tag | Example |
-|-------|-----|---------|
-| Music industry note | `music` | "I've played in bands for 15 years and care deeply about how artists get paid" |
-| Animal welfare note | `animal_welfare` | "I volunteer at my local shelter every weekend" |
-| Education note | `education` | "I tutored underserved kids and care deeply about literacy" |
+| Field | Tag |
+|-------|-----|
+| Music industry note | `music` |
+| Animal welfare note | `animal_welfare` |
+| Education note | `education` |

 Leave a field blank to use a generic default when alignment is detected.

 ### Research Brief Preferences

-Controls optional sections in company research briefs. Both are for personal decision-making only and are never included in applications.
+Controls optional sections in company research briefs. Both are for personal decision-making only and never appear in applications.

-| Setting | Section added |
-|---------|--------------|
-| Candidate accessibility focus | Disability inclusion and accessibility signals (ADA, ERGs, WCAG) |
-| Candidate LGBTQIA+ focus | LGBTQIA+ inclusion signals (ERGs, non-discrimination policies, culture) |
-
---
-
-## Search
-
-Manage search profiles. Equivalent to editing `config/search_profiles.yaml` directly, but with a form UI.
-
- Add, edit, and delete profiles
- Configure titles, locations, boards, custom boards, exclude keywords, and mission tags
- Changes are saved to `config/search_profiles.yaml`
-
---
-
-## LLM Backends
-
-Configure which LLM backends Peregrine uses and in what order.
-
-| Setting | Description |
-|---------|-------------|
-| Enabled toggle | Whether a backend is considered in the fallback chain |
-| Base URL | API endpoint (for `openai_compat` backends) |
-| Model | Model name or `__auto__` (vLLM auto-detects the loaded model) |
-| API key | API key if required |
-| Test button | Sends a short ping to verify the backend is reachable |
-
-### Fallback chains
-
-Three independent fallback chains are configured:
-
-| Chain | Used for |
-|-------|---------|
-| `fallback_order` | Cover letter generation and general tasks |
-| `research_fallback_order` | Company research briefs |
-| `vision_fallback_order` | Survey screenshot analysis |
-
---
-
-## Notion
-
-Configure Notion integration credentials. Requires:
- Notion integration token (from [notion.so/my-integrations](https://www.notion.so/my-integrations))
- Database ID (from the Notion database URL)
-
-The field map controls which Notion properties correspond to which Peregrine fields. Edit `config/notion.yaml` directly for advanced field mapping.
-
---
-
-## Services
-
-Connection settings for local services:
-
-| Service | Default host:port |
-|---------|-----------------|
-| Ollama | localhost:11434 |
-| vLLM | localhost:8000 |
-| SearXNG | localhost:8888 |
-
-Each service has SSL and SSL-verify toggles for reverse-proxy setups.
+| Setting | Section added to brief |
+|---------|----------------------|
+| Accessibility focus | Disability inclusion signals (ADA, ERGs, WCAG) |
+| LGBTQIA+ focus | Inclusion signals (ERGs, non-discrimination policies) |

 ---

 ## Resume Profile

-Edit your parsed resume data (work experience, education, skills, certifications). This is the same data extracted during the first-run wizard Resume step.
+A structured editor for your work experience, education, skills, and personal details. This is the primary data source for cover letter generation.

-Changes here affect all future cover letter generations.
+### Resume vs. Library
+
+The Resume Profile is backed by a structured YAML file (`plain_text_resume.yaml`). The **Resume Library** (`/resumes`, accessible from the sidebar) is a versioned archive of full resume texts. They stay in sync automatically when you use the "Apply to profile" flow — see [Daily Workflow — Managing Your Resume](daily-workflow.md#managing-your-resume).
+
+### Uploading a resume
+
+If no profile exists yet, you can:
+
+- **Upload & Parse** — upload a PDF, DOCX, or ODT. Peregrine extracts structured data automatically.
+- **Fill in Manually** — start from a blank form.
+- **Run Setup Wizard** — re-enter the first-run wizard (self-hosted only).
+
+### Editing the profile
+
+When a resume exists, the full form is shown. Sections:
+
+- **Career Summary** — used in every cover letter and research brief
+- **Personal Information** — name, email, phone, LinkedIn; synced from My Profile
+- **Work Experience** — title, company, period, location, industry, responsibilities, skills
+- **Education** — institution, degree, field, dates
+- **Skills, Domains, Keywords** — tags used for keyword matching; click **Suggest** for AI recommendations
+- **Certifications and Achievements** — optional; included in cover letter context
+
+Click **Save** to write changes. If a default library entry is linked, it updates automatically.

 ---

-## Email
+## Search Prefs

-Configure IMAP email sync. See [Email Sync](email-sync.md) for full setup instructions.
+Manage what Peregrine searches for across all job boards. Changes take effect on the next discovery run — no restart needed.
+
+| Field | Description |
+|-------|-------------|
+| Remote preference | Remote only, on-site only, or both |
+| Job Titles | Roles searched on every board |
+| Locations | Geographic scope; leave blank for unrestricted |
+| Exclude Keywords | Drop any job title containing these words before it enters the database |
+| Job Boards | Enable or disable specific sources; boards marked "coming soon" are tracked in the backlog |
+| Custom Board URLs | Additional job board URLs to include |
+| Blocklists | Companies, industries, or locations to always skip |
+
+Click **Suggest** next to Job Titles, Locations, or Exclude Keywords to get AI-generated suggestions based on your resume.

 ---

-## Skills
+## Connections

-Manage your `config/resume_keywords.yaml` — the list of skills and keywords used for match scoring.
+API credentials and authentication for external services.

-Add or remove keywords. Higher-weighted keywords count more toward the match score.
+| Service | What it enables |
+|---------|----------------|
+| Notion | Sync approved/applied jobs to a Notion database |
+| Airtable | Alternative sync target |
+| Google Drive | Document export |
+| Slack / Discord | Status notifications |
+| Google Calendar / Apple Calendar | Interview scheduling (Paid) |
+
+See [Integrations](integrations.md) for per-service setup instructions.

 ---

-## Integrations
+## System

-Connection cards for all 13 integrations. See [Integrations](integrations.md) for per-service details.
+*Not available in cloud mode.*
+
+LLM backend configuration and service connection settings.
+
+### LLM Backends
+
+| Setting | Description |
+|---------|-------------|
+| Enabled toggle | Whether a backend is considered in the fallback chain |
+| Base URL | API endpoint for OpenAI-compatible backends |
+| Model | Model name or `__auto__` (vLLM auto-detects the loaded model) |
+| API key | Required for hosted APIs |
+| Test button | Sends a ping to verify the backend is reachable |
+
+Three independent fallback chains:
+
+| Chain | Used for |
+|-------|---------|
+| Cover letter chain | Cover letter generation and general tasks |
+| Research chain | Company research briefs |
+| Vision chain | Survey screenshot analysis |
+
+### Service Hosts and Ports
+
+Connection settings for Ollama, vLLM, and SearXNG. Each service has an SSL toggle and SSL-verify toggle for reverse-proxy setups.

 ---

 ## Fine-Tune

-**Tier: Premium**
+*Tier: Premium only.*

 Tools for fine-tuning a cover letter model on your personal writing style.

- Export cover letter training data as JSONL
- Configure training parameters (rank, epochs, learning rate)
- Start a fine-tuning run (requires `ogma` conda environment with Unsloth)
- Register the output model with Ollama
+1. **Export Training Data** — produces a JSONL file from your saved cover letters
+2. **Configure training** — rank, epochs, learning rate
+3. **Start fine-tune** — runs via the `ogma` conda environment with Unsloth
+4. **Register model** — adds the output to Ollama as `alex-cover-writer:latest`
+
+---
+
+## License
+
+View your current license key, tier, and entitlements. Paste a new key here if you are upgrading or replacing a key.
+
+---
+
+## Data
+
+*Not available in cloud mode.*
+
+Export or delete your local data.
+
+| Action | What it does |
+|--------|-------------|
+| Export | Downloads `staging.db` and config files as a zip |
+| Purge pending jobs | Deletes all jobs with status `pending` |
+| Purge rejected jobs | Deletes all jobs with status `rejected` |
+| Factory reset | Removes all data and config; returns to first-run wizard |
+
+---
+
+## Privacy
+
+Controls for data collection and diagnostic logging. All collection is opt-in.

 ---

 ## Developer

-Developer and debugging tools.
+Developer and debugging tools. Only visible when dev mode is enabled or a `dev_tier_override` is set.

 | Option | Description |
 |--------|-------------|
-| Reset wizard | Sets `wizard_complete: false` and `wizard_step: 0`; resumes at step 1 on next page load |
-| Dev tier override | Set `dev_tier_override` to `paid` or `premium` to test tier-gated features locally |
-| Clear stuck tasks | Manually sets any `running` or `queued` background tasks to `failed` (also runs on app startup) |
-| View raw config | Shows the current `config/user.yaml` contents |
+| Reset wizard | Sets `wizard_complete: false`; wizard restarts on next page load |
+| Dev tier override | Set tier to `paid` or `premium` to test tier-gated features locally |
+| Clear stuck tasks | Manually fails any `running` or `queued` background tasks |
+| View raw config | Shows current `user.yaml` contents |
--- a/environment.yml
+++ b/environment.yml
@ -23,8 +23,8 @@ dependencies:
    - undetected-chromedriver
    - webdriver-manager
    - beautifulsoup4
-    - requests
-    - curl_cffi           # Chrome TLS fingerprint — bypasses Cloudflare on The Ladders
+    - requests>=2.33.0        # CVE-2026-25645
+    - curl_cffi>=0.15.0       # CVE-2026-33752
    - fake-useragent      # company scraper rotation

    # ── LLM / AI backends ─────────────────────────────────────────────────────
@ -55,13 +55,16 @@ dependencies:
    - google-auth>=2.0

    # ── Document handling ─────────────────────────────────────────────────────
-    - pypdf
+    - pypdf>=6.12.0           # 12 CVEs in 6.7.x (CVE-2026-27628 through CVE-2026-48156)
    - pdfminer-six
    - pyyaml>=6.0
-    - python-dotenv
+    - python-dotenv>=1.2.2    # CVE-2026-28684

    # ── Auth / licensing ──────────────────────────────────────────────────────
-    - PyJWT>=2.8
+    - PyJWT>=2.13.0           # 2.11 has sig bypass CVEs (PYSEC-2026-120/175-179); used for cloud session routing
+
+    # ── Rate limiting ─────────────────────────────────────────────────────────
+    - slowapi>=0.1.9           # per-user rate limiting on LLM endpoints

    # ── Utilities ─────────────────────────────────────────────────────────────
    - sqlalchemy
@ -71,6 +74,18 @@ dependencies:
    - tenacity
    - httpx

+    # ── Security pins (transitive deps with known CVEs) ───────────────────────
+    - starlette>=1.0.1        # PYSEC-2026-161 (FastAPI foundation)
+    - python-multipart>=0.0.27  # CVE-2026-40347/42561 file upload parsing
+    - aiohttp>=3.14.0         # 12 CVEs (CVE-2026-34513 through CVE-2026-34993)
+    - tornado>=6.5.5          # CVE-2026-35536
+    - cryptography>=46.0.7    # PYSEC-2026-35/36
+    - langsmith>=0.8.0        # CVE-2026-41182/45134
+    - gitpython>=3.1.50       # CVE-2026-42215/42284/44244
+    - lxml>=6.1.0             # PYSEC-2026-87 (XXE)
+    - idna>=3.15              # CVE-2026-45409
+    - markdownify>=0.14.1     # CVE-2025-46656
+
    # ── Testing ───────────────────────────────────────────────────────────────
    - pytest>=9.0
    - pytest-cov
--- a/mkdocs.yml
+++ b/mkdocs.yml
@ -52,6 +52,7 @@ nav:
    - First-Run Wizard: getting-started/first-run-wizard.md
    - Docker Profiles: getting-started/docker-profiles.md
  - User Guide:
+    - Daily Workflow: user-guide/daily-workflow.md
    - Job Discovery: user-guide/job-discovery.md
    - Job Review: user-guide/job-review.md
    - Apply Workspace: user-guide/apply-workspace.md
--- a/pyproject.toml
+++ b/pyproject.toml
@ -6,9 +6,8 @@ exclude = ["app/"]
 [tool.ruff.lint.per-file-ignores]
 # dev-api.py / dev_api.py (symlink): E702 semicolons in compact Pydantic model
 # definitions — intentional style for dense data models with many simple fields.
-# E402: mid-file module-level imports are intentional in dev-api.py for test patchability.
-"dev-api.py" = ["E702", "E402"]
-"dev_api.py" = ["E702", "E402"]
+"dev-api.py" = ["E702"]
+"dev_api.py" = ["E702"]

 # finetune_local.py: E402 ML libs (torch, datasets, trl) are imported after
 # runtime CUDA / Unsloth availability checks — conditional import pattern.
--- a/requirements.txt
+++ b/requirements.txt
@ -91,6 +91,7 @@ mkdocs-material>=9.5
 # ── Vue SPA API backend ──────────────────────────────────────────────────
 fastapi>=0.100.0
 uvicorn[standard]>=0.20.0
+slowapi>=0.1.9
 PyJWT>=2.8.0
 cryptography>=40.0.0
 python-multipart>=0.0.6
--- a/resume_matcher/apps/backend/app/cloud_session.py
+++ b/resume_matcher/apps/backend/app/cloud_session.py
@ -10,23 +10,15 @@ Usage — add to main.py once:
    from app.cloud_session import session_middleware_dep
    app = FastAPI(..., dependencies=[Depends(session_middleware_dep)])

-From that point, any route (and every service/llm function it calls)
-has access to the current user context via llm.get_request_*() helpers.
-
-Writing model resolution order (first match wins):
-  1. USER_WRITING_MODELS env var  — JSON dict mapping Directus UUID → model name
-     e.g. USER_WRITING_MODELS={"5b99ca9f-...": "meghan-letter-writer:latest"}
-     Use this for Monday; no Heimdall changes required.
-  2. session.meta["custom_writing_model"]  — returned by Heimdall resolve endpoint
-     once Heimdall is updated to expose user_preferences fields.
+Writing model is resolved from Heimdall's resolve response (user_preferences
+JSON column, projected as custom_writing_model in the response).  Assign models
+via the admin UI at /account/admin/model-assignments.
 """
 from __future__ import annotations

-import json
 import logging
-import os

-from fastapi import Depends, Request, Response
+from fastapi import Request, Response

 from circuitforge_core.cloud_session import CloudSessionFactory, CloudUser, detect_byok

@ -34,21 +26,6 @@ log = logging.getLogger(__name__)

 __all__ = ["CloudUser", "get_session", "require_tier", "session_middleware_dep"]

-# JSON dict mapping Directus user UUID → custom writing model name.
-# Used until Heimdall's resolve endpoint exposes user_preferences.
-def _load_user_writing_models() -> dict[str, str]:
-    raw = os.environ.get("USER_WRITING_MODELS", "").strip()
-    if not raw:
-        return {}
-    try:
-        return json.loads(raw)
-    except json.JSONDecodeError:
-        log.warning("USER_WRITING_MODELS is not valid JSON — ignoring")
-        return {}
-
-_USER_WRITING_MODELS: dict[str, str] = _load_user_writing_models()
-
-
 _factory = CloudSessionFactory(
    product="peregrine",
    byok_detector=detect_byok,
@ -81,9 +58,4 @@ def session_middleware_dep(request: Request, response: Response) -> None:

    set_request_user_id(user_id)
    set_request_tier(session.tier)
-    # Resolution order: env-var map (Monday path) → Heimdall meta (future path)
-    writing_model = (
-        _USER_WRITING_MODELS.get(session.user_id)
-        or session.meta.get("custom_writing_model")
-    )
-    set_request_writing_model(writing_model)
+    set_request_writing_model(session.meta.get("custom_writing_model") or None)
--- a/resume_matcher/apps/backend/app/llm.py
+++ b/resume_matcher/apps/backend/app/llm.py
@ -152,6 +152,62 @@ async def _allocate_orch_async(
                logging.debug("cf-orch release failed (non-fatal): %s", exc)


+@asynccontextmanager
+async def _allocate_by_task(
+    coordinator_url: str,
+    product: str,
+    task: str,
+    ttl_s: float,
+    caller: str,
+):
+    """Allocate via the task-model assignment layer (POST /api/inference/task).
+
+    Resolves product+task → model_id → service+node automatically.
+    Falls back gracefully: if the coordinator returns 404 (no assignment),
+    raises RuntimeError so the caller can fall back to model_candidates routing.
+    """
+    async with httpx.AsyncClient(timeout=120.0) as client:
+        payload: dict[str, Any] = {
+            "product": product,
+            "task": task,
+            "payload": {"ttl_s": ttl_s, "caller": caller},
+        }
+        uid = get_request_user_id()
+        if uid:
+            payload["payload"]["user_id"] = uid
+        resp = await client.post(
+            f"{coordinator_url.rstrip('/')}/api/inference/task",
+            json=payload,
+        )
+        if resp.status_code == 404:
+            raise RuntimeError(
+                f"No task assignment for product={product!r} task={task!r}; "
+                "falling back to model_candidates routing"
+            )
+        if not resp.is_success:
+            raise RuntimeError(
+                f"cf-orch task allocation failed for {product}/{task}: "
+                f"HTTP {resp.status_code} — {resp.text[:200]}"
+            )
+        data = resp.json()
+        service = data.get("service_type", "vllm")
+        alloc = _OrchAllocation(
+            allocation_id=data["allocation_id"],
+            url=data["url"],
+            service=service,
+        )
+        try:
+            yield alloc
+        finally:
+            try:
+                await client.delete(
+                    f"{coordinator_url.rstrip('/')}/api/services/{service}/allocations/{alloc.allocation_id}",
+                    timeout=10.0,
+                )
+            except Exception as exc:
+                logging.debug("cf-orch task release failed (non-fatal): %s", exc)
+
+
 def _normalize_api_base(provider: str, api_base: str | None) -> str | None:
    """Normalize api_base for LiteLLM provider-specific expectations.

@ -497,11 +553,41 @@ async def complete(
    config: LLMConfig | None = None,
    max_tokens: int = 4096,
    temperature: float = 0.7,
+    task_name: str | None = None,
 ) -> str:
-    """Make a completion request to the LLM."""
+    """Make a completion request to the LLM.
+
+    When task_name is provided and CF_ORCH_URL is set, routing is resolved via
+    the task-model assignment layer (POST /api/inference/task) instead of using
+    hardcoded model_candidates.  Falls back to model_candidates routing if the
+    assignment is missing, then to the default config if cf-orch is unavailable.
+    """
    if config is None:
        cf_orch_url = os.environ.get("CF_ORCH_URL", "").strip()
        if cf_orch_url:
+            # Task-routing path: preferred when a task name is known.
+            if task_name:
+                try:
+                    async with _allocate_by_task(
+                        cf_orch_url,
+                        product="peregrine",
+                        task=task_name,
+                        ttl_s=300.0,
+                        caller="peregrine-resume-matcher",
+                    ) as alloc:
+                        orch_config = LLMConfig(
+                            provider="openai",
+                            model="__auto__",
+                            api_key="any",
+                            api_base=alloc.url.rstrip("/") + "/v1",
+                        )
+                        return await complete(prompt, system_prompt, orch_config, max_tokens, temperature)
+                except RuntimeError as exc:
+                    logging.warning(
+                        "cf-orch task routing failed for %r, falling back to model_candidates: %s",
+                        task_name, exc,
+                    )
+            # Model-candidates path: legacy routing or task fallback.
            try:
                # Premium/ultra users get their personal fine-tuned writing model as the
                # first candidate; the base model is the fallback so cf-orch can
--- a/scripts/db.py
+++ b/scripts/db.py
@ -121,6 +121,17 @@ CREATE TABLE IF NOT EXISTS survey_responses (
 );
 """

+CREATE_RESUME_CORRECTIONS = """
+CREATE TABLE IF NOT EXISTS resume_optimizer_corrections (
+    id            INTEGER PRIMARY KEY AUTOINCREMENT,
+    job_id        INTEGER NOT NULL REFERENCES jobs(id),
+    section       TEXT    NOT NULL,
+    proposed_json TEXT    NOT NULL,
+    accepted_json TEXT    NOT NULL,
+    created_at    TEXT    DEFAULT (datetime('now'))
+);
+"""
+
 CREATE_DIGEST_QUEUE = """
 CREATE TABLE IF NOT EXISTS digest_queue (
    id             INTEGER PRIMARY KEY,
@ -205,9 +216,10 @@ def _migrate_db(db_path: Path) -> None:
        conn.execute("ALTER TABLE background_tasks ADD COLUMN params TEXT")
    except sqlite3.OperationalError:
        pass  # column already exists
-    # Ensure references tables exist (CREATE IF NOT EXISTS is idempotent)
+    # Ensure tables that can't be added via ALTER TABLE exist (all idempotent).
    conn.execute(CREATE_REFERENCES)
    conn.execute(CREATE_JOB_REFERENCES)
+    conn.execute(CREATE_RESUME_CORRECTIONS)
    conn.commit()
    conn.close()

@ -223,6 +235,7 @@ def init_db(db_path: Path = DEFAULT_DB) -> None:
    conn.execute(CREATE_DIGEST_QUEUE)
    conn.execute(CREATE_REFERENCES)
    conn.execute(CREATE_JOB_REFERENCES)
+    conn.execute(CREATE_RESUME_CORRECTIONS)
    conn.commit()
    conn.close()
    _migrate_db(db_path)
@ -1241,3 +1254,76 @@ def set_training_exclusion(db_path: Path, job_id: int, excluded: bool) -> None:
        conn.commit()
    finally:
        conn.close()
+
+
+# ── Resume optimizer corrections ──────────────────────────────────────────────
+
+def save_resume_correction(
+    db_path: Path,
+    job_id: int,
+    section: str,
+    proposed: object,
+    accepted: object,
+) -> None:
+    """Persist a (proposed, accepted) correction pair from the resume review UI.
+
+    Called when a user edits an LLM-proposed value and accepts it. The pair
+    becomes a supervised fine-tuning (SFT) candidate routed through Avocet.
+
+    Args:
+        section: 'summary' or 'experience:<title>|<company>'
+        proposed: Original LLM output (string for summary, list for bullets).
+        accepted: User-edited value (same type as proposed).
+    """
+    import json as _json
+    conn = sqlite3.connect(db_path)
+    try:
+        conn.execute(
+            """INSERT INTO resume_optimizer_corrections
+               (job_id, section, proposed_json, accepted_json)
+               VALUES (?, ?, ?, ?)""",
+            (job_id, section, _json.dumps(proposed), _json.dumps(accepted)),
+        )
+        conn.commit()
+    finally:
+        conn.close()
+
+
+def get_resume_corrections(
+    db_path: Path,
+    limit: int = 200,
+    job_id: int | None = None,
+) -> list[dict]:
+    """Return pending resume corrections for Avocet export.
+
+    Args:
+        limit: Maximum rows to return.
+        job_id: If set, filter to corrections for a specific job.
+    """
+    import json as _json
+    conn = sqlite3.connect(db_path)
+    conn.row_factory = sqlite3.Row
+    try:
+        if job_id is not None:
+            rows = conn.execute(
+                "SELECT * FROM resume_optimizer_corrections WHERE job_id=? ORDER BY created_at DESC LIMIT ?",
+                (job_id, limit),
+            ).fetchall()
+        else:
+            rows = conn.execute(
+                "SELECT * FROM resume_optimizer_corrections ORDER BY created_at DESC LIMIT ?",
+                (limit,),
+            ).fetchall()
+    finally:
+        conn.close()
+    return [
+        {
+            "id": r["id"],
+            "job_id": r["job_id"],
+            "section": r["section"],
+            "proposed": _json.loads(r["proposed_json"]),
+            "accepted": _json.loads(r["accepted_json"]),
+            "created_at": r["created_at"],
+        }
+        for r in rows
+    ]
--- a/scripts/feedback_api.py
+++ b/scripts/feedback_api.py
@ -163,7 +163,8 @@ def _ensure_labels(

 def create_forgejo_issue(title: str, body: str, labels: list[str]) -> dict:
    """Create a Forgejo issue. Returns {"number": int, "url": str}."""
-    token = os.environ.get("FORGEJO_API_TOKEN", "")
+    # Use the bot token when set; fall back to the main API token for dev/self-hosted.
+    token = os.environ.get("FORGEJO_BOT_TOKEN") or os.environ.get("FORGEJO_API_TOKEN", "")
    repo = os.environ.get("FORGEJO_REPO", "pyr0ball/peregrine")
    base = os.environ.get("FORGEJO_API_URL", "https://git.opensourcesolarpunk.com/api/v1")
    headers = {"Authorization": f"token {token}", "Content-Type": "application/json"}
@ -183,7 +184,7 @@ def upload_attachment(
    issue_number: int, image_bytes: bytes, filename: str = "screenshot.png"
 ) -> str:
    """Upload a screenshot to an existing Forgejo issue. Returns attachment URL."""
-    token = os.environ.get("FORGEJO_API_TOKEN", "")
+    token = os.environ.get("FORGEJO_BOT_TOKEN") or os.environ.get("FORGEJO_API_TOKEN", "")
    repo = os.environ.get("FORGEJO_REPO", "pyr0ball/peregrine")
    base = os.environ.get("FORGEJO_API_URL", "https://git.opensourcesolarpunk.com/api/v1")
    headers = {"Authorization": f"token {token}"}
--- a/scripts/finetune_local.py
+++ b/scripts/finetune_local.py
@ -73,7 +73,7 @@ if not LETTERS_JSONL.exists():
    sys.exit(f"ERROR: Dataset not found at {LETTERS_JSONL}\n"
             "Run: make prepare-training  (or: python scripts/prepare_training_data.py)")

-records = [json.loads(l) for l in LETTERS_JSONL.read_text().splitlines() if l.strip()]
+records = [json.loads(line) for line in LETTERS_JSONL.read_text().splitlines() if line.strip()]
 print(f"Loaded {len(records)} training examples.")

 # Convert to chat format expected by SFTTrainer
--- a/scripts/rate_limit.py
+++ b/scripts/rate_limit.py
@ -0,0 +1,32 @@
+"""Per-user rate limiting for Peregrine LLM generation endpoints."""
+from pathlib import Path
+
+from slowapi import Limiter
+from slowapi.errors import RateLimitExceeded
+from slowapi.util import get_remote_address
+from starlette.requests import Request
+from starlette.responses import JSONResponse
+
+
+def _rate_key(request: Request) -> str:
+    """Cloud mode: user_id from DB path. Local mode: client IP. Demo: unique key (no rate limit)."""
+    from dev_api import IS_DEMO, _CLOUD_MODE, _request_db  # lazy import avoids circular
+    if IS_DEMO:
+        return f"demo-{id(request)}"  # unique per request — effectively no rate limiting
+    db_path = _request_db.get()
+    if _CLOUD_MODE and db_path:
+        return Path(db_path).parts[-3]  # user_id segment
+    return get_remote_address(request)
+
+
+limiter = Limiter(key_func=_rate_key)
+
+
+def rate_limit_exceeded_handler(request: Request, exc: RateLimitExceeded) -> JSONResponse:
+    """Return 429 with Retry-After header."""
+    retry_after = getattr(exc, "retry_after", 60)
+    return JSONResponse(
+        status_code=429,
+        content={"error": "rate_limit_exceeded", "retry_after": retry_after},
+        headers={"Retry-After": str(retry_after)},
+    )
--- a/scripts/resume_parser.py
+++ b/scripts/resume_parser.py
@ -19,6 +19,14 @@ from docx import Document

 log = logging.getLogger(__name__)

+# Browser print artifact patterns — lines injected when a PDF is printed from a browser
+# (print header "MM/DD/YY, H:MM AM/PM <title>" and print footer "file:///... N/N")
+_BROWSER_ARTIFACT_RE = re.compile(
+    r"^file:///"                                                      # file:// URL footer
+    r"|^\d{1,2}/\d{1,2}/\d{2,4},\s+\d{1,2}:\d{2}\s+[AP]M\b",       # MM/DD/YY, H:MM AM/PM header
+    re.I,
+)
+
 # ── Section header detection ──────────────────────────────────────────────────

 _SECTION_NAMES = {
@ -27,6 +35,8 @@ _SECTION_NAMES = {
    "education":  re.compile(r"^(education|academic|qualifications|degrees?|educational background|academic background)\s*:?\s*$", re.I),
    "skills":     re.compile(r"^(skills?|technical skills?|core competencies|competencies|expertise|areas? of expertise|key skills?|proficiencies|tools? & technologies)\s*:?\s*$", re.I),
    "achievements": re.compile(r"^(achievements?|accomplishments?|awards?|honors?|certifications?|publications?|volunteer)\s*:?\s*$", re.I),
+    "projects":     re.compile(r"^(projects?|independent development|independent projects?|side projects?|personal projects?|open.?source|portfolio)\s*:?\s*$", re.I),
+    "references": re.compile(r"^references?\s*:?\s*$", re.I),
 }

 # Degrees — used to detect education lines
@ -163,6 +173,8 @@ def _split_sections(text: str) -> dict[str, list[str]]:
        stripped = line.strip()
        if not stripped:
            continue
+        if _BROWSER_ARTIFACT_RE.match(stripped):
+            continue
        matched = False
        for section, pattern in _SECTION_NAMES.items():
            # Match if the line IS a section header (short + matches pattern)
@ -232,10 +244,14 @@ def _parse_experience(lines: list[str]) -> list[dict]:
      (A) Title | Company          (B) Title | Company | Dates
          Dates                        • bullet
          • bullet
+      (C) Title\tDates             (tab-separated, common in DOCX exports)
+          Company | Location
+          • bullet
    """
    entries: list[dict] = []
    current: dict | None = None
    prev_line = ""
+    seen_bullets = False  # True once we've appended the first bullet to current

    for line in lines:
        date_match = _DATE_RANGE_RE.search(line)
@ -243,12 +259,13 @@ def _parse_experience(lines: list[str]) -> list[dict]:
            if current:
                entries.append(current)
            # Title/company extraction — three layouts:
-            #  (A) Title on prev_line, "Company | Location | Dates" on date line
+            #  (A) Title on prev_line (not a bullet), "Company | Location | Dates" on date line
            #  (B) "Title | Company" on prev_line, dates on date line (same_line empty)
            #  (C) "Title | Company | Dates" all on one line
            same_line = _DATE_RANGE_RE.sub("", line)
            # Remove residual punctuation-only fragments like "()" left after date removal
            same_line = re.sub(r"[()[\]{}\s]+$", "", same_line).strip(" –—|-•")
+            # Only use prev_line as title if it isn't bullet text (cleared after bullets)
            if prev_line and same_line.strip():
                # Layout A: title = prev_line, company = first segment of same_line
                title   = prev_line.strip()
@ -268,8 +285,19 @@ def _parse_experience(lines: list[str]) -> list[dict]:
                "bullets":    [],
            }
            prev_line = ""
+            seen_bullets = False
        elif current is not None:
            is_bullet = bool(re.match(r"^[•\-–—*◦▪▸►]\s*", line))
+
+            # Layout C: company/location on the line immediately after the date line,
+            # before any bullets. Short non-date line = company, not a next-job header.
+            if (not is_bullet and not seen_bullets and not current["company"]
+                    and not _DATE_RE.search(line) and len(line.strip()) < 80):
+                co_part = re.split(r"\s{2,}|[|,]\s*", line.strip(), maxsplit=1)[0]
+                current["company"] = co_part.strip()
+                prev_line = ""
+                continue
+
            looks_like_header = (
                not is_bullet
                and " | " in line
@ -282,7 +310,10 @@ def _parse_experience(lines: list[str]) -> list[dict]:
                clean = re.sub(r"^[•\-–—*◦▪▸►]\s*", "", line).strip()
                if clean:
                    current["bullets"].append(clean)
-                prev_line = line
+                    seen_bullets = True
+                # Clear prev_line after non-header content so the next date match
+                # doesn't mistake a bullet as a job title (Layout A false-positive).
+                prev_line = ""
        else:
            prev_line = line

@ -294,39 +325,77 @@ def _parse_experience(lines: list[str]) -> list[dict]:

 # ── Education ─────────────────────────────────────────────────────────────────

+_INSTITUTION_RE = re.compile(r"\b(university|college|institute|school|academy)\b", re.I)
+
+
 def _parse_education(lines: list[str]) -> list[dict]:
+    """Parse education entries.
+
+    Primary path: degree keyword detected (B.S., Master, etc.)
+    Fallback path: year range detected without a degree keyword — handles resumes
+    with courses, programmes, or non-degree study (e.g. "San Jose State University  2005-2006").
+    """
    entries: list[dict] = []
    current: dict | None = None
    prev_line = ""

    for line in lines:
-        if _DEGREE_RE.search(line):
+        has_degree = bool(_DEGREE_RE.search(line))
+        date_range = _DATE_RANGE_RE.search(line)
+        has_year   = bool(re.search(r"\b(19|20)\d{2}\b", line))
+
+        if has_degree or (has_year and date_range):
            if current:
                entries.append(current)
-            current = {
-                "institution":      "",
-                "degree":           "",
-                "field":            "",
-                "graduation_year":  "",
-            }
+            current = {"institution": "", "degree": "", "field": "", "graduation_year": ""}
+
            year_m = re.search(r"\b(19|20)\d{2}\b", line)
            if year_m:
                current["graduation_year"] = year_m.group(0)
-            degree_m = _DEGREE_RE.search(line)
-            if degree_m:
-                current["degree"] = degree_m.group(0).upper()
-            remainder = _DEGREE_RE.sub("", _DATE_RE.sub("", line))
-            remainder = re.sub(r"\b(19|20)\d{2}\b", "", remainder)
-            current["field"] = remainder.strip(" ,–—|•.")
-            # Layout A: institution was on the line before the degree line
-            if prev_line and not _DEGREE_RE.search(prev_line):
-                current["institution"] = prev_line.strip(" ,–—|•")
-        elif current is not None and not current["institution"]:
-            # Layout B: institution follows the degree line
-            clean = line.strip(" ,–—|•")
+
+            if has_degree:
+                degree_m = _DEGREE_RE.search(line)
+                if degree_m:
+                    current["degree"] = degree_m.group(0).upper()
+                remainder = _DEGREE_RE.sub("", _DATE_RE.sub("", line))
+                remainder = re.sub(r"\b(19|20)\d{2}\b", "", remainder)
+                current["field"] = remainder.strip(" ,–—|•.")
+                if prev_line and not _DEGREE_RE.search(prev_line) and not _DATE_RE.search(prev_line):
+                    current["institution"] = prev_line.strip(" ,–—|•")
+            else:
+                # Fallback: year-range line without a degree keyword.
+                # Two layouts:
+                #   (A) PDF: "Graphic Design, 2005–2006" with institution on prev_line
+                #   (B) DOCX: "San Jose State University\t2005-2006" — institution on same line
+                same = _DATE_RANGE_RE.sub("", line)
+                same = re.sub(r"\b(19|20)\d{2}\b", "", same).strip(" ,–—|•\t")
+                prev_clean = prev_line.strip(" ,–—|•") if prev_line else ""
+
+                if same and _INSTITUTION_RE.search(prev_clean):
+                    # Layout A: institution on prev_line (e.g. "San Jose State University")
+                    current["institution"] = prev_clean
+                    current["field"] = same
+                elif same:
+                    # Layout B: institution embedded on same line as year
+                    current["institution"] = same
+                elif prev_clean:
+                    current["institution"] = prev_clean
+
+            prev_line = ""  # consumed; prevent leaking into the next entry
+
+        elif current is not None:
+            clean = line.strip(" ,–—|•\t")
            if clean:
-                current["institution"] = clean
-        prev_line = line.strip()
+                if not current["institution"]:
+                    current["institution"] = clean
+                elif not current["field"]:
+                    current["field"] = clean
+                    prev_line = ""  # field consumed — don't seed the next entry
+                    continue
+            prev_line = line.strip()
+
+        else:
+            prev_line = line.strip()

    if current:
        entries.append(current)
@ -336,13 +405,39 @@ def _parse_education(lines: list[str]) -> list[dict]:

 # ── Skills ────────────────────────────────────────────────────────────────────

+def _split_skill_tokens(line: str) -> list[str]:
+    """Split a skills line on delimiters, but not on commas inside parentheses.
+
+    Splits on |, •, ·, tab first (always separators), then on comma only when
+    paren depth is zero — so "CRM Ticketing (Jira, Salesforce)" stays intact.
+    """
+    tokens: list[str] = []
+    for part in re.split(r"[|•·\t]+", line):
+        depth, buf = 0, ""
+        for ch in part:
+            if ch == "(":
+                depth += 1
+                buf += ch
+            elif ch == ")":
+                depth -= 1
+                buf += ch
+            elif ch == "," and depth == 0:
+                tokens.append(buf)
+                buf = ""
+            else:
+                buf += ch
+        tokens.append(buf)
+    return tokens
+
+
 def _parse_skills(lines: list[str]) -> list[str]:
    skills: list[str] = []
    for line in lines:
-        # Split on common delimiters
-        for item in re.split(r"[,|•·/]+", line):
-            clean = item.strip(" -–—*◦▪▸►()")
-            if 1 < len(clean) <= 50:
+        for item in _split_skill_tokens(line):
+            # Strip only bullet/dash markers and whitespace, NOT parentheses —
+            # many skills contain parens, e.g. "C++ (Arduino / Embedded)"
+            clean = item.strip(" -–—*◦▪▸►")
+            if 1 < len(clean) <= 60:
                skills.append(clean)
    return skills

--- a/tests/conftest.py
+++ b/tests/conftest.py
@ -0,0 +1,18 @@
+"""Shared pytest fixtures for the Peregrine test suite."""
+import pytest
+
+
+@pytest.fixture(autouse=True)
+def reset_rate_limiter():
+    """Reset slowapi state before each test.
+
+    Each importlib.reload(dev_api) re-applies @limiter.limit() decorators,
+    accumulating stale registrations in _route_limits on the shared limiter
+    singleton. One real request then triggers N limit-checks (N = reload count),
+    exhausting per-hour budgets prematurely. Clearing both _storage and
+    _route_limits before each test gives each test a clean slate.
+    """
+    from scripts.rate_limit import limiter
+    limiter._storage.reset()
+    limiter._route_limits.clear()
+    yield
--- a/tests/test_dev_api_prep.py
+++ b/tests/test_dev_api_prep.py
@ -5,11 +5,12 @@ from fastapi.testclient import TestClient


@pytest.fixture
-def client():
-    import sys
-    sys.path.insert(0, "/Library/Development/CircuitForge/peregrine/.worktrees/feature-vue-spa")
-    from dev_api import app
-    return TestClient(app)
+def client(monkeypatch):
+    import importlib
+    monkeypatch.delenv("DEMO_MODE", raising=False)
+    import dev_api
+    importlib.reload(dev_api)
+    return TestClient(dev_api.app)


 # ── /api/jobs/{id}/research ─────────────────────────────────────────────────
--- a/tests/test_rate_limiting.py
+++ b/tests/test_rate_limiting.py
@ -0,0 +1,283 @@
+"""Tests for per-user rate limiting on LLM generation endpoints.
+
+Covers:
+- _rate_key() in demo mode returns unique per-request key (no rate limiting)
+- _rate_key() in cloud mode returns user_id segment from DB path
+- _rate_key() in local mode falls back to client IP address
+- rate_limit_exceeded_handler() returns 429 with Retry-After header
+- Integration: hitting rate limit on a decorated endpoint returns 429
+"""
+import json
+import sqlite3
+from pathlib import Path
+from unittest.mock import MagicMock, patch
+
+import pytest
+from fastapi.testclient import TestClient
+from limits import parse as _limits_parse
+from slowapi.errors import RateLimitExceeded
+from slowapi.wrappers import Limit as _LimitWrapper
+from starlette.requests import Request
+
+from scripts.rate_limit import _rate_key, rate_limit_exceeded_handler
+
+
+# ── Helpers ───────────────────────────────────────────────────────────────────
+
+def _make_request(client_ip: str = "1.2.3.4") -> MagicMock:
+    """Return a minimal mock Request with a client IP."""
+    req = MagicMock(spec=Request)
+    req.client = MagicMock()
+    req.client.host = client_ip
+    req.headers = {}
+    req.scope = {"type": "http"}
+    return req
+
+
+def _make_rate_limit_exceeded(spec: str = "20/hour") -> RateLimitExceeded:
+    """Construct a valid RateLimitExceeded (slowapi 0.1.9+ requires a Limit wrapper)."""
+    limit_item = _limits_parse(spec)
+    wrapper = _LimitWrapper(
+        limit=limit_item,
+        key_func=lambda r: "test",
+        scope=None,
+        per_method=False,
+        methods=None,
+        error_message=None,
+        exempt_when=None,
+        cost=1,
+        override_defaults=False,
+    )
+    return RateLimitExceeded(wrapper)
+
+
+# ── Fixtures ──────────────────────────────────────────────────────────────────
+
+@pytest.fixture()
+def tmp_db(tmp_path):
+    """Create a minimal staging.db in tmp_path and return its string path."""
+    db_path = tmp_path / "staging.db"
+    con = sqlite3.connect(str(db_path))
+    con.executescript("""
+        CREATE TABLE IF NOT EXISTS jobs (
+            id INTEGER PRIMARY KEY,
+            title TEXT, company TEXT, url TEXT, location TEXT,
+            is_remote INTEGER DEFAULT 0, salary TEXT,
+            match_score REAL, keyword_gaps TEXT, status TEXT,
+            interview_date TEXT, rejection_stage TEXT,
+            applied_at TEXT, phone_screen_at TEXT, interviewing_at TEXT,
+            offer_at TEXT, hired_at TEXT, survey_at TEXT
+        );
+        CREATE TABLE IF NOT EXISTS background_tasks (
+            id INTEGER PRIMARY KEY,
+            task_type TEXT,
+            job_id INTEGER,
+            status TEXT DEFAULT 'queued',
+            stage TEXT,
+            error TEXT,
+            params TEXT,
+            finished_at TEXT
+        );
+    """)
+    con.close()
+    return str(db_path)
+
+
+@pytest.fixture()
+def client(tmp_db, monkeypatch):
+    """TestClient wired to a fresh isolated DB."""
+    monkeypatch.setenv("STAGING_DB", tmp_db)
+    import dev_api
+    monkeypatch.setattr(dev_api, "DB_PATH", tmp_db)
+    monkeypatch.setattr(
+        dev_api,
+        "_request_db",
+        type("CV", (), {"get": lambda self: tmp_db, "set": lambda *a: None})(),
+    )
+    return TestClient(dev_api.app)
+
+
+# ── _rate_key(): demo mode ────────────────────────────────────────────────────
+
+class TestRateKeyDemoMode:
+    def test_returns_unique_key_per_request(self):
+        """In demo mode each request gets a unique key so no limiting occurs."""
+        req1 = _make_request()
+        req2 = _make_request()
+        with patch("dev_api.IS_DEMO", True), patch("dev_api._CLOUD_MODE", False):
+            key1 = _rate_key(req1)
+            key2 = _rate_key(req2)
+        assert key1.startswith("demo-")
+        assert key2.startswith("demo-")
+        assert key1 != key2  # unique per request object
+
+    def test_key_does_not_use_client_ip(self):
+        """Demo key must not equal the client IP."""
+        req = _make_request(client_ip="9.9.9.9")
+        with patch("dev_api.IS_DEMO", True), patch("dev_api._CLOUD_MODE", False):
+            key = _rate_key(req)
+        assert "9.9.9.9" not in key
+
+
+# ── _rate_key(): cloud mode ───────────────────────────────────────────────────
+
+class TestRateKeyCloudMode:
+    def test_returns_user_id_from_db_path(self, tmp_path):
+        """Cloud mode extracts user_id (3rd-from-end path segment)."""
+        cloud_db = str(tmp_path / "abc-user-123" / "peregrine" / "staging.db")
+        req = _make_request()
+        with (
+            patch("dev_api.IS_DEMO", False),
+            patch("dev_api._CLOUD_MODE", True),
+            patch("dev_api._request_db") as mock_cv,
+        ):
+            mock_cv.get.return_value = cloud_db
+            key = _rate_key(req)
+        assert key == "abc-user-123"
+
+    def test_falls_back_to_ip_when_db_path_is_none(self):
+        """Cloud mode without a DB path (unauthenticated) falls back to IP."""
+        req = _make_request(client_ip="10.0.0.1")
+        with (
+            patch("dev_api.IS_DEMO", False),
+            patch("dev_api._CLOUD_MODE", True),
+            patch("dev_api._request_db") as mock_cv,
+        ):
+            mock_cv.get.return_value = None
+            key = _rate_key(req)
+        assert key == "10.0.0.1"
+
+
+# ── _rate_key(): local mode ───────────────────────────────────────────────────
+
+class TestRateKeyLocalMode:
+    def test_returns_client_ip(self):
+        """Local (non-cloud, non-demo) mode uses the remote client IP."""
+        req = _make_request(client_ip="192.168.1.50")
+        with patch("dev_api.IS_DEMO", False), patch("dev_api._CLOUD_MODE", False):
+            key = _rate_key(req)
+        assert key == "192.168.1.50"
+
+    def test_different_ips_produce_different_keys(self):
+        """Two distinct client IPs produce distinct rate limit keys."""
+        req_a = _make_request(client_ip="10.0.0.1")
+        req_b = _make_request(client_ip="10.0.0.2")
+        with patch("dev_api.IS_DEMO", False), patch("dev_api._CLOUD_MODE", False):
+            key_a = _rate_key(req_a)
+            key_b = _rate_key(req_b)
+        assert key_a != key_b
+
+
+# ── rate_limit_exceeded_handler() ─────────────────────────────────────────────
+
+class TestRateLimitExceededHandler:
+    def test_returns_429_status(self):
+        """Handler always returns HTTP 429."""
+        req = _make_request()
+        exc = _make_rate_limit_exceeded("20/hour")
+        response = rate_limit_exceeded_handler(req, exc)
+        assert response.status_code == 429
+
+    def test_body_has_error_field(self):
+        """Response body includes error: rate_limit_exceeded."""
+        req = _make_request()
+        exc = _make_rate_limit_exceeded("20/hour")
+        response = rate_limit_exceeded_handler(req, exc)
+        body = json.loads(response.body)
+        assert body["error"] == "rate_limit_exceeded"
+
+    def test_body_has_retry_after_field(self):
+        """Response body includes retry_after value."""
+        req = _make_request()
+        exc = _make_rate_limit_exceeded("20/hour")
+        response = rate_limit_exceeded_handler(req, exc)
+        body = json.loads(response.body)
+        assert "retry_after" in body
+
+    def test_retry_after_header_present(self):
+        """Retry-After HTTP header is set on the response."""
+        req = _make_request()
+        exc = _make_rate_limit_exceeded("20/hour")
+        response = rate_limit_exceeded_handler(req, exc)
+        assert "Retry-After" in response.headers
+
+    def test_retry_after_header_matches_body(self):
+        """Retry-After header value matches the retry_after field in the body."""
+        req = _make_request()
+        exc = _make_rate_limit_exceeded("20/hour")
+        response = rate_limit_exceeded_handler(req, exc)
+        body = json.loads(response.body)
+        assert response.headers["Retry-After"] == str(body["retry_after"])
+
+
+# ── Integration: 429 on rate-limited endpoints ────────────────────────────────
+
+def _patch_limiter_to_raise(exc: RateLimitExceeded):
+    """Context manager: make the slowapi limiter fire for any request."""
+    return patch(
+        "slowapi.extension.Limiter._check_request_limit",
+        side_effect=exc,
+    )
+
+
+class TestRateLimitIntegration:
+    """Verify that when the limiter fires, the app returns 429 via the exception handler."""
+
+    def test_cover_letter_generate_returns_429_on_limit(self, client):
+        """When the rate limiter triggers, the cover letter endpoint returns 429."""
+        exc = _make_rate_limit_exceeded("20/hour")
+        with _patch_limiter_to_raise(exc):
+            resp = client.post("/api/jobs/1/cover_letter/generate")
+        assert resp.status_code == 429
+
+    def test_research_generate_returns_429_on_limit(self, client):
+        """When the rate limiter triggers, the research endpoint returns 429."""
+        exc = _make_rate_limit_exceeded("10/hour")
+        with _patch_limiter_to_raise(exc):
+            resp = client.post("/api/jobs/1/research/generate")
+        assert resp.status_code == 429
+
+    def test_qa_suggest_returns_429_on_limit(self, client):
+        """When the rate limiter triggers, the QA suggest endpoint returns 429."""
+        exc = _make_rate_limit_exceeded("60/hour")
+        with _patch_limiter_to_raise(exc):
+            resp = client.post(
+                "/api/jobs/1/qa/suggest",
+                json={"question": "Why do you want this job?", "items": []},
+            )
+        assert resp.status_code == 429
+
+    def test_survey_analyze_returns_429_on_limit(self, client):
+        """When the rate limiter triggers, the survey analyze endpoint returns 429."""
+        exc = _make_rate_limit_exceeded("30/hour")
+        with _patch_limiter_to_raise(exc):
+            resp = client.post(
+                "/api/jobs/1/survey/analyze",
+                json={"text": "Q: ...", "mode": "quick"},
+            )
+        assert resp.status_code == 429
+
+    def test_cover_letter_generate_succeeds_when_not_limited(self, client):
+        """Cover letter generate endpoint works normally when not rate-limited."""
+        with patch("scripts.task_runner.submit_task", return_value=(1, True)):
+            resp = client.post("/api/jobs/1/cover_letter/generate")
+        # 200 = task queued; 403 = demo/cloud guard; 404/422 = DB/payload issue
+        # Any non-5xx, non-429 response means the limiter did NOT block the request
+        assert resp.status_code in (200, 403, 404, 422)
+
+    def test_429_response_body_has_error_key(self, client):
+        """429 responses from rate-limited endpoints include the error key."""
+        exc = _make_rate_limit_exceeded("20/hour")
+        with _patch_limiter_to_raise(exc):
+            resp = client.post("/api/jobs/1/cover_letter/generate")
+        assert resp.status_code == 429
+        body = resp.json()
+        assert body.get("error") == "rate_limit_exceeded"
+
+    def test_429_response_has_retry_after_header(self, client):
+        """429 responses include a Retry-After header."""
+        exc = _make_rate_limit_exceeded("20/hour")
+        with _patch_limiter_to_raise(exc):
+            resp = client.post("/api/jobs/1/cover_letter/generate")
+        assert resp.status_code == 429
+        assert "retry-after" in resp.headers or "Retry-After" in resp.headers
--- a/tests/test_wizard_ai.py
+++ b/tests/test_wizard_ai.py
@ -0,0 +1,361 @@
+"""Tests for AI interview wizard endpoints (POST /api/wizard/ai/*)."""
+import json
+import sys
+import yaml
+import pytest
+from pathlib import Path
+from unittest.mock import patch, MagicMock
+
+# ── Path bootstrap ────────────────────────────────────────────────────────────
+_REPO = Path(__file__).parent.parent
+if str(_REPO) not in sys.path:
+    sys.path.insert(0, str(_REPO))
+
+
+@pytest.fixture(scope="module")
+def client():
+    from dev_api import app
+    from fastapi.testclient import TestClient
+    return TestClient(app)
+
+
+# ── Helpers ───────────────────────────────────────────────────────────────────
+
+def _write_user_yaml(path: Path, data: dict | None = None) -> None:
+    path.parent.mkdir(parents=True, exist_ok=True)
+    payload = data if data is not None else {}
+    path.write_text(yaml.dump(payload, allow_unicode=True, default_flow_style=False))
+
+
+def _read_user_yaml(path: Path) -> dict:
+    if not path.exists():
+        return {}
+    return yaml.safe_load(path.read_text()) or {}
+
+
+# ── GET /api/config/app — byokUnlocked field ──────────────────────────────────
+
+class TestAppConfigByokField:
+    def test_byok_unlocked_false_when_no_llm_configured(self, client, tmp_path):
+        yaml_path = tmp_path / "config" / "user.yaml"
+        _write_user_yaml(yaml_path, {"wizard_complete": True})
+        with patch("dev_api._user_yaml_path", return_value=str(yaml_path)):
+            with patch("app.wizard.tiers.has_configured_llm", return_value=False):
+                r = client.get("/api/config/app")
+        assert r.status_code == 200
+        assert r.json()["byokUnlocked"] is False
+
+    def test_byok_unlocked_true_when_llm_configured(self, client, tmp_path):
+        yaml_path = tmp_path / "config" / "user.yaml"
+        _write_user_yaml(yaml_path, {"wizard_complete": True})
+        with patch("dev_api._user_yaml_path", return_value=str(yaml_path)):
+            with patch("app.wizard.tiers.has_configured_llm", return_value=True):
+                r = client.get("/api/config/app")
+        assert r.status_code == 200
+        assert r.json()["byokUnlocked"] is True
+
+
+# ── POST /api/wizard/ai/interview — tier gate ─────────────────────────────────
+
+class TestWizardAIInterviewTierGate:
+    def test_returns_402_when_tier_blocked(self, client):
+        """Free tier with no BYOK: expect 402."""
+        with patch("dev_api._get_effective_tier", return_value="free"):
+            with patch("app.wizard.tiers.has_configured_llm", return_value=False):
+                r = client.post(
+                    "/api/wizard/ai/interview",
+                    json={"history": [{"role": "user", "content": "Hello"}]},
+                )
+        assert r.status_code == 402
+        assert r.json()["detail"]["error"] == "tier_required"
+
+    def test_returns_402_for_free_tier_without_byok(self, client):
+        """Explicit check that free tier without LLM configured is gated."""
+        with patch("dev_api._get_effective_tier", return_value="free"):
+            with patch("app.wizard.tiers.has_configured_llm", return_value=False):
+                r = client.post(
+                    "/api/wizard/ai/interview",
+                    json={"history": [], "profile_so_far": {}},
+                )
+        assert r.status_code == 402
+
+    def test_free_tier_with_byok_is_allowed(self, client):
+        """Free tier with BYOK configured: tier gate passes (mocked LLM response)."""
+        llm_reply = json.dumps({
+            "reply": "Hello! What's your name?",
+            "extracted_fields": {},
+            "complete": False,
+        })
+        with patch("dev_api._get_effective_tier", return_value="free"):
+            with patch("app.wizard.tiers.has_configured_llm", return_value=True):
+                with patch("scripts.llm_router.LLMRouter") as mock_cls:
+                    mock_cls.return_value.complete.return_value = llm_reply
+                    r = client.post(
+                        "/api/wizard/ai/interview",
+                        json={"history": [], "profile_so_far": {}},
+                    )
+        assert r.status_code == 200
+
+
+# ── POST /api/wizard/ai/interview — LLM mocked responses ─────────────────────
+
+class TestWizardAIInterviewLLM:
+    def _paid_byok_patches(self):
+        """Context managers for paid tier + BYOK."""
+        return (
+            patch("dev_api._get_effective_tier", return_value="paid"),
+            patch("app.wizard.tiers.has_configured_llm", return_value=True),
+        )
+
+    def test_returns_valid_reply_structure(self, client):
+        llm_reply = json.dumps({
+            "reply": "Great to meet you! What's your preferred contact email?",
+            "extracted_fields": {"name": "Alex Rivera"},
+            "complete": False,
+        })
+        with patch("dev_api._get_effective_tier", return_value="paid"):
+            with patch("app.wizard.tiers.has_configured_llm", return_value=True):
+                with patch("scripts.llm_router.LLMRouter") as mock_cls:
+                    mock_cls.return_value.complete.return_value = llm_reply
+                    r = client.post(
+                        "/api/wizard/ai/interview",
+                        json={
+                            "history": [
+                                {"role": "user", "content": "My name is Alex Rivera"},
+                            ],
+                        },
+                    )
+        assert r.status_code == 200
+        body = r.json()
+        assert body["reply"] == "Great to meet you! What's your preferred contact email?"
+        assert body["extracted_fields"] == {"name": "Alex Rivera"}
+        assert body["complete"] is False
+
+    def test_returns_complete_true_when_llm_signals_done(self, client):
+        llm_reply = json.dumps({
+            "reply": "You're all set! Your profile is complete.",
+            "extracted_fields": {
+                "name": "Alex",
+                "email": "alex@example.com",
+                "career_summary": "Backend engineer with 5 years experience.",
+            },
+            "complete": True,
+        })
+        with patch("dev_api._get_effective_tier", return_value="paid"):
+            with patch("app.wizard.tiers.has_configured_llm", return_value=True):
+                with patch("scripts.llm_router.LLMRouter") as mock_cls:
+                    mock_cls.return_value.complete.return_value = llm_reply
+                    r = client.post(
+                        "/api/wizard/ai/interview",
+                        json={
+                            "history": [
+                                {"role": "user", "content": "I'm done"},
+                            ],
+                        },
+                    )
+        assert r.status_code == 200
+        body = r.json()
+        assert body["complete"] is True
+        assert "name" in body["extracted_fields"]
+
+    def test_fallback_when_llm_returns_non_json(self, client):
+        """If LLM returns non-JSON, the endpoint still returns 200 with raw reply."""
+        with patch("dev_api._get_effective_tier", return_value="paid"):
+            with patch("app.wizard.tiers.has_configured_llm", return_value=True):
+                with patch("scripts.llm_router.LLMRouter") as mock_cls:
+                    mock_cls.return_value.complete.return_value = "Hello, what is your name?"
+                    r = client.post(
+                        "/api/wizard/ai/interview",
+                        json={"history": []},
+                    )
+        assert r.status_code == 200
+        body = r.json()
+        assert body["reply"] == "Hello, what is your name?"
+        assert body["extracted_fields"] == {}
+        assert body["complete"] is False
+
+    def test_history_passed_to_llm(self, client):
+        """Verify the history turns are included in the prompt sent to the LLM."""
+        llm_reply = json.dumps({"reply": "OK", "extracted_fields": {}, "complete": False})
+        captured_calls = []
+        with patch("dev_api._get_effective_tier", return_value="paid"):
+            with patch("app.wizard.tiers.has_configured_llm", return_value=True):
+                with patch("scripts.llm_router.LLMRouter") as mock_cls:
+                    mock_cls.return_value.complete.side_effect = (
+                        lambda prompt, system=None: (captured_calls.append(prompt) or llm_reply)
+                    )
+                    client.post(
+                        "/api/wizard/ai/interview",
+                        json={
+                            "history": [
+                                {"role": "user", "content": "I am Alex"},
+                                {"role": "assistant", "content": "Nice to meet you Alex!"},
+                                {"role": "user", "content": "My email is alex@test.com"},
+                            ],
+                        },
+                    )
+        assert len(captured_calls) == 1
+        prompt = captured_calls[0]
+        assert "I am Alex" in prompt
+        assert "alex@test.com" in prompt
+
+    def test_profile_so_far_injected_into_prompt(self, client):
+        """profile_so_far fields must appear in the prompt sent to the LLM."""
+        llm_reply = json.dumps({"reply": "Got it!", "extracted_fields": {}, "complete": False})
+        captured_calls = []
+        with patch("dev_api._get_effective_tier", return_value="paid"):
+            with patch("app.wizard.tiers.has_configured_llm", return_value=True):
+                with patch("scripts.llm_router.LLMRouter") as mock_cls:
+                    mock_cls.return_value.complete.side_effect = (
+                        lambda prompt, system=None: (captured_calls.append(prompt) or llm_reply)
+                    )
+                    client.post(
+                        "/api/wizard/ai/interview",
+                        json={
+                            "history": [
+                                {"role": "user", "content": "I am Alex"},
+                            ],
+                            "profile_so_far": {
+                                "name": "Alex Rivera",
+                                "email": "alex@example.com",
+                            },
+                        },
+                    )
+        assert len(captured_calls) == 1
+        prompt = captured_calls[0]
+        assert "Alex Rivera" in prompt
+        assert "alex@example.com" in prompt
+
+    def test_llm_error_returns_503(self, client):
+        """If LLM raises, the endpoint returns 503."""
+        with patch("dev_api._get_effective_tier", return_value="paid"):
+            with patch("app.wizard.tiers.has_configured_llm", return_value=True):
+                with patch("scripts.llm_router.LLMRouter") as mock_cls:
+                    mock_cls.return_value.complete.side_effect = RuntimeError("no backends")
+                    r = client.post(
+                        "/api/wizard/ai/interview",
+                        json={"history": [{"role": "user", "content": "hi"}]},
+                    )
+        assert r.status_code == 503
+
+
+# ── POST /api/wizard/ai/finalize ──────────────────────────────────────────────
+
+class TestWizardAIFinalize:
+    def test_merges_allowed_fields_into_user_yaml(self, client, tmp_path):
+        yaml_path = tmp_path / "config" / "user.yaml"
+        _write_user_yaml(yaml_path, {"tier": "paid", "wizard_complete": True})
+        with patch("dev_api._user_yaml_path", return_value=str(yaml_path)):
+            r = client.post(
+                "/api/wizard/ai/finalize",
+                json={
+                    "profile": {
+                        "name": "Jordan Lee",
+                        "email": "jordan@example.com",
+                        "career_summary": "Full-stack developer with 8 years experience.",
+                        "candidate_voice": "warm and conversational",
+                    }
+                },
+            )
+        assert r.status_code == 200
+        body = r.json()
+        assert body["saved"] is True
+        assert set(body["fields"]) == {"name", "email", "career_summary", "candidate_voice"}
+
+        saved = _read_user_yaml(yaml_path)
+        assert saved["name"] == "Jordan Lee"
+        assert saved["email"] == "jordan@example.com"
+        assert saved["career_summary"] == "Full-stack developer with 8 years experience."
+        assert saved["candidate_voice"] == "warm and conversational"
+
+    def test_does_not_clobber_existing_non_wizard_keys(self, client, tmp_path):
+        """Keys like tier, wizard_complete must not be overwritten by finalize."""
+        yaml_path = tmp_path / "config" / "user.yaml"
+        _write_user_yaml(yaml_path, {
+            "tier": "premium",
+            "wizard_complete": True,
+            "inference_profile": "single-gpu",
+        })
+        with patch("dev_api._user_yaml_path", return_value=str(yaml_path)):
+            r = client.post(
+                "/api/wizard/ai/finalize",
+                json={
+                    "profile": {
+                        "name": "Sam Park",
+                        "tier": "free",          # attempt to downgrade — must be blocked
+                        "wizard_complete": False,  # attempt to reset — must be blocked
+                    }
+                },
+            )
+        assert r.status_code == 200
+        saved = _read_user_yaml(yaml_path)
+        # Non-wizard keys are preserved
+        assert saved["tier"] == "premium"
+        assert saved["wizard_complete"] is True
+        assert saved["inference_profile"] == "single-gpu"
+        # Allowed wizard key is written
+        assert saved["name"] == "Sam Park"
+
+    def test_unknown_keys_are_silently_ignored(self, client, tmp_path):
+        yaml_path = tmp_path / "config" / "user.yaml"
+        _write_user_yaml(yaml_path, {})
+        with patch("dev_api._user_yaml_path", return_value=str(yaml_path)):
+            r = client.post(
+                "/api/wizard/ai/finalize",
+                json={
+                    "profile": {
+                        "email": "test@example.com",
+                        "injected_field": "should be ignored",
+                        "admin": True,
+                    }
+                },
+            )
+        assert r.status_code == 200
+        saved = _read_user_yaml(yaml_path)
+        assert saved["email"] == "test@example.com"
+        assert "injected_field" not in saved
+        assert "admin" not in saved
+
+    def test_all_allowed_fields_are_written(self, client, tmp_path):
+        """All allowed wizard fields are accepted when provided."""
+        yaml_path = tmp_path / "config" / "user.yaml"
+        _write_user_yaml(yaml_path, {})
+        full_profile = {
+            "name": "Casey Morgan",
+            "email": "casey@example.com",
+            "career_summary": "Designer turned product manager.",
+            "candidate_voice": "professional and direct",
+            "mission_preferences": ["education", "social_impact"],
+            "candidate_accessibility_focus": True,
+            "candidate_lgbtq_focus": True,
+            "linkedin": "https://linkedin.com/in/casey",
+        }
+        with patch("dev_api._user_yaml_path", return_value=str(yaml_path)):
+            r = client.post("/api/wizard/ai/finalize", json={"profile": full_profile})
+        assert r.status_code == 200
+        saved = _read_user_yaml(yaml_path)
+        for key, value in full_profile.items():
+            assert saved[key] == value, f"Expected {key}={value!r}, got {saved.get(key)!r}"
+
+    def test_empty_profile_returns_saved_true(self, client, tmp_path):
+        yaml_path = tmp_path / "config" / "user.yaml"
+        _write_user_yaml(yaml_path, {"name": "Existing"})
+        with patch("dev_api._user_yaml_path", return_value=str(yaml_path)):
+            r = client.post("/api/wizard/ai/finalize", json={"profile": {}})
+        assert r.status_code == 200
+        assert r.json()["saved"] is True
+        assert r.json()["fields"] == []
+        # Existing data is preserved
+        assert _read_user_yaml(yaml_path)["name"] == "Existing"
+
+    def test_mission_preferences_list_written_correctly(self, client, tmp_path):
+        yaml_path = tmp_path / "config" / "user.yaml"
+        _write_user_yaml(yaml_path, {})
+        with patch("dev_api._user_yaml_path", return_value=str(yaml_path)):
+            r = client.post(
+                "/api/wizard/ai/finalize",
+                json={"profile": {"mission_preferences": ["music", "animal_welfare"]}},
+            )
+        assert r.status_code == 200
+        saved = _read_user_yaml(yaml_path)
+        assert saved["mission_preferences"] == ["music", "animal_welfare"]
--- a/tools/label_tool.py
+++ b/tools/label_tool.py
@ -625,7 +625,7 @@ with tab_stats:
        st.markdown(f"**{len(labeled)} labeled emails total**")

        # Show known labels first, then any custom labels
-        all_display_labels = list(LABELS) + [l for l in counts if l not in LABELS]
+        all_display_labels = list(LABELS) + [lbl for lbl in counts if lbl not in LABELS]
        max_count = max(counts.values()) if counts else 1
        for lbl in all_display_labels:
            if lbl not in counts:
--- a/web/package-lock.json
+++ b/web/package-lock.json
@ -346,9 +346,9 @@
      }
    },
    "node_modules/@esbuild/aix-ppc64": {
-      "version": "0.27.4",
-      "resolved": "https://registry.npmjs.org/@esbuild/aix-ppc64/-/aix-ppc64-0.27.4.tgz",
-      "integrity": "sha512-cQPwL2mp2nSmHHJlCyoXgHGhbEPMrEEU5xhkcy3Hs/O7nGZqEpZ2sUtLaL9MORLtDfRvVl2/3PAuEkYZH0Ty8Q==",
+      "version": "0.27.7",
+      "resolved": "https://registry.npmjs.org/@esbuild/aix-ppc64/-/aix-ppc64-0.27.7.tgz",
+      "integrity": "sha512-EKX3Qwmhz1eMdEJokhALr0YiD0lhQNwDqkPYyPhiSwKrh7/4KRjQc04sZ8db+5DVVnZ1LmbNDI1uAMPEUBnQPg==",
      "cpu": [
        "ppc64"
      ],
@ -363,9 +363,9 @@
      }
    },
    "node_modules/@esbuild/android-arm": {
-      "version": "0.27.4",
-      "resolved": "https://registry.npmjs.org/@esbuild/android-arm/-/android-arm-0.27.4.tgz",
-      "integrity": "sha512-X9bUgvxiC8CHAGKYufLIHGXPJWnr0OCdR0anD2e21vdvgCI8lIfqFbnoeOz7lBjdrAGUhqLZLcQo6MLhTO2DKQ==",
+      "version": "0.27.7",
+      "resolved": "https://registry.npmjs.org/@esbuild/android-arm/-/android-arm-0.27.7.tgz",
+      "integrity": "sha512-jbPXvB4Yj2yBV7HUfE2KHe4GJX51QplCN1pGbYjvsyCZbQmies29EoJbkEc+vYuU5o45AfQn37vZlyXy4YJ8RQ==",
      "cpu": [
        "arm"
      ],
@ -380,9 +380,9 @@
      }
    },
    "node_modules/@esbuild/android-arm64": {
-      "version": "0.27.4",
-      "resolved": "https://registry.npmjs.org/@esbuild/android-arm64/-/android-arm64-0.27.4.tgz",
-      "integrity": "sha512-gdLscB7v75wRfu7QSm/zg6Rx29VLdy9eTr2t44sfTW7CxwAtQghZ4ZnqHk3/ogz7xao0QAgrkradbBzcqFPasw==",
+      "version": "0.27.7",
+      "resolved": "https://registry.npmjs.org/@esbuild/android-arm64/-/android-arm64-0.27.7.tgz",
+      "integrity": "sha512-62dPZHpIXzvChfvfLJow3q5dDtiNMkwiRzPylSCfriLvZeq0a1bWChrGx/BbUbPwOrsWKMn8idSllklzBy+dgQ==",
      "cpu": [
        "arm64"
      ],
@ -397,9 +397,9 @@
      }
    },
    "node_modules/@esbuild/android-x64": {
-      "version": "0.27.4",
-      "resolved": "https://registry.npmjs.org/@esbuild/android-x64/-/android-x64-0.27.4.tgz",
-      "integrity": "sha512-PzPFnBNVF292sfpfhiyiXCGSn9HZg5BcAz+ivBuSsl6Rk4ga1oEXAamhOXRFyMcjwr2DVtm40G65N3GLeH1Lvw==",
+      "version": "0.27.7",
+      "resolved": "https://registry.npmjs.org/@esbuild/android-x64/-/android-x64-0.27.7.tgz",
+      "integrity": "sha512-x5VpMODneVDb70PYV2VQOmIUUiBtY3D3mPBG8NxVk5CogneYhkR7MmM3yR/uMdITLrC1ml/NV1rj4bMJuy9MCg==",
      "cpu": [
        "x64"
      ],
@ -414,9 +414,9 @@
      }
    },
    "node_modules/@esbuild/darwin-arm64": {
-      "version": "0.27.4",
-      "resolved": "https://registry.npmjs.org/@esbuild/darwin-arm64/-/darwin-arm64-0.27.4.tgz",
-      "integrity": "sha512-b7xaGIwdJlht8ZFCvMkpDN6uiSmnxxK56N2GDTMYPr2/gzvfdQN8rTfBsvVKmIVY/X7EM+/hJKEIbbHs9oA4tQ==",
+      "version": "0.27.7",
+      "resolved": "https://registry.npmjs.org/@esbuild/darwin-arm64/-/darwin-arm64-0.27.7.tgz",
+      "integrity": "sha512-5lckdqeuBPlKUwvoCXIgI2D9/ABmPq3Rdp7IfL70393YgaASt7tbju3Ac+ePVi3KDH6N2RqePfHnXkaDtY9fkw==",
      "cpu": [
        "arm64"
      ],
@ -431,9 +431,9 @@
      }
    },
    "node_modules/@esbuild/darwin-x64": {
-      "version": "0.27.4",
-      "resolved": "https://registry.npmjs.org/@esbuild/darwin-x64/-/darwin-x64-0.27.4.tgz",
-      "integrity": "sha512-sR+OiKLwd15nmCdqpXMnuJ9W2kpy0KigzqScqHI3Hqwr7IXxBp3Yva+yJwoqh7rE8V77tdoheRYataNKL4QrPw==",
+      "version": "0.27.7",
+      "resolved": "https://registry.npmjs.org/@esbuild/darwin-x64/-/darwin-x64-0.27.7.tgz",
+      "integrity": "sha512-rYnXrKcXuT7Z+WL5K980jVFdvVKhCHhUwid+dDYQpH+qu+TefcomiMAJpIiC2EM3Rjtq0sO3StMV/+3w3MyyqQ==",
      "cpu": [
        "x64"
      ],
@ -448,9 +448,9 @@
      }
    },
    "node_modules/@esbuild/freebsd-arm64": {
-      "version": "0.27.4",
-      "resolved": "https://registry.npmjs.org/@esbuild/freebsd-arm64/-/freebsd-arm64-0.27.4.tgz",
-      "integrity": "sha512-jnfpKe+p79tCnm4GVav68A7tUFeKQwQyLgESwEAUzyxk/TJr4QdGog9sqWNcUbr/bZt/O/HXouspuQDd9JxFSw==",
+      "version": "0.27.7",
+      "resolved": "https://registry.npmjs.org/@esbuild/freebsd-arm64/-/freebsd-arm64-0.27.7.tgz",
+      "integrity": "sha512-B48PqeCsEgOtzME2GbNM2roU29AMTuOIN91dsMO30t+Ydis3z/3Ngoj5hhnsOSSwNzS+6JppqWsuhTp6E82l2w==",
      "cpu": [
        "arm64"
      ],
@ -465,9 +465,9 @@
      }
    },
    "node_modules/@esbuild/freebsd-x64": {
-      "version": "0.27.4",
-      "resolved": "https://registry.npmjs.org/@esbuild/freebsd-x64/-/freebsd-x64-0.27.4.tgz",
-      "integrity": "sha512-2kb4ceA/CpfUrIcTUl1wrP/9ad9Atrp5J94Lq69w7UwOMolPIGrfLSvAKJp0RTvkPPyn6CIWrNy13kyLikZRZQ==",
+      "version": "0.27.7",
+      "resolved": "https://registry.npmjs.org/@esbuild/freebsd-x64/-/freebsd-x64-0.27.7.tgz",
+      "integrity": "sha512-jOBDK5XEjA4m5IJK3bpAQF9/Lelu/Z9ZcdhTRLf4cajlB+8VEhFFRjWgfy3M1O4rO2GQ/b2dLwCUGpiF/eATNQ==",
      "cpu": [
        "x64"
      ],
@ -482,9 +482,9 @@
      }
    },
    "node_modules/@esbuild/linux-arm": {
-      "version": "0.27.4",
-      "resolved": "https://registry.npmjs.org/@esbuild/linux-arm/-/linux-arm-0.27.4.tgz",
-      "integrity": "sha512-aBYgcIxX/wd5n2ys0yESGeYMGF+pv6g0DhZr3G1ZG4jMfruU9Tl1i2Z+Wnj9/KjGz1lTLCcorqE2viePZqj4Eg==",
+      "version": "0.27.7",
+      "resolved": "https://registry.npmjs.org/@esbuild/linux-arm/-/linux-arm-0.27.7.tgz",
+      "integrity": "sha512-RkT/YXYBTSULo3+af8Ib0ykH8u2MBh57o7q/DAs3lTJlyVQkgQvlrPTnjIzzRPQyavxtPtfg0EopvDyIt0j1rA==",
      "cpu": [
        "arm"
      ],
@ -499,9 +499,9 @@
      }
    },
    "node_modules/@esbuild/linux-arm64": {
-      "version": "0.27.4",
-      "resolved": "https://registry.npmjs.org/@esbuild/linux-arm64/-/linux-arm64-0.27.4.tgz",
-      "integrity": "sha512-7nQOttdzVGth1iz57kxg9uCz57dxQLHWxopL6mYuYthohPKEK0vU0C3O21CcBK6KDlkYVcnDXY099HcCDXd9dA==",
+      "version": "0.27.7",
+      "resolved": "https://registry.npmjs.org/@esbuild/linux-arm64/-/linux-arm64-0.27.7.tgz",
+      "integrity": "sha512-RZPHBoxXuNnPQO9rvjh5jdkRmVizktkT7TCDkDmQ0W2SwHInKCAV95GRuvdSvA7w4VMwfCjUiPwDi0ZO6Nfe9A==",
      "cpu": [
        "arm64"
      ],
@ -516,9 +516,9 @@
      }
    },
    "node_modules/@esbuild/linux-ia32": {
-      "version": "0.27.4",
-      "resolved": "https://registry.npmjs.org/@esbuild/linux-ia32/-/linux-ia32-0.27.4.tgz",
-      "integrity": "sha512-oPtixtAIzgvzYcKBQM/qZ3R+9TEUd1aNJQu0HhGyqtx6oS7qTpvjheIWBbes4+qu1bNlo2V4cbkISr8q6gRBFA==",
+      "version": "0.27.7",
+      "resolved": "https://registry.npmjs.org/@esbuild/linux-ia32/-/linux-ia32-0.27.7.tgz",
+      "integrity": "sha512-GA48aKNkyQDbd3KtkplYWT102C5sn/EZTY4XROkxONgruHPU72l+gW+FfF8tf2cFjeHaRbWpOYa/uRBz/Xq1Pg==",
      "cpu": [
        "ia32"
      ],
@ -533,9 +533,9 @@
      }
    },
    "node_modules/@esbuild/linux-loong64": {
-      "version": "0.27.4",
-      "resolved": "https://registry.npmjs.org/@esbuild/linux-loong64/-/linux-loong64-0.27.4.tgz",
-      "integrity": "sha512-8mL/vh8qeCoRcFH2nM8wm5uJP+ZcVYGGayMavi8GmRJjuI3g1v6Z7Ni0JJKAJW+m0EtUuARb6Lmp4hMjzCBWzA==",
+      "version": "0.27.7",
+      "resolved": "https://registry.npmjs.org/@esbuild/linux-loong64/-/linux-loong64-0.27.7.tgz",
+      "integrity": "sha512-a4POruNM2oWsD4WKvBSEKGIiWQF8fZOAsycHOt6JBpZ+JN2n2JH9WAv56SOyu9X5IqAjqSIPTaJkqN8F7XOQ5Q==",
      "cpu": [
        "loong64"
      ],
@ -550,9 +550,9 @@
      }
    },
    "node_modules/@esbuild/linux-mips64el": {
-      "version": "0.27.4",
-      "resolved": "https://registry.npmjs.org/@esbuild/linux-mips64el/-/linux-mips64el-0.27.4.tgz",
-      "integrity": "sha512-1RdrWFFiiLIW7LQq9Q2NES+HiD4NyT8Itj9AUeCl0IVCA459WnPhREKgwrpaIfTOe+/2rdntisegiPWn/r/aAw==",
+      "version": "0.27.7",
+      "resolved": "https://registry.npmjs.org/@esbuild/linux-mips64el/-/linux-mips64el-0.27.7.tgz",
+      "integrity": "sha512-KabT5I6StirGfIz0FMgl1I+R1H73Gp0ofL9A3nG3i/cYFJzKHhouBV5VWK1CSgKvVaG4q1RNpCTR2LuTVB3fIw==",
      "cpu": [
        "mips64el"
      ],
@ -567,9 +567,9 @@
      }
    },
    "node_modules/@esbuild/linux-ppc64": {
-      "version": "0.27.4",
-      "resolved": "https://registry.npmjs.org/@esbuild/linux-ppc64/-/linux-ppc64-0.27.4.tgz",
-      "integrity": "sha512-tLCwNG47l3sd9lpfyx9LAGEGItCUeRCWeAx6x2Jmbav65nAwoPXfewtAdtbtit/pJFLUWOhpv0FpS6GQAmPrHA==",
+      "version": "0.27.7",
+      "resolved": "https://registry.npmjs.org/@esbuild/linux-ppc64/-/linux-ppc64-0.27.7.tgz",
+      "integrity": "sha512-gRsL4x6wsGHGRqhtI+ifpN/vpOFTQtnbsupUF5R5YTAg+y/lKelYR1hXbnBdzDjGbMYjVJLJTd2OFmMewAgwlQ==",
      "cpu": [
        "ppc64"
      ],
@ -584,9 +584,9 @@
      }
    },
    "node_modules/@esbuild/linux-riscv64": {
-      "version": "0.27.4",
-      "resolved": "https://registry.npmjs.org/@esbuild/linux-riscv64/-/linux-riscv64-0.27.4.tgz",
-      "integrity": "sha512-BnASypppbUWyqjd1KIpU4AUBiIhVr6YlHx/cnPgqEkNoVOhHg+YiSVxM1RLfiy4t9cAulbRGTNCKOcqHrEQLIw==",
+      "version": "0.27.7",
+      "resolved": "https://registry.npmjs.org/@esbuild/linux-riscv64/-/linux-riscv64-0.27.7.tgz",
+      "integrity": "sha512-hL25LbxO1QOngGzu2U5xeXtxXcW+/GvMN3ejANqXkxZ/opySAZMrc+9LY/WyjAan41unrR3YrmtTsUpwT66InQ==",
      "cpu": [
        "riscv64"
      ],
@ -601,9 +601,9 @@
      }
    },
    "node_modules/@esbuild/linux-s390x": {
-      "version": "0.27.4",
-      "resolved": "https://registry.npmjs.org/@esbuild/linux-s390x/-/linux-s390x-0.27.4.tgz",
-      "integrity": "sha512-+eUqgb/Z7vxVLezG8bVB9SfBie89gMueS+I0xYh2tJdw3vqA/0ImZJ2ROeWwVJN59ihBeZ7Tu92dF/5dy5FttA==",
+      "version": "0.27.7",
+      "resolved": "https://registry.npmjs.org/@esbuild/linux-s390x/-/linux-s390x-0.27.7.tgz",
+      "integrity": "sha512-2k8go8Ycu1Kb46vEelhu1vqEP+UeRVj2zY1pSuPdgvbd5ykAw82Lrro28vXUrRmzEsUV0NzCf54yARIK8r0fdw==",
      "cpu": [
        "s390x"
      ],
@ -618,9 +618,9 @@
      }
    },
    "node_modules/@esbuild/linux-x64": {
-      "version": "0.27.4",
-      "resolved": "https://registry.npmjs.org/@esbuild/linux-x64/-/linux-x64-0.27.4.tgz",
-      "integrity": "sha512-S5qOXrKV8BQEzJPVxAwnryi2+Iq5pB40gTEIT69BQONqR7JH1EPIcQ/Uiv9mCnn05jff9umq/5nqzxlqTOg9NA==",
+      "version": "0.27.7",
+      "resolved": "https://registry.npmjs.org/@esbuild/linux-x64/-/linux-x64-0.27.7.tgz",
+      "integrity": "sha512-hzznmADPt+OmsYzw1EE33ccA+HPdIqiCRq7cQeL1Jlq2gb1+OyWBkMCrYGBJ+sxVzve2ZJEVeePbLM2iEIZSxA==",
      "cpu": [
        "x64"
      ],
@ -635,9 +635,9 @@
      }
    },
    "node_modules/@esbuild/netbsd-arm64": {
-      "version": "0.27.4",
-      "resolved": "https://registry.npmjs.org/@esbuild/netbsd-arm64/-/netbsd-arm64-0.27.4.tgz",
-      "integrity": "sha512-xHT8X4sb0GS8qTqiwzHqpY00C95DPAq7nAwX35Ie/s+LO9830hrMd3oX0ZMKLvy7vsonee73x0lmcdOVXFzd6Q==",
+      "version": "0.27.7",
+      "resolved": "https://registry.npmjs.org/@esbuild/netbsd-arm64/-/netbsd-arm64-0.27.7.tgz",
+      "integrity": "sha512-b6pqtrQdigZBwZxAn1UpazEisvwaIDvdbMbmrly7cDTMFnw/+3lVxxCTGOrkPVnsYIosJJXAsILG9XcQS+Yu6w==",
      "cpu": [
        "arm64"
      ],
@ -652,9 +652,9 @@
      }
    },
    "node_modules/@esbuild/netbsd-x64": {
-      "version": "0.27.4",
-      "resolved": "https://registry.npmjs.org/@esbuild/netbsd-x64/-/netbsd-x64-0.27.4.tgz",
-      "integrity": "sha512-RugOvOdXfdyi5Tyv40kgQnI0byv66BFgAqjdgtAKqHoZTbTF2QqfQrFwa7cHEORJf6X2ht+l9ABLMP0dnKYsgg==",
+      "version": "0.27.7",
+      "resolved": "https://registry.npmjs.org/@esbuild/netbsd-x64/-/netbsd-x64-0.27.7.tgz",
+      "integrity": "sha512-OfatkLojr6U+WN5EDYuoQhtM+1xco+/6FSzJJnuWiUw5eVcicbyK3dq5EeV/QHT1uy6GoDhGbFpprUiHUYggrw==",
      "cpu": [
        "x64"
      ],
@ -669,9 +669,9 @@
      }
    },
    "node_modules/@esbuild/openbsd-arm64": {
-      "version": "0.27.4",
-      "resolved": "https://registry.npmjs.org/@esbuild/openbsd-arm64/-/openbsd-arm64-0.27.4.tgz",
-      "integrity": "sha512-2MyL3IAaTX+1/qP0O1SwskwcwCoOI4kV2IBX1xYnDDqthmq5ArrW94qSIKCAuRraMgPOmG0RDTA74mzYNQA9ow==",
+      "version": "0.27.7",
+      "resolved": "https://registry.npmjs.org/@esbuild/openbsd-arm64/-/openbsd-arm64-0.27.7.tgz",
+      "integrity": "sha512-AFuojMQTxAz75Fo8idVcqoQWEHIXFRbOc1TrVcFSgCZtQfSdc1RXgB3tjOn/krRHENUB4j00bfGjyl2mJrU37A==",
      "cpu": [
        "arm64"
      ],
@ -686,9 +686,9 @@
      }
    },
    "node_modules/@esbuild/openbsd-x64": {
-      "version": "0.27.4",
-      "resolved": "https://registry.npmjs.org/@esbuild/openbsd-x64/-/openbsd-x64-0.27.4.tgz",
-      "integrity": "sha512-u8fg/jQ5aQDfsnIV6+KwLOf1CmJnfu1ShpwqdwC0uA7ZPwFws55Ngc12vBdeUdnuWoQYx/SOQLGDcdlfXhYmXQ==",
+      "version": "0.27.7",
+      "resolved": "https://registry.npmjs.org/@esbuild/openbsd-x64/-/openbsd-x64-0.27.7.tgz",
+      "integrity": "sha512-+A1NJmfM8WNDv5CLVQYJ5PshuRm/4cI6WMZRg1by1GwPIQPCTs1GLEUHwiiQGT5zDdyLiRM/l1G0Pv54gvtKIg==",
      "cpu": [
        "x64"
      ],
@ -703,9 +703,9 @@
      }
    },
    "node_modules/@esbuild/openharmony-arm64": {
-      "version": "0.27.4",
-      "resolved": "https://registry.npmjs.org/@esbuild/openharmony-arm64/-/openharmony-arm64-0.27.4.tgz",
-      "integrity": "sha512-JkTZrl6VbyO8lDQO3yv26nNr2RM2yZzNrNHEsj9bm6dOwwu9OYN28CjzZkH57bh4w0I2F7IodpQvUAEd1mbWXg==",
+      "version": "0.27.7",
+      "resolved": "https://registry.npmjs.org/@esbuild/openharmony-arm64/-/openharmony-arm64-0.27.7.tgz",
+      "integrity": "sha512-+KrvYb/C8zA9CU/g0sR6w2RBw7IGc5J2BPnc3dYc5VJxHCSF1yNMxTV5LQ7GuKteQXZtspjFbiuW5/dOj7H4Yw==",
      "cpu": [
        "arm64"
      ],
@ -720,9 +720,9 @@
      }
    },
    "node_modules/@esbuild/sunos-x64": {
-      "version": "0.27.4",
-      "resolved": "https://registry.npmjs.org/@esbuild/sunos-x64/-/sunos-x64-0.27.4.tgz",
-      "integrity": "sha512-/gOzgaewZJfeJTlsWhvUEmUG4tWEY2Spp5M20INYRg2ZKl9QPO3QEEgPeRtLjEWSW8FilRNacPOg8R1uaYkA6g==",
+      "version": "0.27.7",
+      "resolved": "https://registry.npmjs.org/@esbuild/sunos-x64/-/sunos-x64-0.27.7.tgz",
+      "integrity": "sha512-ikktIhFBzQNt/QDyOL580ti9+5mL/YZeUPKU2ivGtGjdTYoqz6jObj6nOMfhASpS4GU4Q/Clh1QtxWAvcYKamA==",
      "cpu": [
        "x64"
      ],
@ -737,9 +737,9 @@
      }
    },
    "node_modules/@esbuild/win32-arm64": {
-      "version": "0.27.4",
-      "resolved": "https://registry.npmjs.org/@esbuild/win32-arm64/-/win32-arm64-0.27.4.tgz",
-      "integrity": "sha512-Z9SExBg2y32smoDQdf1HRwHRt6vAHLXcxD2uGgO/v2jK7Y718Ix4ndsbNMU/+1Qiem9OiOdaqitioZwxivhXYg==",
+      "version": "0.27.7",
+      "resolved": "https://registry.npmjs.org/@esbuild/win32-arm64/-/win32-arm64-0.27.7.tgz",
+      "integrity": "sha512-7yRhbHvPqSpRUV7Q20VuDwbjW5kIMwTHpptuUzV+AA46kiPze5Z7qgt6CLCK3pWFrHeNfDd1VKgyP4O+ng17CA==",
      "cpu": [
        "arm64"
      ],
@ -754,9 +754,9 @@
      }
    },
    "node_modules/@esbuild/win32-ia32": {
-      "version": "0.27.4",
-      "resolved": "https://registry.npmjs.org/@esbuild/win32-ia32/-/win32-ia32-0.27.4.tgz",
-      "integrity": "sha512-DAyGLS0Jz5G5iixEbMHi5KdiApqHBWMGzTtMiJ72ZOLhbu/bzxgAe8Ue8CTS3n3HbIUHQz/L51yMdGMeoxXNJw==",
+      "version": "0.27.7",
+      "resolved": "https://registry.npmjs.org/@esbuild/win32-ia32/-/win32-ia32-0.27.7.tgz",
+      "integrity": "sha512-SmwKXe6VHIyZYbBLJrhOoCJRB/Z1tckzmgTLfFYOfpMAx63BJEaL9ExI8x7v0oAO3Zh6D/Oi1gVxEYr5oUCFhw==",
      "cpu": [
        "ia32"
      ],
@ -771,9 +771,9 @@
      }
    },
    "node_modules/@esbuild/win32-x64": {
-      "version": "0.27.4",
-      "resolved": "https://registry.npmjs.org/@esbuild/win32-x64/-/win32-x64-0.27.4.tgz",
-      "integrity": "sha512-+knoa0BDoeXgkNvvV1vvbZX4+hizelrkwmGJBdT17t8FNPwG2lKemmuMZlmaNQ3ws3DKKCxpb4zRZEIp3UxFCg==",
+      "version": "0.27.7",
+      "resolved": "https://registry.npmjs.org/@esbuild/win32-x64/-/win32-x64-0.27.7.tgz",
+      "integrity": "sha512-56hiAJPhwQ1R4i+21FVF7V8kSD5zZTdHcVuRFMW0hn753vVfQN8xlx4uOPT4xoGH0Z/oVATuR82AiqSTDIpaHg==",
      "cpu": [
        "x64"
      ],
@ -2728,9 +2728,9 @@
      }
    },
    "node_modules/brace-expansion": {
-      "version": "2.0.2",
-      "resolved": "https://registry.npmjs.org/brace-expansion/-/brace-expansion-2.0.2.tgz",
-      "integrity": "sha512-Jt0vHyM+jmUBqojB7E1NIYadt0vI0Qxjxd2TErW94wDz+E2LAm5vKMXXwg6ZZBTHPuUlDgQHKXvjGBdfcF1ZDQ==",
+      "version": "2.1.1",
+      "resolved": "https://registry.npmjs.org/brace-expansion/-/brace-expansion-2.1.1.tgz",
+      "integrity": "sha512-WR1cURNjuvBLMZBMbqM0UoE+WAfdUcEV1ccD8PVBVOI+Z3ND4+SZbN8RsfT2bMuG1qwz5RFvPukSZm5fF2D5eA==",
      "dev": true,
      "license": "MIT",
      "dependencies": {
@ -2949,9 +2949,9 @@
      "license": "MIT"
    },
    "node_modules/defu": {
-      "version": "6.1.4",
-      "resolved": "https://registry.npmjs.org/defu/-/defu-6.1.4.tgz",
-      "integrity": "sha512-mEQCMmwJu317oSz8CwdIOdwf3xMif1ttiM8LTufzc3g6kR+9Pe236twL8j3IYT1F7GfRgGcW6MWxzZjLIkuHIg==",
+      "version": "6.1.7",
+      "resolved": "https://registry.npmjs.org/defu/-/defu-6.1.7.tgz",
+      "integrity": "sha512-7z22QmUWiQ/2d0KkdYmANbRUVABpZ9SNYyH5vx6PZ+nE5bcC0l7uFvEfHlyld/HcGBFTL536ClDt3DEcSlEJAQ==",
      "dev": true,
      "license": "MIT"
    },
@ -3032,9 +3032,9 @@
      "license": "MIT"
    },
    "node_modules/esbuild": {
-      "version": "0.27.4",
-      "resolved": "https://registry.npmjs.org/esbuild/-/esbuild-0.27.4.tgz",
-      "integrity": "sha512-Rq4vbHnYkK5fws5NF7MYTU68FPRE1ajX7heQ/8QXXWqNgqqJ/GkmmyxIzUnf2Sr/bakf8l54716CcMGHYhMrrQ==",
+      "version": "0.27.7",
+      "resolved": "https://registry.npmjs.org/esbuild/-/esbuild-0.27.7.tgz",
+      "integrity": "sha512-IxpibTjyVnmrIQo5aqNpCgoACA/dTKLTlhMHihVHhdkxKyPO1uBBthumT0rdHmcsk9uMonIWS0m4FljWzILh3w==",
      "dev": true,
      "hasInstallScript": true,
      "license": "MIT",
@ -3045,32 +3045,32 @@
        "node": ">=18"
      },
      "optionalDependencies": {
-        "@esbuild/aix-ppc64": "0.27.4",
-        "@esbuild/android-arm": "0.27.4",
-        "@esbuild/android-arm64": "0.27.4",
-        "@esbuild/android-x64": "0.27.4",
-        "@esbuild/darwin-arm64": "0.27.4",
-        "@esbuild/darwin-x64": "0.27.4",
-        "@esbuild/freebsd-arm64": "0.27.4",
-        "@esbuild/freebsd-x64": "0.27.4",
-        "@esbuild/linux-arm": "0.27.4",
-        "@esbuild/linux-arm64": "0.27.4",
-        "@esbuild/linux-ia32": "0.27.4",
-        "@esbuild/linux-loong64": "0.27.4",
-        "@esbuild/linux-mips64el": "0.27.4",
-        "@esbuild/linux-ppc64": "0.27.4",
-        "@esbuild/linux-riscv64": "0.27.4",
-        "@esbuild/linux-s390x": "0.27.4",
-        "@esbuild/linux-x64": "0.27.4",
-        "@esbuild/netbsd-arm64": "0.27.4",
-        "@esbuild/netbsd-x64": "0.27.4",
-        "@esbuild/openbsd-arm64": "0.27.4",
-        "@esbuild/openbsd-x64": "0.27.4",
-        "@esbuild/openharmony-arm64": "0.27.4",
-        "@esbuild/sunos-x64": "0.27.4",
-        "@esbuild/win32-arm64": "0.27.4",
-        "@esbuild/win32-ia32": "0.27.4",
-        "@esbuild/win32-x64": "0.27.4"
+        "@esbuild/aix-ppc64": "0.27.7",
+        "@esbuild/android-arm": "0.27.7",
+        "@esbuild/android-arm64": "0.27.7",
+        "@esbuild/android-x64": "0.27.7",
+        "@esbuild/darwin-arm64": "0.27.7",
+        "@esbuild/darwin-x64": "0.27.7",
+        "@esbuild/freebsd-arm64": "0.27.7",
+        "@esbuild/freebsd-x64": "0.27.7",
+        "@esbuild/linux-arm": "0.27.7",
+        "@esbuild/linux-arm64": "0.27.7",
+        "@esbuild/linux-ia32": "0.27.7",
+        "@esbuild/linux-loong64": "0.27.7",
+        "@esbuild/linux-mips64el": "0.27.7",
+        "@esbuild/linux-ppc64": "0.27.7",
+        "@esbuild/linux-riscv64": "0.27.7",
+        "@esbuild/linux-s390x": "0.27.7",
+        "@esbuild/linux-x64": "0.27.7",
+        "@esbuild/netbsd-arm64": "0.27.7",
+        "@esbuild/netbsd-x64": "0.27.7",
+        "@esbuild/openbsd-arm64": "0.27.7",
+        "@esbuild/openbsd-x64": "0.27.7",
+        "@esbuild/openharmony-arm64": "0.27.7",
+        "@esbuild/sunos-x64": "0.27.7",
+        "@esbuild/win32-arm64": "0.27.7",
+        "@esbuild/win32-ia32": "0.27.7",
+        "@esbuild/win32-x64": "0.27.7"
      }
    },
    "node_modules/estree-walker": {
@ -3325,14 +3325,11 @@
      }
    },
    "node_modules/js-cookie": {
-      "version": "3.0.5",
-      "resolved": "https://registry.npmjs.org/js-cookie/-/js-cookie-3.0.5.tgz",
-      "integrity": "sha512-cEiJEAEoIbWfCZYKWhVwFuvPX1gETRYPw6LlaTKoxD3s2AkXzkCjnp6h0V77ozyqj0jakteJ4YqDJT830+lVGw==",
+      "version": "3.0.8",
+      "resolved": "https://registry.npmjs.org/js-cookie/-/js-cookie-3.0.8.tgz",
+      "integrity": "sha512-yeJd4aNAdYZQjaon2bpD/Gb0B/omw7HQOsynXXcOiWVCacbBcPlgn8S/d1X6blFSaHao7ozqtW7NZW19xpCtIw==",
      "dev": true,
-      "license": "MIT",
-      "engines": {
-        "node": ">=14"
-      }
+      "license": "MIT"
    },
    "node_modules/jsdom": {
      "version": "28.1.0",
@ -3500,9 +3497,9 @@
      }
    },
    "node_modules/marked": {
-      "version": "18.0.0",
-      "resolved": "https://registry.npmjs.org/marked/-/marked-18.0.0.tgz",
-      "integrity": "sha512-2e7Qiv/HJSXj8rDEpgTvGKsP8yYtI9xXHKDnrftrmnrJPaFNM7VRb2YCzWaX4BP1iCJ/XPduzDJZMFoqTCcIMA==",
+      "version": "18.0.5",
+      "resolved": "https://registry.npmjs.org/marked/-/marked-18.0.5.tgz",
+      "integrity": "sha512-S6GcvALHg6K4ohtu4E7x0a1AqhAjp6cV8KhLSyN9qVapnzJkusVBxZRcIU9AeYsbe6P1hKDusSbEOzGyyuce6w==",
      "license": "MIT",
      "bin": {
        "marked": "bin/marked.js"
@ -3586,9 +3583,9 @@
      "license": "MIT"
    },
    "node_modules/nanoid": {
-      "version": "3.3.11",
-      "resolved": "https://registry.npmjs.org/nanoid/-/nanoid-3.3.11.tgz",
-      "integrity": "sha512-N8SpfPUnUp1bK+PMYW8qSWdl9U+wwNWI4QKxOYDy9JAro3WMX7p2OeVRF9v+347pnakNevPmiHhNmZ2HbFA76w==",
+      "version": "3.3.12",
+      "resolved": "https://registry.npmjs.org/nanoid/-/nanoid-3.3.12.tgz",
+      "integrity": "sha512-ZB9RH/39qpq5Vu6Y+NmUaFhQR6pp+M2Xt76XBnEwDaGcVAqhlvxrl3B2bKS5D3NH3QR76v3aSrKaF/Kiy7lEtQ==",
      "funding": [
        {
          "type": "github",
@ -3787,9 +3784,9 @@
      "license": "ISC"
    },
    "node_modules/picomatch": {
-      "version": "4.0.3",
-      "resolved": "https://registry.npmjs.org/picomatch/-/picomatch-4.0.3.tgz",
-      "integrity": "sha512-5gTmgEY/sqK6gFXLIsQNH19lWb4ebPDLA4SdLP7dsWkIXHWlG66oPuVvXSGFPppYZz8ZDZq0dYYrbHfBCVUb1Q==",
+      "version": "4.0.4",
+      "resolved": "https://registry.npmjs.org/picomatch/-/picomatch-4.0.4.tgz",
+      "integrity": "sha512-QP88BAKvMam/3NxH6vj2o21R6MjxZUAd6nlwAS/pnGvN9IVLocLHxGYIzFhg6fUQ+5th6P4dv4eW9jX3DSIj7A==",
      "license": "MIT",
      "engines": {
        "node": ">=12"
@ -3831,9 +3828,9 @@
      }
    },
    "node_modules/postcss": {
-      "version": "8.5.8",
-      "resolved": "https://registry.npmjs.org/postcss/-/postcss-8.5.8.tgz",
-      "integrity": "sha512-OW/rX8O/jXnm82Ey1k44pObPtdblfiuWnrd8X7GJ7emImCOstunGbXUpp7HdBrFQX6rJzn3sPT397Wp5aCwCHg==",
+      "version": "8.5.15",
+      "resolved": "https://registry.npmjs.org/postcss/-/postcss-8.5.15.tgz",
+      "integrity": "sha512-FfR8sjd4em2T6fb3I2MwAJU7HWVMr9zba+enmQeeWFfCbm+UOC/0X4DS8XtpUTMwWMGbjKYP7xjfNekzyGmB3A==",
      "funding": [
        {
          "type": "opencollective",
@ -3850,7 +3847,7 @@
      ],
      "license": "MIT",
      "dependencies": {
-        "nanoid": "^3.3.11",
+        "nanoid": "^3.3.12",
        "picocolors": "^1.1.1",
        "source-map-js": "^1.2.1"
      },
@ -4484,9 +4481,9 @@
      }
    },
    "node_modules/vite": {
-      "version": "7.3.1",
-      "resolved": "https://registry.npmjs.org/vite/-/vite-7.3.1.tgz",
-      "integrity": "sha512-w+N7Hifpc3gRjZ63vYBXA56dvvRlNWRczTdmCBBa+CotUzAPf5b7YMdMR/8CQoeYE5LX3W4wj6RYTgonm1b9DA==",
+      "version": "7.3.5",
+      "resolved": "https://registry.npmjs.org/vite/-/vite-7.3.5.tgz",
+      "integrity": "sha512-KuOaNhcnGFN2zIPGA7wRmzF+lJA1sea7rHq17aiJ++9lzY1WWG6Jpwqwe1KNbRVPIqHmr8GLYx7jbrQcN/7/ww==",
      "dev": true,
      "license": "MIT",
      "dependencies": {
@ -4987,9 +4984,9 @@
      "license": "MIT"
    },
    "node_modules/yaml": {
-      "version": "2.8.2",
-      "resolved": "https://registry.npmjs.org/yaml/-/yaml-2.8.2.tgz",
-      "integrity": "sha512-mplynKqc1C2hTVYxd0PU2xQAc22TI1vShAYGksCCfxbn/dFwnHTNi1bvYsBTkhdUNtGIf5xNOg938rrSSYvS9A==",
+      "version": "2.9.0",
+      "resolved": "https://registry.npmjs.org/yaml/-/yaml-2.9.0.tgz",
+      "integrity": "sha512-2AvhNX3mb8zd6Zy7INTtSpl1F15HW6Wnqj0srWlkKLcpYl/gMIMJiyuGq2KeI2YFxUPjdlB+3Lc10seMLtL4cA==",
      "license": "ISC",
      "bin": {
        "yaml": "bin.mjs"
--- a/web/src/composables/useDocsUrl.ts
+++ b/web/src/composables/useDocsUrl.ts
@ -0,0 +1,5 @@
+const DOCS_BASE = 'https://docs.circuitforge.tech/peregrine'
+
+export function useDocsUrl(path: string): string {
+  return `${DOCS_BASE}/${path}`
+}
--- a/web/src/router/index.ts
+++ b/web/src/router/index.ts
@ -37,6 +37,8 @@ export const router = createRouter({
        { path: 'developer',   component: () => import('../views/settings/DeveloperView.vue') },
      ],
    },
+    // AI profile wizard — post-setup settings entry point (correctly blocked by wizard gate during onboarding)
+    { path: '/wizard/ai-profile', component: () => import('../views/wizard/WizardAIView.vue') },
    // Onboarding wizard — full-page layout, no AppNav
    {
      path: '/setup',
--- a/web/src/stores/appConfig.ts
+++ b/web/src/stores/appConfig.ts
@ -2,7 +2,7 @@ import { ref } from 'vue'
 import { defineStore } from 'pinia'
 import { useApiFetch } from '../composables/useApi'

-export type Tier = 'free' | 'paid' | 'premium' | 'ultra'
+export type Tier = 'free' | 'paid' | 'premium'
 export type InferenceProfile = 'remote' | 'cpu' | 'single-gpu' | 'dual-gpu'

 export const useAppConfigStore = defineStore('appConfig', () => {
@ -13,6 +13,7 @@ export const useAppConfigStore = defineStore('appConfig', () => {
  const inferenceProfile = ref<InferenceProfile>('cpu')
  const isDemo = ref(false)
  const wizardComplete = ref(true)  // optimistic default — guard corrects on load
+  const byokUnlocked = ref(false)
  const loaded = ref(false)
  const devTierOverride = ref(localStorage.getItem('dev_tier_override') ?? '')

@ -20,7 +21,7 @@ export const useAppConfigStore = defineStore('appConfig', () => {
    const { data } = await useApiFetch<{
      isCloud: boolean; isDemo: boolean; isDevMode: boolean; tier: Tier
      contractedClient: boolean; inferenceProfile: InferenceProfile
-      wizardComplete: boolean
+      wizardComplete: boolean; byokUnlocked: boolean
    }>('/api/config/app')
    if (!data) return
    isCloud.value = data.isCloud
@ -30,6 +31,7 @@ export const useAppConfigStore = defineStore('appConfig', () => {
    contractedClient.value = data.contractedClient
    inferenceProfile.value = data.inferenceProfile
    wizardComplete.value = data.wizardComplete ?? true
+    byokUnlocked.value = data.byokUnlocked ?? false
    loaded.value = true
  }

@ -43,5 +45,5 @@ export const useAppConfigStore = defineStore('appConfig', () => {
    }
  }

-  return { isCloud, isDemo, isDevMode, wizardComplete, tier, contractedClient, inferenceProfile, loaded, load, devTierOverride, setDevTierOverride }
+  return { isCloud, isDemo, isDevMode, wizardComplete, byokUnlocked, tier, contractedClient, inferenceProfile, loaded, load, devTierOverride, setDevTierOverride }
 })
--- a/web/src/stores/settings/sync.ts
+++ b/web/src/stores/settings/sync.ts
@ -0,0 +1,57 @@
+import { defineStore } from 'pinia'
+import { ref } from 'vue'
+import { useApiFetch } from '../../composables/useApi'
+
+export const SYNC_DATA_CLASSES = [
+  { key: 'peregrine:dismissed', label: 'Dismissed job IDs',        description: 'Hides jobs you have already reviewed across devices.' },
+  { key: 'peregrine:drafts',    label: 'Cover letter drafts',      description: 'Saves in-progress drafts so you can continue on another device.' },
+] as const
+
+export type SyncDataClass = typeof SYNC_DATA_CLASSES[number]['key']
+
+export interface SyncPref {
+  data_class: string
+  enabled:    boolean
+}
+
+export const useSyncStore = defineStore('sync', () => {
+  const prefs   = ref<Record<string, boolean>>({})
+  const loading = ref(false)
+  const saving  = ref<string | null>(null)
+  const wiping  = ref(false)
+  const error   = ref<string | null>(null)
+
+  async function loadPrefs() {
+    loading.value = true
+    error.value   = null
+    const { data, error: err } = await useApiFetch<SyncPref[]>('/sync/prefs')
+    loading.value = false
+    if (err) { error.value = 'Failed to load sync preferences.'; return }
+    const map: Record<string, boolean> = {}
+    for (const p of data ?? []) map[p.data_class] = p.enabled
+    prefs.value = map
+  }
+
+  async function setPref(dataClass: string, enabled: boolean) {
+    saving.value  = dataClass
+    error.value   = null
+    const { error: err } = await useApiFetch('/sync/prefs', {
+      method: 'PATCH',
+      body: JSON.stringify({ data_class: dataClass, enabled }),
+    })
+    saving.value = null
+    if (err) { error.value = `Failed to update sync preference for ${dataClass}.`; return }
+    prefs.value = { ...prefs.value, [dataClass]: enabled }
+  }
+
+  async function wipeAll() {
+    wiping.value = true
+    error.value  = null
+    const { error: err } = await useApiFetch('/sync/all', { method: 'DELETE' })
+    wiping.value = false
+    if (err) { error.value = 'Failed to delete sync data.'; return }
+    prefs.value = {}
+  }
+
+  return { prefs, loading, saving, wiping, error, loadPrefs, setPref, wipeAll }
+})
--- a/web/src/stores/wizard.ts
+++ b/web/src/stores/wizard.ts
@ -2,7 +2,7 @@ import { ref, computed } from 'vue'
 import { defineStore } from 'pinia'
 import { useApiFetch } from '../composables/useApi'

-export type WizardProfile = 'remote' | 'cpu' | 'single-gpu' | 'dual-gpu'
+export type WizardProfile = 'remote' | 'cpu' | 'single-gpu' | 'dual-gpu' | 'cf-orch'
 export type WizardTier = 'free' | 'paid' | 'premium'

 export interface WorkExperience {
@ -36,6 +36,7 @@ export interface WizardInferenceData {
  anthropicKey: string
  openaiUrl: string
  openaiKey: string
+  orchUrl: string
  ollamaHost: string
  ollamaPort: number
  services: Record<string, string | number>
@ -90,7 +91,8 @@ export const useWizardStore = defineStore('wizard', () => {
    anthropicKey: '',
    openaiUrl: '',
    openaiKey: '',
-    ollamaHost: 'localhost',
+    orchUrl: '',
+    ollamaHost: '',
    ollamaPort: 11434,
    services: {},
    confirmed: false,
@ -127,6 +129,7 @@ export const useWizardStore = defineStore('wizard', () => {
        wizard_step: number
        saved_data: {
          inference_profile?: string
+          cf_orch_url?: string
          tier?: string
          name?: string
          email?: string
@ -143,6 +146,8 @@ export const useWizardStore = defineStore('wizard', () => {

      if (saved.inference_profile)
        hardware.value.selectedProfile = saved.inference_profile as WizardProfile
+      if (saved.cf_orch_url)
+        inference.value.orchUrl = saved.cf_orch_url as string
      if (saved.tier)
        tier.value = saved.tier as WizardTier
      if (saved.name) identity.value.name = saved.name
@ -222,6 +227,7 @@ export const useWizardStore = defineStore('wizard', () => {
      anthropic_key: inference.value.anthropicKey,
      openai_url: inference.value.openaiUrl,
      openai_key: inference.value.openaiKey,
+      orch_url: inference.value.orchUrl,
      ollama_host: inference.value.ollamaHost,
      ollama_port: inference.value.ollamaPort,
    }
--- a/web/src/stores/wizard/tests/aiInterview.test.ts
+++ b/web/src/stores/wizard/tests/aiInterview.test.ts
@ -0,0 +1,217 @@
+import { describe, it, expect, vi, beforeEach } from 'vitest'
+import { setActivePinia, createPinia } from 'pinia'
+import { useAiInterviewStore } from '../aiInterview'
+
+vi.mock('../../../composables/useApi', () => ({ useApiFetch: vi.fn() }))
+import { useApiFetch } from '../../../composables/useApi'
+const mockFetch = vi.mocked(useApiFetch)
+
+const LS_KEY = 'peregrine:wizard-draft'
+
+describe('useAiInterviewStore', () => {
+  beforeEach(() => {
+    setActivePinia(createPinia())
+    vi.clearAllMocks()
+    localStorage.clear()
+  })
+
+  // ── restore() ──────────────────────────────────────────────────────────────
+
+  it('restore() loads messages, fields, and complete from localStorage', () => {
+    const draft = {
+      messages: [{ role: 'assistant', content: 'Hello!' }],
+      fields: { name: 'Alice' },
+      complete: true,
+    }
+    localStorage.setItem(LS_KEY, JSON.stringify(draft))
+
+    const store = useAiInterviewStore()
+    store.restore()
+
+    expect(store.messages).toEqual(draft.messages)
+    expect(store.fields).toEqual(draft.fields)
+    expect(store.complete).toBe(true)
+  })
+
+  it('restore() is a no-op when localStorage is empty', () => {
+    const store = useAiInterviewStore()
+    store.restore()
+    expect(store.messages).toEqual([])
+    expect(store.fields).toEqual({})
+    expect(store.complete).toBe(false)
+  })
+
+  it('restore() ignores corrupted localStorage data without throwing', () => {
+    localStorage.setItem(LS_KEY, '{not valid json}}}')
+    const store = useAiInterviewStore()
+    expect(() => store.restore()).not.toThrow()
+    expect(store.messages).toEqual([])
+  })
+
+  // ── send() ─────────────────────────────────────────────────────────────────
+
+  it('send() appends user message and assistant reply on success', async () => {
+    mockFetch.mockResolvedValue({
+      data: { reply: 'Nice to meet you!', extracted_fields: {}, complete: false },
+      error: null,
+    })
+
+    const store = useAiInterviewStore()
+    await store.send('Hello')
+
+    expect(store.messages).toEqual([
+      { role: 'user', content: 'Hello' },
+      { role: 'assistant', content: 'Nice to meet you!' },
+    ])
+    expect(store.complete).toBe(false)
+    expect(store.error).toBeNull()
+  })
+
+  it('send() does not add a user bubble for empty string (intro trigger)', async () => {
+    mockFetch.mockResolvedValue({
+      data: { reply: 'Welcome!', extracted_fields: {}, complete: false },
+      error: null,
+    })
+
+    const store = useAiInterviewStore()
+    await store.send('')
+
+    expect(store.messages).toEqual([
+      { role: 'assistant', content: 'Welcome!' },
+    ])
+  })
+
+  it('send() merges extracted_fields into existing fields', async () => {
+    mockFetch.mockResolvedValueOnce({
+      data: { reply: 'Got it.', extracted_fields: { name: 'Alice' }, complete: false },
+      error: null,
+    })
+    mockFetch.mockResolvedValueOnce({
+      data: { reply: 'Thanks.', extracted_fields: { title: 'Engineer' }, complete: false },
+      error: null,
+    })
+
+    const store = useAiInterviewStore()
+    await store.send('My name is Alice')
+    await store.send('I am an engineer')
+
+    expect(store.fields).toEqual({ name: 'Alice', title: 'Engineer' })
+  })
+
+  it('send() sets complete flag when backend signals done', async () => {
+    mockFetch.mockResolvedValue({
+      data: { reply: 'All done!', extracted_fields: { name: 'Alice' }, complete: true },
+      error: null,
+    })
+
+    const store = useAiInterviewStore()
+    await store.send('done')
+
+    expect(store.complete).toBe(true)
+  })
+
+  it('send() sets error and rolls back loading on API failure', async () => {
+    mockFetch.mockResolvedValue({ data: null, error: { kind: 'network', message: 'fail' } })
+
+    const store = useAiInterviewStore()
+    await store.send('Hello')
+
+    expect(store.error).toBe('Could not reach the assistant. Please try again.')
+    expect(store.loading).toBe(false)
+  })
+
+  it('send() persists draft to localStorage on success', async () => {
+    mockFetch.mockResolvedValue({
+      data: { reply: 'Hi!', extracted_fields: { name: 'Bob' }, complete: false },
+      error: null,
+    })
+
+    const store = useAiInterviewStore()
+    await store.send('Hello')
+
+    const stored = JSON.parse(localStorage.getItem(LS_KEY) ?? '{}')
+    expect(stored.fields).toEqual({ name: 'Bob' })
+  })
+
+  // ── finalize() ─────────────────────────────────────────────────────────────
+
+  it('finalize() calls the finalize API and clears localStorage on success', async () => {
+    localStorage.setItem(LS_KEY, JSON.stringify({ messages: [], fields: { name: 'Alice' }, complete: true }))
+    mockFetch.mockResolvedValue({ data: {}, error: null })
+
+    const store = useAiInterviewStore()
+    const ok = await store.finalize()
+
+    expect(ok).toBe(true)
+    expect(localStorage.getItem(LS_KEY)).toBeNull()
+    expect(store.saving).toBe(false)
+  })
+
+  it('finalize() returns false and sets error on API failure', async () => {
+    mockFetch.mockResolvedValue({ data: null, error: { kind: 'network', message: 'fail' } })
+
+    const store = useAiInterviewStore()
+    const ok = await store.finalize()
+
+    expect(ok).toBe(false)
+    expect(store.error).toBe('Failed to save profile. Please try again.')
+  })
+
+  // ── skip() ─────────────────────────────────────────────────────────────────
+
+  it('skip() sends the skip signal to the backend', async () => {
+    mockFetch.mockResolvedValue({
+      data: { reply: 'No problem, moving on.', extracted_fields: {}, complete: false },
+      error: null,
+    })
+
+    const store = useAiInterviewStore()
+    await store.skip()
+
+    expect(mockFetch).toHaveBeenCalledWith(
+      '/api/wizard/ai/interview',
+      expect.objectContaining({ method: 'POST' }),
+    )
+    const body = JSON.parse((mockFetch.mock.calls[0][1] as { body: string }).body)
+    expect(body.history[0]).toEqual({ role: 'user', content: 'skip' })
+  })
+
+  // ── keepChatting() ─────────────────────────────────────────────────────────
+
+  it('keepChatting() clears the complete flag without resetting messages', async () => {
+    mockFetch.mockResolvedValue({
+      data: { reply: 'All done!', extracted_fields: { name: 'Alice' }, complete: true },
+      error: null,
+    })
+
+    const store = useAiInterviewStore()
+    await store.send('done')
+    expect(store.complete).toBe(true)
+
+    store.keepChatting()
+
+    expect(store.complete).toBe(false)
+    expect(store.messages.length).toBeGreaterThan(0)
+    expect(store.fields).toEqual({ name: 'Alice' })
+  })
+
+  // ── startOver() ────────────────────────────────────────────────────────────
+
+  it('startOver() resets all state and clears localStorage', async () => {
+    mockFetch.mockResolvedValue({
+      data: { reply: 'Hi!', extracted_fields: { name: 'Alice' }, complete: true },
+      error: null,
+    })
+
+    const store = useAiInterviewStore()
+    await store.send('test')  // populates state and localStorage
+
+    store.startOver()
+
+    expect(store.messages).toEqual([])
+    expect(store.fields).toEqual({})
+    expect(store.complete).toBe(false)
+    expect(store.error).toBeNull()
+    expect(localStorage.getItem(LS_KEY)).toBeNull()
+  })
+})
--- a/web/src/stores/wizard/aiInterview.ts
+++ b/web/src/stores/wizard/aiInterview.ts
@ -0,0 +1,113 @@
+import { defineStore } from 'pinia'
+import { ref } from 'vue'
+import { useApiFetch } from '../../composables/useApi'
+
+const LS_KEY     = 'peregrine:wizard-draft'
+const SKIP_SIGNAL = 'skip'
+
+export interface ChatMessage {
+  role: 'user' | 'assistant'
+  content: string
+}
+
+export const useAiInterviewStore = defineStore('aiInterview', () => {
+  const messages    = ref<ChatMessage[]>([])
+  const fields      = ref<Record<string, unknown>>({})
+  const complete    = ref(false)
+  const loading     = ref(false)
+  const saving      = ref(false)
+  const error       = ref<string | null>(null)
+
+  function _persist() {
+    localStorage.setItem(LS_KEY, JSON.stringify({
+      messages: messages.value,
+      fields: fields.value,
+      complete: complete.value,
+    }))
+  }
+
+  function restore() {
+    try {
+      const raw = localStorage.getItem(LS_KEY)
+      if (!raw) return
+      const d = JSON.parse(raw) as { messages?: ChatMessage[]; fields?: Record<string, unknown>; complete?: boolean }
+      messages.value = d.messages  ?? []
+      fields.value   = d.fields    ?? {}
+      complete.value = d.complete  ?? false
+    } catch { /* ignore corrupted draft */ }
+  }
+
+  async function send(userText: string) {
+    if (loading.value) return
+    if (userText !== '') {
+      messages.value = [...messages.value, { role: 'user', content: userText }]
+      _persist()
+    }
+    loading.value = true
+    error.value   = null
+    const { data, error: err } = await useApiFetch<{
+      reply: string
+      extracted_fields: Record<string, unknown>
+      complete: boolean
+    }>('/api/wizard/ai/interview', {
+      method: 'POST',
+      body: JSON.stringify({ history: messages.value, profile_so_far: fields.value }),
+    })
+    loading.value = false
+    if (err || !data) {
+      if (err?.kind === 'http' && err.status === 402) {
+        error.value = 'AI profile assistant requires a Paid plan or a BYOK API key.'
+      } else if (err?.kind === 'http' && err.status === 503) {
+        try {
+          const body = JSON.parse(err.detail) as { detail?: { error?: string } }
+          error.value = body.detail?.error === 'llm_error'
+            ? 'No LLM backend configured — add an API key in Settings → System first.'
+            : 'Could not reach the assistant. Please try again.'
+        } catch {
+          error.value = 'Could not reach the assistant. Please try again.'
+        }
+      } else {
+        error.value = 'Could not reach the assistant. Please try again.'
+      }
+      return
+    }
+    messages.value = [...messages.value, { role: 'assistant', content: data.reply }]
+    fields.value   = { ...fields.value, ...data.extracted_fields }
+    complete.value = data.complete
+    _persist()
+  }
+
+  async function finalize(): Promise<boolean> {
+    saving.value = true
+    error.value  = null
+    const { error: err } = await useApiFetch('/api/wizard/ai/finalize', {
+      method: 'POST',
+      body: JSON.stringify({ profile: fields.value }),
+    })
+    saving.value = false
+    if (err) {
+      error.value = 'Failed to save profile. Please try again.'
+      return false
+    }
+    localStorage.removeItem(LS_KEY)
+    return true
+  }
+
+  function skip() {
+    return send(SKIP_SIGNAL)
+  }
+
+  function keepChatting() {
+    complete.value = false
+  }
+
+  function startOver() {
+    messages.value = []
+    fields.value   = {}
+    complete.value = false
+    error.value    = null
+    localStorage.removeItem(LS_KEY)
+  }
+
+  return { messages, fields, complete, loading, saving, error, restore, send, skip, finalize, keepChatting, startOver }
+})
--- a/web/src/views/HomeView.vue
+++ b/web/src/views/HomeView.vue
@ -12,7 +12,10 @@
          {{ greeting }}
          <span v-if="isMidnight" aria-label="Late night session">🌙</span>
        </h1>
-        <p class="home__subtitle">Discover → Review → Apply</p>
+        <p class="home__subtitle">
+          Discover → Review → Apply
+          <a href="https://docs.circuitforge.tech/peregrine/user-guide/daily-workflow/" target="_blank" rel="noopener" class="home__docs-link" aria-label="Daily Workflow documentation">Daily Workflow guide ↗</a>
+        </p>
      </div>
    </header>

@ -600,7 +603,22 @@ onUnmounted(() => {
  font-size: var(--text-sm);
  text-transform: uppercase;
  letter-spacing: 0.05em;
+  display: flex;
+  align-items: center;
+  gap: var(--space-3);
+  flex-wrap: wrap;
 }
+.home__docs-link {
+  font-size: 0.7rem;
+  text-transform: none;
+  letter-spacing: 0;
+  color: var(--color-text-muted);
+  text-decoration: none;
+  border: 1px solid var(--color-border);
+  border-radius: var(--radius-full);
+  padding: 1px 7px;
+}
+.home__docs-link:hover { color: var(--color-primary); border-color: var(--color-primary); }

 .home__metrics {
  display: grid;
--- a/web/src/views/JobReviewView.vue
+++ b/web/src/views/JobReviewView.vue
@ -9,6 +9,7 @@
    <header class="review__header">
      <div class="review__title-row">
        <h1 class="review__title">Review Jobs</h1>
+        <a href="https://docs.circuitforge.tech/peregrine/user-guide/job-review/" target="_blank" rel="noopener" class="review__docs-link" aria-label="Job Review documentation">? Docs</a>
        <button class="help-btn" :aria-expanded="showHelp" @click="showHelp = !showHelp">
          <span aria-hidden="true">?</span>
          <span class="sr-only">Keyboard shortcuts</span>
@ -429,6 +430,17 @@ onUnmounted(() => {
  flex: 1;
 }

+.review__docs-link {
+  font-size: 0.75rem;
+  color: var(--color-text-muted);
+  border: 1px solid var(--color-border);
+  border-radius: var(--radius-full);
+  padding: 2px 8px;
+  text-decoration: none;
+  white-space: nowrap;
+  flex-shrink: 0;
+}
+.review__docs-link:hover { color: var(--color-primary); border-color: var(--color-primary); }
 .help-btn {
  width: 32px;
  height: 32px;
--- a/web/src/views/ResumesView.vue
+++ b/web/src/views/ResumesView.vue
@ -2,6 +2,7 @@
  <div class="rv">
    <div class="rv__header">
      <h1 class="rv__title">Resume Library</h1>
+      <a href="https://docs.circuitforge.tech/peregrine/user-guide/daily-workflow/#managing-your-resume" target="_blank" rel="noopener" class="rv__help-link" aria-label="Resume Library documentation">? Help</a>
      <label class="btn-generate rv__import-btn">
        <span aria-hidden="true">📥</span> Import
        <input type="file" accept=".txt,.pdf,.docx,.odt,.yaml,.yml"
@ -314,7 +315,10 @@ onBeforeRouteLeave(() => {
 <style scoped>
 .rv { display: flex; flex-direction: column; gap: var(--space-4, 1rem); padding: var(--space-5, 1.25rem); height: 100%; }

-.rv__header { display: flex; align-items: center; justify-content: space-between; }
+.rv__header { display: flex; align-items: center; gap: var(--space-3); }
+.rv__header .btn-generate { margin-left: auto; }
+.rv__help-link { font-size: 0.75rem; color: var(--color-text-muted); border: 1px solid var(--color-border); border-radius: var(--radius-full); padding: 2px 8px; text-decoration: none; white-space: nowrap; flex-shrink: 0; }
+.rv__help-link:hover { color: var(--color-primary); border-color: var(--color-primary); }
 .rv__title { font-size: var(--font-xl, 1.25rem); font-weight: 700; margin: 0; }
 .rv__file-input { display: none; }
 .rv__import-btn { cursor: pointer; }
--- a/web/src/views/settings/DataView.vue
+++ b/web/src/views/settings/DataView.vue
@ -1,7 +1,9 @@
 <script setup lang="ts">
-import { ref } from 'vue'
+import { ref, onMounted } from 'vue'
 import { storeToRefs } from 'pinia'
 import { useDataStore } from '../../stores/settings/data'
+import { useSyncStore, SYNC_DATA_CLASSES } from '../../stores/settings/sync'
+import { useAppConfigStore } from '../../stores/appConfig'

 const store = useDataStore()
 const { backupPath, backupFileCount, backupSizeBytes, creatingBackup, backupError } = storeToRefs(store)
@ -9,6 +11,13 @@ const includeDb = ref(false)
 const showRestoreConfirm = ref(false)
 const restoreFile = ref<File | null>(null)

+const sync  = useSyncStore()
+const config = useAppConfigStore()
+
+const canSync = config.isCloud && ['paid', 'premium'].includes(config.tier)
+
+onMounted(() => { if (config.isCloud) sync.loadPrefs() })
+
 function formatBytes(b: number) {
  if (b < 1024) return `${b} B`
  if (b < 1024 * 1024) return `${(b / 1024).toFixed(1)} KB`
@ -77,5 +86,71 @@ function formatBytes(b: number) {
        </div>
      </Teleport>
    </section>
+
+    <!-- Cross-device sync — cloud accounts only -->
+    <section v-if="config.isCloud" class="form-section">
+      <h3>Cross-device Sync <span class="tier-badge">Paid</span></h3>
+      <p class="section-note">
+        Sync selected data to your cloud account so it follows you across devices.
+        Each category is opt-in — nothing is uploaded until you enable it.
+      </p>
+
+      <div v-if="sync.loading" class="sync-loading">Loading sync preferences…</div>
+
+      <template v-else-if="canSync">
+        <div v-for="dc in SYNC_DATA_CLASSES" :key="dc.key" class="sync-row">
+          <label class="sync-toggle-label">
+            <input
+              type="checkbox"
+              :checked="sync.prefs[dc.key] ?? false"
+              :disabled="sync.saving === dc.key"
+              @change="sync.setPref(dc.key, ($event.target as HTMLInputElement).checked)"
+            />
+            <span class="sync-label-text">
+              <strong>{{ dc.label }}</strong>
+              <span class="sync-label-desc">{{ dc.description }}</span>
+            </span>
+          </label>
+        </div>
+        <p v-if="sync.error" class="error-msg">{{ sync.error }}</p>
+      </template>
+
+      <p v-else class="tier-gate-note">
+        Cross-device sync is available on the Paid and Premium plans.
+      </p>
+
+      <!-- Delete all — tier-free, always shown to cloud users -->
+      <div class="form-actions sync-delete-row">
+        <button
+          class="btn-danger"
+          :disabled="sync.wiping"
+          @click="sync.wipeAll()"
+        >
+          {{ sync.wiping ? 'Deleting…' : 'Delete all sync data' }}
+        </button>
+        <span class="section-note">Removes all uploaded sync data immediately. Preferences are also reset.</span>
+      </div>
+    </section>
  </div>
 </template>
+
+<style scoped>
+.tier-badge {
+  font-size: 0.7rem;
+  font-weight: 600;
+  padding: 0.15em 0.5em;
+  border-radius: 4px;
+  background: var(--color-accent, #6c63ff);
+  color: #fff;
+  vertical-align: middle;
+  margin-left: 0.4em;
+}
+.sync-loading { color: var(--color-text-muted); font-size: 0.9rem; margin: 0.5rem 0; }
+.sync-row { margin: 0.75rem 0; }
+.sync-toggle-label { display: flex; align-items: flex-start; gap: 0.6rem; cursor: pointer; }
+.sync-label-text { display: flex; flex-direction: column; gap: 0.1rem; }
+.sync-label-desc { font-size: 0.8rem; color: var(--color-text-muted); }
+.sync-delete-row { display: flex; align-items: center; gap: 1rem; flex-wrap: wrap; margin-top: 1rem; }
+.sync-delete-row .section-note { margin: 0; }
+.tier-gate-note { font-size: 0.85rem; color: var(--color-text-muted); margin: 0.5rem 0; }
+</style>
--- a/web/src/views/settings/MyProfileView.vue
+++ b/web/src/views/settings/MyProfileView.vue
@ -5,6 +5,26 @@
      <p class="subtitle">Your identity and preferences used for cover letters, research, and interview prep.</p>
    </header>

+    <!-- ── AI wizard entry point ──────────────────────────── -->
+    <div class="wizard-cta" :class="hasWizardAccess ? 'wizard-cta--unlocked' : 'wizard-cta--locked'">
+      <div class="wizard-cta__body">
+        <span class="wizard-cta__icon" aria-hidden="true">✦</span>
+        <div>
+          <p class="wizard-cta__heading">Set up your profile with AI</p>
+          <p class="wizard-cta__desc">
+            <template v-if="hasWizardAccess">Answer a few questions and the assistant fills in your profile automatically.</template>
+            <template v-else>Upgrade to Paid, or bring your own LLM key, to use the AI profile assistant.</template>
+          </p>
+        </div>
+      </div>
+      <RouterLink v-if="hasWizardAccess" to="/wizard/ai-profile" class="btn-wizard">
+        Start AI setup
+      </RouterLink>
+      <RouterLink v-else to="/settings/license" class="btn-wizard btn-wizard--upgrade">
+        Upgrade
+      </RouterLink>
+    </div>
+
    <div v-if="store.loading" class="loading-state">Loading profile…</div>

    <template v-else>
@ -204,7 +224,8 @@
 </template>

 <script setup lang="ts">
-import { ref, onMounted } from 'vue'
+import { ref, onMounted, computed } from 'vue'
+import { RouterLink } from 'vue-router'
 import { storeToRefs } from 'pinia'
 import { useProfileStore } from '../../stores/settings/profile'
 import { useAppConfigStore } from '../../stores/appConfig'
@ -214,6 +235,8 @@ const store = useProfileStore()
 const { loadError } = storeToRefs(store)
 const config = useAppConfigStore()

+const hasWizardAccess = computed(() => config.tier !== 'free' || config.byokUnlocked)
+
 const newNdaCompany = ref('')
 const generatingSummary = ref(false)
 const generatingMissions = ref(false)
@ -290,7 +313,106 @@ async function generateVoice() {
 }

 .page-header {
+  margin-bottom: var(--space-4);
+}
+
+/* ── AI wizard callout ─────────────────────────────── */
+.wizard-cta {
+  display: flex;
+  align-items: center;
+  justify-content: space-between;
+  gap: var(--space-4);
+  flex-wrap: wrap;
+  padding: var(--space-4) var(--space-5);
+  border-radius: var(--radius-md);
  margin-bottom: var(--space-6);
+  border: 1px solid var(--color-border-light);
+}
+
+.wizard-cta--unlocked {
+  background: color-mix(in srgb, var(--color-primary) 6%, var(--color-surface));
+  border-color: color-mix(in srgb, var(--color-primary) 25%, transparent);
+}
+
+.wizard-cta--locked {
+  background: var(--color-surface-raised);
+}
+
+.wizard-cta__body {
+  display: flex;
+  align-items: flex-start;
+  gap: var(--space-3);
+  flex: 1;
+}
+
+.wizard-cta__icon {
+  font-size: 1.25rem;
+  color: var(--color-primary);
+  flex-shrink: 0;
+  margin-top: 2px;
+}
+
+.wizard-cta--locked .wizard-cta__icon {
+  color: var(--color-text-muted);
+}
+
+.wizard-cta__heading {
+  margin: 0 0 var(--space-1);
+  font-size: 0.95rem;
+  font-weight: 600;
+  color: var(--color-text);
+}
+
+.wizard-cta__desc {
+  margin: 0;
+  font-size: 0.85rem;
+  color: var(--color-text-muted);
+  line-height: 1.5;
+}
+
+.btn-wizard {
+  display: inline-flex;
+  align-items: center;
+  padding: var(--space-2) var(--space-5);
+  background: var(--color-primary);
+  color: var(--color-text-inverse);
+  border: none;
+  border-radius: var(--radius-md);
+  font-family: var(--font-body);
+  font-size: 0.875rem;
+  font-weight: 600;
+  text-decoration: none;
+  white-space: nowrap;
+  min-height: 40px;
+  transition: background var(--transition);
+  flex-shrink: 0;
+}
+
+.btn-wizard:hover {
+  background: var(--color-primary-hover);
+}
+
+.btn-wizard--upgrade {
+  background: none;
+  color: var(--color-text-muted);
+  border: 1px solid var(--color-border);
+}
+
+.btn-wizard--upgrade:hover {
+  background: var(--color-surface-raised);
+  color: var(--color-text);
+}
+
+@media (max-width: 600px) {
+  .wizard-cta {
+    flex-direction: column;
+    align-items: flex-start;
+  }
+
+  .btn-wizard {
+    width: 100%;
+    justify-content: center;
+  }
 }

 .page-header h2 {
--- a/web/src/views/settings/ResumeProfileView.vue
+++ b/web/src/views/settings/ResumeProfileView.vue
@ -1,6 +1,9 @@
 <template>
  <div class="resume-profile">
-    <h2>Resume Profile</h2>
+    <div class="page-header">
+      <h2>Resume Profile</h2>
+      <a href="https://docs.circuitforge.tech/peregrine/user-guide/settings/#resume-profile" target="_blank" rel="noopener" class="help-link" aria-label="Resume Profile documentation">? Help</a>
+    </div>

    <!-- Load error banner -->
    <div v-if="loadError" class="error-banner">
@ -401,6 +404,10 @@ async function handleUpload() {

 <style scoped>
 .resume-profile { max-width: 720px; margin: 0 auto; padding: var(--space-4); }
+.page-header { display: flex; align-items: center; gap: var(--space-3); margin-bottom: var(--space-6); }
+.page-header h2 { margin-bottom: 0; }
+.help-link { font-size: 0.75rem; color: var(--color-text-muted); border: 1px solid var(--color-border); border-radius: var(--radius-full); padding: 2px 8px; text-decoration: none; white-space: nowrap; flex-shrink: 0; }
+.help-link:hover { color: var(--color-primary); border-color: var(--color-primary); }
 h2 { font-size: 1.4rem; font-weight: 600; margin-bottom: var(--space-6); }
 h3 { font-size: 1rem; font-weight: 600; margin-bottom: var(--space-3); }
 .form-section { margin-bottom: var(--space-8); padding-bottom: var(--space-6); border-bottom: 1px solid var(--color-border); }
--- a/web/src/views/settings/SearchPrefsView.vue
+++ b/web/src/views/settings/SearchPrefsView.vue
@ -1,6 +1,9 @@
 <template>
  <div class="search-prefs">
-    <h2>Search Preferences</h2>
+    <div class="page-header">
+      <h2>Search Preferences</h2>
+      <a :href="docsUrl" target="_blank" rel="noopener" class="help-link" aria-label="Search Preferences documentation">? Help</a>
+    </div>
    <p v-if="store.loadError" class="error-banner">{{ store.loadError }}</p>

    <!-- Remote Preference -->
@ -154,8 +157,10 @@
 <script setup lang="ts">
 import { ref, onMounted } from 'vue'
 import { useSearchStore } from '../../stores/settings/search'
+import { useDocsUrl } from '../../composables/useDocsUrl'

 const store = useSearchStore()
+const docsUrl = useDocsUrl('user-guide/settings/#search-prefs')

 const remoteOptions = [
  { value: 'remote' as const, label: 'Remote only' },
@ -186,6 +191,10 @@ onMounted(() => store.load())

 <style scoped>
 .search-prefs { max-width: 720px; margin: 0 auto; padding: var(--space-4); }
+.page-header { display: flex; align-items: center; gap: var(--space-3); margin-bottom: var(--space-6); }
+.page-header h2 { margin-bottom: 0; }
+.help-link { font-size: 0.75rem; color: var(--color-text-muted); border: 1px solid var(--color-border); border-radius: var(--radius-full); padding: 2px 8px; text-decoration: none; white-space: nowrap; flex-shrink: 0; }
+.help-link:hover { color: var(--color-primary); border-color: var(--color-primary); }
 h2 { font-size: 1.4rem; font-weight: 600; margin-bottom: var(--space-6); }
 h3 { font-size: 1rem; font-weight: 600; margin-bottom: var(--space-3); }
 .form-section { margin-bottom: var(--space-8); padding-bottom: var(--space-6); border-bottom: 1px solid var(--color-border); }
--- a/web/src/views/settings/SystemSettingsView.vue
+++ b/web/src/views/settings/SystemSettingsView.vue
@ -136,6 +136,29 @@
      <p v-if="store.deployError" class="error-msg">{{ store.deployError }}</p>
    </section>

+    <!-- Orchard coordinator -->
+    <section class="form-section">
+      <h3>Orchard Coordinator</h3>
+      <p class="section-note">
+        The Orchard is CircuitForge's distributed GPU cluster. Requires a Paid license or higher.
+        Leave blank to disable Orchard routing.
+      </p>
+      <div class="field-row">
+        <label>Coordinator URL</label>
+        <input
+          v-model="orchUrl"
+          type="url"
+          placeholder="https://orch.circuitforge.tech"
+          class="field-input-wide"
+        />
+        <button @click="saveOrchUrl" :disabled="orchSaving" class="btn-save-inline">
+          {{ orchSaving ? 'Saving…' : 'Save' }}
+        </button>
+      </div>
+      <p v-if="orchError" class="error">{{ orchError }}</p>
+      <p v-if="orchSaved" class="success">Saved.</p>
+    </section>
+
    <!-- BYOK Modal -->
    <Teleport to="body">
      <div v-if="store.byokPending.length > 0" class="modal-overlay" @click.self="store.cancelByok()">
@ -250,12 +273,39 @@ async function saveCoverLetterModel() {
  setTimeout(() => { clmSaved.value = false }, 3000)
 }

+// ── Orchard coordinator URL ───────────────────────────────────────────────────
+const orchUrl    = ref('')
+const orchSaving = ref(false)
+const orchError  = ref<string | null>(null)
+const orchSaved  = ref(false)
+
+async function loadOrchUrl() {
+  const { data } = await useApiFetch<{ orch_url: string }>('/api/settings/system/orch-url')
+  if (data) orchUrl.value = data.orch_url ?? ''
+}
+
+async function saveOrchUrl() {
+  orchSaving.value = true
+  orchError.value  = null
+  orchSaved.value  = false
+  const { error } = await useApiFetch('/api/settings/system/orch-url', {
+    method: 'POST',
+    headers: { 'Content-Type': 'application/json' },
+    body: JSON.stringify({ orch_url: orchUrl.value }),
+  })
+  orchSaving.value = false
+  if (error) { orchError.value = 'Failed to save.'; return }
+  orchSaved.value = true
+  setTimeout(() => { orchSaved.value = false }, 3000)
+}
+
 onMounted(async () => {
  await store.loadLlm()
  const tasks = [
    store.loadServices(),
    store.loadFilePaths(),
    store.loadDeployConfig(),
+    loadOrchUrl(),
  ]
  if (config.isCloud && tierOrder.indexOf(tier.value) >= tierOrder.indexOf('paid')) {
    tasks.push(loadCoverLetterModel())
@ -328,6 +378,7 @@ h3 { font-size: 1rem; font-weight: 600; margin-bottom: var(--space-3); }
 .field-row { display: flex; flex-direction: column; gap: 4px; margin-bottom: 14px; }
 .field-row label { font-size: 0.82rem; color: var(--color-text-muted); }
 .field-row input { background: var(--color-surface-alt); border: 1px solid var(--color-border); border-radius: 6px; color: var(--color-text); padding: 7px 10px; font-size: 0.88rem; }
+.field-input-wide { width: 100%; max-width: 400px; }
 .field-hint { font-size: 0.72rem; color: var(--color-text-muted); margin-top: 3px; }
 .btn-secondary { padding: 9px 18px; background: transparent; border: 1px solid var(--color-border); border-radius: 7px; color: var(--color-text-muted); cursor: pointer; font-size: 0.88rem; }
 .btn-danger {
--- a/web/src/views/wizard/WizardAIView.vue
+++ b/web/src/views/wizard/WizardAIView.vue
@ -0,0 +1,598 @@
+<script setup lang="ts">
+import { ref, computed, nextTick, onMounted, watch } from 'vue'
+import { useRouter } from 'vue-router'
+import { useAiInterviewStore } from '../../stores/wizard/aiInterview'
+import { useAppConfigStore } from '../../stores/appConfig'
+import { RouterLink } from 'vue-router'
+
+const router = useRouter()
+const store  = useAiInterviewStore()
+const config = useAppConfigStore()
+
+const hasAccess = computed(() => config.tier !== 'free' || config.byokUnlocked)
+
+const inputText     = ref('')
+const messageList   = ref<HTMLElement | null>(null)
+
+const TOTAL_FIELDS = 8
+
+const progressPct = computed(() =>
+  Math.min(100, (Object.keys(store.fields).length / TOTAL_FIELDS) * 100)
+)
+
+const TONE_CHIPS = [
+  'Professional and direct',
+  'Warm and conversational',
+  'Concise and clear',
+  'Enthusiastic and personable',
+]
+
+const lastAssistantMsg = computed(() => {
+  const msgs = store.messages
+  for (let i = msgs.length - 1; i >= 0; i--) {
+    if (msgs[i].role === 'assistant') return msgs[i].content
+  }
+  return ''
+})
+
+const showToneChips = computed(() => {
+  if (store.messages.length === 0) return false
+  const lower = lastAssistantMsg.value.toLowerCase()
+  return lower.includes('writing') || lower.includes('voice') || lower.includes('cover letter')
+})
+
+async function scrollToBottom() {
+  await nextTick()
+  if (messageList.value) {
+    messageList.value.scrollTop = messageList.value.scrollHeight
+  }
+}
+
+watch(() => store.messages.length, () => scrollToBottom())
+
+async function handleSend() {
+  const text = inputText.value.trim()
+  if (!text || store.loading) return
+  inputText.value = ''
+  await store.send(text)
+}
+
+function handleKeydown(e: KeyboardEvent) {
+  if (e.key === 'Enter' && !e.shiftKey) {
+    e.preventDefault()
+    handleSend()
+  }
+}
+
+function applyToneChip(chip: string) {
+  inputText.value = chip
+}
+
+async function handleSave() {
+  const ok = await store.finalize()
+  if (ok) router.push('/settings/my-profile')
+}
+
+onMounted(async () => {
+  if (!config.loaded) await config.load()
+  store.restore()
+  if (store.messages.length === 0) {
+    await store.send('')
+  }
+  scrollToBottom()
+})
+</script>
+
+<template>
+  <div class="ai-view">
+    <!-- Tier gate -->
+    <div v-if="!hasAccess" class="ai-locked">
+      <div class="ai-locked__icon" aria-hidden="true">🔒</div>
+      <h2 class="ai-locked__heading">AI Profile Assistant</h2>
+      <p class="ai-locked__body">
+        The AI profile assistant is available on the Paid plan, or for free when you bring your own LLM.
+        You can
+        <RouterLink to="/settings/my-profile" class="ai-locked__link">set up your profile manually</RouterLink>
+        instead.
+      </p>
+    </div>
+
+    <!-- Chat UI -->
+    <div v-else class="ai-chat">
+      <header class="ai-chat__header">
+        <h1 class="ai-chat__title">Set up your profile with AI</h1>
+        <p class="ai-chat__subtitle">I'll ask you a few questions. You can skip anything.</p>
+
+        <!-- Progress bar -->
+        <div class="ai-progress" role="progressbar"
+          :aria-valuenow="Object.keys(store.fields).length"
+          :aria-valuemax="TOTAL_FIELDS"
+          aria-label="Profile fields completed">
+          <div class="ai-progress__bar" :style="{ width: progressPct + '%' }"></div>
+        </div>
+        <p class="ai-progress__label">
+          {{ Object.keys(store.fields).length }} of {{ TOTAL_FIELDS }} fields captured
+        </p>
+      </header>
+
+      <!-- Message list -->
+      <div class="ai-messages" ref="messageList">
+        <div
+          v-for="(msg, idx) in store.messages"
+          :key="idx"
+          class="ai-bubble"
+          :class="msg.role === 'user' ? 'ai-bubble--user' : 'ai-bubble--assistant'"
+        >
+          <span class="ai-bubble__text">{{ msg.content }}</span>
+        </div>
+        <div v-if="store.loading" class="ai-bubble ai-bubble--assistant ai-bubble--typing">
+          <span class="ai-typing-dots" aria-label="Thinking">
+            <span></span><span></span><span></span>
+          </span>
+        </div>
+      </div>
+
+      <!-- Completion panel -->
+      <div v-if="store.complete" class="ai-complete">
+        <p class="ai-complete__msg">Your profile is ready to save.</p>
+        <div class="ai-complete__actions">
+          <button
+            class="btn-primary"
+            :disabled="store.saving"
+            @click="handleSave"
+          >
+            {{ store.saving ? 'Saving…' : 'Save Profile' }}
+          </button>
+          <button
+            class="btn-ghost"
+            :disabled="store.loading || store.saving"
+            @click="store.keepChatting()"
+          >
+            Keep chatting
+          </button>
+        </div>
+      </div>
+
+      <!-- Input area -->
+      <div class="ai-input-area">
+        <!-- Tone chips -->
+        <div v-if="showToneChips" class="ai-tone-chips" role="group" aria-label="Writing tone suggestions">
+          <button
+            v-for="chip in TONE_CHIPS"
+            :key="chip"
+            class="ai-tone-chip"
+            @click="applyToneChip(chip)"
+          >{{ chip }}</button>
+        </div>
+
+        <div class="ai-input-row">
+          <textarea
+            v-model="inputText"
+            class="ai-input"
+            placeholder="Type your answer…"
+            rows="2"
+            :disabled="store.loading || store.saving"
+            @keydown="handleKeydown"
+            aria-label="Chat input"
+          ></textarea>
+          <div class="ai-input-btns">
+            <button
+              class="btn-primary ai-send-btn"
+              :disabled="store.loading || store.saving || !inputText.trim()"
+              @click="handleSend"
+            >
+              Send
+            </button>
+            <button
+              class="btn-ghost ai-skip-btn"
+              :disabled="store.loading || store.saving || store.complete"
+              @click="store.skip()"
+            >
+              Skip
+            </button>
+          </div>
+        </div>
+
+        <p v-if="store.error" class="ai-error" role="alert">{{ store.error }}</p>
+
+        <div v-if="store.messages.length > 0" class="ai-startover-row">
+          <button class="btn-startover" @click="store.startOver()">Start over</button>
+        </div>
+      </div>
+    </div>
+  </div>
+</template>
+
+<style scoped>
+/* ── Page container ────────────────────────────────── */
+.ai-view {
+  min-height: 100vh;
+  background: var(--color-surface);
+  display: flex;
+  justify-content: center;
+  padding: var(--space-8) var(--space-4);
+}
+
+/* ── Locked state ──────────────────────────────────── */
+.ai-locked {
+  max-width: 480px;
+  width: 100%;
+  margin: auto;
+  text-align: center;
+  display: flex;
+  flex-direction: column;
+  align-items: center;
+  gap: var(--space-4);
+}
+
+.ai-locked__icon {
+  font-size: 3rem;
+}
+
+.ai-locked__heading {
+  font-family: var(--font-display);
+  font-size: 1.5rem;
+  font-weight: 700;
+  color: var(--color-text);
+  margin: 0;
+}
+
+.ai-locked__body {
+  font-size: 0.95rem;
+  color: var(--color-text-muted);
+  line-height: 1.6;
+  margin: 0;
+}
+
+.ai-locked__link {
+  color: var(--color-accent);
+  text-decoration: underline;
+  text-underline-offset: 3px;
+}
+
+/* ── Chat container ────────────────────────────────── */
+.ai-chat {
+  width: 100%;
+  max-width: 680px;
+  display: flex;
+  flex-direction: column;
+  gap: var(--space-4);
+}
+
+/* ── Header ────────────────────────────────────────── */
+.ai-chat__header {
+  display: flex;
+  flex-direction: column;
+  gap: var(--space-2);
+}
+
+.ai-chat__title {
+  font-family: var(--font-display);
+  font-size: 1.375rem;
+  font-weight: 700;
+  color: var(--color-primary);
+  margin: 0;
+}
+
+.ai-chat__subtitle {
+  font-size: 0.9rem;
+  color: var(--color-text-muted);
+  margin: 0;
+}
+
+/* ── Progress bar ──────────────────────────────────── */
+.ai-progress {
+  height: 6px;
+  background: var(--color-border-light);
+  border-radius: var(--radius-full);
+  overflow: hidden;
+}
+
+.ai-progress__bar {
+  height: 100%;
+  background: var(--color-primary);
+  border-radius: var(--radius-full);
+  transition: width 0.4s ease;
+}
+
+.ai-progress__label {
+  font-size: 0.78rem;
+  color: var(--color-text-muted);
+  margin: 0;
+}
+
+/* ── Message list ──────────────────────────────────── */
+.ai-messages {
+  flex: 1;
+  min-height: 320px;
+  max-height: 480px;
+  overflow-y: auto;
+  display: flex;
+  flex-direction: column;
+  gap: var(--space-3);
+  padding: var(--space-4);
+  background: var(--color-surface-raised);
+  border: 1px solid var(--color-border-light);
+  border-radius: var(--radius-lg);
+  scroll-behavior: smooth;
+}
+
+/* ── Chat bubbles ──────────────────────────────────── */
+.ai-bubble {
+  display: flex;
+  max-width: 80%;
+}
+
+.ai-bubble--user {
+  align-self: flex-end;
+}
+
+.ai-bubble--assistant {
+  align-self: flex-start;
+}
+
+.ai-bubble__text {
+  display: block;
+  padding: var(--space-3) var(--space-4);
+  border-radius: var(--radius-md);
+  font-size: 0.9rem;
+  line-height: 1.55;
+  white-space: pre-wrap;
+}
+
+.ai-bubble--user .ai-bubble__text {
+  background: var(--color-primary);
+  color: var(--color-text-inverse);
+  border-bottom-right-radius: var(--radius-sm);
+}
+
+.ai-bubble--assistant .ai-bubble__text {
+  background: var(--color-surface-alt);
+  color: var(--color-text);
+  border: 1px solid var(--color-border-light);
+  border-bottom-left-radius: var(--radius-sm);
+}
+
+/* ── Typing indicator ──────────────────────────────── */
+.ai-bubble--typing .ai-bubble__text {
+  padding: var(--space-3) var(--space-4);
+}
+
+.ai-typing-dots {
+  display: inline-flex;
+  gap: 4px;
+  align-items: center;
+}
+
+.ai-typing-dots span {
+  display: inline-block;
+  width: 6px;
+  height: 6px;
+  border-radius: 50%;
+  background: var(--color-text-muted);
+  animation: typing-bounce 1.2s infinite ease-in-out;
+}
+
+.ai-typing-dots span:nth-child(2) { animation-delay: 0.2s; }
+.ai-typing-dots span:nth-child(3) { animation-delay: 0.4s; }
+
+@keyframes typing-bounce {
+  0%, 80%, 100% { transform: translateY(0); opacity: 0.4; }
+  40%            { transform: translateY(-4px); opacity: 1; }
+}
+
+@media (prefers-reduced-motion: reduce) {
+  .ai-typing-dots span { animation: none; opacity: 0.7; }
+}
+
+/* ── Completion panel ──────────────────────────────── */
+.ai-complete {
+  background: color-mix(in srgb, var(--color-success) 10%, transparent);
+  border: 1px solid color-mix(in srgb, var(--color-success) 35%, transparent);
+  border-radius: var(--radius-md);
+  padding: var(--space-4);
+  display: flex;
+  align-items: center;
+  gap: var(--space-4);
+  flex-wrap: wrap;
+}
+
+.ai-complete__msg {
+  flex: 1;
+  margin: 0;
+  font-size: 0.95rem;
+  font-weight: 600;
+  color: var(--color-success);
+}
+
+.ai-complete__actions {
+  display: flex;
+  gap: var(--space-3);
+  flex-wrap: wrap;
+}
+
+/* ── Input area ────────────────────────────────────── */
+.ai-input-area {
+  display: flex;
+  flex-direction: column;
+  gap: var(--space-2);
+}
+
+.ai-input-row {
+  display: flex;
+  gap: var(--space-3);
+  align-items: flex-end;
+}
+
+.ai-input {
+  flex: 1;
+  padding: var(--space-3);
+  border: 1px solid var(--color-border);
+  border-radius: var(--radius-md);
+  background: var(--color-surface-raised);
+  color: var(--color-text);
+  font-family: var(--font-body);
+  font-size: 0.9rem;
+  line-height: 1.5;
+  resize: none;
+  transition: border-color var(--transition);
+}
+
+.ai-input:focus {
+  outline: none;
+  border-color: var(--color-primary);
+  box-shadow: 0 0 0 3px color-mix(in srgb, var(--color-primary) 15%, transparent);
+}
+
+.ai-input:disabled {
+  opacity: 0.5;
+  cursor: not-allowed;
+}
+
+.ai-input-btns {
+  display: flex;
+  flex-direction: column;
+  gap: var(--space-2);
+}
+
+.ai-send-btn,
+.ai-skip-btn {
+  white-space: nowrap;
+  min-width: 72px;
+}
+
+/* ── Tone chips ────────────────────────────────────── */
+.ai-tone-chips {
+  display: flex;
+  flex-wrap: wrap;
+  gap: var(--space-2);
+}
+
+.ai-tone-chip {
+  padding: var(--space-1) var(--space-3);
+  border: 1px solid color-mix(in srgb, var(--color-accent) 40%, transparent);
+  border-radius: var(--radius-full);
+  background: var(--color-accent-light);
+  color: var(--color-accent);
+  font-family: var(--font-body);
+  font-size: 0.82rem;
+  font-weight: 500;
+  cursor: pointer;
+  transition: background var(--transition), border-color var(--transition);
+}
+
+.ai-tone-chip:hover {
+  background: color-mix(in srgb, var(--color-accent) 15%, transparent);
+  border-color: var(--color-accent);
+}
+
+/* ── Error ─────────────────────────────────────────── */
+.ai-error {
+  font-size: 0.875rem;
+  color: var(--color-error);
+  margin: 0;
+  padding: var(--space-2) var(--space-3);
+  background: color-mix(in srgb, var(--color-error) 8%, transparent);
+  border: 1px solid color-mix(in srgb, var(--color-error) 25%, transparent);
+  border-radius: var(--radius-md);
+}
+
+/* ── Start over ────────────────────────────────────── */
+.ai-startover-row {
+  display: flex;
+  justify-content: flex-end;
+}
+
+.btn-startover {
+  background: none;
+  border: none;
+  font-family: var(--font-body);
+  font-size: 0.8rem;
+  color: var(--color-text-muted);
+  cursor: pointer;
+  padding: var(--space-1) var(--space-2);
+  border-radius: var(--radius-sm);
+  transition: color var(--transition);
+  text-decoration: underline;
+  text-underline-offset: 2px;
+}
+
+.btn-startover:hover {
+  color: var(--color-error);
+}
+
+/* ── Button styles (local defs matching wizard.css) ── */
+.btn-primary {
+  padding: var(--space-2) var(--space-6);
+  background: var(--color-primary);
+  color: var(--color-text-inverse);
+  border: none;
+  border-radius: var(--radius-md);
+  font-family: var(--font-body);
+  font-size: 0.9rem;
+  font-weight: 600;
+  cursor: pointer;
+  transition: background var(--transition), opacity var(--transition);
+  min-height: 44px;
+}
+
+.btn-primary:hover:not(:disabled) { background: var(--color-primary-hover); }
+
+.btn-primary:disabled {
+  opacity: 0.5;
+  cursor: not-allowed;
+}
+
+.btn-ghost {
+  padding: var(--space-2) var(--space-4);
+  background: none;
+  color: var(--color-text-muted);
+  border: 1px solid var(--color-border);
+  border-radius: var(--radius-md);
+  font-family: var(--font-body);
+  font-size: 0.9rem;
+  cursor: pointer;
+  transition: color var(--transition), border-color var(--transition);
+  min-height: 44px;
+}
+
+.btn-ghost:hover:not(:disabled) {
+  color: var(--color-text);
+  border-color: var(--color-border);
+}
+
+.btn-ghost:disabled {
+  opacity: 0.4;
+  cursor: not-allowed;
+}
+
+/* ── Mobile ────────────────────────────────────────── */
+@media (max-width: 600px) {
+  .ai-view {
+    padding: var(--space-4) var(--space-3);
+  }
+
+  .ai-messages {
+    min-height: 240px;
+    max-height: 360px;
+  }
+
+  .ai-input-row {
+    flex-direction: column;
+    align-items: stretch;
+  }
+
+  .ai-input-btns {
+    flex-direction: row;
+  }
+
+  .ai-bubble {
+    max-width: 92%;
+  }
+
+  .ai-complete {
+    flex-direction: column;
+    align-items: flex-start;
+  }
+}
+</style>
--- a/web/src/views/wizard/WizardHardwareStep.vue
+++ b/web/src/views/wizard/WizardHardwareStep.vue
@ -13,20 +13,39 @@
        {{ wizard.hardware.gpus.join(', ') }}
      </div>
      <div v-else class="step__info">
-        No local NVIDIA GPUs detected. "Remote", "CPU", or "cf-orch" mode recommended.
+        No local NVIDIA GPUs detected. CPU or Orchard mode recommended.
      </div>

+      <!-- Service status -->
+      <div class="hw-services">
+        <div class="hw-svc" :class="ollamaRunning ? 'hw-svc--up' : 'hw-svc--down'">
+          <span class="hw-svc__dot" aria-hidden="true" />
+          <span class="hw-svc__name">Ollama</span>
+          <span class="hw-svc__status">{{ ollamaRunning ? 'running' : 'not detected' }}</span>
+        </div>
+        <div class="hw-svc" :class="searxngRunning ? 'hw-svc--up' : 'hw-svc--down'">
+          <span class="hw-svc__dot" aria-hidden="true" />
+          <span class="hw-svc__name">SearXNG</span>
+          <span class="hw-svc__status">{{ searxngRunning ? 'running' : 'not detected' }}</span>
+        </div>
+      </div>
+
+      <p v-if="!ollamaRunning" class="step__field-hint">
+        Ollama not running — start it on the host before continuing, or choose Remote or Orchard mode.
+        See <strong>Settings → Services</strong> after setup to manage services.
+      </p>
+
      <div class="step__field">
        <label class="step__label" for="hw-profile">Inference profile</label>
        <select id="hw-profile" v-model="selectedProfile" class="step__select">
-          <option value="remote">Remote — use cloud API keys</option>
          <option value="cpu">CPU — local Ollama, no GPU</option>
          <option value="single-gpu">Single GPU — local Ollama + one GPU</option>
          <option value="dual-gpu">Dual GPU — local Ollama + two GPUs</option>
          <option value="cf-orch">
-            cf-orch — CircuitForge GPU cluster
+            Orchard — CircuitForge GPU cluster
            {{ orchAvailable ? `(${orchGpus.length} GPU(s) available)` : '(configure endpoint below)' }}
          </option>
+          <option value="remote">Remote — use cloud API keys</option>
        </select>
      </div>

@ -49,7 +68,7 @@
        </div>

        <div class="step__field">
-          <label class="step__label" for="orch-url">cf-orch coordinator URL</label>
+          <label class="step__label" for="orch-url">Orchard coordinator URL</label>
          <input
            id="orch-url"
            v-model="orchUrl"
@ -58,14 +77,14 @@
            placeholder="http://10.1.10.71:7700"
          />
          <p class="step__field-hint">
-            The coordinator serves public inference endpoints for paid+ users.
+            The Orchard coordinator serves public inference endpoints for Paid+ users.
            Leave blank to use the default cluster URL from Settings.
          </p>
        </div>

        <div class="step__tier-note">
          <span aria-hidden="true">🔒</span>
-          cf-orch inference requires a <strong>Paid</strong> license or higher.
+          Orchard inference requires a <strong>Paid</strong> license or higher.
          You can select this profile now; it will activate once your license is verified.
        </div>
      </template>
@ -74,8 +93,8 @@
        v-else-if="selectedProfile !== 'remote' && !wizard.hardware.gpus.length"
        class="step__warning"
      >
-        ⚠️ No local GPUs detected — a GPU profile may not work. Choose CPU, Remote,
-        or cf-orch if you have access to the cluster.
+        ⚠️ No local GPUs detected — a GPU profile may not work. Choose CPU
+        or Orchard if you have access to the cluster.
      </div>
    </template>

@ -107,6 +126,10 @@ const orchAvailable = ref(false)
 const orchGpus = ref<Array<{ node: string; name: string; vram_total_mb: number; vram_free_mb: number }>>([])
 const orchUrl = ref('')

+// local service probe results
+const ollamaRunning = ref(false)
+const searxngRunning = ref(false)
+
 onMounted(async () => {
  detecting.value = true
  const { data } = await useApiFetch<{
@ -115,6 +138,8 @@ onMounted(async () => {
    profiles: string[]
    cf_orch_available: boolean
    cf_orch_gpus: Array<{ node: string; name: string; vram_total_mb: number; vram_free_mb: number }>
+    ollama_running: boolean
+    searxng_running: boolean
  }>('/api/wizard/hardware')
  detecting.value = false
  if (!data) return
@ -128,6 +153,8 @@ onMounted(async () => {

  orchAvailable.value = data.cf_orch_available ?? false
  orchGpus.value = data.cf_orch_gpus ?? []
+  ollamaRunning.value = data.ollama_running ?? false
+  searxngRunning.value = data.searxng_running ?? false
 })

 async function next() {
@ -140,3 +167,40 @@ async function next() {
  if (ok) router.push('/setup/tier')
 }
 </script>
+
+<style scoped>
+.hw-services {
+  display: flex;
+  gap: var(--space-4);
+  margin: var(--space-3) 0 var(--space-2);
+  flex-wrap: wrap;
+}
+
+.hw-svc {
+  display: flex;
+  align-items: center;
+  gap: var(--space-2);
+  padding: var(--space-2) var(--space-3);
+  border-radius: var(--radius-full);
+  font-size: 0.8rem;
+  font-weight: 500;
+  border: 1px solid var(--color-border-light);
+  background: var(--color-surface-alt);
+}
+
+.hw-svc--up  { border-color: color-mix(in srgb, var(--color-success) 40%, transparent); }
+.hw-svc--down { opacity: 0.65; }
+
+.hw-svc__dot {
+  width: 8px;
+  height: 8px;
+  border-radius: var(--radius-full);
+  flex-shrink: 0;
+}
+
+.hw-svc--up   .hw-svc__dot { background: var(--color-success); }
+.hw-svc--down .hw-svc__dot { background: var(--color-text-muted); }
+
+.hw-svc__name  { color: var(--color-text); }
+.hw-svc__status { color: var(--color-text-muted); }
+</style>
--- a/web/src/views/wizard/WizardIdentityStep.vue
+++ b/web/src/views/wizard/WizardIdentityStep.vue
@ -1,6 +1,6 @@
 <template>
  <div class="step">
-    <h2 class="step__heading">Step 4 — Your Identity</h2>
+    <h2 class="step__heading">Step 5 — Your Identity</h2>
    <p class="step__caption">
      Used in cover letters, research briefs, and interview prep. You can update
      this any time in Settings → My Profile.
--- a/web/src/views/wizard/WizardInferenceStep.vue
+++ b/web/src/views/wizard/WizardInferenceStep.vue
@ -1,6 +1,6 @@
 <template>
  <div class="step">
-    <h2 class="step__heading">Step 5 — Inference & API Keys</h2>
+    <h2 class="step__heading">Step 6 — Inference & API Keys</h2>
    <p class="step__caption">
      Configure how Peregrine generates AI content. You can adjust this any time
      in Settings → System.
@ -36,7 +36,35 @@
      </div>
    </template>

-    <!-- Local mode -->
+    <!-- Orchard mode -->
+    <template v-else-if="isCfOrch">
+      <div class="step__info">
+        Orchard mode: Peregrine routes AI generation through the CircuitForge GPU cluster.
+      </div>
+
+      <div class="step__field">
+        <label class="step__label" for="inf-orch-url">Orchard coordinator URL</label>
+        <input id="inf-orch-url" v-model="form.orchUrl" type="url"
+               class="step__input" placeholder="https://orch.circuitforge.tech" />
+      </div>
+
+      <div v-if="isPaid" class="step__check-row">
+        <label class="step__checkbox-label">
+          <input
+            type="checkbox"
+            class="step__checkbox"
+            :checked="form.orchUrl === MANAGED_ORCH_URL"
+            @change="onUseManagedOrchard"
+          />
+          <span>Use CircuitForge managed Orchard</span>
+        </label>
+        <span class="step__check-hint">
+          Auto-fills your Paid+ cluster endpoint ({{ MANAGED_ORCH_URL }})
+        </span>
+      </div>
+    </template>
+
+    <!-- Local mode (CPU / single-gpu / dual-gpu) -->
    <template v-else>
      <div class="step__info">
        Local mode ({{ wizard.hardware.selectedProfile }}): Peregrine uses
@ -81,12 +109,19 @@
 import { reactive, ref, computed } from 'vue'
 import { useRouter } from 'vue-router'
 import { useWizardStore } from '../../stores/wizard'
+import { useAppConfigStore } from '../../stores/appConfig'
 import './wizard.css'

 const wizard = useWizardStore()
+const config = useAppConfigStore()
 const router = useRouter()

+const MANAGED_ORCH_URL = 'https://orch.circuitforge.tech'
+
 const isRemote = computed(() => wizard.hardware.selectedProfile === 'remote')
+const isCfOrch = computed(() => wizard.hardware.selectedProfile === 'cf-orch')
+const isPaid = computed(() => config.tier !== 'free')
+
 const showAdvanced = ref(false)
 const testing = ref(false)
 const testResult = ref<{ ok: boolean; message: string } | null>(null)
@ -95,19 +130,42 @@ const form = reactive({
  anthropicKey: wizard.inference.anthropicKey,
  openaiUrl: wizard.inference.openaiUrl,
  openaiKey: wizard.inference.openaiKey,
+  orchUrl: wizard.inference.orchUrl,
 })

+const savedSvcs = wizard.inference.services as Record<string, string | number>
 const services = reactive([
-  { key: 'ollama', label: 'Ollama', host: 'ollama', port: 11434 },
-  { key: 'searxng', label: 'SearXNG', host: 'searxng', port: 8080 },
+  {
+    key: 'ollama',
+    label: 'Ollama',
+    host: (savedSvcs['ollama_host'] as string) || wizard.inference.ollamaHost || 'localhost',
+    port: (savedSvcs['ollama_port'] as number) || wizard.inference.ollamaPort || 11434,
+  },
+  {
+    key: 'searxng',
+    label: 'SearXNG',
+    host: (savedSvcs['searxng_host'] as string) || 'searxng',
+    port: (savedSvcs['searxng_port'] as number) || 8080,
+  },
 ])

+function onUseManagedOrchard(e: Event) {
+  const checked = (e.target as HTMLInputElement).checked
+  form.orchUrl = checked ? MANAGED_ORCH_URL : ''
+}
+
 async function runTest() {
  testing.value = true
  testResult.value = null
  wizard.inference.anthropicKey = form.anthropicKey
  wizard.inference.openaiUrl = form.openaiUrl
  wizard.inference.openaiKey = form.openaiKey
+  wizard.inference.orchUrl = form.orchUrl
+  const ollamaSvc = services.find(s => s.key === 'ollama')
+  if (ollamaSvc) {
+    wizard.inference.ollamaHost = ollamaSvc.host
+    wizard.inference.ollamaPort = ollamaSvc.port
+  }
  testResult.value = await wizard.testInference()
  testing.value = false
 }
@ -115,10 +173,10 @@ async function runTest() {
 function back() { router.push('/setup/identity') }

 async function next() {
-  // Sync form back to store
  wizard.inference.anthropicKey = form.anthropicKey
  wizard.inference.openaiUrl = form.openaiUrl
  wizard.inference.openaiKey = form.openaiKey
+  wizard.inference.orchUrl = form.orchUrl

  const svcMap: Record<string, string | number> = {}
  services.forEach(s => {
@ -131,6 +189,7 @@ async function next() {
    anthropic_key: form.anthropicKey,
    openai_url: form.openaiUrl,
    openai_key: form.openaiKey,
+    orch_url: form.orchUrl,
    services: svcMap,
  })
  if (ok) router.push('/setup/search')
@ -166,4 +225,33 @@ async function next() {
 .svc-port {
  text-align: right;
 }
+
+.step__check-row {
+  display: flex;
+  flex-direction: column;
+  gap: var(--space-1);
+  margin-bottom: var(--space-4);
+}
+
+.step__checkbox-label {
+  display: flex;
+  align-items: center;
+  gap: var(--space-2);
+  cursor: pointer;
+  font-size: 0.9rem;
+  color: var(--color-text);
+}
+
+.step__checkbox {
+  width: 1rem;
+  height: 1rem;
+  accent-color: var(--color-primary);
+  flex-shrink: 0;
+}
+
+.step__check-hint {
+  font-size: 0.8rem;
+  color: var(--color-text-muted);
+  padding-left: calc(1rem + var(--space-2));
+}
 </style>
--- a/web/src/views/wizard/WizardIntegrationsStep.vue
+++ b/web/src/views/wizard/WizardIntegrationsStep.vue
@ -1,6 +1,6 @@
 <template>
  <div class="step">
-    <h2 class="step__heading">Step 7 — Integrations</h2>
+    <h2 class="step__heading">Step 8 — Integrations</h2>
    <p class="step__caption">
      Optional. Connect external tools to supercharge your workflow.
      You can configure these any time in Settings → System.
@ -54,6 +54,7 @@ const wizard = useWizardStore()
 const config = useAppConfigStore()
 const router = useRouter()

+
 const isPaid = computed(() =>
  wizard.tier === 'paid' || wizard.tier === 'premium',
 )
@ -87,7 +88,12 @@ async function finish() {
  // Save integration selections (step 7) then mark wizard complete
  await wizard.saveStep(8, { integrations: [...checkedIds.value] })
  const ok = await wizard.complete()
-  if (ok) router.replace('/')
+  if (ok) {
+    // Update store before navigating so the router guard sees wizard as complete
+    // without waiting for a full config.load() round-trip.
+    config.wizardComplete = true
+    router.replace('/')
+  }
 }
 </script>

--- a/web/src/views/wizard/WizardResumeStep.vue
+++ b/web/src/views/wizard/WizardResumeStep.vue
@ -2,7 +2,7 @@
  <div class="step">
    <h2 class="step__heading">Step 3 — Your Resume</h2>
    <p class="step__caption">
-      Upload a resume to auto-populate your profile, or build it manually.
+      Upload a resume to auto-populate your profile, build it manually, or let an AI guide you.
    </p>

    <!-- Tabs -->
@ -13,14 +13,31 @@
        class="resume-tab"
        :class="{ 'resume-tab--active': tab === 'upload' }"
        @click="tab = 'upload'"
-      >Upload File</button>
+      >
+        <span class="resume-tab__icon" aria-hidden="true">📄</span>
+        <span class="resume-tab__label">Upload File</span>
+      </button>
      <button
        role="tab"
        :aria-selected="tab === 'manual'"
        class="resume-tab"
        :class="{ 'resume-tab--active': tab === 'manual' }"
        @click="tab = 'manual'"
-      >Build Manually</button>
+      >
+        <span class="resume-tab__icon" aria-hidden="true">✏️</span>
+        <span class="resume-tab__label">Build Manually</span>
+      </button>
+      <button
+        role="tab"
+        :aria-selected="tab === 'ai'"
+        class="resume-tab resume-tab--ai"
+        :class="{ 'resume-tab--active': tab === 'ai' }"
+        @click="tab = 'ai'"
+      >
+        <span class="resume-tab__icon" aria-hidden="true">{{ hasAiAccess ? '✨' : '🔒' }}</span>
+        <span class="resume-tab__label">AI Assistant</span>
+        <span v-if="!hasAiAccess" class="resume-tab__badge">Paid</span>
+      </button>
    </div>

    <!-- Upload tab -->
@ -106,6 +123,34 @@
      </button>
    </div>

+    <!-- AI assistant tab -->
+    <div v-if="tab === 'ai'" class="resume-ai">
+      <div v-if="!hasAiAccess" class="ai-gate">
+        <p class="ai-gate__icon" aria-hidden="true">🔒</p>
+        <p class="ai-gate__heading">AI Assistant requires a Paid plan</p>
+        <p class="ai-gate__body">
+          Upgrade to Paid, or bring your own LLM key in
+          <strong>Settings → LLM Backends</strong> to unlock the AI profile assistant for free.
+        </p>
+        <p class="ai-gate__body">
+          In the meantime, use <button class="ai-gate__link" @click="tab = 'upload'">Upload File</button>
+          or <button class="ai-gate__link" @click="tab = 'manual'">Build Manually</button>.
+        </p>
+      </div>
+      <div v-else class="ai-embed">
+        <p class="ai-embed__intro">
+          The AI assistant will ask you a few questions to build your profile.
+          Your answers are saved locally — nothing is sent anywhere without your approval.
+        </p>
+        <a href="/wizard/ai-profile" class="btn-primary ai-embed__cta">
+          Open AI Assistant →
+        </a>
+        <p class="ai-embed__note">
+          Opens in a focused view. Come back here to continue the wizard once you're done.
+        </p>
+      </div>
+    </div>
+
    <div v-if="validationError" class="step__warning" style="margin-top: var(--space-4)">
      {{ validationError }}
    </div>
@ -120,17 +165,21 @@
 </template>

 <script setup lang="ts">
-import { ref } from 'vue'
+import { ref, computed } from 'vue'
 import { useRouter } from 'vue-router'
 import { useWizardStore } from '../../stores/wizard'
 import type { WorkExperience } from '../../stores/wizard'
 import { useApiFetch } from '../../composables/useApi'
+import { useAppConfigStore } from '../../stores/appConfig'
 import './wizard.css'

 const wizard = useWizardStore()
 const router = useRouter()
+const config = useAppConfigStore()

-const tab = ref<'upload' | 'manual'>(
+const hasAiAccess = computed(() => config.tier !== 'free' || config.byokUnlocked)
+
+const tab = ref<'upload' | 'manual' | 'ai'>(
  wizard.resume.experience.length > 0 ? 'manual' : 'upload',
 )
 const dragging = ref(false)
@ -223,30 +272,69 @@ async function next() {
 <style scoped>
 .resume-tabs {
  display: flex;
-  gap: 0;
-  border-bottom: 2px solid var(--color-border-light);
+  gap: var(--space-2);
+  border-bottom: 2px solid var(--color-border);
  margin-bottom: var(--space-6);
 }

 .resume-tab {
+  display: flex;
+  align-items: center;
+  gap: var(--space-2);
  padding: var(--space-2) var(--space-5);
-  background: none;
-  border: none;
+  background: var(--color-surface-alt);
+  border: 1.5px solid var(--color-border);
  border-bottom: 2px solid transparent;
+  border-radius: var(--radius-md) var(--radius-md) 0 0;
  margin-bottom: -2px;
  cursor: pointer;
  font-family: var(--font-body);
-  font-size: 0.9rem;
+  font-size: 0.875rem;
  color: var(--color-text-muted);
-  transition: color var(--transition), border-color var(--transition);
+  transition: color var(--transition), background var(--transition), border-color var(--transition);
+}
+
+.resume-tab:hover:not(.resume-tab--active) {
+  background: var(--color-surface-raised);
+  color: var(--color-text);
+  border-color: var(--color-border);
 }

 .resume-tab--active {
+  background: var(--color-surface);
  color: var(--color-primary);
-  border-bottom-color: var(--color-primary);
+  border-color: var(--color-border);
+  border-bottom-color: var(--color-surface);
  font-weight: 600;
 }

+.resume-tab--ai.resume-tab--active {
+  color: var(--color-accent);
+  border-bottom-color: var(--color-surface);
+}
+
+.resume-tab__icon {
+  font-size: 1rem;
+  line-height: 1;
+}
+
+.resume-tab__label {
+  /* explicit — keeps tab text from being an accessibility mystery */
+}
+
+.resume-tab__badge {
+  font-size: 0.7rem;
+  font-weight: 700;
+  padding: 1px var(--space-2);
+  background: color-mix(in srgb, var(--color-accent) 15%, transparent);
+  color: var(--color-accent);
+  border: 1px solid color-mix(in srgb, var(--color-accent) 30%, transparent);
+  border-radius: var(--radius-full);
+  margin-left: var(--space-1);
+  text-transform: uppercase;
+  letter-spacing: 0.04em;
+}
+
 .upload-zone {
  display: flex;
  flex-direction: column;
@ -310,4 +398,86 @@ async function next() {
  grid-template-columns: 1fr 1fr;
  gap: var(--space-4);
 }
+
+/* ── AI tab panels ──────────────────────────────────── */
+.resume-ai {
+  min-height: 200px;
+  display: flex;
+  flex-direction: column;
+  justify-content: center;
+}
+
+.ai-gate {
+  text-align: center;
+  padding: var(--space-8) var(--space-4);
+  display: flex;
+  flex-direction: column;
+  align-items: center;
+  gap: var(--space-3);
+  background: var(--color-surface-alt);
+  border: 1px dashed var(--color-border);
+  border-radius: var(--radius-lg);
+}
+
+.ai-gate__icon {
+  font-size: 2rem;
+  margin: 0;
+}
+
+.ai-gate__heading {
+  font-size: 1rem;
+  font-weight: 600;
+  color: var(--color-text);
+  margin: 0;
+}
+
+.ai-gate__body {
+  font-size: 0.875rem;
+  color: var(--color-text-muted);
+  line-height: 1.55;
+  margin: 0;
+  max-width: 380px;
+}
+
+.ai-gate__link {
+  background: none;
+  border: none;
+  padding: 0;
+  font-family: var(--font-body);
+  font-size: inherit;
+  color: var(--color-primary);
+  cursor: pointer;
+  text-decoration: underline;
+  text-underline-offset: 2px;
+}
+
+.ai-embed {
+  display: flex;
+  flex-direction: column;
+  align-items: flex-start;
+  gap: var(--space-4);
+  padding: var(--space-6);
+  background: color-mix(in srgb, var(--color-accent) 6%, var(--color-surface));
+  border: 1px solid color-mix(in srgb, var(--color-accent) 25%, transparent);
+  border-radius: var(--radius-lg);
+}
+
+.ai-embed__intro {
+  font-size: 0.9rem;
+  color: var(--color-text);
+  line-height: 1.6;
+  margin: 0;
+}
+
+.ai-embed__cta {
+  text-decoration: none;
+  display: inline-flex;
+  align-items: center;
+}
+
+.ai-embed__note {
+  font-size: 0.8rem;
+  color: var(--color-text-muted);
+  margin: 0;
+}
 </style>
--- a/web/src/views/wizard/WizardSearchStep.vue
+++ b/web/src/views/wizard/WizardSearchStep.vue
@ -1,6 +1,6 @@
 <template>
  <div class="step">
-    <h2 class="step__heading">Step 6 — Search Preferences</h2>
+    <h2 class="step__heading">Step 7 — Search Preferences</h2>
    <p class="step__caption">
      Tell Peregrine what roles and markets to watch. You can add more profiles
      in Settings → Search later.
Author	SHA1	Message	Date
pyr0ball	e225346d23	ci: retrigger after Docker network pool fix Some checks failed CI / Backend (Python) (push) Failing after 23s Details CI / Frontend (Vue) (push) Successful in 21s Details	2026-06-26 20:41:18 -07:00
pyr0ball	bfb6de0dfe	ci: add freeze/ branches to CI trigger Some checks failed CI / Backend (Python) (push) Failing after 15s Details CI / Frontend (Vue) (push) Failing after 1s Details freeze/rc branches were not covered by the push trigger, leaving RC-stage work untested. Adds 'freeze/' alongside existing patterns.	2026-06-26 19:24:40 -07:00
pyr0ball	82c26074d8	fix: search prefs wizard data loss, resume sync link, docs + GUI help links Bug fixes (filed as #125–#128): - Wizard step 7 read data.titles instead of data.search.titles — user-entered job titles and locations were silently dropped on every wizard run (#125) - GET /api/settings/search returned "titles" key but store expected "job_titles" — Settings → Search Prefs always showed empty even when data existed (#126) - remote_only preference not persisted during wizard setup (#127) - apply-to-profile didn't set default_resume_id in user.yaml, so future Resume Profile saves never synced back to the library entry (#128) Also: - Wizard step headings corrected (off-by-one after Training step was inserted) - Ollama host in wizard inference step now reads from saved wizard state - Resume upload during wizard now creates a library entry and sets it as default Docs: - New: docs/user-guide/daily-workflow.md — end-to-end daily usage guide - Updated: docs/user-guide/settings.md — rewritten for Vue SPA (was Streamlit) - mkdocs.yml nav: Daily Workflow added as first User Guide entry GUI help links: - web/src/composables/useDocsUrl.ts — shared docs base URL composable - Home: "Daily Workflow guide ↗" link in subtitle - Job Review: "? Docs" link in title row - Resume Library: "? Help" link in header - Settings → Resume Profile: "? Help" link in page header - Settings → Search Prefs: "? Help" link in page header	2026-06-15 16:52:56 -07:00
pyr0ball	f799aff4e0	fix: CPU as default inference profile, remote last in list - Reorder PROFILES in step_hardware.py, _WIZARD_PROFILES in dev-api.py, and <option> elements in WizardHardwareStep.vue: cpu → single-gpu → dual-gpu → cf-orch → remote - _suggest_profile() now defaults to "cpu" instead of "remote" when no local GPUs detected - Update no-GPU hint text to remove "Remote" from suggested options - Add nvidia GPU device reservation to compose.wizard-test.yml so the wizard test instance can run nvidia-smi and detect host GPUs - Switch wizard-test compose to use ghcr.io/circuitforgellc/peregrine:latest (same image as main compose, avoids stale peregrine-api tag drift)	2026-06-15 09:11:14 -07:00
pyr0ball	7e361aa6d1	chore: release Dockerfile and GHCR publish workflow for RC1 - Replace stale Streamlit Dockerfile with self-contained release build (uvicorn/FastAPI; Streamlit removed in #104) - cf-orch BSL client installed via BuildKit secret in release CI; community builds skip it gracefully and fall back to local backends - compose.yml api build now uses single-repo context (context: .) so self-hosters can build without sibling repo setup - Add image: tags to api + web services in compose.yml and compose.demo.yml so docker compose pull works for pre-built images - Enable Docker push in release.yml: api + web to GHCR on v* tags (was disabled pending BSL registry policy — cf-agents#3 resolved) - cloud image (compose.cloud.yml / Dockerfile.cfcore) unchanged: never published, built on Heimdall with sibling repos available - .dockerignore: add plain_text_resume.yaml and adzuna.yaml	2026-06-14 20:03:40 -07:00
pyr0ball	80041d1dd9	feat: wire cf-orch allocate flow for LLM routing - Fix cf_text base_url (was port 8006/cf-musicgen, corrected to 8008/cf-text) - Add cf_orch blocks to cf_text, ollama, ollama_research, vllm_research backends - Fix ollama_research base_url to host.docker.internal:11435 (was Docker service name) - Promote cf_text to top of research_fallback_order - Add cf_text backend to llm.cloud.yaml with cf_orch block - Wire _RL_WIZARD rate limit to wizard_ai_interview endpoint (closes TODO from #122) Closes: #122	2026-06-14 15:21:53 -07:00
pyr0ball	b3435a8bd8	fix: add slowapi to requirements.txt for Docker image slowapi was only in environment.yml (conda env) but missing from requirements.txt, causing ModuleNotFoundError in the Docker container.	2026-06-14 14:13:12 -07:00
pyr0ball	e85fb9bba3	test: fix rate limiter cross-test contamination Each importlib.reload(dev_api) re-applies @limiter.limit() decorators to the shared slowapi Limiter singleton, accumulating stale registrations in _route_limits. One real HTTP request then triggered N limit-checks (N = reload count), exhausting per-hour budgets prematurely. Fix: conftest.py autouse fixture resets both _storage and _route_limits before each test, giving a clean slate regardless of prior reloads. Also updates test_dev_api_prep.py client fixture to use monkeypatch to clear DEMO_MODE + importlib.reload to get a fresh IS_DEMO module state (prevents 403 bleed from test_demo_guard.py tests running first). All 842 tests passing.	2026-06-14 14:00:31 -07:00
pyr0ball	88b6943527	merge: feat/122-rate-limiting into freeze/rc-1 Per-user LLM rate limiting via slowapi: cloud-aware key function, 4 endpoint limits, demo bypass, SSRF and path traversal already in fix/ci-ruff-lint merge. Closes: #122	2026-06-14 12:41:18 -07:00
pyr0ball	71e8eeb090	merge: feat/77-ai-wizard into freeze/rc-1 AI profile wizard full implementation: backend interview endpoints, BYOK tier flag, chat UI, store actions (skip/keepChatting), settings CTA, quality review fixes. Closes: #77	2026-06-14 12:16:49 -07:00
pyr0ball	6db1fe1546	merge: fix/ci-ruff-lint into freeze/rc-1 CI lint fixes, CVE security mitigations, sync status UI (#120), bugbot Forgejo token fallback (#118), npm audit, mnemo compose stub.	2026-06-14 12:16:40 -07:00
pyr0ball	b13abb1118	feat(settings): sync status UI (#120 ) + bugbot Forgejo token fallback (#118 ) Issue #120 — sync status panel in DataView: - Add SyncStore (web/src/stores/settings/sync.ts) to track last-sync timestamp, in-progress state, and error message for profile/preferences - Extend DataView with a sync status section: last synced time, refresh button, error display, and per-section progress indicators Issue #118 — bugbot Forgejo token fallback: - scripts/feedback_api.py: try FORGEJO_BOT_TOKEN first, then fall back to FORGEJO_TOKEN so ops can provision a dedicated cf-bugbot account without breaking existing single-token installs Add FORGEJO_BOT_TOKEN and LLM_RATE_* env var documentation to .env.example Closes: #120 Closes: #118	2026-06-14 12:16:16 -07:00
pyr0ball	3cdd14c345	fix(security): CVE mitigations — path traversal, SSRF, dep upgrades, npm audit Path traversal (cloud middleware): - Add _VALID_USER_ID_RE UUID regex; reject non-UUID user_id before constructing db path from CLOUD_DATA_ROOT / user_id / ... - Non-UUID values log a warning and fall through to unauthenticated path SSRF (test_email IMAP endpoint): - Add _is_ssrf_host() using ipaddress + socket.gethostbyname() - Checks resolved IP against RFC-1918, loopback, and link-local ranges - Fails closed on DNS resolution errors (returns True = blocked) Dependency security pins in environment.yml (transitive CVEs): - starlette>=1.0.1 (PYSEC-2026-161), python-multipart>=0.0.27 (CVE-2026-40347), aiohttp>=3.14.0, tornado>=6.5.5, cryptography>=46.0.7, langsmith>=0.8.0, gitpython>=3.1.50, lxml>=6.1.0, idna>=3.15, markdownify>=0.14.1 - Direct dep upgrades: requests>=2.33.0, pypdf>=6.12.0, python-dotenv>=1.2.2, PyJWT>=2.13.0, curl_cffi>=0.15.0 npm audit (web/package-lock.json): - Resolved 7 of 9 CVEs; 2 remaining esbuild CVEs require vite 8 upgrade (tracked as issue #123 — breaking change, deferred)	2026-06-14 12:16:00 -07:00
pyr0ball	ad27467026	chore(infra): add mnemo service stub to compose.yml Pre-existing local development addition — mnemo vector memory service placeholder for future integration work.	2026-06-14 12:15:16 -07:00
pyr0ball	d801650db1	feat(api): per-user LLM rate limiting via slowapi Add scripts/rate_limit.py with cloud-aware key function: - In cloud mode, extracts user_id from _request_db ContextVar path (part[-3]) so each cloud user has their own rate limit bucket - In demo mode, returns unique per-request key to disable limiting entirely (_demo_guard handles write-blocking; rate limiting would block the demo UX) - Falls back to client IP for local/self-hosted installs Wire limiter to 4 endpoints with conservative per-user limits: - POST /generate/cover-letter: 20/hour - POST /research/run: 10/hour - POST /qa/suggest: 60/hour - POST /survey/analyze: 30/hour Add _demo_guard() to generate_research and suggest_qa_answer (was missing). Fix pre-existing silent except in suggest_qa_answer: was bare except pass, now logs warning with exc_info. Add _RL_WIZARD placeholder constant with TODO to wire to wizard/ai/interview after feat/77 merges (declared but intentionally not applied yet to avoid false sense of security — comment makes the gap explicit). 18 tests covering cloud user isolation, demo bypass, IP fallback, all 4 endpoints returning 429 on excess, retry_after header, and demo guard. Closes: #122	2026-06-14 12:14:21 -07:00
pyr0ball	eebfc84a80	fix(wizard): quality review fixes — store encapsulation + skip action + settings CTA - Add keepChatting() action to aiInterview store; replace direct store.complete = false mutation in WizardAIView template with store.keepChatting() - Add skip() action wrapping SKIP_SIGNAL constant; replace magic string store.send('skip') with store.skip() - Fix skip button disabled condition to include \|\| store.complete (was always enabled when wizard was complete, allowing spurious skip after finalize) - Add _persist() call after user bubble append in send() so localStorage draft is written before the async fetch — prevents stale draft on browser refresh during slow LLM call - Fix @click="store.startOver" → @click="store.startOver()" (missing parentheses) - Add 2 tests: skip() sends SKIP_SIGNAL, keepChatting() clears complete without reset - Remove 'ultra' from Tier type in appConfig.ts (violates no-ultra-tier policy) - Add MyProfileView wizard callout banner with tier-aware unlock/upgrade CTAs - Add clarifying comment on wizard route guard in router/index.ts Closes: #77	2026-06-14 12:13:58 -07:00
pyr0ball	cecf85de02	feat(wizard): AI interview store, WizardAIView chat UI, byokUnlocked in appConfig	2026-06-13 20:10:38 -07:00
pyr0ball	e9943908c6	fix(wizard): 503 on LLM error, sanitize history content, typed HistoryMessage model	2026-06-13 20:04:14 -07:00
pyr0ball	6d1edff1b9	fix(wizard): inject profile_so_far context into AI interview LLM prompt	2026-06-13 19:59:58 -07:00
pyr0ball	6327a4cdd9	feat(wizard): backend AI interview endpoints + BYOK tier flag	2026-06-13 19:57:00 -07:00
pyr0ball	3048d8e2f4	docs: add LLM development disclosure to README All checks were successful CI / Backend (Python) (push) Successful in 1m35s Details CI / Frontend (Vue) (push) Successful in 23s Details Humans own design, architecture, code review, testing, and verification. LLMs are part of our development workflow. Links to circuitforge.tech/positions for our full position.	2026-05-28 08:20:16 -07:00
pyr0ball	02d79e6727	fix(ci): install ruff before lint step All checks were successful CI / Backend (Python) (push) Successful in 1m33s Details CI / Frontend (Vue) (push) Successful in 19s Details CI / Backend (Python) (pull_request) Successful in 1m21s Details CI / Frontend (Vue) (pull_request) Successful in 19s Details ruff is not in requirements.txt (dev-only tool) so the CI runner couldn't find it. Install explicitly in the workflow.	2026-05-21 12:03:46 -07:00
pyr0ball	e4c5744d87	fix(ci): restore TaskSpec re-export in task_scheduler.py Some checks failed CI / Backend (Python) (push) Failing after 31s Details CI / Frontend (Vue) (push) Successful in 22s Details CI / Backend (Python) (pull_request) Failing after 23s Details CI / Frontend (Vue) (pull_request) Successful in 20s Details ruff --fix removed the TaskSpec import as unused within the module, but it is part of the public API — tests import it from scripts.task_scheduler rather than reaching into circuitforge_core directly. Add # noqa: F401 to protect intentional re-exports from future auto-fix.	2026-05-21 11:51:40 -07:00
pyr0ball	46bae7db1c	fix(ci): rename GITHUB_MIRROR_TOKEN secret to GH_MIRROR_TOKEN Some checks failed CI / Backend (Python) (push) Failing after 26s Details CI / Frontend (Vue) (push) Successful in 22s Details CI / Backend (Python) (pull_request) Failing after 22s Details CI / Frontend (Vue) (pull_request) Successful in 20s Details Forgejo reserves the GITHUB_* prefix for secret names — creating a secret called GITHUB_MIRROR_TOKEN returns 'invalid secret name'. Also rename the GITHUB_TOKEN step env var to GH_MIRROR_PAT to avoid collision with the built-in Forgejo Actions context variable.	2026-05-21 11:41:11 -07:00
pyr0ball	e87c707dd9	chore(lint): ruff auto-fix unused imports in tests/ Some checks failed CI / Backend (Python) (push) Failing after 30s Details CI / Frontend (Vue) (push) Successful in 22s Details CI / Backend (Python) (pull_request) Failing after 27s Details CI / Frontend (Vue) (pull_request) Successful in 20s Details Removes unused imports flagged by ruff F401 across 47 test files. Auto-fix only — imports verified unused by static analysis.	2026-05-20 23:07:52 -07:00
pyr0ball	7dcdf551fc	chore(lint): ruff auto-fix unused imports in scripts/ and scrapers/ Removes unused imports flagged by ruff F401 across 12 scripts. All removals are safe — ruff only auto-fixes imports that are verifiably unused.	2026-05-20 23:07:26 -07:00
pyr0ball	544a6aeeb3	fix(ci): add ruff config, clean lint in dev-api.py + scripts - Add pyproject.toml with ruff per-file-ignores: - Exclude deprecated app/ Streamlit dir entirely - Suppress E702 in dev-api.py (intentional compact Pydantic models) - Suppress E402 in finetune_local.py (conditional ML imports after CUDA check) - Suppress F841/E741/E702 in tests/ (mock-patch capture pattern) - Remove unused db_path_obj assignment in dev-api.py:760 - Add # noqa: E402 to documented mid-file imports in dev-api.py - Rename ambiguous l variable to line/lbl in finetune_local.py + label_tool.py	2026-05-20 23:06:49 -07:00