docs: add cloud architecture + cloud-deployment.md

architecture.md: updated Docker Compose table (3 compose files), database
layer (Postgres platform + SQLite-per-user), cloud session middleware,
telemetry system, and cloud design decisions.

cloud-deployment.md (new): full operational runbook — env vars, data root
layout, GDPR deletion, platform DB queries, telemetry, backup/restore,
Caddy routing, demo instance, and onboarding a new app to the cloud.
This commit is contained in:
pyr0ball 2026-03-09 23:02:29 -07:00
parent a893ba6527
commit ba295cb010
3 changed files with 400 additions and 79 deletions

View file

@ -6,87 +6,179 @@ This page describes Peregrine's system structure, layer boundaries, and key desi
## System Overview
### Pipeline
```mermaid
flowchart LR
sources["JobSpy\nCustom Boards"]
discover["discover.py"]
db[("staging.db\nSQLite")]
match["match.py\nScoring"]
review["Job Review\nApprove / Reject"]
apply["Apply Workspace\nCover letter + PDF"]
kanban["Interviews\nphone_screen → hired"]
sync["sync.py"]
notion["Notion DB"]
sources --> discover --> db --> match --> review --> apply --> kanban
db --> sync --> notion
```
┌─────────────────────────────────────────────────────────────┐
│ Docker Compose │
│ │
│ ┌──────────┐ ┌──────────┐ ┌───────┐ ┌───────────────┐ │
│ │ app │ │ ollama │ │ vllm │ │ vision │ │
│ │ :8501 │ │ :11434 │ │ :8000 │ │ :8002 │ │
│ │Streamlit │ │ Local LLM│ │ vLLM │ │ Moondream2 │ │
│ └────┬─────┘ └──────────┘ └───────┘ └───────────────┘ │
│ │ │
│ ┌────┴───────┐ ┌─────────────┐ │
│ │ searxng │ │ staging.db │ │
│ │ :8888 │ │ (SQLite) │ │
│ └────────────┘ └─────────────┘ │
└─────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────┐
│ Streamlit App Layer │
│ │
│ app/app.py (entry point, navigation, sidebar task badge) │
│ │
│ app/pages/ │
│ 0_Setup.py First-run wizard (gates everything) │
│ 1_Job_Review.py Approve / reject queue │
│ 2_Settings.py All user configuration │
│ 4_Apply.py Cover letter gen + PDF export │
│ 5_Interviews.py Kanban: phone_screen → hired │
│ 6_Interview_Prep.py Research brief + practice Q&A │
│ 7_Survey.py Culture-fit survey assistant │
│ │
│ app/wizard/ │
│ step_hardware.py ... step_integrations.py │
│ tiers.py Feature gate definitions │
└─────────────────────────────────────────────────────────────┘
### Docker Compose Services
┌─────────────────────────────────────────────────────────────┐
│ Scripts Layer │
│ (framework-independent — could be called by FastAPI) │
│ │
│ discover.py JobSpy + custom board orchestration │
│ match.py Resume keyword scoring │
│ db.py All SQLite helpers (single source) │
│ llm_router.py LLM fallback chain │
│ generate_cover_letter.py Cover letter generation │
│ company_research.py Pre-interview research brief │
│ task_runner.py Background daemon thread executor │
│ imap_sync.py IMAP email fetch + classify │
│ sync.py Push to external integrations │
│ user_profile.py UserProfile wrapper for user.yaml │
│ preflight.py Port + resource check │
│ │
│ custom_boards/ Per-board scrapers │
│ integrations/ Per-service integration drivers │
│ vision_service/ FastAPI Moondream2 inference server │
└─────────────────────────────────────────────────────────────┘
Three compose files serve different deployment contexts:
┌─────────────────────────────────────────────────────────────┐
│ Config Layer │
│ │
│ config/user.yaml Personal data + wizard state │
│ config/llm.yaml LLM backends + fallback chains │
│ config/search_profiles.yaml Job search configuration │
│ config/resume_keywords.yaml Scoring keywords │
│ config/blocklist.yaml Excluded companies/domains │
│ config/email.yaml IMAP credentials │
│ config/integrations/ Per-integration credentials │
└─────────────────────────────────────────────────────────────┘
| File | Project name | Port | Purpose |
|------|-------------|------|---------|
| `compose.yml` | `peregrine` | 8502 | Local self-hosted install (default) |
| `compose.demo.yml` | `peregrine-demo` | 8504 | Public demo at `demo.circuitforge.tech/peregrine``DEMO_MODE=true`, no LLM |
| `compose.cloud.yml` | `peregrine-cloud` | 8505 | Cloud managed instance at `menagerie.circuitforge.tech/peregrine``CLOUD_MODE=true`, per-user data |
┌─────────────────────────────────────────────────────────────┐
│ Database Layer │
│ │
│ staging.db (SQLite, local, gitignored) │
│ │
│ jobs Core pipeline — all job data │
│ job_contacts Email thread log per job │
│ company_research LLM-generated research briefs │
│ background_tasks Async task queue state │
│ survey_responses Culture-fit survey Q&A pairs │
└─────────────────────────────────────────────────────────────┘
```mermaid
flowchart TB
subgraph local["compose.yml (local)"]
app_l["**app** :8502\nStreamlit UI"]
ollama_l["**ollama**\nLocal LLM"]
vllm_l["**vllm**\nvLLM"]
vision_l["**vision**\nMoondream2"]
searxng_l["**searxng**\nWeb Search"]
db_l[("staging.db\nSQLite")]
end
subgraph cloud["compose.cloud.yml (cloud)"]
app_c["**app** :8505\nStreamlit UI\nCLOUD_MODE=true"]
searxng_c["**searxng**\nWeb Search"]
db_c[("menagerie-data/\n<user-id>/staging.db\nSQLCipher")]
pg[("Postgres\nplatform DB\n:5433")]
end
```
Solid lines = always connected. Dashed lines = optional/profile-dependent backends.
### Streamlit App Layer
```mermaid
flowchart TD
entry["app/app.py\nEntry point · navigation · sidebar task badge"]
setup["0_Setup.py\nFirst-run wizard\n⚠ Gates everything"]
review["1_Job_Review.py\nApprove / reject queue"]
settings["2_Settings.py\nAll user configuration"]
apply["4_Apply.py\nCover letter gen + PDF export"]
interviews["5_Interviews.py\nKanban: phone_screen → hired"]
prep["6_Interview_Prep.py\nResearch brief + practice Q&A"]
survey["7_Survey.py\nCulture-fit survey assistant"]
wizard["app/wizard/\nstep_hardware.py … step_integrations.py\ntiers.py — feature gate definitions"]
entry --> setup
entry --> review
entry --> settings
entry --> apply
entry --> interviews
entry --> prep
entry --> survey
setup <-.->|wizard steps| wizard
```
### Scripts Layer
Framework-independent — no Streamlit imports. Can be called from CLI, FastAPI, or background threads.
| Script | Purpose |
|--------|---------|
| `discover.py` | JobSpy + custom board orchestration |
| `match.py` | Resume keyword scoring |
| `db.py` | All SQLite helpers (single source of truth) |
| `llm_router.py` | LLM fallback chain |
| `generate_cover_letter.py` | Cover letter generation |
| `company_research.py` | Pre-interview research brief |
| `task_runner.py` | Background daemon thread executor |
| `imap_sync.py` | IMAP email fetch + classify |
| `sync.py` | Push to external integrations |
| `user_profile.py` | `UserProfile` wrapper for `user.yaml` |
| `preflight.py` | Port + resource check |
| `custom_boards/` | Per-board scrapers |
| `integrations/` | Per-service integration drivers |
| `vision_service/` | FastAPI Moondream2 inference server |
### Config Layer
Plain YAML files. Gitignored files contain secrets; `.example` files are committed as templates.
| File | Purpose |
|------|---------|
| `config/user.yaml` | Personal data + wizard state |
| `config/llm.yaml` | LLM backends + fallback chains |
| `config/search_profiles.yaml` | Job search configuration |
| `config/resume_keywords.yaml` | Scoring keywords |
| `config/blocklist.yaml` | Excluded companies/domains |
| `config/email.yaml` | IMAP credentials |
| `config/integrations/` | Per-integration credentials |
### Database Layer
**Local mode** — `staging.db`: SQLite, single file, gitignored.
**Cloud mode** — Hybrid:
- **Postgres (platform layer):** account data, subscriptions, telemetry consent. Shared across all users.
- **SQLite-per-user (content layer):** each user's job data in an isolated, SQLCipher-encrypted file at `/devl/menagerie-data/<user-id>/peregrine/staging.db`. Schema is identical to local — the app sees no difference.
#### Local SQLite tables
| Table | Purpose |
|-------|---------|
| `jobs` | Core pipeline — all job data |
| `job_contacts` | Email thread log per job |
| `company_research` | LLM-generated research briefs |
| `background_tasks` | Async task queue state |
| `survey_responses` | Culture-fit survey Q&A pairs |
#### Postgres platform tables (cloud only)
| Table | Purpose |
|-------|---------|
| `subscriptions` | User tier, license JWT, product |
| `usage_events` | Anonymous usage telemetry (consent-gated) |
| `telemetry_consent` | Per-user telemetry preferences + hard kill switch |
| `support_access_grants` | Time-limited support session grants |
---
### Cloud Session Middleware
`app/cloud_session.py` handles multi-tenant routing transparently:
```
Request → Caddy injects X-CF-Session header (from Directus session cookie)
→ resolve_session() validates JWT, derives db_path + db_key
→ all DB calls use get_db_path() instead of DEFAULT_DB
```
Key functions:
| Function | Purpose |
|----------|---------|
| `resolve_session(app)` | Called at top of every page — no-op in local mode |
| `get_db_path()` | Returns per-user `db_path` (cloud) or `DEFAULT_DB` (local) |
| `derive_db_key(user_id)` | `HMAC(SERVER_SECRET, user_id)` — deterministic per-user SQLCipher key |
The app code never branches on `CLOUD_MODE` except at the entry points (`resolve_session` and `get_db_path`). Everything downstream is transparent.
### Telemetry (cloud only)
`app/telemetry.py` is the **only** path to the `usage_events` table. No feature may write there directly.
```python
from app.telemetry import log_usage_event
log_usage_event(user_id, "peregrine", "cover_letter_generated", {"words": 350})
```
- Complete no-op when `CLOUD_MODE=false`
- Checks `telemetry_consent.all_disabled` first — if set, nothing is written, no exceptions
- Swallows all exceptions so telemetry never crashes the app
---
## Layer Boundaries
@ -129,7 +221,18 @@ submit_task(db_path, task_type="cover_letter", job_id=42)
submit_task(db_path, task_type="company_research", job_id=42)
```
Tasks are recorded in the `background_tasks` table with statuses: `queued → running → completed / failed`.
Tasks are recorded in the `background_tasks` table with the following state machine:
```mermaid
stateDiagram-v2
[*] --> queued : submit_task()
queued --> running : daemon picks up
running --> completed
running --> failed
queued --> failed : server restart clears stuck tasks
completed --> [*]
failed --> [*]
```
**Dedup rule:** Only one `queued` or `running` task per `(task_type, job_id)` pair is allowed at a time. Submitting a duplicate is a silent no-op.
@ -166,3 +269,18 @@ The scripts layer was deliberately kept free of Streamlit imports. This means th
### Vision service is a separate process
Moondream2 requires `torch` and `transformers`, which are incompatible with the lightweight main conda environment. The vision service runs as a separate FastAPI process in a separate conda environment (`job-seeker-vision`), keeping the main env free of GPU dependencies.
### Cloud mode is a transparent layer, not a fork
`CLOUD_MODE=true` activates two entry points (`resolve_session`, `get_db_path`) and the telemetry middleware. Every other line of app code is unchanged. There is no cloud branch, no conditional imports, no schema divergence. The local-first architecture is preserved end-to-end; the cloud layer sits on top of it.
### SQLite-per-user instead of shared Postgres
Each cloud user gets their own encrypted SQLite file. This means:
- No SQL migrations when the schema changes — new users get the latest schema, existing users keep their file as-is
- Zero risk of cross-user data leakage at the DB layer
- GDPR deletion is `rm -rf /devl/menagerie-data/<user-id>/` — auditable and complete
- The app can be tested locally with `CLOUD_MODE=false` without any Postgres dependency
The Postgres platform DB holds only account metadata (subscriptions, consent, telemetry) — never job search content.

View file

@ -0,0 +1,198 @@
# Cloud Deployment
This page covers operating the Peregrine cloud managed instance at `menagerie.circuitforge.tech/peregrine`.
---
## Architecture Overview
```
Browser → Caddy (bastion) → host:8505 → peregrine-cloud container
┌─────────────────────────┼──────────────────────────┐
│ │ │
cloud_session.py /devl/menagerie-data/ Postgres :5433
(session routing) <user-id>/peregrine/ (platform DB)
staging.db (SQLCipher)
```
Caddy injects the Directus session cookie as `X-CF-Session`. `cloud_session.py` validates the JWT, derives the per-user db path and SQLCipher key, and injects both into `st.session_state`. All downstream DB calls are transparent — the app never knows it's multi-tenant.
---
## Compose File
```bash
# Start
docker compose -f compose.cloud.yml --project-name peregrine-cloud --env-file .env up -d
# Stop
docker compose -f compose.cloud.yml --project-name peregrine-cloud down
# Logs
docker compose -f compose.cloud.yml --project-name peregrine-cloud logs app -f
# Rebuild after code changes
docker compose -f compose.cloud.yml --project-name peregrine-cloud build app
docker compose -f compose.cloud.yml --project-name peregrine-cloud up -d
```
---
## Required Environment Variables
These must be present in `.env` (gitignored) before starting the cloud stack:
| Variable | Description | Where to find |
|----------|-------------|---------------|
| `CLOUD_MODE` | Must be `true` | Hardcoded in compose.cloud.yml |
| `CLOUD_DATA_ROOT` | Host path for per-user data trees | `/devl/menagerie-data` |
| `DIRECTUS_JWT_SECRET` | Directus signing secret — validates session JWTs | `website/.env``DIRECTUS_SECRET` |
| `CF_SERVER_SECRET` | Server secret for SQLCipher key derivation | Generate: `openssl rand -base64 32 \| tr -d '/=+' \| cut -c1-32` |
| `PLATFORM_DB_URL` | Postgres connection string for platform DB | `postgresql://cf_platform:<pass>@host.docker.internal:5433/circuitforge_platform` |
!!! warning "SECRET ROTATION"
`CF_SERVER_SECRET` is used to derive all per-user SQLCipher keys via `HMAC(secret, user_id)`. Rotating this secret renders all existing user databases unreadable. Do not rotate it without a migration plan.
---
## Data Root
User data lives at `/devl/menagerie-data/` on the host, bind-mounted into the container:
```
/devl/menagerie-data/
<directus-user-uuid>/
peregrine/
staging.db ← SQLCipher-encrypted (AES-256)
config/ ← llm.yaml, server.yaml, user.yaml, etc.
data/ ← documents, exports, attachments
```
The directory is created automatically on first login. The SQLCipher key for each user is derived deterministically: `HMAC-SHA256(CF_SERVER_SECRET, user_id)`.
### GDPR / Data deletion
To fully delete a user's data:
```bash
# Remove all content data
rm -rf /devl/menagerie-data/<user-id>/
# Remove platform DB rows (cascades)
docker exec cf-platform-db psql -U cf_platform -d circuitforge_platform \
-c "DELETE FROM subscriptions WHERE user_id = '<user-id>';"
```
---
## Platform Database
The Postgres platform DB runs as `cf-platform-db` in the website compose stack (port 5433 on host).
```bash
# Connect
docker exec cf-platform-db psql -U cf_platform -d circuitforge_platform
# Check tables
\dt
# View telemetry consent for a user
SELECT * FROM telemetry_consent WHERE user_id = '<uuid>';
# View recent usage events
SELECT user_id, event_type, occurred_at FROM usage_events
ORDER BY occurred_at DESC LIMIT 20;
```
The schema is initialised on container start from `platform-db/init.sql` in the website repo.
---
## Telemetry
`app/telemetry.py` is the **only** entry point to `usage_events`. Never write to that table directly.
```python
from app.telemetry import log_usage_event
# Fires in cloud mode only; no-op locally
log_usage_event(user_id, "peregrine", "cover_letter_generated", {"words": 350})
```
Events are blocked if:
1. `telemetry_consent.all_disabled = true` (hard kill switch, overrides all)
2. `telemetry_consent.usage_events_enabled = false`
The user controls both from Settings → 🔒 Privacy.
---
## Backup / Restore (Cloud Mode)
The Settings → 💾 Data tab handles backup/restore transparently. In cloud mode:
- **Export:** the SQLCipher-encrypted DB is decrypted before zipping — the downloaded `.zip` is a portable plain SQLite archive, compatible with any local Docker install.
- **Import:** a plain SQLite backup is re-encrypted with the user's key on restore.
The user's `base_dir` in cloud mode is `get_db_path().parent` (`/devl/menagerie-data/<user-id>/peregrine/`), not the app root.
---
## Routing (Caddy)
`menagerie.circuitforge.tech` in `/devl/caddy-proxy/Caddyfile`:
```caddy
menagerie.circuitforge.tech {
encode gzip zstd
handle /peregrine* {
reverse_proxy http://host.docker.internal:8505 {
header_up X-CF-Session {header.Cookie}
}
}
handle {
respond "This app is not yet available in the managed cloud — check back soon." 503
}
log {
output file /data/logs/menagerie.circuitforge.tech.log
format json
}
}
```
`header_up X-CF-Session {header.Cookie}` passes the full cookie header so `cloud_session.py` can extract the Directus session token.
!!! note "Caddy inode gotcha"
After editing the Caddyfile, run `docker restart caddy-proxy` — not `caddy reload`. The Edit tool creates a new inode; Docker bind mounts pin to the original inode and `caddy reload` re-reads the stale one.
---
## Demo Instance
The public demo at `demo.circuitforge.tech/peregrine` runs separately:
```bash
# Start demo
docker compose -f compose.demo.yml --project-name peregrine-demo up -d
# Rebuild after code changes
docker compose -f compose.demo.yml --project-name peregrine-demo build app
docker compose -f compose.demo.yml --project-name peregrine-demo up -d
```
`DEMO_MODE=true` blocks all LLM inference calls at `llm_router.py`. Discovery, job enrichment, and the UI work normally. Demo data lives in `demo/config/` and `demo/data/` — isolated from personal data.
---
## Adding a New App to the Cloud
To onboard a new menagerie app (e.g. `falcon`) to the cloud:
1. Add `resolve_session("falcon")` at the top of each page (calls `cloud_session.py` with the app slug)
2. Replace `DEFAULT_DB` references with `get_db_path()`
3. Add `app/telemetry.py` import and `log_usage_event()` calls at key action points
4. Create `compose.cloud.yml` following the Peregrine pattern (port, `CLOUD_MODE=true`, data mount)
5. Add a Caddy `handle /falcon*` block in `menagerie.circuitforge.tech`, routing to the new port
6. `cloud_session.py` automatically creates `<data_root>/<user-id>/falcon/` on first login

View file

@ -1,9 +1,9 @@
site_name: Peregrine
site_description: AI-powered job search pipeline
site_author: Circuit Forge LLC
site_url: https://docs.circuitforge.io/peregrine
repo_url: https://git.circuitforge.io/circuitforge/peregrine
repo_name: circuitforge/peregrine
site_url: https://docs.circuitforge.tech/peregrine
repo_url: https://git.opensourcesolarpunk.com/pyr0ball/peregrine
repo_name: pyr0ball/peregrine
theme:
name: material
@ -32,7 +32,11 @@ theme:
markdown_extensions:
- admonition
- pymdownx.details
- pymdownx.superfences
- pymdownx.superfences:
custom_fences:
- name: mermaid
class: mermaid
format: !!python/name:pymdownx.superfences.fence_code_format
- pymdownx.highlight:
anchor_linenums: true
- pymdownx.tabbed:
@ -58,6 +62,7 @@ nav:
- Developer Guide:
- Contributing: developer-guide/contributing.md
- Architecture: developer-guide/architecture.md
- Cloud Deployment: developer-guide/cloud-deployment.md
- Adding a Scraper: developer-guide/adding-scrapers.md
- Adding an Integration: developer-guide/adding-integrations.md
- Testing: developer-guide/testing.md