diff --git a/.env.example b/.env.example index 61d12b2..85223ab 100644 --- a/.env.example +++ b/.env.example @@ -12,21 +12,10 @@ VISION_REVISION=2025-01-09 DOCS_DIR=~/Documents/JobSearch OLLAMA_MODELS_DIR=~/models/ollama -VLLM_MODELS_DIR=~/models/vllm # override with full path to your model dir -VLLM_MODEL=Ouro-1.4B # cover letters — fast 1.4B model -VLLM_RESEARCH_MODEL=Ouro-2.6B-Thinking # research — reasoning 2.6B model; restart vllm to switch -VLLM_MAX_MODEL_LEN=4096 # increase to 8192 for Thinking models with long CoT -VLLM_GPU_MEM_UTIL=0.75 # lower to 0.6 if sharing GPU with other services +VLLM_MODELS_DIR=~/models/vllm +VLLM_MODEL=Ouro-1.4B OLLAMA_DEFAULT_MODEL=llama3.2:3b -# ── LLM env-var auto-config (alternative to config/llm.yaml) ───────────────── -# Set any of these to configure LLM backends without needing a config/llm.yaml. -# Priority: Anthropic > OpenAI-compat > Ollama (always tried as local fallback). -OLLAMA_HOST=http://localhost:11434 # Ollama host; override if on a different machine -OLLAMA_MODEL=llama3.2:3b # model to request from Ollama -OPENAI_MODEL=gpt-4o-mini # model override for OpenAI-compat backend -ANTHROPIC_MODEL=claude-haiku-4-5-20251001 # model override for Anthropic backend - # API keys (required for remote profile) ANTHROPIC_API_KEY= OPENAI_COMPAT_URL= @@ -39,12 +28,6 @@ FORGEJO_API_URL=https://git.opensourcesolarpunk.com/api/v1 # GITHUB_TOKEN= # future — enable when public mirror is active # GITHUB_REPO= # future -# ── CF-hosted coordinator (Paid+ tier) ─────────────────────────────────────── -# Set CF_LICENSE_KEY to authenticate with the hosted coordinator. -# Leave both blank for local self-hosted cf-orch or bare-metal inference. -CF_LICENSE_KEY= -CF_ORCH_URL=https://orch.circuitforge.tech - # Cloud multi-tenancy (compose.cloud.yml only — do not set for local installs) CLOUD_MODE=false CLOUD_DATA_ROOT=/devl/menagerie-data diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml index 1bc28dc..f956e6c 100644 --- a/.github/workflows/ci.yml +++ b/.github/workflows/ci.yml @@ -22,12 +22,6 @@ jobs: python-version: "3.11" cache: pip - - name: Configure git credentials for Forgejo - env: - FORGEJO_TOKEN: ${{ secrets.FORGEJO_TOKEN }} - run: | - git config --global url."https://oauth2:${FORGEJO_TOKEN}@git.opensourcesolarpunk.com/".insteadOf "https://git.opensourcesolarpunk.com/" - - name: Install dependencies run: pip install -r requirements.txt diff --git a/CHANGELOG.md b/CHANGELOG.md index e0754e3..d2fa234 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -9,222 +9,6 @@ Format follows [Keep a Changelog](https://keepachangelog.com/en/1.0.0/). --- -## [0.8.5] — 2026-04-02 - -### Added - -- **Vue onboarding wizard** — 7-step first-run setup replaces the Streamlit wizard - in the Vue SPA: Hardware detection → Tier → Resume upload/build → Identity → - Inference & API keys → Search preferences → Integrations. Progress saves to - `user.yaml` on every step; crash-recovery resumes from the last completed step. -- **Wizard API endpoints** — `GET /api/wizard/status`, `POST /api/wizard/step`, - `GET /api/wizard/hardware`, `POST /api/wizard/inference/test`, - `POST /api/wizard/complete`. Inference test always soft-fails so Ollama being - unreachable never blocks setup completion. -- **Cloud auto-skip** — cloud instances automatically complete steps 1 (hardware), - 2 (tier), and 5 (inference) and drop the user directly on the Resume step. -- **`wizardGuard` router gate** — all Vue routes require wizard completion; completed - users are bounced away from `/setup` to `/`. -- **Chip-input search step** — job titles and locations entered as press-Enter/comma - chips; validates at least one title before advancing. -- **Integrations tile grid** — optional step 7 shows Notion, Calendar, Slack, Discord, - Drive with paid-tier badges; skippable on Finish. - -### Fixed - -- **User config isolation: dangerous fallback removed** — `_user_yaml_path()` fell - back to `/devl/job-seeker/config/user.yaml` (legacy profile) when `user.yaml` - didn't exist at the expected path; new users now get an empty dict instead of - another user's data. Affects profile, resume, search, and all wizard endpoints. -- **Resume path not user-isolated** — `RESUME_PATH = Path("config/plain_text_resume.yaml")` - was a relative CWD path shared across all users. Replaced with `_resume_path()` - derived from `_user_yaml_path()` / `STAGING_DB`. -- **Resume upload silently returned empty data** — `upload_resume` was passing a - file path string to `structure_resume()` which expects raw text; now reads bytes - and dispatches to the correct extractor (`extract_text_from_pdf` / `_docx` / `_odt`). -- **Wizard resume step read wrong envelope field** — `WizardResumeStep.vue` read - `data.experience` but the upload response wraps parsed data under `data.data`. - ---- - -## [0.8.4] — 2026-04-02 - -### Fixed - -- **Cloud: cover letter used wrong user's profile** — `generate_cover_letter.generate()` - loaded `_profile` from the global `config/user.yaml` at module import time, so all - cloud users got the default user's name, voice, and mission preferences in their - generated letters. `generate()` now accepts a `user_yaml_path` parameter; `task_runner` - derives it from the per-user config directory (`db_path/../config/user.yaml`) and - passes it through. `_build_system_context`, `_build_mission_notes`, `detect_mission_alignment`, - `build_prompt`, and `_trim_to_letter_end` all accept a `profile` override so the - per-call profile is used end-to-end without breaking CLI mode. -- **Apply Workspace: hardcoded config paths in cloud mode** — `4_Apply.py` was loading - `_USER_YAML` and `RESUME_YAML` from the repo-root `config/` before `resolve_session()` - ran, so cloud users saw the global (Meg's) resume in the Apply tab. Both paths now - derive from `get_config_dir()` after session resolution. - -### Changed - -- **Vue SPA open to all tiers** — Vue 3 frontend is no longer gated behind the beta - flag; all tier users can switch to the Vue UI from Settings. -- **LLM model candidates** — vllm backend now tries Qwen2.5-3B first, Phi-4-mini - as fallback (was reversed). cf_orch allocation block added to vllm config. -- **Preflight** — removed `vllm` from Docker adoption list; vllm is now managed - entirely by cf-orch and should not be stubbed by preflight. - ---- - -## [0.8.3] — 2026-04-01 - -### Fixed -- **CI: Forgejo auth** — GitHub Actions `pip install` was failing to fetch - `circuitforge-core` from the private Forgejo VCS URL. Added `FORGEJO_TOKEN` - repository secret and a `git config insteadOf` step to inject credentials - before `pip install`. -- **CI: settings API tests** — 6 `test_dev_api_settings` PUT/POST tests were - returning HTTP 500 in CI because `_user_yaml_path()` read the module-level - `DB_PATH` constant (frozen at import time), so `monkeypatch.setenv("STAGING_DB")` - had no effect. Fixed by reading `os.environ` at call time. - ---- - -## [0.8.2] — 2026-04-01 - -### Fixed -- **CI pipeline** — `pip install -r requirements.txt` was failing in GitHub Actions - because `-e ../circuitforge-core` requires a sibling directory that doesn't exist - in a single-repo checkout. Replaced with a `git+https://` VCS URL fallback; - `Dockerfile.cfcore` still installs from the local `COPY` to avoid redundant - network fetches during Docker builds. -- **Vue-nav reload loop** — `sync_ui_cookie()` was calling - `window.parent.location.reload()` on every render when `user.yaml` has - `ui_preference: vue` but no Caddy proxy is in the traffic path (test instances, - bare Docker). Gated the reload on `PEREGRINE_CADDY_PROXY=1`; instances without - the env var set the cookie silently and skip the reload. - -### Changed -- **cfcore VRAM lease integration** — the task scheduler now acquires a VRAM lease - from the cf-orch coordinator before running a batch of LLM tasks and releases it - when the batch completes. Visible in the coordinator dashboard at `:7700`. -- **`CF_ORCH_URL` env var** — scheduler reads coordinator address from - `CF_ORCH_URL` (default `http://localhost:7700`); set to - `http://host.docker.internal:7700` in Docker compose files so containers can - reach the host coordinator. -- **All compose files on `Dockerfile.cfcore`** — `compose.yml`, `compose.cloud.yml`, - and `compose.test-cfcore.yml` all use the parent-context build. `build: .` is - removed from `compose.yml`. - ---- - -## [0.8.1] — 2026-04-01 - -### Fixed -- **Job title suggester silent failure** — when the LLM returned empty arrays or - non-JSON text, the spinner would complete with zero UI feedback. Now shows an - explicit "No new suggestions found" info message with a resume-upload hint for - new users who haven't uploaded a resume yet. -- **Suggester exception handling** — catch `Exception` instead of only - `RuntimeError` so connection errors and `FileNotFoundError` (missing llm.yaml) - surface as error messages rather than crashing the page silently. - -### Added -- **`Dockerfile.cfcore`** — parent-context Dockerfile that copies - `circuitforge-core/` alongside `peregrine/` before `pip install`, resolving - the `-e ../circuitforge-core` editable requirement inside Docker. -- **`compose.test-cfcore.yml`** — single-user test instance on port 8516 for - smoke-testing cfcore shim integration before promoting to the cloud instance. - ---- - -## [0.8.0] — 2026-04-01 - -### Added -- **ATS Resume Optimizer** (gap report free; LLM rewrite paid+) - - `scripts/resume_optimizer.py` — full pipeline: TF-IDF gap extraction → - `prioritize_gaps` → `rewrite_for_ats` → hallucination guard (anchor-set - diffing on employers, institutions, and dates) - - `scripts/db.py` — `optimized_resume` + `ats_gap_report` columns; - `save_optimized_resume` / `get_optimized_resume` helpers - - `GET /api/jobs/{id}/resume_optimizer` — fetch gap report + rewrite - - `POST /api/jobs/{id}/resume_optimizer/generate` — queue rewrite task - - `GET /api/jobs/{id}/resume_optimizer/task` — poll task status - - `web/src/components/ResumeOptimizerPanel.vue` — gap report (all tiers), - LLM rewrite section (paid+), hallucination warning badge, `.txt` download - - `ResumeOptimizerPanel` integrated into `ApplyWorkspace` - -- **Vue SPA full merge** (closes #8) — `feature/vue-spa` merged to `main` - - `dev-api.py` — full FastAPI backend (settings, jobs, interviews, prep, - survey, digest, resume optimizer); cloud session middleware (JWT → per-user - SQLite); BYOK credential store - - `dev_api.py` — symlink → `dev-api.py` for importable module alias - - `scripts/job_ranker.py` — two-stage ranking for `/api/jobs/stack` - - `scripts/credential_store.py` — per-user BYOK API key management - - `scripts/user_profile.py` — `load_user_profile` / `save_user_profile` - - `web/src/components/TaskIndicator.vue` + `web/src/stores/tasks.ts` — - live background task queue display - - `web/public/` — peregrine logo assets (SVG + PNG) - -- **API test suite** — 5 new test modules (622 tests total) - - `tests/test_dev_api_settings.py` (38 tests) - - `tests/test_dev_api_interviews.py`, `test_dev_api_prep.py`, - `test_dev_api_survey.py`, `test_dev_api_digest.py` - -### Fixed -- **Cloud DB routing** — `app/pages/1_Job_Review.py`, `5_Interviews.py`, - `6_Interview_Prep.py`, `7_Survey.py` were hardcoding `DEFAULT_DB`; now - use `get_db_path()` for correct per-user routing in cloud mode (#24) -- **Test isolation** — `importlib.reload(dev_api)` in digest/interviews - fixtures reset all module globals, silently breaking `monkeypatch.setattr` - in subsequent test files; replaced with targeted `monkeypatch.setattr(dev_api, - "DB_PATH", tmp_db)` (#26) - ---- - -## [0.7.0] — 2026-03-22 - -### Added -- **Vue 3 SPA — beta access for paid tier** — The new Vue 3 frontend (built with - Vite + UnoCSS) is now merged into `main` and available to paid-tier subscribers - as an opt-in beta. The Streamlit UI remains the default and will continue to - receive full support. - - `web/` — full Vue 3 SPA source (components, stores, router, composables, - views) from `feature/vue-spa` - - `web/src/components/ClassicUIButton.vue` — one-click switch back to the - Classic (Streamlit) UI; sets `prgn_ui=streamlit` cookie and appends - `?prgn_switch=streamlit` so `user.yaml` stays in sync - - `web/src/composables/useFeatureFlag.ts` — reads `prgn_demo_tier` cookie for - demo toolbar visual consistency (display-only, not an authoritative gate) - -- **UI switcher** — Reddit-style opt-in to the Vue SPA with durable preference - persistence and graceful fallback. - - `app/components/ui_switcher.py` — `sync_ui_cookie()`, `switch_ui()`, - `render_banner()`, `render_settings_toggle()` - - `scripts/user_profile.py` — `ui_preference` field (`streamlit` | `vue`, - default: `streamlit`) with round-trip `save()` - - `app/wizard/tiers.py` — `vue_ui_beta: "paid"` feature key; `demo_tier` - keyword arg on `can_use()` for thread-safe demo mode simulation - - Banner (dismissible, paid tier only) + Settings → System → Deployment toggle - - Caddy cookie routing: `prgn_ui=vue` → nginx Vue SPA; absent/`streamlit` → - Streamlit. 502 fallback clears cookie and redirects with `?ui_fallback=1` - -- **Demo toolbar** — slim full-width tier-simulation bar for `DEMO_MODE` - instances. Free / Paid / Premium pills let demo visitors explore all feature - tiers without an account. Persists via `prgn_demo_tier` cookie. Default: Paid - (most compelling first impression). `app/components/demo_toolbar.py` - -- **Docker `web` service** — multi-stage nginx container serving the Vue SPA - `dist/` build. Added to `compose.yml` (port 8506), `compose.demo.yml` - (port 8507), `compose.cloud.yml` (port 8508). `manage.sh build` now includes - the `web` service alongside `app`. - -### Changed -- **Caddy routing** — `menagerie.circuitforge.tech` and - `demo.circuitforge.tech` peregrine blocks now inspect the `prgn_ui` cookie - and fan-out to the Vue SPA service or Streamlit accordingly. - ---- - ## [0.6.2] — 2026-03-18 ### Added diff --git a/Dockerfile.cfcore b/Dockerfile.cfcore deleted file mode 100644 index 6387c2a..0000000 --- a/Dockerfile.cfcore +++ /dev/null @@ -1,47 +0,0 @@ -# Dockerfile.cfcore — build context must be the PARENT directory of peregrine/ -# -# Used when circuitforge-core is installed from source (not PyPI). -# Both repos must be siblings on the build host: -# /devl/peregrine/ → WORKDIR /app -# /devl/circuitforge-core/ → installed to /circuitforge-core -# -# Build manually: -# docker build -f peregrine/Dockerfile.cfcore -t peregrine-cfcore .. -# -# Via compose (compose.test-cfcore.yml sets context: ..): -# docker compose -f compose.test-cfcore.yml build -FROM python:3.11-slim - -WORKDIR /app - -# System deps for companyScraper (beautifulsoup4, fake-useragent, lxml) and PDF gen -# libsqlcipher-dev: required to build pysqlcipher3 (SQLCipher AES-256 encryption for cloud mode) -RUN apt-get update && apt-get install -y --no-install-recommends \ - gcc libffi-dev curl libsqlcipher-dev \ - && rm -rf /var/lib/apt/lists/* - -# Copy circuitforge-core and install it from the local path before requirements.txt. -# requirements.txt has a git+https:// fallback URL for CI (where circuitforge-core -# is not a sibling directory), but Docker always has the local copy available here. -COPY circuitforge-core/ /circuitforge-core/ -RUN pip install --no-cache-dir /circuitforge-core - -COPY peregrine/requirements.txt . -# Skip the cfcore line — already installed above from the local copy -RUN grep -v 'circuitforge-core' requirements.txt | pip install --no-cache-dir -r /dev/stdin - -# Install Playwright browser (cached separately from Python deps so requirements -# changes don't bust the ~600–900 MB Chromium layer and vice versa) -RUN playwright install chromium && playwright install-deps chromium - -# Bundle companyScraper (company research web scraper) -COPY peregrine/scrapers/ /app/scrapers/ - -COPY peregrine/ . - -EXPOSE 8501 - -CMD ["streamlit", "run", "app/app.py", \ - "--server.port=8501", \ - "--server.headless=true", \ - "--server.fileWatcherType=none"] diff --git a/README.md b/README.md index 228ac81..fb4e10e 100644 --- a/README.md +++ b/README.md @@ -1,33 +1,16 @@ # Peregrine -> **Primary development** happens at [git.opensourcesolarpunk.com](https://git.opensourcesolarpunk.com/Circuit-Forge/peregrine) — GitHub and Codeberg are push mirrors. Issues and PRs are welcome on either platform. +> **Primary development** happens at [git.opensourcesolarpunk.com](https://git.opensourcesolarpunk.com/pyr0ball/peregrine) — GitHub and Codeberg are push mirrors. Issues and PRs are welcome on either platform. [![License: BSL 1.1](https://img.shields.io/badge/License-BSL_1.1-blue.svg)](./LICENSE-BSL) [![CI](https://github.com/CircuitForge/peregrine/actions/workflows/ci.yml/badge.svg)](https://github.com/CircuitForge/peregrine/actions/workflows/ci.yml) -**Job search pipeline — by [Circuit Forge LLC](https://circuitforge.tech)** +**AI-powered job search pipeline — by [Circuit Forge LLC](https://circuitforge.tech)** -> *"Tools for the jobs that the system made hard on purpose."* +> *"Don't be evil, for real and forever."* ---- - -Job search is a second job nobody hired you for. - -ATS filters designed to reject. Job boards that show the same listing eight times. Cover letter number forty-seven for a role that might already be filled. Hours of prep for a phone screen that lasts twelve minutes. - -Peregrine handles the pipeline — discovery, matching, tracking, drafting, and prep — so you can spend your time doing the work you actually want to be doing. - -**LLM support is optional.** The full discovery and tracking pipeline works without one. When you do configure a backend, the LLM drafts the parts that are genuinely miserable — cover letters, company research briefs, interview prep sheets — and waits for your approval before anything goes anywhere. - -### What Peregrine does not do - -Peregrine does **not** submit job applications for you. You still have to go to each employer's site and click apply yourself. - -This is intentional. Automated mass-applying is a bad experience for everyone — it's also a trust violation with employers who took the time to post a real role. Peregrine is a preparation and organization tool, not a bot. - -What it *does* cover is everything before and after that click: finding the jobs, matching them against your resume, generating cover letters and prep materials, and once you've applied — tracking where you stand, classifying the emails that come back, and surfacing company research when an interview lands on your calendar. The submit button is yours. The rest of the grind is ours. - -> **Exception:** [AIHawk](https://github.com/nicolomantini/LinkedIn-Easy-Apply) is a separate, optional tool that handles LinkedIn Easy Apply automation. Peregrine integrates with it for AIHawk-compatible profiles, but it is not part of Peregrine's core pipeline. +Automates the full job search lifecycle: discovery → matching → cover letters → applications → interview prep. +Privacy-first, local-first. Your data never leaves your machine. --- @@ -36,7 +19,7 @@ What it *does* cover is everything before and after that click: finding the jobs **1. Clone and install dependencies** (Docker, NVIDIA toolkit if needed): ```bash -git clone https://git.opensourcesolarpunk.com/Circuit-Forge/peregrine +git clone https://git.opensourcesolarpunk.com/pyr0ball/peregrine cd peregrine ./manage.sh setup ``` @@ -146,26 +129,21 @@ Re-enter the wizard any time via **Settings → Developer → Reset wizard**. | **Company research briefs** | Free with LLM¹ | | **Interview prep & practice Q&A** | Free with LLM¹ | | **Survey assistant** (culture-fit Q&A, screenshot analysis) | Free with LLM¹ | -| **Wizard helpers** (career summary, bullet expansion, skill suggestions, job title suggestions, mission notes) | Free with LLM¹ | +| **AI wizard helpers** (career summary, bullet expansion, skill suggestions) | Free with LLM¹ | | Managed cloud LLM (no API key needed) | Paid | | Email sync & auto-classification | Paid | -| LLM-powered keyword blocklist | Paid | | Job tracking integrations (Notion, Airtable, Google Sheets) | Paid | | Calendar sync (Google, Apple) | Paid | | Slack notifications | Paid | | CircuitForge shared cover-letter model | Paid | -| Vue 3 SPA — full UI with onboarding wizard, job board, apply workspace, sort/filter, research modal, draft cover letter | Free | -| **Voice guidelines** (custom writing style & tone) | Premium with LLM¹ ² | | Cover letter model fine-tuning (your writing, your model) | Premium | | Multi-user support | Premium | -¹ **BYOK (bring your own key/backend) unlock:** configure any LLM backend — a local [Ollama](https://ollama.com) or vLLM instance, -or your own API key (Anthropic, OpenAI-compatible) — and all features marked **Free with LLM** or **Premium with LLM** +¹ **BYOK unlock:** configure any LLM backend — a local [Ollama](https://ollama.com) or vLLM instance, +or your own API key (Anthropic, OpenAI-compatible) — and all AI features marked **Free with LLM** unlock at no charge. The paid tier earns its price by providing managed cloud inference so you don't need a key at all, plus integrations and email sync. -² **Voice guidelines** requires Premium tier without a configured LLM backend. With BYOK, it unlocks at any tier. - --- ## Email Sync @@ -223,6 +201,6 @@ Full documentation at: https://docs.circuitforge.tech/peregrine ## License Core discovery pipeline: [MIT](LICENSE-MIT) -LLM features (cover letter generation, company research, interview prep, UI): [BSL 1.1](LICENSE-BSL) +AI features (cover letter generation, company research, interview prep, UI): [BSL 1.1](LICENSE-BSL) © 2026 Circuit Forge LLC diff --git a/app/Home.py b/app/Home.py index fab1428..ee5d4e8 100644 --- a/app/Home.py +++ b/app/Home.py @@ -19,8 +19,8 @@ _profile = UserProfile(_USER_YAML) if UserProfile.exists(_USER_YAML) else None _name = _profile.name if _profile else "Job Seeker" from scripts.db import init_db, get_job_counts, purge_jobs, purge_email_data, \ - purge_non_remote, archive_jobs, kill_stuck_tasks, cancel_task, \ - get_task_for_job, get_active_tasks, insert_job, get_existing_urls + purge_non_remote, archive_jobs, kill_stuck_tasks, get_task_for_job, get_active_tasks, \ + insert_job, get_existing_urls from scripts.task_runner import submit_task from app.cloud_session import resolve_session, get_db_path @@ -376,145 +376,178 @@ _scrape_status() st.divider() -# ── Danger zone ─────────────────────────────────────────────────────────────── +# ── Danger zone: purge + re-scrape ──────────────────────────────────────────── with st.expander("⚠️ Danger Zone", expanded=False): - - # ── Queue reset (the common case) ───────────────────────────────────────── - st.markdown("**Queue reset**") st.caption( - "Archive clears your review queue while keeping job URLs for dedup, " - "so the same listings won't resurface on the next discovery run. " - "Use hard purge only if you want a full clean slate including dedup history." + "**Purge** permanently deletes jobs from the local database. " + "Applied and synced jobs are never touched." ) - _scope = st.radio( - "Clear scope", - ["Pending only", "Pending + approved (stale search)"], - horizontal=True, - label_visibility="collapsed", - ) - _scope_statuses = ( - ["pending"] if _scope == "Pending only" else ["pending", "approved"] - ) + purge_col, rescrape_col, email_col, tasks_col = st.columns(4) - _qc1, _qc2, _qc3 = st.columns([2, 2, 4]) - if _qc1.button("📦 Archive & reset", use_container_width=True, type="primary"): - st.session_state["confirm_dz"] = "archive" - if _qc2.button("🗑 Hard purge (delete)", use_container_width=True): - st.session_state["confirm_dz"] = "purge" + with purge_col: + st.markdown("**Purge pending & rejected**") + st.caption("Removes all _pending_ and _rejected_ listings so the next discovery starts fresh.") + if st.button("🗑 Purge Pending + Rejected", use_container_width=True): + st.session_state["confirm_purge"] = "partial" - if st.session_state.get("confirm_dz") == "archive": - st.info( - f"Archive **{', '.join(_scope_statuses)}** jobs? " - "URLs are kept for dedup — nothing is permanently deleted." - ) - _dc1, _dc2 = st.columns(2) - if _dc1.button("Yes, archive", type="primary", use_container_width=True, key="dz_archive_confirm"): - n = archive_jobs(get_db_path(), statuses=_scope_statuses) - st.success(f"Archived {n} jobs.") - st.session_state.pop("confirm_dz", None) - st.rerun() - if _dc2.button("Cancel", use_container_width=True, key="dz_archive_cancel"): - st.session_state.pop("confirm_dz", None) - st.rerun() - - if st.session_state.get("confirm_dz") == "purge": - st.warning( - f"Permanently delete **{', '.join(_scope_statuses)}** jobs? " - "This removes the URLs from dedup history too. Cannot be undone." - ) - _dc1, _dc2 = st.columns(2) - if _dc1.button("Yes, delete", type="primary", use_container_width=True, key="dz_purge_confirm"): - n = purge_jobs(get_db_path(), statuses=_scope_statuses) - st.success(f"Deleted {n} jobs.") - st.session_state.pop("confirm_dz", None) - st.rerun() - if _dc2.button("Cancel", use_container_width=True, key="dz_purge_cancel"): - st.session_state.pop("confirm_dz", None) - st.rerun() - - st.divider() - - # ── Background tasks ────────────────────────────────────────────────────── - _active = get_active_tasks(get_db_path()) - st.markdown(f"**Background tasks** — {len(_active)} active") - - if _active: - _task_icons = {"cover_letter": "✉️", "research": "🔍", "discovery": "🌐", "enrich_descriptions": "📝"} - for _t in _active: - _tc1, _tc2, _tc3 = st.columns([3, 4, 2]) - _icon = _task_icons.get(_t["task_type"], "⚙️") - _tc1.caption(f"{_icon} `{_t['task_type']}`") - _job_label = f"{_t['title']} @ {_t['company']}" if _t.get("title") else f"job #{_t['job_id']}" - _tc2.caption(_job_label) - _tc3.caption(f"_{_t['status']}_") - if st.button("✕ Cancel", key=f"dz_cancel_task_{_t['id']}", use_container_width=True): - cancel_task(get_db_path(), _t["id"]) + if st.session_state.get("confirm_purge") == "partial": + st.warning("Are you sure? This cannot be undone.") + c1, c2 = st.columns(2) + if c1.button("Yes, purge", type="primary", use_container_width=True): + deleted = purge_jobs(get_db_path(), statuses=["pending", "rejected"]) + st.success(f"Purged {deleted} jobs.") + st.session_state.pop("confirm_purge", None) + st.rerun() + if c2.button("Cancel", use_container_width=True): + st.session_state.pop("confirm_purge", None) st.rerun() - st.caption("") - _kill_col, _ = st.columns([2, 6]) - if _kill_col.button("⏹ Kill all stuck", use_container_width=True, disabled=len(_active) == 0): - killed = kill_stuck_tasks(get_db_path()) - st.success(f"Killed {killed} task(s).") - st.rerun() + with email_col: + st.markdown("**Purge email data**") + st.caption("Clears all email thread logs and email-sourced pending jobs so the next sync starts fresh.") + if st.button("📧 Purge Email Data", use_container_width=True): + st.session_state["confirm_purge"] = "email" + + if st.session_state.get("confirm_purge") == "email": + st.warning("This deletes all email contacts and email-sourced jobs. Cannot be undone.") + c1, c2 = st.columns(2) + if c1.button("Yes, purge emails", type="primary", use_container_width=True): + contacts, jobs = purge_email_data(get_db_path()) + st.success(f"Purged {contacts} email contacts, {jobs} email jobs.") + st.session_state.pop("confirm_purge", None) + st.rerun() + if c2.button("Cancel ", use_container_width=True): + st.session_state.pop("confirm_purge", None) + st.rerun() + + with tasks_col: + _active = get_active_tasks(get_db_path()) + st.markdown("**Kill stuck tasks**") + st.caption(f"Force-fail all queued/running background tasks. Currently **{len(_active)}** active.") + if st.button("⏹ Kill All Tasks", use_container_width=True, disabled=len(_active) == 0): + killed = kill_stuck_tasks(get_db_path()) + st.success(f"Killed {killed} task(s).") + st.rerun() + + with rescrape_col: + st.markdown("**Purge all & re-scrape**") + st.caption("Wipes _all_ non-applied, non-synced jobs then immediately runs a fresh discovery.") + if st.button("🔄 Purge All + Re-scrape", use_container_width=True): + st.session_state["confirm_purge"] = "full" + + if st.session_state.get("confirm_purge") == "full": + st.warning("This will delete ALL pending, approved, and rejected jobs, then re-scrape. Applied and synced records are kept.") + c1, c2 = st.columns(2) + if c1.button("Yes, wipe + scrape", type="primary", use_container_width=True): + purge_jobs(get_db_path(), statuses=["pending", "approved", "rejected"]) + submit_task(get_db_path(), "discovery", 0) + st.session_state.pop("confirm_purge", None) + st.rerun() + if c2.button("Cancel ", use_container_width=True): + st.session_state.pop("confirm_purge", None) + st.rerun() st.divider() - # ── Rarely needed (collapsed) ───────────────────────────────────────────── - with st.expander("More options", expanded=False): - _rare1, _rare2, _rare3 = st.columns(3) + pending_col, nonremote_col, approved_col, _ = st.columns(4) - with _rare1: - st.markdown("**Purge email data**") - st.caption("Clears all email thread logs and email-sourced pending jobs.") - if st.button("📧 Purge Email Data", use_container_width=True): - st.session_state["confirm_dz"] = "email" - if st.session_state.get("confirm_dz") == "email": - st.warning("Deletes all email contacts and email-sourced jobs. Cannot be undone.") - _ec1, _ec2 = st.columns(2) - if _ec1.button("Yes, purge emails", type="primary", use_container_width=True, key="dz_email_confirm"): - contacts, jobs = purge_email_data(get_db_path()) - st.success(f"Purged {contacts} email contacts, {jobs} email jobs.") - st.session_state.pop("confirm_dz", None) - st.rerun() - if _ec2.button("Cancel", use_container_width=True, key="dz_email_cancel"): - st.session_state.pop("confirm_dz", None) - st.rerun() + with pending_col: + st.markdown("**Purge pending review**") + st.caption("Removes only _pending_ listings, keeping your rejected history intact.") + if st.button("🗑 Purge Pending Only", use_container_width=True): + st.session_state["confirm_purge"] = "pending_only" - with _rare2: - st.markdown("**Purge non-remote**") - st.caption("Removes pending/approved/rejected on-site listings from the DB.") - if st.button("🏢 Purge On-site Jobs", use_container_width=True): - st.session_state["confirm_dz"] = "non_remote" - if st.session_state.get("confirm_dz") == "non_remote": - st.warning("Deletes all non-remote jobs not yet applied to. Cannot be undone.") - _rc1, _rc2 = st.columns(2) - if _rc1.button("Yes, purge on-site", type="primary", use_container_width=True, key="dz_nonremote_confirm"): - deleted = purge_non_remote(get_db_path()) - st.success(f"Purged {deleted} non-remote jobs.") - st.session_state.pop("confirm_dz", None) - st.rerun() - if _rc2.button("Cancel", use_container_width=True, key="dz_nonremote_cancel"): - st.session_state.pop("confirm_dz", None) - st.rerun() + if st.session_state.get("confirm_purge") == "pending_only": + st.warning("Deletes all pending jobs. Rejected jobs are kept. Cannot be undone.") + c1, c2 = st.columns(2) + if c1.button("Yes, purge pending", type="primary", use_container_width=True): + deleted = purge_jobs(get_db_path(), statuses=["pending"]) + st.success(f"Purged {deleted} pending jobs.") + st.session_state.pop("confirm_purge", None) + st.rerun() + if c2.button("Cancel ", use_container_width=True): + st.session_state.pop("confirm_purge", None) + st.rerun() - with _rare3: - st.markdown("**Wipe all + re-scrape**") - st.caption("Deletes all non-applied jobs then immediately runs a fresh discovery.") - if st.button("🔄 Wipe + Re-scrape", use_container_width=True): - st.session_state["confirm_dz"] = "rescrape" - if st.session_state.get("confirm_dz") == "rescrape": - st.warning("Wipes ALL pending, approved, and rejected jobs, then re-scrapes. Applied and synced records are kept.") - _wc1, _wc2 = st.columns(2) - if _wc1.button("Yes, wipe + scrape", type="primary", use_container_width=True, key="dz_rescrape_confirm"): - purge_jobs(get_db_path(), statuses=["pending", "approved", "rejected"]) - submit_task(get_db_path(), "discovery", 0) - st.session_state.pop("confirm_dz", None) - st.rerun() - if _wc2.button("Cancel", use_container_width=True, key="dz_rescrape_cancel"): - st.session_state.pop("confirm_dz", None) - st.rerun() + with nonremote_col: + st.markdown("**Purge non-remote**") + st.caption("Removes pending/approved/rejected jobs where remote is not set. Keeps anything already in the pipeline.") + if st.button("🏢 Purge On-site Jobs", use_container_width=True): + st.session_state["confirm_purge"] = "non_remote" + + if st.session_state.get("confirm_purge") == "non_remote": + st.warning("Deletes all non-remote jobs not yet applied to. Cannot be undone.") + c1, c2 = st.columns(2) + if c1.button("Yes, purge on-site", type="primary", use_container_width=True): + deleted = purge_non_remote(get_db_path()) + st.success(f"Purged {deleted} non-remote jobs.") + st.session_state.pop("confirm_purge", None) + st.rerun() + if c2.button("Cancel ", use_container_width=True): + st.session_state.pop("confirm_purge", None) + st.rerun() + + with approved_col: + st.markdown("**Purge approved (unapplied)**") + st.caption("Removes _approved_ jobs you haven't applied to yet — e.g. to reset after a review pass.") + if st.button("🗑 Purge Approved", use_container_width=True): + st.session_state["confirm_purge"] = "approved_only" + + if st.session_state.get("confirm_purge") == "approved_only": + st.warning("Deletes all approved-but-not-applied jobs. Cannot be undone.") + c1, c2 = st.columns(2) + if c1.button("Yes, purge approved", type="primary", use_container_width=True): + deleted = purge_jobs(get_db_path(), statuses=["approved"]) + st.success(f"Purged {deleted} approved jobs.") + st.session_state.pop("confirm_purge", None) + st.rerun() + if c2.button("Cancel ", use_container_width=True): + st.session_state.pop("confirm_purge", None) + st.rerun() + + st.divider() + + archive_col1, archive_col2, _, _ = st.columns(4) + + with archive_col1: + st.markdown("**Archive remaining**") + st.caption( + "Move all _pending_ and _rejected_ jobs to archived status. " + "Archived jobs stay in the DB for dedup — they just won't appear in Job Review." + ) + if st.button("📦 Archive Pending + Rejected", use_container_width=True): + st.session_state["confirm_purge"] = "archive_remaining" + + if st.session_state.get("confirm_purge") == "archive_remaining": + st.info("Jobs will be archived (not deleted) — URLs are kept for dedup.") + c1, c2 = st.columns(2) + if c1.button("Yes, archive", type="primary", use_container_width=True): + archived = archive_jobs(get_db_path(), statuses=["pending", "rejected"]) + st.success(f"Archived {archived} jobs.") + st.session_state.pop("confirm_purge", None) + st.rerun() + if c2.button("Cancel ", use_container_width=True): + st.session_state.pop("confirm_purge", None) + st.rerun() + + with archive_col2: + st.markdown("**Archive approved (unapplied)**") + st.caption("Archive _approved_ listings you decided to skip — keeps history without cluttering the apply queue.") + if st.button("📦 Archive Approved", use_container_width=True): + st.session_state["confirm_purge"] = "archive_approved" + + if st.session_state.get("confirm_purge") == "archive_approved": + st.info("Approved jobs will be archived (not deleted).") + c1, c2 = st.columns(2) + if c1.button("Yes, archive approved", type="primary", use_container_width=True): + archived = archive_jobs(get_db_path(), statuses=["approved"]) + st.success(f"Archived {archived} approved jobs.") + st.session_state.pop("confirm_purge", None) + st.rerun() + if c2.button("Cancel ", use_container_width=True): + st.session_state.pop("confirm_purge", None) + st.rerun() # ── Setup banners ───────────────────────────────────────────────────────────── if _profile and _profile.wizard_complete: diff --git a/app/app.py b/app/app.py index efa6e51..fcd04df 100644 --- a/app/app.py +++ b/app/app.py @@ -17,39 +17,22 @@ sys.path.insert(0, str(Path(__file__).parent.parent)) logging.basicConfig(level=logging.WARNING, format="%(name)s %(levelname)s: %(message)s") -# Load .env before any os.environ reads — safe to call inside Docker too -# (uses setdefault, so Docker-injected vars take precedence over .env values) -from circuitforge_core.config.settings import load_env as _load_env -_load_env(Path(__file__).parent.parent / ".env") - IS_DEMO = os.environ.get("DEMO_MODE", "").lower() in ("1", "true", "yes") import streamlit as st from scripts.db import DEFAULT_DB, init_db, get_active_tasks -from scripts.db_migrate import migrate_db from app.feedback import inject_feedback_button -from app.cloud_session import resolve_session, get_db_path, get_config_dir, get_cloud_tier +from app.cloud_session import resolve_session, get_db_path, get_config_dir import sqlite3 -_LOGO_CIRCLE = Path(__file__).parent / "static" / "peregrine_logo_circle.png" -_LOGO_FULL = Path(__file__).parent / "static" / "peregrine_logo.png" - st.set_page_config( page_title="Peregrine", - page_icon=str(_LOGO_CIRCLE) if _LOGO_CIRCLE.exists() else "💼", + page_icon="💼", layout="wide", ) resolve_session("peregrine") init_db(get_db_path()) -migrate_db(Path(get_db_path())) - -# Demo tier — initialize once per session (cookie persistence handled client-side) -if IS_DEMO and "simulated_tier" not in st.session_state: - st.session_state["simulated_tier"] = "paid" - -if _LOGO_CIRCLE.exists(): - st.logo(str(_LOGO_CIRCLE), icon_image=str(_LOGO_CIRCLE)) # ── Startup cleanup — runs once per server process via cache_resource ────────── @st.cache_resource @@ -106,15 +89,6 @@ _show_wizard = not IS_DEMO and ( if _show_wizard: _setup_page = st.Page("pages/0_Setup.py", title="Setup", icon="👋") st.navigation({"": [_setup_page]}).run() - # Sync UI cookie even during wizard so vue preference redirects correctly. - # Tier not yet computed here — use cloud tier (or "free" fallback). - try: - from app.components.ui_switcher import sync_ui_cookie as _sync_wizard_cookie - from app.cloud_session import get_cloud_tier as _gctr - _wizard_tier = _gctr() if _gctr() != "local" else "free" - _sync_wizard_cookie(_USER_YAML, _wizard_tier) - except Exception: - pass st.stop() # ── Navigation ───────────────────────────────────────────────────────────────── @@ -139,21 +113,6 @@ pg = st.navigation(pages) # ── Background task sidebar indicator ───────────────────────────────────────── # Fragment polls every 3s so stage labels update live without a full page reload. # The sidebar context WRAPS the fragment call — do not write to st.sidebar inside it. -_TASK_LABELS = { - "cover_letter": "Cover letter", - "company_research": "Research", - "email_sync": "Email sync", - "discovery": "Discovery", - "enrich_descriptions": "Enriching descriptions", - "score": "Scoring matches", - "scrape_url": "Scraping listing", - "enrich_craigslist": "Enriching listing", - "wizard_generate": "Wizard generation", - "prepare_training": "Training data", -} -_DISCOVERY_PIPELINE = ["discovery", "enrich_descriptions", "score"] - - @st.fragment(run_every=3) def _task_indicator(): tasks = get_active_tasks(get_db_path()) @@ -161,31 +120,28 @@ def _task_indicator(): return st.divider() st.markdown(f"**⏳ {len(tasks)} task(s) running**") - - pipeline_set = set(_DISCOVERY_PIPELINE) - pipeline_tasks = [t for t in tasks if t["task_type"] in pipeline_set] - other_tasks = [t for t in tasks if t["task_type"] not in pipeline_set] - - # Discovery pipeline: render as ordered sub-queue with indented steps - if pipeline_tasks: - ordered = [ - next((t for t in pipeline_tasks if t["task_type"] == typ), None) - for typ in _DISCOVERY_PIPELINE - ] - ordered = [t for t in ordered if t is not None] - for i, t in enumerate(ordered): - icon = "⏳" if t["status"] == "running" else "🕐" - label = _TASK_LABELS.get(t["task_type"], t["task_type"].replace("_", " ").title()) - stage = t.get("stage") or "" - detail = f" · {stage}" if stage else "" - prefix = "" if i == 0 else "↳ " - st.caption(f"{prefix}{icon} {label}{detail}") - - # All other tasks (cover letter, email sync, etc.) as individual rows - for t in other_tasks: - icon = "⏳" if t["status"] == "running" else "🕐" - label = _TASK_LABELS.get(t["task_type"], t["task_type"].replace("_", " ").title()) - stage = t.get("stage") or "" + for t in tasks: + icon = "⏳" if t["status"] == "running" else "🕐" + task_type = t["task_type"] + if task_type == "cover_letter": + label = "Cover letter" + elif task_type == "company_research": + label = "Research" + elif task_type == "email_sync": + label = "Email sync" + elif task_type == "discovery": + label = "Discovery" + elif task_type == "enrich_descriptions": + label = "Enriching" + elif task_type == "scrape_url": + label = "Scraping URL" + elif task_type == "wizard_generate": + label = "Wizard generation" + elif task_type == "enrich_craigslist": + label = "Enriching listing" + else: + label = task_type.replace("_", " ").title() + stage = t.get("stage") or "" detail = f" · {stage}" if stage else (f" — {t.get('company')}" if t.get("company") else "") st.caption(f"{icon} {label}{detail}") @@ -200,13 +156,6 @@ def _get_version() -> str: except Exception: return "dev" -# ── Effective tier (resolved before sidebar so switcher can use it) ────────── -# get_cloud_tier() returns "local" in dev/self-hosted mode, real tier in cloud. -_ui_profile = _UserProfile(_USER_YAML) if _UserProfile.exists(_USER_YAML) else None -_ui_yaml_tier = _ui_profile.effective_tier if _ui_profile else "free" -_ui_cloud_tier = get_cloud_tier() -_ui_tier = _ui_cloud_tier if _ui_cloud_tier != "local" else _ui_yaml_tier - with st.sidebar: if IS_DEMO: st.info( @@ -236,31 +185,7 @@ with st.sidebar: ) st.divider() - try: - from app.components.ui_switcher import render_sidebar_switcher - render_sidebar_switcher(_USER_YAML, _ui_tier) - except Exception: - pass # never crash the app over the sidebar switcher st.caption(f"Peregrine {_get_version()}") inject_feedback_button(page=pg.title) -# ── Demo toolbar (DEMO_MODE only) ─────────────────────────────────────────── -if IS_DEMO: - from app.components.demo_toolbar import render_demo_toolbar - render_demo_toolbar() - -# ── UI switcher banner (paid tier; or all visitors in demo mode) ───────────── -try: - from app.components.ui_switcher import render_banner - render_banner(_USER_YAML, _ui_tier) -except Exception: - pass # never crash the app over the banner - pg.run() - -# ── UI preference cookie sync (runs after page render) ────────────────────── -try: - from app.components.ui_switcher import sync_ui_cookie - sync_ui_cookie(_USER_YAML, _ui_tier) -except Exception: - pass # never crash the app over cookie sync diff --git a/app/components/demo_toolbar.py b/app/components/demo_toolbar.py deleted file mode 100644 index 2c30c56..0000000 --- a/app/components/demo_toolbar.py +++ /dev/null @@ -1,72 +0,0 @@ -"""Demo toolbar — tier simulation for DEMO_MODE instances. - -Renders a slim full-width bar above the Streamlit nav showing -Free / Paid / Premium pills. Clicking a pill sets a prgn_demo_tier -cookie (for persistence across reloads) and st.session_state.simulated_tier -(for immediate use within the current render pass). - -Only ever rendered when DEMO_MODE=true. -""" -from __future__ import annotations - -import os - -import streamlit as st -import streamlit.components.v1 as components - -_VALID_TIERS = ("free", "paid", "premium") -_DEFAULT_TIER = "paid" # most compelling first impression - -_DEMO_MODE = os.environ.get("DEMO_MODE", "").lower() in ("1", "true", "yes") - -_COOKIE_JS = """ - -""" - - -def get_simulated_tier() -> str: - """Return the current simulated tier, defaulting to 'paid'.""" - return st.session_state.get("simulated_tier", _DEFAULT_TIER) - - -def set_simulated_tier(tier: str) -> None: - """Set simulated tier in session state + cookie. Reruns the page.""" - if tier not in _VALID_TIERS: - return - st.session_state["simulated_tier"] = tier - components.html(_COOKIE_JS.format(tier=tier), height=0) - st.rerun() - - -def render_demo_toolbar() -> None: - """Render the demo mode toolbar. - - Shows a dismissible info bar with tier-selection pills. - Call this at the TOP of app.py's render pass, before pg.run(). - """ - current = get_simulated_tier() - - labels = {t: t.capitalize() + (" ✓" if t == current else "") for t in _VALID_TIERS} - - with st.container(): - cols = st.columns([3, 1, 1, 1, 2]) - with cols[0]: - st.caption("🎭 **Demo mode** — exploring as:") - for i, tier in enumerate(_VALID_TIERS): - with cols[i + 1]: - is_active = tier == current - if st.button( - labels[tier], - key=f"_demo_tier_{tier}", - type="primary" if is_active else "secondary", - use_container_width=True, - ): - if not is_active: - set_simulated_tier(tier) - with cols[4]: - st.caption("[Get your own →](https://circuitforge.tech/software/peregrine)") - st.divider() diff --git a/app/components/ui_switcher.py b/app/components/ui_switcher.py deleted file mode 100644 index 33ed955..0000000 --- a/app/components/ui_switcher.py +++ /dev/null @@ -1,262 +0,0 @@ -"""UI switcher component for Peregrine. - -Manages the prgn_ui cookie (Caddy routing signal) and user.yaml -ui_preference (durability across browser clears). - -Cookie mechanics ----------------- -Streamlit cannot read HTTP cookies server-side. Instead: -- sync_ui_cookie() injects a JS snippet that sets document.cookie. -- Vue SPA switch-back appends ?prgn_switch=streamlit to the redirect URL. - sync_ui_cookie() reads this param via st.query_params and uses it as - an override signal, then writes user.yaml to match. - -Call sync_ui_cookie() in the app.py render pass (after pg.run()). -""" -from __future__ import annotations - -import os -from pathlib import Path - -import streamlit as st -import streamlit.components.v1 as components - -from scripts.user_profile import UserProfile -from app.wizard.tiers import can_use - -_DEMO_MODE = os.environ.get("DEMO_MODE", "").lower() in ("1", "true", "yes") - -# When set, the app is running without a Caddy reverse proxy in front -# (local dev, direct port exposure). Switch to Vue by navigating directly -# to this URL instead of relying on cookie-based Caddy routing. -# Example: PEREGRINE_VUE_URL=http://localhost:8506 -_VUE_URL = os.environ.get("PEREGRINE_VUE_URL", "").strip().rstrip("/") - -# When True, a window.location.reload() after setting prgn_ui=vue will be -# intercepted by Caddy and routed to the Vue SPA. When False (no Caddy in the -# traffic path — e.g. test instances, direct Docker exposure), reloading just -# comes back to Streamlit and creates an infinite loop. Only set this in -# production/staging compose files where Caddy is actually in front. -_CADDY_PROXY = os.environ.get("PEREGRINE_CADDY_PROXY", "").lower() in ("1", "true", "yes") - -_COOKIE_JS = """ - -""" - - -def _set_cookie_js(value: str, navigate: bool = False) -> None: - """Inject JS to set the prgn_ui cookie. - - When PEREGRINE_VUE_URL is set (local dev, no Caddy): navigating to Vue - uses window.parent.location.href to jump directly to the Vue container - port. Without this, reload() just sends the request back to the same - Streamlit port with no router in between to inspect the cookie. - - When PEREGRINE_CADDY_PROXY is set (production/staging): navigate=True - triggers window.location.reload() so Caddy sees the updated cookie on - the next HTTP request and routes accordingly. - - When neither is set (test instances, bare Docker): navigate is suppressed - entirely — the cookie is written silently, but no reload is attempted. - Reloading without a proxy just bounces back to Streamlit and loops. - """ - # components.html() renders in an iframe — window.parent navigates the host page - if navigate and value == "vue" and _VUE_URL: - nav_js = f"window.parent.location.href = '{_VUE_URL}';" - elif navigate and _CADDY_PROXY: - nav_js = "window.parent.location.reload();" - else: - nav_js = "" - components.html(_COOKIE_JS.format(value=value, navigate_js=nav_js), height=0) - - -def sync_ui_cookie(yaml_path: Path, tier: str) -> None: - """Sync the prgn_ui cookie to match user.yaml ui_preference. - - Also handles: - - ?prgn_switch= param (Vue SPA switch-back signal): overrides yaml, - writes yaml to match, clears the param. - - Tier downgrade: resets vue preference to streamlit for ineligible users. - - ?ui_fallback=1 param: Vue SPA was down — reinforce streamlit cookie and - return early to avoid immediately navigating back to a broken Vue SPA. - - When the resolved preference is "vue", this function navigates (full page - reload) rather than silently setting the cookie. Without navigate=True, - Streamlit would set prgn_ui=vue mid-page-load; subsequent HTTP requests - made by Streamlit's own frontend (lazy JS chunks, WebSocket upgrade) would - carry the new cookie and Caddy would misroute them to the Vue nginx - container, causing TypeError: error loading dynamically imported module. - """ - # ── ?ui_fallback=1 — Vue SPA was down, Caddy bounced us back ────────────── - # Return early: reinforce the streamlit cookie so we don't immediately - # navigate back to a Vue SPA that may still be down. - if st.query_params.get("ui_fallback"): - st.toast("⚠️ New UI temporarily unavailable — switched back to Classic", icon="⚠️") - st.query_params.pop("ui_fallback", None) - _set_cookie_js("streamlit") - return - - # ── ?prgn_switch param — Vue SPA sent us here to switch back ────────────── - switch_param = st.query_params.get("prgn_switch") - if switch_param in ("streamlit", "vue"): - try: - profile = UserProfile(yaml_path) - profile.ui_preference = switch_param - profile.save() - except Exception: - # UI components must not crash the app — silent fallback - pass - st.query_params.pop("prgn_switch", None) - _set_cookie_js(switch_param) - return - - # ── Normal path: read yaml, enforce tier, inject cookie ─────────────────── - profile = None - try: - profile = UserProfile(yaml_path) - pref = profile.ui_preference - except Exception: - # UI components must not crash the app — silent fallback to default - pref = "streamlit" - - # Demo mode: Vue SPA has no demo data wiring — always serve Streamlit. - # (The tier downgrade check below is skipped in demo mode, but we must - # also block the Vue navigation itself so Caddy doesn't route to a blank SPA.) - if pref == "vue" and _DEMO_MODE: - pref = "streamlit" - - # Tier downgrade protection (skip in demo — demo bypasses tier gate) - if pref == "vue" and not _DEMO_MODE and not can_use(tier, "vue_ui_beta"): - if profile is not None: - try: - profile.ui_preference = "streamlit" - profile.save() - except Exception: - # UI components must not crash the app — silent fallback - pass - pref = "streamlit" - - # Navigate (full reload) when switching to Vue so Caddy re-routes on the - # next HTTP request before Streamlit serves any more content. Silent - # cookie-only set is safe for streamlit since we're already on that origin. - _set_cookie_js(pref, navigate=(pref == "vue")) - - -def switch_ui(yaml_path: Path, to: str, tier: str) -> None: - """Write user.yaml, set cookie, and navigate. - - to: "vue" | "streamlit" - - Switching to Vue triggers window.location.reload() so Caddy sees the - updated prgn_ui cookie and routes to the Vue SPA. st.rerun() alone is - not sufficient — it operates over WebSocket and produces no HTTP request. - - Switching back to streamlit uses st.rerun() (no full reload needed since - we're already on the Streamlit origin and no Caddy re-routing is required). - """ - if to not in ("vue", "streamlit"): - return - try: - profile = UserProfile(yaml_path) - profile.ui_preference = to - profile.save() - except Exception: - # UI components must not crash the app — silent fallback - pass - if to == "vue": - # navigate=True triggers window.location.reload() after setting cookie - _set_cookie_js("vue", navigate=True) - else: - sync_ui_cookie(yaml_path, tier=tier) - st.rerun() - - -def render_banner(yaml_path: Path, tier: str) -> None: - """Show the 'Try the new UI' banner once per session. - - Dismissed flag stored in user.yaml dismissed_banners list so it - persists across sessions (uses the existing dismissed_banners pattern). - Eligible: paid+ tier, OR demo mode. Not shown if already on vue. - """ - eligible = _DEMO_MODE or can_use(tier, "vue_ui_beta") - if not eligible: - return - - try: - profile = UserProfile(yaml_path) - except Exception: - # UI components must not crash the app — silent fallback - return - - if profile.ui_preference == "vue": - return - if "ui_switcher_beta" in (profile.dismissed_banners or []): - return - - col1, col2, col3 = st.columns([8, 1, 1]) - with col1: - st.info("✨ **New Peregrine UI available** — try the modern Vue interface (Beta)") - with col2: - if st.button("Try it", key="_ui_banner_try"): - switch_ui(yaml_path, to="vue", tier=tier) - with col3: - if st.button("Dismiss", key="_ui_banner_dismiss"): - profile.dismissed_banners = list(profile.dismissed_banners or []) + ["ui_switcher_beta"] - profile.save() - st.rerun() - - -def render_sidebar_switcher(yaml_path: Path, tier: str) -> None: - """Persistent sidebar button to switch to the Vue UI. - - Shown when the user is eligible (paid+ or demo) and currently on Streamlit. - This is always visible — unlike the banner which can be dismissed. - """ - eligible = _DEMO_MODE or can_use(tier, "vue_ui_beta") - if not eligible: - return - try: - profile = UserProfile(yaml_path) - if profile.ui_preference == "vue": - return - except Exception: - pass - - if st.button("✨ Switch to New UI", key="_sidebar_switch_vue", use_container_width=True): - switch_ui(yaml_path, to="vue", tier=tier) - - -def render_settings_toggle(yaml_path: Path, tier: str) -> None: - """Toggle in Settings → System → Deployment expander.""" - eligible = _DEMO_MODE or can_use(tier, "vue_ui_beta") - if not eligible: - return - - try: - profile = UserProfile(yaml_path) - current = profile.ui_preference - except Exception: - # UI components must not crash the app — silent fallback to default - current = "streamlit" - - options = ["streamlit", "vue"] - labels = ["Classic (Streamlit)", "✨ New UI (Vue, Beta)"] - current_idx = options.index(current) if current in options else 0 - - st.markdown("**UI Version**") - chosen = st.radio( - "UI Version", - options=labels, - index=current_idx, - key="_ui_toggle_radio", - label_visibility="collapsed", - ) - chosen_val = options[labels.index(chosen)] - - if chosen_val != current: - switch_ui(yaml_path, to=chosen_val, tier=tier) diff --git a/app/pages/0_Setup.py b/app/pages/0_Setup.py index 23d6967..3aed1af 100644 --- a/app/pages/0_Setup.py +++ b/app/pages/0_Setup.py @@ -457,11 +457,6 @@ elif step == 5: from app.wizard.step_inference import validate st.subheader("Step 5 \u2014 Inference & API Keys") - st.info( - "**Simplest setup:** set `OLLAMA_HOST` in your `.env` file — " - "Peregrine auto-detects it, no config file needed. " - "Or use the fields below to configure API keys and endpoints." - ) profile = saved_yaml.get("inference_profile", "remote") if profile == "remote": @@ -471,18 +466,8 @@ elif step == 5: placeholder="https://api.together.xyz/v1") openai_key = st.text_input("Endpoint API Key (optional)", type="password", key="oai_key") if openai_url else "" - ollama_host = st.text_input("Ollama host (optional \u2014 local fallback)", - placeholder="http://localhost:11434", - key="ollama_host_input") - ollama_model = st.text_input("Ollama model (optional)", - value="llama3.2:3b", - key="ollama_model_input") else: st.info(f"Local mode ({profile}): Ollama provides inference.") - import os - _ollama_host_env = os.environ.get("OLLAMA_HOST", "") - if _ollama_host_env: - st.caption(f"OLLAMA_HOST from .env: `{_ollama_host_env}`") anthropic_key = openai_url = openai_key = "" with st.expander("Advanced \u2014 Service Ports & Hosts"): @@ -561,14 +546,6 @@ elif step == 5: if anthropic_key or openai_url: env_path.write_text("\n".join(env_lines) + "\n") - if profile == "remote": - if ollama_host: - env_lines = _set_env(env_lines, "OLLAMA_HOST", ollama_host) - if ollama_model: - env_lines = _set_env(env_lines, "OLLAMA_MODEL", ollama_model) - if ollama_host or ollama_model: - env_path.write_text("\n".join(env_lines) + "\n") - _save_yaml({"services": svc, "wizard_step": 5}) st.session_state.wizard_step = 6 st.rerun() @@ -654,7 +631,7 @@ elif step == 6: ) default_profile = { "name": "default", - "titles": titles, + "job_titles": titles, "locations": locations, "remote_only": False, "boards": ["linkedin", "indeed", "glassdoor", "zip_recruiter"], diff --git a/app/pages/1_Job_Review.py b/app/pages/1_Job_Review.py index b86b33b..8f2c397 100644 --- a/app/pages/1_Job_Review.py +++ b/app/pages/1_Job_Review.py @@ -12,15 +12,12 @@ from scripts.db import ( DEFAULT_DB, init_db, get_jobs_by_status, update_job_status, update_cover_letter, mark_applied, get_email_leads, ) -from app.cloud_session import resolve_session, get_db_path - -resolve_session("peregrine") st.title("📋 Job Review") -init_db(get_db_path()) +init_db(DEFAULT_DB) -_email_leads = get_email_leads(get_db_path()) +_email_leads = get_email_leads(DEFAULT_DB) # ── Sidebar filters ──────────────────────────────────────────────────────────── with st.sidebar: @@ -40,7 +37,7 @@ with st.sidebar: index=0, ) -jobs = get_jobs_by_status(get_db_path(), show_status) +jobs = get_jobs_by_status(DEFAULT_DB, show_status) if remote_only: jobs = [j for j in jobs if j.get("is_remote")] @@ -89,11 +86,11 @@ if show_status == "pending" and _email_leads: with right_l: if st.button("✅ Approve", key=f"el_approve_{lead_id}", type="primary", use_container_width=True): - update_job_status(get_db_path(), [lead_id], "approved") + update_job_status(DEFAULT_DB, [lead_id], "approved") st.rerun() if st.button("❌ Reject", key=f"el_reject_{lead_id}", use_container_width=True): - update_job_status(get_db_path(), [lead_id], "rejected") + update_job_status(DEFAULT_DB, [lead_id], "rejected") st.rerun() st.divider() @@ -165,7 +162,7 @@ for job in jobs: ) save_col, _ = st.columns([2, 5]) if save_col.button("💾 Save draft", key=f"save_cl_{job_id}"): - update_cover_letter(get_db_path(), job_id, st.session_state[_cl_key]) + update_cover_letter(DEFAULT_DB, job_id, st.session_state[_cl_key]) st.success("Saved!") # Applied date + cover letter preview (applied/synced) @@ -185,11 +182,11 @@ for job in jobs: if show_status == "pending": if st.button("✅ Approve", key=f"approve_{job_id}", type="primary", use_container_width=True): - update_job_status(get_db_path(), [job_id], "approved") + update_job_status(DEFAULT_DB, [job_id], "approved") st.rerun() if st.button("❌ Reject", key=f"reject_{job_id}", use_container_width=True): - update_job_status(get_db_path(), [job_id], "rejected") + update_job_status(DEFAULT_DB, [job_id], "rejected") st.rerun() elif show_status == "approved": @@ -201,6 +198,6 @@ for job in jobs: use_container_width=True): cl_text = st.session_state.get(f"cl_{job_id}", "") if cl_text: - update_cover_letter(get_db_path(), job_id, cl_text) - mark_applied(get_db_path(), [job_id]) + update_cover_letter(DEFAULT_DB, job_id, cl_text) + mark_applied(DEFAULT_DB, [job_id]) st.rerun() diff --git a/app/pages/2_Settings.py b/app/pages/2_Settings.py index 1101c23..937e336 100644 --- a/app/pages/2_Settings.py +++ b/app/pages/2_Settings.py @@ -323,26 +323,6 @@ with tab_search: _run_suggest = st.button("✨ Suggest", key="sp_suggest_btn", help="Ask the LLM to suggest additional titles and smarter exclude keywords — using your blocklist, mission values, and career background.") - _title_sugg_count = len((st.session_state.get("_sp_suggestions") or {}).get("suggested_titles", [])) - if _title_sugg_count: - st.markdown(f"""""", unsafe_allow_html=True) - st.multiselect( "Job titles", options=st.session_state.get("_sp_title_options", p.get("titles", [])), @@ -350,14 +330,6 @@ with tab_search: help="Select from known titles. Suggestions from ✨ Suggest appear here — pick the ones you want.", label_visibility="collapsed", ) - - if _title_sugg_count: - st.markdown( - f'
' - f' ↑ {_title_sugg_count} new suggestion{"s" if _title_sugg_count != 1 else ""} ' - f'added — open the dropdown to browse
', - unsafe_allow_html=True, - ) _add_t_col, _add_t_btn = st.columns([5, 1]) with _add_t_col: st.text_input("Add a title", key="_sp_new_title", label_visibility="collapsed", @@ -401,32 +373,22 @@ with tab_search: with st.spinner("Asking LLM for suggestions…"): try: suggestions = _suggest_search_terms(_current_titles, RESUME_PATH, _blocklist, _user_profile) - except Exception as _e: - _err_msg = str(_e) - if "exhausted" in _err_msg.lower() or isinstance(_e, RuntimeError): - st.warning( - f"No LLM backend available: {_err_msg}. " - "Check that Ollama is running and has GPU access, or enable a cloud backend in Settings → System → LLM.", - icon="⚠️", - ) - else: - st.error(f"Suggestion failed: {_err_msg}", icon="🚨") + except RuntimeError as _e: + st.warning( + f"No LLM backend available: {_e}. " + "Check that Ollama is running and has GPU access, or enable a cloud backend in Settings → System → LLM.", + icon="⚠️", + ) suggestions = None if suggestions is not None: # Add suggested titles to options list (not auto-selected — user picks from dropdown) _opts = list(st.session_state.get("_sp_title_options", [])) - _new_titles = [_t for _t in suggestions.get("suggested_titles", []) if _t not in _opts] - _opts.extend(_new_titles) + for _t in suggestions.get("suggested_titles", []): + if _t not in _opts: + _opts.append(_t) st.session_state["_sp_title_options"] = _opts st.session_state["_sp_suggestions"] = suggestions - if not _new_titles and not suggestions.get("suggested_excludes"): - _resume_hint = " Upload your resume in Settings → Resume Profile for better results." if not RESUME_PATH.exists() else "" - st.info( - f"No new suggestions found — the LLM didn't generate anything new for these titles.{_resume_hint}", - icon="ℹ️", - ) - else: - st.rerun() + st.rerun() if st.session_state.get("_sp_suggestions"): sugg = st.session_state["_sp_suggestions"] @@ -851,13 +813,6 @@ with tab_resume: kw_current: list[str] = kw_data.get(kw_category, []) kw_suggestions = _load_sugg(kw_category) - # If a custom tag was added last render, clear the multiselect's session - # state key NOW (before the widget is created) so Streamlit uses `default` - # instead of the stale session state that lacks the new tag. - _reset_key = f"_kw_reset_{kw_category}" - if st.session_state.pop(_reset_key, False): - st.session_state.pop(f"kw_ms_{kw_category}", None) - # Merge: suggestions first, then any custom tags not in suggestions kw_custom = [t for t in kw_current if t not in kw_suggestions] kw_options = kw_suggestions + kw_custom @@ -878,7 +833,6 @@ with tab_resume: label_visibility="collapsed", placeholder=f"Custom: {kw_placeholder}", ) - _tag_just_added = False if kw_btn_col.button("+", key=f"kw_add_{kw_category}", help="Add custom tag"): cleaned = _filter_tag(kw_raw) if cleaned is None: @@ -886,19 +840,13 @@ with tab_resume: elif cleaned in kw_options: st.info(f"'{cleaned}' is already in the list — select it above.") else: - # Save to YAML and set a reset flag so the multiselect session - # state is cleared before the widget renders on the next rerun, - # allowing `default` (which includes the new tag) to take effect. + # Persist custom tag: add to YAML and session state so it appears in options kw_new_list = kw_selected + [cleaned] - st.session_state[_reset_key] = True kw_data[kw_category] = kw_new_list kw_changed = True - _tag_just_added = True - # Detect multiselect changes. Skip when a tag was just added — the change - # detection would otherwise overwrite kw_data with the old kw_selected - # (which doesn't include the new tag) in the same render. - if not _tag_just_added and sorted(kw_selected) != sorted(kw_current): + # Detect multiselect changes + if sorted(kw_selected) != sorted(kw_current): kw_data[kw_category] = kw_selected kw_changed = True @@ -1051,11 +999,6 @@ with tab_system: _env_path.write_text("\n".join(_env_lines) + "\n") st.success("Deployment settings saved. Run `./manage.sh restart` to apply.") - st.divider() - from app.components.ui_switcher import render_settings_toggle as _render_ui_toggle - _ui_tier = _profile.tier if _profile else "free" - _render_ui_toggle(yaml_path=_USER_YAML, tier=_ui_tier) - st.divider() # ── LLM Backends ───────────────────────────────────────────────────────── diff --git a/app/pages/4_Apply.py b/app/pages/4_Apply.py index c51f9ba..1e9a3d1 100644 --- a/app/pages/4_Apply.py +++ b/app/pages/4_Apply.py @@ -15,28 +15,28 @@ import streamlit.components.v1 as components import yaml from scripts.user_profile import UserProfile + +_USER_YAML = Path(__file__).parent.parent.parent / "config" / "user.yaml" +_profile = UserProfile(_USER_YAML) if UserProfile.exists(_USER_YAML) else None +_name = _profile.name if _profile else "Job Seeker" + from scripts.db import ( DEFAULT_DB, init_db, get_jobs_by_status, update_cover_letter, mark_applied, update_job_status, get_task_for_job, ) from scripts.task_runner import submit_task -from app.cloud_session import resolve_session, get_db_path, get_config_dir +from app.cloud_session import resolve_session, get_db_path from app.telemetry import log_usage_event +DOCS_DIR = _profile.docs_dir if _profile else Path.home() / "Documents" / "JobSearch" +RESUME_YAML = Path(__file__).parent.parent.parent / "config" / "plain_text_resume.yaml" + st.title("🚀 Apply Workspace") resolve_session("peregrine") init_db(get_db_path()) -_CONFIG_DIR = get_config_dir() -_USER_YAML = _CONFIG_DIR / "user.yaml" -_profile = UserProfile(_USER_YAML) if UserProfile.exists(_USER_YAML) else None -_name = _profile.name if _profile else "Job Seeker" - -DOCS_DIR = _profile.docs_dir if _profile else Path.home() / "Documents" / "JobSearch" -RESUME_YAML = _CONFIG_DIR / "plain_text_resume.yaml" - # ── PDF generation ───────────────────────────────────────────────────────────── def _make_cover_letter_pdf(job: dict, cover_letter: str, output_dir: Path) -> Path: from reportlab.lib.pagesizes import letter diff --git a/app/pages/5_Interviews.py b/app/pages/5_Interviews.py index 09b6c33..99b5162 100644 --- a/app/pages/5_Interviews.py +++ b/app/pages/5_Interviews.py @@ -36,9 +36,6 @@ from scripts.db import ( get_unread_stage_signals, dismiss_stage_signal, ) from scripts.task_runner import submit_task -from app.cloud_session import resolve_session, get_db_path - -resolve_session("peregrine") _CONFIG_DIR = Path(__file__).parent.parent.parent / "config" _CALENDAR_INTEGRATIONS = ("apple_calendar", "google_calendar") @@ -49,23 +46,23 @@ _calendar_connected = any( st.title("🎯 Interviews") -init_db(get_db_path()) +init_db(DEFAULT_DB) # ── Sidebar: Email sync ──────────────────────────────────────────────────────── with st.sidebar: st.markdown("### 📧 Email Sync") - _email_task = get_task_for_job(get_db_path(), "email_sync", 0) + _email_task = get_task_for_job(DEFAULT_DB, "email_sync", 0) _email_running = _email_task and _email_task["status"] in ("queued", "running") if st.button("🔄 Sync Emails", use_container_width=True, type="primary", disabled=bool(_email_running)): - submit_task(get_db_path(), "email_sync", 0) + submit_task(DEFAULT_DB, "email_sync", 0) st.rerun() if _email_running: @st.fragment(run_every=4) def _email_sidebar_status(): - t = get_task_for_job(get_db_path(), "email_sync", 0) + t = get_task_for_job(DEFAULT_DB, "email_sync", 0) if t and t["status"] in ("queued", "running"): st.info("⏳ Syncing…") else: @@ -102,7 +99,7 @@ STAGE_NEXT_LABEL = { } # ── Data ────────────────────────────────────────────────────────────────────── -jobs_by_stage = get_interview_jobs(get_db_path()) +jobs_by_stage = get_interview_jobs(DEFAULT_DB) # ── Helpers ─────────────────────────────────────────────────────────────────── def _days_ago(date_str: str | None) -> str: @@ -123,8 +120,8 @@ def _days_ago(date_str: str | None) -> str: def _research_modal(job: dict) -> None: job_id = job["id"] st.caption(f"**{job.get('company')}** — {job.get('title')}") - research = get_research(get_db_path(), job_id=job_id) - task = get_task_for_job(get_db_path(), "company_research", job_id) + research = get_research(DEFAULT_DB, job_id=job_id) + task = get_task_for_job(DEFAULT_DB, "company_research", job_id) running = task and task["status"] in ("queued", "running") if running: @@ -147,7 +144,7 @@ def _research_modal(job: dict) -> None: "inaccuracies. SearXNG is now available — re-run to get verified facts." ) if st.button("🔄 Re-run with live data", key=f"modal_rescrape_{job_id}", type="primary"): - submit_task(get_db_path(), "company_research", job_id) + submit_task(DEFAULT_DB, "company_research", job_id) st.rerun() st.divider() else: @@ -163,14 +160,14 @@ def _research_modal(job: dict) -> None: ) st.markdown(research["raw_output"]) if st.button("🔄 Refresh", key=f"modal_regen_{job_id}", disabled=bool(running)): - submit_task(get_db_path(), "company_research", job_id) + submit_task(DEFAULT_DB, "company_research", job_id) st.rerun() else: st.info("No research brief yet.") if task and task["status"] == "failed": st.error(f"Last attempt failed: {task.get('error', '')}") if st.button("🔬 Generate now", key=f"modal_gen_{job_id}"): - submit_task(get_db_path(), "company_research", job_id) + submit_task(DEFAULT_DB, "company_research", job_id) st.rerun() @@ -178,7 +175,7 @@ def _research_modal(job: dict) -> None: def _email_modal(job: dict) -> None: job_id = job["id"] st.caption(f"**{job.get('company')}** — {job.get('title')}") - contacts = get_contacts(get_db_path(), job_id=job_id) + contacts = get_contacts(DEFAULT_DB, job_id=job_id) if not contacts: st.info("No emails logged yet. Use the form below to add one.") @@ -249,7 +246,7 @@ def _email_modal(job: dict) -> None: body_text = st.text_area("Body / notes", height=80, key=f"body_modal_{job_id}") if st.form_submit_button("📧 Save contact"): add_contact( - get_db_path(), job_id=job_id, + DEFAULT_DB, job_id=job_id, direction=direction, subject=subject, from_addr=from_addr, body=body_text, received_at=recv_at, ) @@ -258,7 +255,7 @@ def _email_modal(job: dict) -> None: def _render_card(job: dict, stage: str, compact: bool = False) -> None: """Render a single job card appropriate for the given stage.""" job_id = job["id"] - contacts = get_contacts(get_db_path(), job_id=job_id) + contacts = get_contacts(DEFAULT_DB, job_id=job_id) last_contact = contacts[-1] if contacts else None with st.container(border=True): @@ -281,7 +278,7 @@ def _render_card(job: dict, stage: str, compact: bool = False) -> None: format="YYYY-MM-DD", ) if st.form_submit_button("📅 Save date"): - set_interview_date(get_db_path(), job_id=job_id, date_str=str(new_date)) + set_interview_date(DEFAULT_DB, job_id=job_id, date_str=str(new_date)) st.success("Saved!") st.rerun() @@ -291,7 +288,7 @@ def _render_card(job: dict, stage: str, compact: bool = False) -> None: _cal_label = "🔄 Update Calendar" if _has_event else "📅 Add to Calendar" if st.button(_cal_label, key=f"cal_push_{job_id}", use_container_width=True): from scripts.calendar_push import push_interview_event - result = push_interview_event(get_db_path(), job_id=job_id, config_dir=_CONFIG_DIR) + result = push_interview_event(DEFAULT_DB, job_id=job_id, config_dir=_CONFIG_DIR) if result["ok"]: st.success(f"Event {'updated' if _has_event else 'added'} ({result['provider'].replace('_', ' ').title()})") st.rerun() @@ -300,7 +297,7 @@ def _render_card(job: dict, stage: str, compact: bool = False) -> None: if not compact: if stage in ("applied", "phone_screen", "interviewing"): - signals = get_unread_stage_signals(get_db_path(), job_id=job_id) + signals = get_unread_stage_signals(DEFAULT_DB, job_id=job_id) if signals: sig = signals[-1] _SIGNAL_TO_STAGE = { @@ -321,23 +318,23 @@ def _render_card(job: dict, stage: str, compact: bool = False) -> None: if sig["stage_signal"] == "rejected": if b1.button("✗ Reject", key=f"sig_rej_{sig['id']}", use_container_width=True): - reject_at_stage(get_db_path(), job_id=job_id, rejection_stage=stage) - dismiss_stage_signal(get_db_path(), sig["id"]) + reject_at_stage(DEFAULT_DB, job_id=job_id, rejection_stage=stage) + dismiss_stage_signal(DEFAULT_DB, sig["id"]) st.rerun(scope="app") elif target_stage and b1.button( f"→ {target_label}", key=f"sig_adv_{sig['id']}", use_container_width=True, type="primary", ): if target_stage == "phone_screen" and stage == "applied": - advance_to_stage(get_db_path(), job_id=job_id, stage="phone_screen") - submit_task(get_db_path(), "company_research", job_id) + advance_to_stage(DEFAULT_DB, job_id=job_id, stage="phone_screen") + submit_task(DEFAULT_DB, "company_research", job_id) elif target_stage: - advance_to_stage(get_db_path(), job_id=job_id, stage=target_stage) - dismiss_stage_signal(get_db_path(), sig["id"]) + advance_to_stage(DEFAULT_DB, job_id=job_id, stage=target_stage) + dismiss_stage_signal(DEFAULT_DB, sig["id"]) st.rerun(scope="app") if b2.button("Dismiss", key=f"sig_dis_{sig['id']}", use_container_width=True): - dismiss_stage_signal(get_db_path(), sig["id"]) + dismiss_stage_signal(DEFAULT_DB, sig["id"]) st.rerun() # Advance / Reject buttons @@ -349,16 +346,16 @@ def _render_card(job: dict, stage: str, compact: bool = False) -> None: f"→ {next_label}", key=f"adv_{job_id}", use_container_width=True, type="primary", ): - advance_to_stage(get_db_path(), job_id=job_id, stage=next_stage) + advance_to_stage(DEFAULT_DB, job_id=job_id, stage=next_stage) if next_stage == "phone_screen": - submit_task(get_db_path(), "company_research", job_id) + submit_task(DEFAULT_DB, "company_research", job_id) st.rerun(scope="app") # full rerun — card must appear in new column if c2.button( "✗ Reject", key=f"rej_{job_id}", use_container_width=True, ): - reject_at_stage(get_db_path(), job_id=job_id, rejection_stage=stage) + reject_at_stage(DEFAULT_DB, job_id=job_id, rejection_stage=stage) st.rerun() # fragment-scope rerun — card disappears without scroll-to-top if job.get("url"): @@ -388,7 +385,7 @@ def _render_card(job: dict, stage: str, compact: bool = False) -> None: @st.fragment def _card_fragment(job_id: int, stage: str) -> None: """Re-fetches the job on each fragment rerun; renders nothing if moved/rejected.""" - job = get_job_by_id(get_db_path(), job_id) + job = get_job_by_id(DEFAULT_DB, job_id) if job is None or job.get("status") != stage: return _render_card(job, stage) @@ -397,11 +394,11 @@ def _card_fragment(job_id: int, stage: str) -> None: @st.fragment def _pre_kanban_row_fragment(job_id: int) -> None: """Pre-kanban compact row for applied and survey-stage jobs.""" - job = get_job_by_id(get_db_path(), job_id) + job = get_job_by_id(DEFAULT_DB, job_id) if job is None or job.get("status") not in ("applied", "survey"): return stage = job["status"] - contacts = get_contacts(get_db_path(), job_id=job_id) + contacts = get_contacts(DEFAULT_DB, job_id=job_id) last_contact = contacts[-1] if contacts else None with st.container(border=True): @@ -417,7 +414,7 @@ def _pre_kanban_row_fragment(job_id: int) -> None: _email_modal(job) # Stage signal hint (email-detected next steps) - signals = get_unread_stage_signals(get_db_path(), job_id=job_id) + signals = get_unread_stage_signals(DEFAULT_DB, job_id=job_id) if signals: sig = signals[-1] _SIGNAL_TO_STAGE = { @@ -440,15 +437,15 @@ def _pre_kanban_row_fragment(job_id: int) -> None: use_container_width=True, type="primary", ): if target_stage == "phone_screen": - advance_to_stage(get_db_path(), job_id=job_id, stage="phone_screen") - submit_task(get_db_path(), "company_research", job_id) + advance_to_stage(DEFAULT_DB, job_id=job_id, stage="phone_screen") + submit_task(DEFAULT_DB, "company_research", job_id) else: - advance_to_stage(get_db_path(), job_id=job_id, stage=target_stage) - dismiss_stage_signal(get_db_path(), sig["id"]) + advance_to_stage(DEFAULT_DB, job_id=job_id, stage=target_stage) + dismiss_stage_signal(DEFAULT_DB, sig["id"]) st.rerun(scope="app") if s2.button("Dismiss", key=f"sig_dis_pre_{sig['id']}", use_container_width=True): - dismiss_stage_signal(get_db_path(), sig["id"]) + dismiss_stage_signal(DEFAULT_DB, sig["id"]) st.rerun() with right: @@ -456,24 +453,24 @@ def _pre_kanban_row_fragment(job_id: int) -> None: "→ 📞 Phone Screen", key=f"adv_pre_{job_id}", use_container_width=True, type="primary", ): - advance_to_stage(get_db_path(), job_id=job_id, stage="phone_screen") - submit_task(get_db_path(), "company_research", job_id) + advance_to_stage(DEFAULT_DB, job_id=job_id, stage="phone_screen") + submit_task(DEFAULT_DB, "company_research", job_id) st.rerun(scope="app") col_a, col_b = st.columns(2) if stage == "applied" and col_a.button( "📋 Survey", key=f"to_survey_{job_id}", use_container_width=True, ): - advance_to_stage(get_db_path(), job_id=job_id, stage="survey") + advance_to_stage(DEFAULT_DB, job_id=job_id, stage="survey") st.rerun(scope="app") if col_b.button("✗ Reject", key=f"rej_pre_{job_id}", use_container_width=True): - reject_at_stage(get_db_path(), job_id=job_id, rejection_stage=stage) + reject_at_stage(DEFAULT_DB, job_id=job_id, rejection_stage=stage) st.rerun() @st.fragment def _hired_card_fragment(job_id: int) -> None: """Compact hired job card — shown in the Offer/Hired column.""" - job = get_job_by_id(get_db_path(), job_id) + job = get_job_by_id(DEFAULT_DB, job_id) if job is None or job.get("status") != "hired": return with st.container(border=True): diff --git a/app/pages/6_Interview_Prep.py b/app/pages/6_Interview_Prep.py index 94e79c0..812bdd1 100644 --- a/app/pages/6_Interview_Prep.py +++ b/app/pages/6_Interview_Prep.py @@ -25,14 +25,11 @@ from scripts.db import ( get_task_for_job, ) from scripts.task_runner import submit_task -from app.cloud_session import resolve_session, get_db_path -resolve_session("peregrine") - -init_db(get_db_path()) +init_db(DEFAULT_DB) # ── Job selection ───────────────────────────────────────────────────────────── -jobs_by_stage = get_interview_jobs(get_db_path()) +jobs_by_stage = get_interview_jobs(DEFAULT_DB) active_stages = ["phone_screen", "interviewing", "offer"] active_jobs = [ j for stage in active_stages @@ -103,10 +100,10 @@ col_prep, col_context = st.columns([2, 3]) # ════════════════════════════════════════════════ with col_prep: - research = get_research(get_db_path(), job_id=selected_id) + research = get_research(DEFAULT_DB, job_id=selected_id) # Refresh / generate research - _res_task = get_task_for_job(get_db_path(), "company_research", selected_id) + _res_task = get_task_for_job(DEFAULT_DB, "company_research", selected_id) _res_running = _res_task and _res_task["status"] in ("queued", "running") if not research: @@ -115,13 +112,13 @@ with col_prep: if _res_task and _res_task["status"] == "failed": st.error(f"Last attempt failed: {_res_task.get('error', '')}") if st.button("🔬 Generate research brief", type="primary", use_container_width=True): - submit_task(get_db_path(), "company_research", selected_id) + submit_task(DEFAULT_DB, "company_research", selected_id) st.rerun() if _res_running: @st.fragment(run_every=3) def _res_status_initial(): - t = get_task_for_job(get_db_path(), "company_research", selected_id) + t = get_task_for_job(DEFAULT_DB, "company_research", selected_id) if t and t["status"] in ("queued", "running"): stage = t.get("stage") or "" lbl = "Queued…" if t["status"] == "queued" else (stage or "Generating… this may take 30–60 seconds") @@ -136,13 +133,13 @@ with col_prep: col_ts, col_btn = st.columns([3, 1]) col_ts.caption(f"Research generated: {generated_at}") if col_btn.button("🔄 Refresh", use_container_width=True, disabled=bool(_res_running)): - submit_task(get_db_path(), "company_research", selected_id) + submit_task(DEFAULT_DB, "company_research", selected_id) st.rerun() if _res_running: @st.fragment(run_every=3) def _res_status_refresh(): - t = get_task_for_job(get_db_path(), "company_research", selected_id) + t = get_task_for_job(DEFAULT_DB, "company_research", selected_id) if t and t["status"] in ("queued", "running"): stage = t.get("stage") or "" lbl = "Queued…" if t["status"] == "queued" else (stage or "Refreshing research…") @@ -314,7 +311,7 @@ with col_context: st.markdown(job.get("description") or "_No description saved for this listing._") with tab_emails: - contacts = get_contacts(get_db_path(), job_id=selected_id) + contacts = get_contacts(DEFAULT_DB, job_id=selected_id) if not contacts: st.info("No contacts logged yet. Use the Interviews page to log emails.") else: diff --git a/app/pages/7_Survey.py b/app/pages/7_Survey.py index ed986ba..d5f00ed 100644 --- a/app/pages/7_Survey.py +++ b/app/pages/7_Survey.py @@ -22,13 +22,10 @@ from scripts.db import ( insert_survey_response, get_survey_responses, ) from scripts.llm_router import LLMRouter -from app.cloud_session import resolve_session, get_db_path - -resolve_session("peregrine") st.title("📋 Survey Assistant") -init_db(get_db_path()) +init_db(DEFAULT_DB) # ── Vision service health check ──────────────────────────────────────────────── @@ -43,7 +40,7 @@ def _vision_available() -> bool: vision_up = _vision_available() # ── Job selector ─────────────────────────────────────────────────────────────── -jobs_by_stage = get_interview_jobs(get_db_path()) +jobs_by_stage = get_interview_jobs(DEFAULT_DB) survey_jobs = jobs_by_stage.get("survey", []) other_jobs = ( jobs_by_stage.get("applied", []) + @@ -64,7 +61,7 @@ selected_job_id = st.selectbox( format_func=lambda jid: job_labels[jid], index=0, ) -selected_job = get_job_by_id(get_db_path(), selected_job_id) +selected_job = get_job_by_id(DEFAULT_DB, selected_job_id) # ── LLM prompt builders ──────────────────────────────────────────────────────── _SURVEY_SYSTEM = ( @@ -239,7 +236,7 @@ with right_col: image_path = str(img_file) insert_survey_response( - get_db_path(), + DEFAULT_DB, job_id=selected_job_id, survey_name=survey_name, source=source, @@ -259,7 +256,7 @@ with right_col: # ── History ──────────────────────────────────────────────────────────────────── st.divider() st.subheader("📂 Response History") -history = get_survey_responses(get_db_path(), job_id=selected_job_id) +history = get_survey_responses(DEFAULT_DB, job_id=selected_job_id) if not history: st.caption("No saved responses for this job yet.") diff --git a/app/wizard/tiers.py b/app/wizard/tiers.py index 2b04ab9..9679843 100644 --- a/app/wizard/tiers.py +++ b/app/wizard/tiers.py @@ -1,7 +1,7 @@ """ Tier definitions and feature gates for Peregrine. -Tiers: free < paid < premium < ultra (ultra reserved; no Peregrine features use it yet) +Tiers: free < paid < premium FEATURES maps feature key → minimum tier required. Features not in FEATURES are available to all tiers (free). @@ -22,14 +22,9 @@ Features that stay gated even with BYOK: """ from __future__ import annotations -import os as _os from pathlib import Path -from circuitforge_core.tiers import ( - can_use as _core_can_use, - TIERS, - tier_label as _core_tier_label, -) +TIERS = ["free", "paid", "premium"] # Maps feature key → minimum tier string required. # Features absent from this dict are free (available to all). @@ -63,9 +58,6 @@ FEATURES: dict[str, str] = { "google_calendar_sync": "paid", "apple_calendar_sync": "paid", "slack_notifications": "paid", - - # Beta UI access — open to all tiers (access management, not compute) - "vue_ui_beta": "free", } # Features that unlock when the user supplies any LLM backend (local or BYOK). @@ -83,13 +75,6 @@ BYOK_UNLOCKABLE: frozenset[str] = frozenset({ "survey_assistant", }) -# Demo mode flag — read from environment at module load time. -# Allows demo toolbar to override tier without accessing st.session_state (thread-safe). -# _DEMO_MODE is immutable after import for the process lifetime. -# DEMO_MODE must be set in the environment before the process starts (e.g., via -# Docker Compose environment:). Runtime toggling is not supported. -_DEMO_MODE = _os.environ.get("DEMO_MODE", "").lower() in ("1", "true", "yes") - # Free integrations (not in FEATURES): # google_drive_sync, dropbox_sync, onedrive_sync, mega_sync, # nextcloud_sync, discord_notifications, home_assistant @@ -116,40 +101,34 @@ def has_configured_llm(config_path: Path | None = None) -> bool: return False -def can_use( - tier: str, - feature: str, - has_byok: bool = False, - *, - demo_tier: str | None = None, -) -> bool: +def can_use(tier: str, feature: str, has_byok: bool = False) -> bool: """Return True if the given tier has access to the feature. has_byok: pass has_configured_llm() to unlock BYOK_UNLOCKABLE features for users who supply their own LLM backend regardless of tier. - demo_tier: when set AND _DEMO_MODE is True, substitutes for `tier`. - Read from st.session_state by the *caller*, not here — keeps - this function thread-safe for background tasks and tests. - Returns True for unknown features (not gated). Returns False for unknown/invalid tier strings. """ - effective_tier = demo_tier if (demo_tier is not None and _DEMO_MODE) else tier - # Pass Peregrine's BYOK_UNLOCKABLE via has_byok collapse — core's frozenset is empty + required = FEATURES.get(feature) + if required is None: + return True # not gated — available to all if has_byok and feature in BYOK_UNLOCKABLE: return True - return _core_can_use(feature, effective_tier, _features=FEATURES) + try: + return TIERS.index(tier) >= TIERS.index(required) + except ValueError: + return False # invalid tier string def tier_label(feature: str, has_byok: bool = False) -> str: """Return a display label for a locked feature, or '' if free/unlocked.""" if has_byok and feature in BYOK_UNLOCKABLE: return "" - raw = _core_tier_label(feature, _features=FEATURES) - if not raw or raw == "free": + required = FEATURES.get(feature) + if required is None: return "" - return "🔒 Paid" if raw == "paid" else "⭐ Premium" + return "🔒 Paid" if required == "paid" else "⭐ Premium" def effective_tier( diff --git a/compose.cloud.yml b/compose.cloud.yml index ea3c23d..180b168 100644 --- a/compose.cloud.yml +++ b/compose.cloud.yml @@ -13,15 +13,12 @@ services: app: - build: - context: .. - dockerfile: peregrine/Dockerfile.cfcore + build: . container_name: peregrine-cloud ports: - "8505:8501" volumes: - /devl/menagerie-data:/devl/menagerie-data # per-user data trees - - ./config/llm.cloud.yaml:/app/config/llm.yaml:ro # cloud-safe backends only (no claude_code/copilot/anthropic) environment: - CLOUD_MODE=true - CLOUD_DATA_ROOT=/devl/menagerie-data @@ -34,10 +31,7 @@ services: - DOCS_DIR=/tmp/cloud-docs - STREAMLIT_SERVER_BASE_URL_PATH=peregrine - PYTHONUNBUFFERED=1 - - PEREGRINE_CADDY_PROXY=1 - - CF_ORCH_URL=http://host.docker.internal:7700 - DEMO_MODE=false - - FORGEJO_API_TOKEN=${FORGEJO_API_TOKEN:-} depends_on: searxng: condition: service_healthy @@ -45,42 +39,6 @@ services: - "host.docker.internal:host-gateway" restart: unless-stopped - api: - build: - context: .. - dockerfile: peregrine/Dockerfile.cfcore - command: > - bash -c "uvicorn dev_api:app --host 0.0.0.0 --port 8601" - volumes: - - /devl/menagerie-data:/devl/menagerie-data - - ./config/llm.cloud.yaml:/app/config/llm.yaml:ro - environment: - - CLOUD_MODE=true - - CLOUD_DATA_ROOT=/devl/menagerie-data - - STAGING_DB=/devl/menagerie-data/cloud-default.db - - DIRECTUS_JWT_SECRET=${DIRECTUS_JWT_SECRET} - - CF_SERVER_SECRET=${CF_SERVER_SECRET} - - PLATFORM_DB_URL=${PLATFORM_DB_URL} - - HEIMDALL_URL=${HEIMDALL_URL:-http://cf-license:8000} - - HEIMDALL_ADMIN_TOKEN=${HEIMDALL_ADMIN_TOKEN} - - PYTHONUNBUFFERED=1 - - FORGEJO_API_TOKEN=${FORGEJO_API_TOKEN:-} - extra_hosts: - - "host.docker.internal:host-gateway" - restart: unless-stopped - - web: - build: - context: . - dockerfile: docker/web/Dockerfile - args: - VITE_BASE_PATH: /peregrine/ - ports: - - "8508:80" - depends_on: - - api - restart: unless-stopped - searxng: image: searxng/searxng:latest volumes: diff --git a/compose.demo.yml b/compose.demo.yml index c6296c3..3678321 100644 --- a/compose.demo.yml +++ b/compose.demo.yml @@ -38,16 +38,6 @@ services: - "host.docker.internal:host-gateway" restart: unless-stopped - web: - build: - context: . - dockerfile: docker/web/Dockerfile - args: - VITE_BASE_PATH: /peregrine/ - ports: - - "8507:80" - restart: unless-stopped - searxng: image: searxng/searxng:latest volumes: diff --git a/compose.test-cfcore.yml b/compose.test-cfcore.yml deleted file mode 100644 index eea3d34..0000000 --- a/compose.test-cfcore.yml +++ /dev/null @@ -1,35 +0,0 @@ -# compose.test-cfcore.yml — single-user test instance for circuitforge-core integration -# -# Run from the PARENT directory of peregrine/ (the build context must include -# both peregrine/ and circuitforge-core/ as siblings): -# -# cd /devl (or /Library/Development/CircuitForge on dev) -# docker compose -f peregrine/compose.test-cfcore.yml --project-name peregrine-test up -d -# docker compose -f peregrine/compose.test-cfcore.yml --project-name peregrine-test logs -f -# docker compose -f peregrine/compose.test-cfcore.yml --project-name peregrine-test down -# -# UI: http://localhost:8516 -# Purpose: smoke-test circuitforge-core shims (db, llm_router, tiers, task_scheduler) -# before promoting cfcore integration to the production cloud instance. - -services: - app: - build: - context: .. - dockerfile: peregrine/Dockerfile.cfcore - container_name: peregrine-test-cfcore - ports: - - "8516:8501" - volumes: - - /devl/job-seeker:/devl/job-seeker - - /devl/job-seeker/config:/app/config - - /devl/job-seeker/config/llm.docker.yaml:/app/config/llm.yaml:ro - - /devl/job-seeker/config/user.docker.yaml:/app/config/user.yaml:ro - environment: - - STAGING_DB=/devl/job-seeker/staging.db - - PYTHONUNBUFFERED=1 - - STREAMLIT_SERVER_BASE_URL_PATH= - - CF_ORCH_URL=http://host.docker.internal:7700 - extra_hosts: - - "host.docker.internal:host-gateway" - restart: "no" diff --git a/compose.yml b/compose.yml index cc82471..186dd97 100644 --- a/compose.yml +++ b/compose.yml @@ -1,11 +1,9 @@ # compose.yml — Peregrine by Circuit Forge LLC -# Profiles: remote | cpu | single-gpu | dual-gpu-ollama +# Profiles: remote | cpu | single-gpu | dual-gpu-ollama | dual-gpu-vllm | dual-gpu-mixed services: app: - build: - context: .. - dockerfile: peregrine/Dockerfile.cfcore + build: . command: > bash -c "streamlit run app/app.py --server.port=8501 @@ -35,7 +33,6 @@ services: - FORGEJO_API_URL=${FORGEJO_API_URL:-} - PYTHONUNBUFFERED=1 - PYTHONLOGGING=WARNING - - PEREGRINE_CADDY_PROXY=1 depends_on: searxng: condition: service_healthy @@ -43,39 +40,6 @@ services: - "host.docker.internal:host-gateway" restart: unless-stopped - api: - build: - context: .. - dockerfile: peregrine/Dockerfile.cfcore - command: > - bash -c "uvicorn dev_api:app --host 0.0.0.0 --port 8601" - volumes: - - ./config:/app/config - - ./data:/app/data - - ${DOCS_DIR:-~/Documents/JobSearch}:/docs - environment: - - STAGING_DB=/app/data/staging.db - - DOCS_DIR=/docs - - ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY:-} - - OPENAI_COMPAT_URL=${OPENAI_COMPAT_URL:-} - - OPENAI_COMPAT_KEY=${OPENAI_COMPAT_KEY:-} - - PEREGRINE_GPU_COUNT=${PEREGRINE_GPU_COUNT:-0} - - PEREGRINE_GPU_NAMES=${PEREGRINE_GPU_NAMES:-} - - PYTHONUNBUFFERED=1 - extra_hosts: - - "host.docker.internal:host-gateway" - restart: unless-stopped - - web: - build: - context: . - dockerfile: docker/web/Dockerfile - ports: - - "${VUE_PORT:-8506}:80" - depends_on: - - api - restart: unless-stopped - searxng: image: searxng/searxng:latest ports: @@ -129,6 +93,23 @@ services: profiles: [single-gpu, dual-gpu-ollama, dual-gpu-vllm, dual-gpu-mixed] restart: unless-stopped + vllm: + image: vllm/vllm-openai:latest + ports: + - "${VLLM_PORT:-8000}:8000" + volumes: + - ${VLLM_MODELS_DIR:-~/models/vllm}:/models + command: > + --model /models/${VLLM_MODEL:-Ouro-1.4B} + --trust-remote-code + --max-model-len 4096 + --gpu-memory-utilization 0.75 + --enforce-eager + --max-num-seqs 8 + --cpu-offload-gb ${CPU_OFFLOAD_GB:-0} + profiles: [dual-gpu-vllm, dual-gpu-mixed] + restart: unless-stopped + finetune: build: context: . diff --git a/config/llm.cloud.yaml b/config/llm.cloud.yaml deleted file mode 100644 index 62af14f..0000000 --- a/config/llm.cloud.yaml +++ /dev/null @@ -1,62 +0,0 @@ -backends: - anthropic: - api_key_env: ANTHROPIC_API_KEY - enabled: false - model: claude-sonnet-4-6 - supports_images: true - type: anthropic - claude_code: - api_key: any - base_url: http://localhost:3009/v1 - enabled: false - model: claude-code-terminal - supports_images: true - type: openai_compat - github_copilot: - api_key: any - base_url: http://localhost:3010/v1 - enabled: false - model: gpt-4o - supports_images: false - type: openai_compat - ollama: - api_key: ollama - base_url: http://host.docker.internal:11434/v1 - enabled: true - model: llama3.1:8b # generic — no personal fine-tunes in cloud - supports_images: false - type: openai_compat - ollama_research: - api_key: ollama - base_url: http://host.docker.internal:11434/v1 - enabled: true - model: llama3.1:8b - supports_images: false - type: openai_compat - vision_service: - base_url: http://host.docker.internal:8002 - enabled: true - supports_images: true - type: vision_service - vllm: - api_key: '' - base_url: http://host.docker.internal:8000/v1 - enabled: true - model: __auto__ - supports_images: false - type: openai_compat - vllm_research: - api_key: '' - base_url: http://host.docker.internal:8000/v1 - enabled: true - model: __auto__ - supports_images: false - type: openai_compat -fallback_order: -- vllm -- ollama -research_fallback_order: -- vllm_research -- ollama_research -vision_fallback_order: -- vision_service diff --git a/config/llm.yaml b/config/llm.yaml index 485b6a2..0f08746 100644 --- a/config/llm.yaml +++ b/config/llm.yaml @@ -28,9 +28,9 @@ backends: type: openai_compat ollama_research: api_key: ollama - base_url: http://ollama_research:11434/v1 + base_url: http://host.docker.internal:11434/v1 enabled: true - model: llama3.1:8b + model: llama3.2:3b supports_images: false type: openai_compat vision_service: @@ -45,11 +45,6 @@ backends: model: __auto__ supports_images: false type: openai_compat - cf_orch: - service: vllm - model_candidates: - - Qwen2.5-3B-Instruct - ttl_s: 300 vllm_research: api_key: '' base_url: http://host.docker.internal:8000/v1 diff --git a/config/user.yaml.example b/config/user.yaml.example index a2dbe94..b17c083 100644 --- a/config/user.yaml.example +++ b/config/user.yaml.example @@ -43,7 +43,6 @@ dev_tier_override: null # overrides tier locally (for testing only) wizard_complete: false wizard_step: 0 dismissed_banners: [] -ui_preference: streamlit # UI preference — "streamlit" (default) or "vue" (Beta: Paid tier) docs_dir: "~/Documents/JobSearch" ollama_models_dir: "~/models/ollama" diff --git a/demo/config/user.yaml b/demo/config/user.yaml index 3dd8c02..a4f1ec2 100644 --- a/demo/config/user.yaml +++ b/demo/config/user.yaml @@ -22,7 +22,7 @@ mission_preferences: social_impact: Want my work to reach people who need it most. name: Demo User nda_companies: [] -ollama_models_dir: /root/models/ollama +ollama_models_dir: ~/models/ollama phone: '' services: ollama_host: localhost @@ -39,7 +39,6 @@ services: vllm_ssl: false vllm_ssl_verify: true tier: free -ui_preference: streamlit -vllm_models_dir: /root/models/vllm +vllm_models_dir: ~/models/vllm wizard_complete: true wizard_step: 0 diff --git a/dev-api.py b/dev-api.py index 08f647f..0edfc4f 100644 --- a/dev-api.py +++ b/dev-api.py @@ -15,7 +15,6 @@ import ssl as ssl_mod import subprocess import sys import threading -from contextvars import ContextVar from datetime import datetime from pathlib import Path from typing import Optional, List @@ -24,7 +23,7 @@ from urllib.parse import urlparse import requests import yaml from bs4 import BeautifulSoup -from fastapi import FastAPI, HTTPException, Request, Response, UploadFile +from fastapi import FastAPI, HTTPException, Response, UploadFile from fastapi.middleware.cors import CORSMiddleware from pydantic import BaseModel @@ -33,18 +32,10 @@ PEREGRINE_ROOT = Path("/Library/Development/CircuitForge/peregrine") if str(PEREGRINE_ROOT) not in sys.path: sys.path.insert(0, str(PEREGRINE_ROOT)) -from circuitforge_core.config.settings import load_env as _load_env # noqa: E402 from scripts.credential_store import get_credential, set_credential, delete_credential # noqa: E402 DB_PATH = os.environ.get("STAGING_DB", "/devl/job-seeker/staging.db") -_CLOUD_MODE = os.environ.get("CLOUD_MODE", "").lower() in ("1", "true") -_CLOUD_DATA_ROOT = Path(os.environ.get("CLOUD_DATA_ROOT", "/devl/menagerie-data")) -_DIRECTUS_SECRET = os.environ.get("DIRECTUS_JWT_SECRET", "") - -# Per-request DB path — set by cloud_session_middleware; falls back to DB_PATH -_request_db: ContextVar[str | None] = ContextVar("_request_db", default=None) - app = FastAPI(title="Peregrine Dev API") app.add_middleware( @@ -55,65 +46,8 @@ app.add_middleware( ) -_log = logging.getLogger("peregrine.session") - -def _resolve_cf_user_id(cookie_str: str) -> str | None: - """Extract cf_session JWT from Cookie string and return Directus user_id. - - Directus signs with the raw bytes of its JWT_SECRET (which is base64-encoded - in env). Try the raw string first, then fall back to base64-decoded bytes. - """ - if not cookie_str: - _log.debug("_resolve_cf_user_id: empty cookie string") - return None - m = re.search(r'(?:^|;)\s*cf_session=([^;]+)', cookie_str) - if not m: - _log.debug("_resolve_cf_user_id: no cf_session in cookie: %s…", cookie_str[:80]) - return None - token = m.group(1).strip() - import base64 - import jwt # PyJWT - secrets_to_try: list[str | bytes] = [_DIRECTUS_SECRET] - try: - secrets_to_try.append(base64.b64decode(_DIRECTUS_SECRET)) - except Exception: - pass - # Skip exp verification — we use the token for routing only, not auth. - # Directus manages actual auth; Caddy gates on cookie presence. - decode_opts = {"verify_exp": False} - for secret in secrets_to_try: - try: - payload = jwt.decode(token, secret, algorithms=["HS256"], options=decode_opts) - user_id = payload.get("id") or payload.get("sub") - if user_id: - _log.debug("_resolve_cf_user_id: resolved user_id=%s", user_id) - return user_id - except Exception as exc: - _log.debug("_resolve_cf_user_id: decode failed (%s): %s", type(exc).__name__, exc) - continue - _log.warning("_resolve_cf_user_id: all secrets failed for token prefix %s…", token[:20]) - return None - - -@app.middleware("http") -async def cloud_session_middleware(request: Request, call_next): - """In cloud mode, resolve per-user staging.db from the X-CF-Session header.""" - if _CLOUD_MODE and _DIRECTUS_SECRET: - cookie_header = request.headers.get("X-CF-Session", "") - user_id = _resolve_cf_user_id(cookie_header) - if user_id: - user_db = str(_CLOUD_DATA_ROOT / user_id / "peregrine" / "staging.db") - token = _request_db.set(user_db) - try: - return await call_next(request) - finally: - _request_db.reset(token) - return await call_next(request) - - def _get_db(): - path = _request_db.get() or DB_PATH - db = sqlite3.connect(path) + db = sqlite3.connect(DB_PATH) db.row_factory = sqlite3.Row return db @@ -132,12 +66,20 @@ def _strip_html(text: str | None) -> str | None: @app.on_event("startup") def _startup(): - """Load .env then run pending SQLite migrations.""" - # Load .env before any runtime env reads — safe because startup doesn't run - # when dev_api is imported by tests (only when uvicorn actually starts). - _load_env(PEREGRINE_ROOT / ".env") - from scripts.db_migrate import migrate_db - migrate_db(Path(DB_PATH)) + """Ensure digest_queue table exists (dev-api may run against an existing DB).""" + db = _get_db() + try: + db.execute(""" + CREATE TABLE IF NOT EXISTS digest_queue ( + id INTEGER PRIMARY KEY, + job_contact_id INTEGER NOT NULL REFERENCES job_contacts(id), + created_at TEXT DEFAULT (datetime('now')), + UNIQUE(job_contact_id) + ) + """) + db.commit() + finally: + db.close() # ── Link extraction helpers ─────────────────────────────────────────────── @@ -422,67 +364,6 @@ def research_task_status(job_id: int): return {"status": row["status"], "stage": row["stage"], "message": row["error"]} -# ── ATS Resume Optimizer endpoints ─────────────────────────────────────────── - -@app.get("/api/jobs/{job_id}/resume_optimizer") -def get_optimized_resume(job_id: int): - """Return the current optimized resume and ATS gap report for a job.""" - from scripts.db import get_optimized_resume as _get - import json - result = _get(db_path=Path(DB_PATH), job_id=job_id) - gap_report = result.get("ats_gap_report", "") - try: - gap_report_parsed = json.loads(gap_report) if gap_report else [] - except Exception: - gap_report_parsed = [] - return { - "optimized_resume": result.get("optimized_resume", ""), - "ats_gap_report": gap_report_parsed, - } - - -class ResumeOptimizeBody(BaseModel): - full_rewrite: bool = False - - -@app.post("/api/jobs/{job_id}/resume_optimizer/generate") -def generate_optimized_resume(job_id: int, body: ResumeOptimizeBody): - """Queue an ATS resume optimization task for this job. - - full_rewrite=False (default) → free tier: gap report only, no LLM rewrite. - full_rewrite=True → paid tier: per-section LLM rewrite + hallucination check. - """ - import json - try: - from scripts.task_runner import submit_task - params = json.dumps({"full_rewrite": body.full_rewrite}) - task_id, is_new = submit_task( - db_path=Path(DB_PATH), - task_type="resume_optimize", - job_id=job_id, - params=params, - ) - return {"task_id": task_id, "is_new": is_new} - except Exception as e: - raise HTTPException(500, str(e)) - - -@app.get("/api/jobs/{job_id}/resume_optimizer/task") -def resume_optimizer_task_status(job_id: int): - """Poll the latest resume_optimize task status for this job.""" - db = _get_db() - row = db.execute( - "SELECT status, stage, error FROM background_tasks " - "WHERE task_type = 'resume_optimize' AND job_id = ? " - "ORDER BY id DESC LIMIT 1", - (job_id,), - ).fetchone() - db.close() - if not row: - return {"status": "none", "stage": None, "message": None} - return {"status": row["status"], "stage": row["stage"], "message": row["error"]} - - @app.get("/api/jobs/{job_id}/contacts") def get_job_contacts(job_id: int): db = _get_db() @@ -678,117 +559,6 @@ def download_pdf(job_id: int): raise HTTPException(501, "reportlab not installed — install it to generate PDFs") -# ── Application Q&A endpoints ───────────────────────────────────────────────── - -def _ensure_qa_column(db) -> None: - """Add application_qa TEXT column to jobs if not present (idempotent).""" - try: - db.execute("ALTER TABLE jobs ADD COLUMN application_qa TEXT") - db.commit() - except Exception: - pass # Column already exists - - -class QAItem(BaseModel): - id: str - question: str - answer: str - - -class QAPayload(BaseModel): - items: List[QAItem] - - -class QASuggestPayload(BaseModel): - question: str - - -@app.get("/api/jobs/{job_id}/qa") -def get_qa(job_id: int): - db = _get_db() - _ensure_qa_column(db) - row = db.execute("SELECT application_qa FROM jobs WHERE id = ?", (job_id,)).fetchone() - db.close() - if not row: - raise HTTPException(404, "Job not found") - try: - items = json.loads(row["application_qa"] or "[]") - except Exception: - items = [] - return {"items": items} - - -@app.patch("/api/jobs/{job_id}/qa") -def save_qa(job_id: int, payload: QAPayload): - db = _get_db() - _ensure_qa_column(db) - row = db.execute("SELECT id FROM jobs WHERE id = ?", (job_id,)).fetchone() - if not row: - db.close() - raise HTTPException(404, "Job not found") - db.execute( - "UPDATE jobs SET application_qa = ? WHERE id = ?", - (json.dumps([item.model_dump() for item in payload.items]), job_id), - ) - db.commit() - db.close() - return {"ok": True} - - -@app.post("/api/jobs/{job_id}/qa/suggest") -def suggest_qa_answer(job_id: int, payload: QASuggestPayload): - """Synchronously generate an LLM answer for an application Q&A question.""" - db = _get_db() - job_row = db.execute( - "SELECT title, company, description FROM jobs WHERE id = ?", (job_id,) - ).fetchone() - db.close() - if not job_row: - raise HTTPException(404, "Job not found") - - # Load resume summary for context - resume_context = "" - try: - resume_path = _resume_path() - if resume_path.exists(): - with open(resume_path) as f: - resume_data = yaml.safe_load(f) or {} - parts = [] - if resume_data.get("name"): - parts.append(f"Candidate: {resume_data['name']}") - if resume_data.get("skills"): - parts.append(f"Skills: {', '.join(resume_data['skills'][:20])}") - if resume_data.get("experience"): - exp = resume_data["experience"] - if isinstance(exp, list) and exp: - titles = [e.get("title", "") for e in exp[:3] if e.get("title")] - if titles: - parts.append(f"Recent roles: {', '.join(titles)}") - if resume_data.get("career_summary"): - parts.append(f"Summary: {resume_data['career_summary'][:400]}") - resume_context = "\n".join(parts) - except Exception: - pass - - prompt = ( - f"You are helping a job applicant answer an application question.\n\n" - f"Job: {job_row['title']} at {job_row['company']}\n" - f"Job description excerpt:\n{(job_row['description'] or '')[:800]}\n\n" - f"Candidate background:\n{resume_context or 'Not provided'}\n\n" - f"Application question: {payload.question}\n\n" - "Write a concise, professional answer (2–4 sentences) in first person. " - "Be specific and genuine. Do not use hollow filler phrases." - ) - - try: - from scripts.llm_router import LLMRouter - router = LLMRouter() - answer = router.complete(prompt) - return {"answer": answer.strip()} - except Exception as e: - raise HTTPException(500, f"LLM generation failed: {e}") - - # ── GET /api/interviews ──────────────────────────────────────────────────────── PIPELINE_STATUSES = { @@ -884,230 +654,6 @@ def email_sync_status(): } -# ── Task management routes ───────────────────────────────────────────────────── - -def _db_path() -> Path: - """Return the effective staging.db path (cloud-aware).""" - return Path(_request_db.get() or DB_PATH) - - -@app.get("/api/tasks") -def list_active_tasks(): - from scripts.db import get_active_tasks - return get_active_tasks(_db_path()) - - -@app.delete("/api/tasks/{task_id}") -def cancel_task_by_id(task_id: int): - from scripts.db import cancel_task - ok = cancel_task(_db_path(), task_id) - return {"ok": ok} - - -@app.post("/api/tasks/kill") -def kill_stuck(): - from scripts.db import kill_stuck_tasks - killed = kill_stuck_tasks(_db_path()) - return {"killed": killed} - - -@app.post("/api/tasks/discovery", status_code=202) -def trigger_discovery(): - from scripts.task_runner import submit_task - task_id, is_new = submit_task(_db_path(), "discovery", 0) - return {"task_id": task_id, "is_new": is_new} - - -@app.post("/api/tasks/email-sync", status_code=202) -def trigger_email_sync_task(): - from scripts.task_runner import submit_task - task_id, is_new = submit_task(_db_path(), "email_sync", 0) - return {"task_id": task_id, "is_new": is_new} - - -@app.post("/api/tasks/enrich", status_code=202) -def trigger_enrich_task(): - from scripts.task_runner import submit_task - task_id, is_new = submit_task(_db_path(), "enrich_descriptions", 0) - return {"task_id": task_id, "is_new": is_new} - - -@app.post("/api/tasks/score") -def trigger_score(): - try: - result = subprocess.run( - [sys.executable, "scripts/match.py"], - capture_output=True, text=True, cwd=str(PEREGRINE_ROOT), - ) - if result.returncode == 0: - return {"ok": True, "output": result.stdout} - raise HTTPException(status_code=500, detail=result.stderr) - except HTTPException: - raise - except Exception as e: - raise HTTPException(status_code=500, detail=str(e)) - - -@app.post("/api/tasks/sync") -def trigger_notion_sync(): - try: - from scripts.sync import sync_to_notion - count = sync_to_notion(_db_path()) - return {"ok": True, "count": count} - except Exception as e: - raise HTTPException(status_code=500, detail=str(e)) - - -# ── Bulk job actions ─────────────────────────────────────────────────────────── - -class BulkArchiveBody(BaseModel): - statuses: List[str] - - -@app.post("/api/jobs/archive") -def bulk_archive_jobs(body: BulkArchiveBody): - from scripts.db import archive_jobs - n = archive_jobs(_db_path(), statuses=body.statuses) - return {"archived": n} - - -class BulkPurgeBody(BaseModel): - statuses: Optional[List[str]] = None - target: Optional[str] = None # "email", "non_remote", "rescrape" - - -@app.post("/api/jobs/purge") -def bulk_purge_jobs(body: BulkPurgeBody): - from scripts.db import purge_jobs, purge_email_data, purge_non_remote - if body.target == "email": - contacts, jobs = purge_email_data(_db_path()) - return {"ok": True, "contacts": contacts, "jobs": jobs} - if body.target == "non_remote": - n = purge_non_remote(_db_path()) - return {"ok": True, "deleted": n} - if body.target == "rescrape": - purge_jobs(_db_path(), statuses=["pending", "approved", "rejected"]) - from scripts.task_runner import submit_task - submit_task(_db_path(), "discovery", 0) - return {"ok": True} - statuses = body.statuses or ["pending", "rejected"] - n = purge_jobs(_db_path(), statuses=statuses) - return {"ok": True, "deleted": n} - - -class AddJobsBody(BaseModel): - urls: List[str] - - -@app.post("/api/jobs/add", status_code=202) -def add_jobs_by_url(body: AddJobsBody): - try: - from datetime import datetime as _dt - from scripts.scrape_url import canonicalize_url - from scripts.db import get_existing_urls, insert_job - from scripts.task_runner import submit_task - db_path = _db_path() - existing = get_existing_urls(db_path) - queued = 0 - for raw_url in body.urls: - url = canonicalize_url(raw_url.strip()) - if not url.startswith("http") or url in existing: - continue - job_id = insert_job(db_path, { - "title": "Importing...", "company": "", "url": url, - "source": "manual", "location": "", "description": "", - "date_found": _dt.now().isoformat()[:10], - }) - if job_id: - submit_task(db_path, "scrape_url", job_id) - queued += 1 - return {"queued": queued} - except Exception as e: - raise HTTPException(status_code=500, detail=str(e)) - - -@app.post("/api/jobs/upload-csv", status_code=202) -async def upload_jobs_csv(file: UploadFile): - try: - import csv as _csv - import io as _io - from datetime import datetime as _dt - from scripts.scrape_url import canonicalize_url - from scripts.db import get_existing_urls, insert_job - from scripts.task_runner import submit_task - content = await file.read() - reader = _csv.DictReader(_io.StringIO(content.decode("utf-8", errors="replace"))) - urls: list[str] = [] - for row in reader: - for val in row.values(): - if val and val.strip().startswith("http"): - urls.append(val.strip()) - break - db_path = _db_path() - existing = get_existing_urls(db_path) - queued = 0 - for raw_url in urls: - url = canonicalize_url(raw_url) - if not url.startswith("http") or url in existing: - continue - job_id = insert_job(db_path, { - "title": "Importing...", "company": "", "url": url, - "source": "manual", "location": "", "description": "", - "date_found": _dt.now().isoformat()[:10], - }) - if job_id: - submit_task(db_path, "scrape_url", job_id) - queued += 1 - return {"queued": queued, "total": len(urls)} - except Exception as e: - raise HTTPException(status_code=500, detail=str(e)) - - -# ── Setup banners ────────────────────────────────────────────────────────────── - -_SETUP_BANNERS = [ - {"key": "connect_cloud", "text": "Connect a cloud service for resume/cover letter storage", "link": "/settings?tab=integrations"}, - {"key": "setup_email", "text": "Set up email sync to catch recruiter outreach", "link": "/settings?tab=email"}, - {"key": "setup_email_labels", "text": "Set up email label filters for auto-classification", "link": "/settings?tab=email"}, - {"key": "tune_mission", "text": "Tune your mission preferences for better cover letters", "link": "/settings?tab=profile"}, - {"key": "configure_keywords", "text": "Configure keywords and blocklist for smarter search", "link": "/settings?tab=search"}, - {"key": "upload_corpus", "text": "Upload your cover letter corpus for voice fine-tuning", "link": "/settings?tab=fine-tune"}, - {"key": "configure_linkedin", "text": "Configure LinkedIn Easy Apply automation", "link": "/settings?tab=integrations"}, - {"key": "setup_searxng", "text": "Set up company research with SearXNG", "link": "/settings?tab=system"}, - {"key": "target_companies", "text": "Build a target company list for focused outreach", "link": "/settings?tab=search"}, - {"key": "setup_notifications", "text": "Set up notifications for stage changes", "link": "/settings?tab=integrations"}, - {"key": "tune_model", "text": "Tune a custom cover letter model on your writing", "link": "/settings?tab=fine-tune"}, - {"key": "review_training", "text": "Review and curate training data for model tuning", "link": "/settings?tab=fine-tune"}, - {"key": "setup_calendar", "text": "Set up calendar sync to track interview dates", "link": "/settings?tab=integrations"}, -] - - -@app.get("/api/config/setup-banners") -def get_setup_banners(): - try: - cfg = _load_user_config() - if not cfg.get("wizard_complete"): - return [] - dismissed = set(cfg.get("dismissed_banners", [])) - return [b for b in _SETUP_BANNERS if b["key"] not in dismissed] - except Exception: - return [] - - -@app.post("/api/config/setup-banners/{key}/dismiss") -def dismiss_setup_banner(key: str): - try: - cfg = _load_user_config() - dismissed = cfg.get("dismissed_banners", []) - if key not in dismissed: - dismissed.append(key) - cfg["dismissed_banners"] = dismissed - _save_user_config(cfg) - return {"ok": True} - except Exception as e: - raise HTTPException(status_code=500, detail=str(e)) - - # ── POST /api/stage-signals/{id}/dismiss ───────────────────────────────── @app.post("/api/stage-signals/{signal_id}/dismiss") @@ -1340,26 +886,12 @@ def get_app_config(): valid_profiles = {"remote", "cpu", "single-gpu", "dual-gpu"} valid_tiers = {"free", "paid", "premium", "ultra"} raw_tier = os.environ.get("APP_TIER", "free") - - # Cloud users always bypass the wizard — they configure through Settings - is_cloud = os.environ.get("CLOUD_MODE", "").lower() in ("1", "true") - if is_cloud: - wizard_complete = True - else: - try: - cfg = load_user_profile(_user_yaml_path()) - wizard_complete = bool(cfg.get("wizard_complete", False)) - except Exception: - wizard_complete = False - return { "isCloud": os.environ.get("CLOUD_MODE", "").lower() in ("1", "true"), - "isDemo": os.environ.get("DEMO_MODE", "").lower() in ("1", "true", "yes"), "isDevMode": os.environ.get("DEV_MODE", "").lower() in ("1", "true"), "tier": raw_tier if raw_tier in valid_tiers else "free", "contractedClient": os.environ.get("CONTRACTED_CLIENT", "").lower() in ("1", "true"), "inferenceProfile": profile if profile in valid_profiles else "cpu", - "wizardComplete": wizard_complete, } @@ -1370,7 +902,9 @@ def config_user(): # Try to read name from user.yaml if present try: import yaml - cfg_path = _user_yaml_path() + cfg_path = os.path.join(os.path.dirname(DB_PATH), "config", "user.yaml") + if not os.path.exists(cfg_path): + cfg_path = "/devl/job-seeker/config/user.yaml" with open(cfg_path) as f: cfg = yaml.safe_load(f) return {"name": cfg.get("name", "")} @@ -1384,13 +918,11 @@ from scripts.user_profile import load_user_profile, save_user_profile def _user_yaml_path() -> str: - """Resolve user.yaml path relative to the active staging.db. - - In cloud mode the ContextVar holds the per-user db path; elsewhere - falls back to STAGING_DB env var. Never crosses user boundaries. - """ - db = _request_db.get() or os.environ.get("STAGING_DB", "/devl/peregrine/staging.db") - return os.path.join(os.path.dirname(db), "config", "user.yaml") + """Resolve user.yaml path, falling back to legacy location.""" + cfg_path = os.path.join(os.path.dirname(DB_PATH), "config", "user.yaml") + if not os.path.exists(cfg_path): + cfg_path = "/devl/job-seeker/config/user.yaml" + return cfg_path def _mission_dict_to_list(prefs: object) -> list: @@ -1457,42 +989,6 @@ class IdentitySyncPayload(BaseModel): phone: str = "" linkedin_url: str = "" -_VALID_THEMES = frozenset({"auto", "light", "dark", "solarized-dark", "solarized-light", "colorblind"}) - -class ThemePayload(BaseModel): - theme: str - -@app.post("/api/settings/theme") -def set_theme(payload: ThemePayload): - """Persist the user's chosen theme to user.yaml.""" - if payload.theme not in _VALID_THEMES: - raise HTTPException(status_code=400, detail=f"Invalid theme: {payload.theme}") - try: - data = load_user_profile(_user_yaml_path()) - data["theme"] = payload.theme - save_user_profile(_user_yaml_path(), data) - return {"ok": True} - except Exception as e: - raise HTTPException(status_code=500, detail=str(e)) - - -class UIPrefPayload(BaseModel): - preference: str # "streamlit" | "vue" - -@app.post("/api/settings/ui-preference") -def set_ui_preference(payload: UIPrefPayload): - """Persist UI preference to user.yaml so Streamlit doesn't re-set the cookie.""" - if payload.preference not in ("streamlit", "vue"): - raise HTTPException(status_code=400, detail="preference must be 'streamlit' or 'vue'") - try: - data = load_user_profile(_user_yaml_path()) - data["ui_preference"] = payload.preference - save_user_profile(_user_yaml_path(), data) - return {"ok": True} - except Exception as e: - raise HTTPException(status_code=500, detail=str(e)) - - @app.post("/api/settings/resume/sync-identity") def sync_identity(payload: IdentitySyncPayload): """Sync identity fields from profile store back to user.yaml.""" @@ -1532,108 +1028,6 @@ def save_profile(payload: UserProfilePayload): raise HTTPException(500, f"Could not save profile: {e}") -# ── Settings: My Profile — LLM generation endpoints ───────────────────────── - -def _resume_context_snippet() -> str: - """Load a concise resume snippet for use as LLM generation context.""" - try: - rp = _resume_path() - if not rp.exists(): - return "" - with open(rp) as f: - resume_data = yaml.safe_load(f) or {} - parts: list[str] = [] - if resume_data.get("name"): - parts.append(f"Candidate: {resume_data['name']}") - if resume_data.get("skills"): - parts.append(f"Skills: {', '.join(resume_data['skills'][:20])}") - if resume_data.get("experience"): - exp = resume_data["experience"] - if isinstance(exp, list) and exp: - titles = [e.get("title", "") for e in exp[:3] if e.get("title")] - if titles: - parts.append(f"Recent roles: {', '.join(titles)}") - return "\n".join(parts) - except Exception: - return "" - - -@app.post("/api/settings/profile/generate-summary") -def generate_career_summary(): - """LLM-generate a career summary from the candidate's resume profile.""" - context = _resume_context_snippet() - if not context: - raise HTTPException(400, "Resume profile is empty — add experience and skills first") - prompt = ( - "You are a professional resume writer.\n\n" - f"Candidate background:\n{context}\n\n" - "Write a 2–3 sentence professional career summary in first person. " - "Be specific, highlight key strengths, and avoid hollow filler phrases like " - "'results-driven' or 'passionate self-starter'." - ) - try: - from scripts.llm_router import LLMRouter - summary = LLMRouter().complete(prompt) - return {"summary": summary.strip()} - except Exception as e: - raise HTTPException(500, f"LLM generation failed: {e}") - - -@app.post("/api/settings/profile/generate-missions") -def generate_mission_preferences(): - """LLM-generate 3 mission/industry preferences from the candidate's resume.""" - context = _resume_context_snippet() - prompt = ( - "You are helping a job seeker identify mission-aligned industries they would enjoy working in.\n\n" - + (f"Candidate background:\n{context}\n\n" if context else "") - + "Suggest 3 mission-aligned industries or causes the candidate might care about " - "(e.g. animal welfare, education, accessibility, climate tech, healthcare). " - "Return a JSON array with exactly 3 objects, each with 'tag' (slug, no spaces), " - "'label' (human-readable name), and 'note' (one sentence on why it fits). " - "Only output the JSON array, no other text." - ) - try: - from scripts.llm_router import LLMRouter - import json as _json - raw = LLMRouter().complete(prompt) - # Extract JSON array from the response - start = raw.find("[") - end = raw.rfind("]") + 1 - if start == -1 or end == 0: - raise ValueError("LLM did not return a JSON array") - items = _json.loads(raw[start:end]) - # Normalise to {industry, note} — LLM may return {tag, label, note} - missions = [ - {"industry": m.get("label") or m.get("tag") or str(m), "note": m.get("note", "")} - for m in items if isinstance(m, dict) - ] - return {"mission_preferences": missions} - except Exception as e: - raise HTTPException(500, f"LLM generation failed: {e}") - - -@app.post("/api/settings/profile/generate-voice") -def generate_candidate_voice(): - """LLM-generate a candidate voice/writing-style note from the resume profile.""" - context = _resume_context_snippet() - if not context: - raise HTTPException(400, "Resume profile is empty — add experience and skills first") - prompt = ( - "You are a professional writing coach helping a job seeker articulate their communication style.\n\n" - f"Candidate background:\n{context}\n\n" - "Write a 1–2 sentence note describing the candidate's professional voice and writing style " - "for use in cover letter generation. This should capture tone (e.g. direct, warm, precise), " - "values that come through in their writing, and any standout personality. " - "Write it in third person as a style directive (e.g. 'Writes in a clear, direct tone...')." - ) - try: - from scripts.llm_router import LLMRouter - voice = LLMRouter().complete(prompt) - return {"voice": voice.strip()} - except Exception as e: - raise HTTPException(500, f"LLM generation failed: {e}") - - # ── Settings: Resume Profile endpoints ─────────────────────────────────────── class WorkEntry(BaseModel): @@ -1651,66 +1045,16 @@ class ResumePayload(BaseModel): veteran_status: str = ""; disability: str = "" skills: List[str] = []; domains: List[str] = []; keywords: List[str] = [] -def _config_dir() -> Path: - """Resolve per-user config directory. Always co-located with user.yaml.""" - return Path(_user_yaml_path()).parent - -def _resume_path() -> Path: - """Resolve plain_text_resume.yaml co-located with user.yaml (user-isolated).""" - return _config_dir() / "plain_text_resume.yaml" - -def _search_prefs_path() -> Path: - return _config_dir() / "search_profiles.yaml" - -def _license_path() -> Path: - return _config_dir() / "license.yaml" - -def _tokens_path() -> Path: - return _config_dir() / "tokens.yaml" - -def _normalize_experience(raw: list) -> list: - """Normalize AIHawk-style experience entries to the Vue WorkEntry schema. - - Parser / AIHawk stores: bullets (list[str]), start_date, end_date - Vue WorkEntry expects: responsibilities (str), period (str) - """ - out = [] - for e in raw: - if not isinstance(e, dict): - continue - entry = dict(e) - # bullets → responsibilities - if "responsibilities" not in entry or not entry["responsibilities"]: - bullets = entry.pop("bullets", None) or [] - if isinstance(bullets, list): - entry["responsibilities"] = "\n".join(b for b in bullets if b) - elif isinstance(bullets, str): - entry["responsibilities"] = bullets - else: - entry.pop("bullets", None) - # start_date + end_date → period - if "period" not in entry or not entry["period"]: - start = entry.pop("start_date", "") or "" - end = entry.pop("end_date", "") or "" - entry["period"] = f"{start} – {end}".strip(" –") if (start or end) else "" - else: - entry.pop("start_date", None) - entry.pop("end_date", None) - out.append(entry) - return out - +RESUME_PATH = Path("config/plain_text_resume.yaml") @app.get("/api/settings/resume") def get_resume(): try: - resume_path = _resume_path() - if not resume_path.exists(): + if not RESUME_PATH.exists(): return {"exists": False} - with open(resume_path) as f: + with open(RESUME_PATH) as f: data = yaml.safe_load(f) or {} data["exists"] = True - if "experience" in data and isinstance(data["experience"], list): - data["experience"] = _normalize_experience(data["experience"]) return data except Exception as e: raise HTTPException(status_code=500, detail=str(e)) @@ -1718,9 +1062,8 @@ def get_resume(): @app.put("/api/settings/resume") def save_resume(payload: ResumePayload): try: - resume_path = _resume_path() - resume_path.parent.mkdir(parents=True, exist_ok=True) - with open(resume_path, "w") as f: + RESUME_PATH.parent.mkdir(parents=True, exist_ok=True) + with open(RESUME_PATH, "w") as f: yaml.dump(payload.model_dump(), f, allow_unicode=True, default_flow_style=False) return {"ok": True} except Exception as e: @@ -1729,10 +1072,9 @@ def save_resume(payload: ResumePayload): @app.post("/api/settings/resume/blank") def create_blank_resume(): try: - resume_path = _resume_path() - resume_path.parent.mkdir(parents=True, exist_ok=True) - if not resume_path.exists(): - with open(resume_path, "w") as f: + RESUME_PATH.parent.mkdir(parents=True, exist_ok=True) + if not RESUME_PATH.exists(): + with open(RESUME_PATH, "w") as f: yaml.dump({}, f) return {"ok": True} except Exception as e: @@ -1741,30 +1083,20 @@ def create_blank_resume(): @app.post("/api/settings/resume/upload") async def upload_resume(file: UploadFile): try: - from scripts.resume_parser import ( - extract_text_from_pdf, - extract_text_from_docx, - extract_text_from_odt, - structure_resume, - ) + from scripts.resume_parser import structure_resume + import tempfile, os suffix = Path(file.filename).suffix.lower() - file_bytes = await file.read() - - if suffix == ".pdf": - raw_text = extract_text_from_pdf(file_bytes) - elif suffix == ".odt": - raw_text = extract_text_from_odt(file_bytes) - else: - raw_text = extract_text_from_docx(file_bytes) - - result, err = structure_resume(raw_text) - if err and not result: - return {"ok": False, "error": err} - # Persist parsed data so store.load() reads the updated file - resume_path = _resume_path() - resume_path.parent.mkdir(parents=True, exist_ok=True) - with open(resume_path, "w") as f: - yaml.dump(result, f, allow_unicode=True, default_flow_style=False) + tmp_path = None + with tempfile.NamedTemporaryFile(delete=False, suffix=suffix) as tmp: + tmp.write(await file.read()) + tmp_path = tmp.name + try: + result, err = structure_resume(tmp_path) + finally: + if tmp_path: + os.unlink(tmp_path) + if err: + return {"ok": False, "error": err, "data": result} result["exists"] = True return {"ok": True, "data": result} except Exception as e: @@ -1784,13 +1116,14 @@ class SearchPrefsPayload(BaseModel): blocklist_industries: List[str] = [] blocklist_locations: List[str] = [] +SEARCH_PREFS_PATH = Path("config/search_profiles.yaml") + @app.get("/api/settings/search") def get_search_prefs(): try: - p = _search_prefs_path() - if not p.exists(): + if not SEARCH_PREFS_PATH.exists(): return {} - with open(p) as f: + with open(SEARCH_PREFS_PATH) as f: data = yaml.safe_load(f) or {} return data.get("default", {}) except Exception as e: @@ -1799,72 +1132,24 @@ def get_search_prefs(): @app.put("/api/settings/search") def save_search_prefs(payload: SearchPrefsPayload): try: - p = _search_prefs_path() data = {} - if p.exists(): - with open(p) as f: + if SEARCH_PREFS_PATH.exists(): + with open(SEARCH_PREFS_PATH) as f: data = yaml.safe_load(f) or {} data["default"] = payload.model_dump() - p.parent.mkdir(parents=True, exist_ok=True) - with open(p, "w") as f: + with open(SEARCH_PREFS_PATH, "w") as f: yaml.dump(data, f, allow_unicode=True, default_flow_style=False) return {"ok": True} except Exception as e: raise HTTPException(status_code=500, detail=str(e)) -class SearchSuggestPayload(BaseModel): - type: str # "titles" | "locations" | "exclude_keywords" - current: List[str] = [] - @app.post("/api/settings/search/suggest") -def suggest_search(payload: SearchSuggestPayload): - """LLM-generate suggestions for job titles, locations, or exclude keywords.""" - context = _resume_context_snippet() - current_str = ", ".join(payload.current) if payload.current else "none" - - if payload.type == "titles": - prompt = ( - "You are a career advisor helping a job seeker identify relevant job titles.\n\n" - + (f"Candidate background:\n{context}\n\n" if context else "") - + f"Current job titles they're searching for: {current_str}\n\n" - "Suggest 5 additional relevant job titles they may have missed. " - "Return only a JSON array of strings, no other text. " - "Example: [\"Senior Software Engineer\", \"Staff Engineer\"]" - ) - elif payload.type == "locations": - prompt = ( - "You are a career advisor helping a job seeker identify relevant job markets.\n\n" - + (f"Candidate background:\n{context}\n\n" if context else "") - + f"Current locations they're searching in: {current_str}\n\n" - "Suggest 5 relevant locations or remote options they may have missed. " - "Include 'Remote' if not already listed. " - "Return only a JSON array of strings, no other text." - ) - elif payload.type == "exclude_keywords": - prompt = ( - "You are a job search assistant helping a job seeker filter out irrelevant listings.\n\n" - + (f"Candidate background:\n{context}\n\n" if context else "") - + f"Keywords they already exclude: {current_str}\n\n" - "Suggest 5–8 keywords or phrases they should add to their exclude list to avoid " - "irrelevant postings (e.g. management roles they don't want, clearance requirements, " - "technologies they don't work with). " - "Return only a JSON array of strings, no other text." - ) - else: - raise HTTPException(400, f"Unknown suggestion type: {payload.type}") - +def suggest_search(body: dict): try: - import json as _json - from scripts.llm_router import LLMRouter - raw = LLMRouter().complete(prompt) - start = raw.find("[") - end = raw.rfind("]") + 1 - if start == -1 or end == 0: - return {"suggestions": []} - suggestions = _json.loads(raw[start:end]) - return {"suggestions": [str(s) for s in suggestions if s]} + # Stub — LLM suggest for paid tier + return {"suggestions": []} except Exception as e: - raise HTTPException(500, f"LLM generation failed: {e}") + raise HTTPException(status_code=500, detail=str(e)) # ── Settings: System — LLM Backends + BYOK endpoints ───────────────────────── @@ -1980,7 +1265,7 @@ def stop_service(name: str): # ── Settings: System — Email ────────────────────────────────────────────────── -# EMAIL_PATH is resolved per-request via _config_dir() +EMAIL_PATH = Path("config/email.yaml") EMAIL_CRED_SERVICE = "peregrine" EMAIL_CRED_KEY = "imap_password" @@ -1992,9 +1277,8 @@ EMAIL_YAML_FIELDS = ("host", "port", "ssl", "username", "sent_folder", "lookback def get_email_config(): try: config = {} - ep = _config_dir() / "email.yaml" - if ep.exists(): - with open(ep) as f: + if EMAIL_PATH.exists(): + with open(EMAIL_PATH) as f: config = yaml.safe_load(f) or {} # Never return the password — only indicate whether it's set password = get_credential(EMAIL_CRED_SERVICE, EMAIL_CRED_KEY) @@ -2008,8 +1292,7 @@ def get_email_config(): @app.put("/api/settings/system/email") def save_email_config(payload: dict): try: - ep = _config_dir() / "email.yaml" - ep.parent.mkdir(parents=True, exist_ok=True) + EMAIL_PATH.parent.mkdir(parents=True, exist_ok=True) # Extract password before writing yaml; discard the sentinel boolean regardless password = payload.pop("password", None) payload.pop("password_set", None) # always discard — boolean sentinel, not a secret @@ -2017,7 +1300,7 @@ def save_email_config(payload: dict): set_credential(EMAIL_CRED_SERVICE, EMAIL_CRED_KEY, password) # Write non-secret fields to yaml (chmod 600 still, contains username) safe_config = {k: v for k, v in payload.items() if k in EMAIL_YAML_FIELDS} - fd = os.open(str(ep), os.O_WRONLY | os.O_CREAT | os.O_TRUNC, 0o600) + fd = os.open(str(EMAIL_PATH), os.O_WRONLY | os.O_CREAT | os.O_TRUNC, 0o600) with os.fdopen(fd, "w") as f: yaml.dump(safe_config, f, allow_unicode=True, default_flow_style=False) return {"ok": True} @@ -2169,73 +1452,18 @@ def save_deploy_config(payload: dict): # ── Settings: Fine-Tune ─────────────────────────────────────────────────────── -_TRAINING_JSONL = Path("/Library/Documents/JobSearch/training_data/cover_letters.jsonl") - - -def _load_training_pairs() -> list[dict]: - """Load training pairs from the JSONL file. Returns empty list if missing.""" - if not _TRAINING_JSONL.exists(): - return [] - pairs = [] - with open(_TRAINING_JSONL, encoding="utf-8") as f: - for line in f: - line = line.strip() - if line: - try: - pairs.append(json.loads(line)) - except json.JSONDecodeError: - pass - return pairs - - -def _save_training_pairs(pairs: list[dict]) -> None: - _TRAINING_JSONL.parent.mkdir(parents=True, exist_ok=True) - with open(_TRAINING_JSONL, "w", encoding="utf-8") as f: - for p in pairs: - f.write(json.dumps(p, ensure_ascii=False) + "\n") - - @app.get("/api/settings/fine-tune/status") def finetune_status(): try: - pairs_count = len(_load_training_pairs()) from scripts.task_runner import get_task_status task = get_task_status("finetune_extract") - if task: - # Prefer the DB task count if available and larger (recent extraction) - db_count = task.get("result_count", 0) or 0 - pairs_count = max(pairs_count, db_count) - status = task.get("status", "idle") if task else "idle" - # Stub quota for self-hosted; cloud overrides via its own middleware - return {"status": status, "pairs_count": pairs_count, "quota_remaining": None} + if not task: + return {"status": "idle", "pairs_count": 0} + return {"status": task.get("status", "idle"), "pairs_count": task.get("result_count", 0)} except Exception as e: raise HTTPException(status_code=500, detail=str(e)) -@app.get("/api/settings/fine-tune/pairs") -def list_training_pairs(): - """Return training pairs with index for display and removal.""" - pairs = _load_training_pairs() - return { - "pairs": [ - {"index": i, "instruction": p.get("instruction", ""), "source_file": p.get("source_file", "")} - for i, p in enumerate(pairs) - ], - "total": len(pairs), - } - - -@app.delete("/api/settings/fine-tune/pairs/{index}") -def delete_training_pair(index: int): - """Remove a training pair by index.""" - pairs = _load_training_pairs() - if index < 0 or index >= len(pairs): - raise HTTPException(404, "Pair index out of range") - pairs.pop(index) - _save_training_pairs(pairs) - return {"ok": True, "remaining": len(pairs)} - - @app.post("/api/settings/fine-tune/extract") def finetune_extract(): try: @@ -2266,11 +1494,12 @@ async def finetune_upload(files: list[UploadFile]): @app.post("/api/settings/fine-tune/submit") def finetune_submit(): - """Trigger prepare_training_data extraction and queue fine-tune background task.""" try: - from scripts.task_runner import submit_task - task_id, is_new = submit_task(Path(DB_PATH), "prepare_training", None) - return {"job_id": str(task_id), "is_new": is_new} + # Cloud-only: submit a managed fine-tune job + # In dev mode, stub a job_id for local testing + import uuid + job_id = str(uuid.uuid4()) + return {"job_id": job_id} except Exception as e: raise HTTPException(status_code=500, detail=str(e)) @@ -2290,7 +1519,12 @@ def finetune_local_status(): # ── Settings: License ───────────────────────────────────────────────────────── -# _config_dir() / _license_path() / _tokens_path() are per-request (see helpers above) +# CONFIG_DIR resolves relative to staging.db location (same convention as _user_yaml_path) +CONFIG_DIR = Path(os.path.dirname(DB_PATH)) / "config" +if not CONFIG_DIR.exists(): + CONFIG_DIR = Path("/devl/job-seeker/config") + +LICENSE_PATH = CONFIG_DIR / "license.yaml" def _load_user_config() -> dict: @@ -2306,9 +1540,8 @@ def _save_user_config(cfg: dict) -> None: @app.get("/api/settings/license") def get_license(): try: - lp = _license_path() - if lp.exists(): - with open(lp) as f: + if LICENSE_PATH.exists(): + with open(LICENSE_PATH) as f: data = yaml.safe_load(f) or {} else: data = {} @@ -2332,10 +1565,9 @@ def activate_license(payload: LicenseActivatePayload): key = payload.key.strip() if not re.match(r'^CFG-[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}$', key): return {"ok": False, "error": "Invalid key format"} - lp = _license_path() data = {"tier": "paid", "key": key, "active": True} - lp.parent.mkdir(parents=True, exist_ok=True) - fd = os.open(str(lp), os.O_WRONLY | os.O_CREAT | os.O_TRUNC, 0o600) + CONFIG_DIR.mkdir(parents=True, exist_ok=True) + fd = os.open(str(LICENSE_PATH), os.O_WRONLY | os.O_CREAT | os.O_TRUNC, 0o600) with os.fdopen(fd, "w") as f: yaml.dump(data, f, allow_unicode=True, default_flow_style=False) return {"ok": True, "tier": "paid"} @@ -2346,12 +1578,11 @@ def activate_license(payload: LicenseActivatePayload): @app.post("/api/settings/license/deactivate") def deactivate_license(): try: - lp = _license_path() - if lp.exists(): - with open(lp) as f: + if LICENSE_PATH.exists(): + with open(LICENSE_PATH) as f: data = yaml.safe_load(f) or {} data["active"] = False - fd = os.open(str(lp), os.O_WRONLY | os.O_CREAT | os.O_TRUNC, 0o600) + fd = os.open(str(LICENSE_PATH), os.O_WRONLY | os.O_CREAT | os.O_TRUNC, 0o600) with os.fdopen(fd, "w") as f: yaml.dump(data, f, allow_unicode=True, default_flow_style=False) return {"ok": True} @@ -2369,19 +1600,18 @@ def create_backup(payload: BackupCreatePayload): try: import zipfile import datetime - cfg_dir = _config_dir() - backup_dir = cfg_dir.parent / "backups" + backup_dir = Path("data/backups") backup_dir.mkdir(parents=True, exist_ok=True) ts = datetime.datetime.now().strftime("%Y%m%d_%H%M%S") dest = backup_dir / f"peregrine_backup_{ts}.zip" file_count = 0 with zipfile.ZipFile(dest, "w", zipfile.ZIP_DEFLATED) as zf: - for cfg_file in cfg_dir.glob("*.yaml"): + for cfg_file in CONFIG_DIR.glob("*.yaml"): if cfg_file.name not in ("tokens.yaml",): zf.write(cfg_file, f"config/{cfg_file.name}") file_count += 1 if payload.include_db: - db_path = Path(_request_db.get() or DB_PATH) + db_path = Path(DB_PATH) if db_path.exists(): zf.write(db_path, "data/staging.db") file_count += 1 @@ -2425,14 +1655,15 @@ def save_privacy(payload: dict): # ── Settings: Developer ─────────────────────────────────────────────────────── +TOKENS_PATH = CONFIG_DIR / "tokens.yaml" + @app.get("/api/settings/developer") def get_developer(): try: cfg = _load_user_config() tokens = {} - tp = _tokens_path() - if tp.exists(): - with open(tp) as f: + if TOKENS_PATH.exists(): + with open(TOKENS_PATH) as f: tokens = yaml.safe_load(f) or {} return { "dev_tier_override": cfg.get("dev_tier_override"), @@ -2506,288 +1737,3 @@ def export_classifier(): return {"ok": True, "count": len(emails), "path": str(export_path)} except Exception as e: raise HTTPException(status_code=500, detail=str(e)) - - -# ── Wizard API ──────────────────────────────────────────────────────────────── -# -# These endpoints back the Vue SPA first-run onboarding wizard. -# State is persisted to user.yaml on every step so the wizard can resume -# after a browser refresh or crash (mirrors the Streamlit wizard behaviour). - -_WIZARD_PROFILES = ("remote", "cpu", "single-gpu", "dual-gpu") -_WIZARD_TIERS = ("free", "paid", "premium") - - -def _wizard_yaml_path() -> str: - """Same resolution logic as _user_yaml_path() — single source of truth.""" - return _user_yaml_path() - - -def _load_wizard_yaml() -> dict: - try: - return load_user_profile(_wizard_yaml_path()) or {} - except Exception: - return {} - - -def _save_wizard_yaml(updates: dict) -> None: - path = _wizard_yaml_path() - existing = _load_wizard_yaml() - existing.update(updates) - save_user_profile(path, existing) - - -def _detect_gpus() -> list[str]: - """Detect GPUs. Prefers PEREGRINE_GPU_NAMES env var (set by preflight).""" - env_names = os.environ.get("PEREGRINE_GPU_NAMES", "").strip() - if env_names: - return [n.strip() for n in env_names.split(",") if n.strip()] - try: - out = subprocess.check_output( - ["nvidia-smi", "--query-gpu=name", "--format=csv,noheader"], - text=True, timeout=5, - ) - return [line.strip() for line in out.strip().splitlines() if line.strip()] - except Exception: - return [] - - -def _suggest_profile(gpus: list[str]) -> str: - recommended = os.environ.get("RECOMMENDED_PROFILE", "").strip() - if recommended and recommended in _WIZARD_PROFILES: - return recommended - if len(gpus) >= 2: - return "dual-gpu" - if len(gpus) == 1: - return "single-gpu" - return "remote" - - -@app.get("/api/wizard/status") -def wizard_status(): - """Return current wizard state for resume-after-refresh. - - wizard_complete=True means the wizard has been finished and the app - should not redirect to /setup. wizard_step is the last completed step - (0 = not started); the SPA advances to step+1 on load. - """ - cfg = _load_wizard_yaml() - return { - "wizard_complete": bool(cfg.get("wizard_complete", False)), - "wizard_step": int(cfg.get("wizard_step", 0)), - "saved_data": { - "inference_profile": cfg.get("inference_profile", ""), - "tier": cfg.get("tier", "free"), - "name": cfg.get("name", ""), - "email": cfg.get("email", ""), - "phone": cfg.get("phone", ""), - "linkedin": cfg.get("linkedin", ""), - "career_summary": cfg.get("career_summary", ""), - "services": cfg.get("services", {}), - }, - } - - -class WizardStepPayload(BaseModel): - step: int - data: dict = {} - - -@app.post("/api/wizard/step") -def wizard_save_step(payload: WizardStepPayload): - """Persist a single wizard step and advance the step counter. - - Side effects by step number: - - Step 3 (Resume): writes config/plain_text_resume.yaml - - Step 5 (Inference): writes API keys into .env - - Step 6 (Search): writes config/search_profiles.yaml - """ - step = payload.step - data = payload.data - - if step < 1 or step > 7: - raise HTTPException(status_code=400, detail="step must be 1–7") - - updates: dict = {"wizard_step": step} - - # ── Step-specific field extraction ──────────────────────────────────────── - if step == 1: - profile = data.get("inference_profile", "remote") - if profile not in _WIZARD_PROFILES: - raise HTTPException(status_code=400, detail=f"Unknown profile: {profile}") - updates["inference_profile"] = profile - - elif step == 2: - tier = data.get("tier", "free") - if tier not in _WIZARD_TIERS: - raise HTTPException(status_code=400, detail=f"Unknown tier: {tier}") - updates["tier"] = tier - - elif step == 3: - # Resume data: persist to plain_text_resume.yaml - resume = data.get("resume", {}) - if resume: - resume_path = Path(_wizard_yaml_path()).parent / "plain_text_resume.yaml" - resume_path.parent.mkdir(parents=True, exist_ok=True) - with open(resume_path, "w") as f: - yaml.dump(resume, f, allow_unicode=True, default_flow_style=False) - - elif step == 4: - for field in ("name", "email", "phone", "linkedin", "career_summary"): - if field in data: - updates[field] = data[field] - - elif step == 5: - # Write API keys to .env (never store in user.yaml) - env_path = Path(_wizard_yaml_path()).parent.parent / ".env" - env_lines = env_path.read_text().splitlines() if env_path.exists() else [] - - def _set_env_key(lines: list[str], key: str, val: str) -> list[str]: - for i, line in enumerate(lines): - if line.startswith(f"{key}="): - lines[i] = f"{key}={val}" - return lines - lines.append(f"{key}={val}") - return lines - - if data.get("anthropic_key"): - env_lines = _set_env_key(env_lines, "ANTHROPIC_API_KEY", data["anthropic_key"]) - if data.get("openai_url"): - env_lines = _set_env_key(env_lines, "OPENAI_COMPAT_URL", data["openai_url"]) - if data.get("openai_key"): - env_lines = _set_env_key(env_lines, "OPENAI_COMPAT_KEY", data["openai_key"]) - if any(data.get(k) for k in ("anthropic_key", "openai_url", "openai_key")): - env_path.parent.mkdir(parents=True, exist_ok=True) - env_path.write_text("\n".join(env_lines) + "\n") - - if "services" in data: - updates["services"] = data["services"] - - elif step == 6: - # Persist search preferences to search_profiles.yaml - titles = data.get("titles", []) - locations = data.get("locations", []) - search_path = _search_prefs_path() - existing_search: dict = {} - if search_path.exists(): - with open(search_path) as f: - existing_search = yaml.safe_load(f) or {} - default_profile = existing_search.get("default", {}) - default_profile["job_titles"] = titles - default_profile["location"] = locations - existing_search["default"] = default_profile - search_path.parent.mkdir(parents=True, exist_ok=True) - with open(search_path, "w") as f: - yaml.dump(existing_search, f, allow_unicode=True, default_flow_style=False) - - # Step 7 (integrations) has no extra side effects here — connections are - # handled by the existing /api/settings/system/integrations/{id}/connect. - - try: - _save_wizard_yaml(updates) - except Exception as e: - raise HTTPException(status_code=500, detail=str(e)) - - return {"ok": True, "step": step} - - -@app.get("/api/wizard/hardware") -def wizard_hardware(): - """Detect GPUs and suggest an inference profile.""" - gpus = _detect_gpus() - suggested = _suggest_profile(gpus) - return { - "gpus": gpus, - "suggested_profile": suggested, - "profiles": list(_WIZARD_PROFILES), - } - - -class WizardInferenceTestPayload(BaseModel): - profile: str = "remote" - anthropic_key: str = "" - openai_url: str = "" - openai_key: str = "" - ollama_host: str = "localhost" - ollama_port: int = 11434 - - -@app.post("/api/wizard/inference/test") -def wizard_test_inference(payload: WizardInferenceTestPayload): - """Test LLM or Ollama connectivity. - - Always returns {ok, message} — a connection failure is reported as a - soft warning (message), not an HTTP error, so the wizard can let the - user continue past a temporarily-down Ollama instance. - """ - if payload.profile == "remote": - try: - # Temporarily inject key if provided (don't persist yet) - env_override = {} - if payload.anthropic_key: - env_override["ANTHROPIC_API_KEY"] = payload.anthropic_key - if payload.openai_url: - env_override["OPENAI_COMPAT_URL"] = payload.openai_url - if payload.openai_key: - env_override["OPENAI_COMPAT_KEY"] = payload.openai_key - - old_env = {k: os.environ.get(k) for k in env_override} - os.environ.update(env_override) - try: - from scripts.llm_router import LLMRouter - result = LLMRouter().complete("Reply with only the word: OK") - ok = bool(result and result.strip()) - message = "LLM responding." if ok else "LLM returned an empty response." - finally: - for k, v in old_env.items(): - if v is None: - os.environ.pop(k, None) - else: - os.environ[k] = v - except Exception as exc: - return {"ok": False, "message": f"LLM test failed: {exc}"} - else: - # Local profile — ping Ollama - ollama_url = f"http://{payload.ollama_host}:{payload.ollama_port}" - try: - resp = requests.get(f"{ollama_url}/api/tags", timeout=5) - ok = resp.status_code == 200 - message = "Ollama is running." if ok else f"Ollama returned HTTP {resp.status_code}." - except Exception: - # Soft-fail: user can skip and configure later - return { - "ok": False, - "message": ( - "Ollama not responding — you can continue and configure it later " - "in Settings → System." - ), - } - - return {"ok": ok, "message": message} - - -@app.post("/api/wizard/complete") -def wizard_complete(): - """Finalise the wizard: set wizard_complete=true, apply service URLs.""" - try: - from scripts.user_profile import UserProfile - from scripts.generate_llm_config import apply_service_urls - - yaml_path = _wizard_yaml_path() - llm_yaml = Path(yaml_path).parent / "llm.yaml" - - try: - profile_obj = UserProfile(yaml_path) - if llm_yaml.exists(): - apply_service_urls(profile_obj, llm_yaml) - except Exception: - pass # don't block completion on llm.yaml errors - - cfg = _load_wizard_yaml() - cfg["wizard_complete"] = True - cfg.pop("wizard_step", None) - save_user_profile(yaml_path, cfg) - - return {"ok": True} - except Exception as e: - raise HTTPException(status_code=500, detail=str(e)) diff --git a/docker/web/Dockerfile b/docker/web/Dockerfile deleted file mode 100644 index a2e4119..0000000 --- a/docker/web/Dockerfile +++ /dev/null @@ -1,15 +0,0 @@ -# Stage 1: build -FROM node:20-alpine AS build -WORKDIR /app -COPY web/package*.json ./ -RUN npm ci --prefer-offline -COPY web/ ./ -ARG VITE_BASE_PATH=/ -ENV VITE_BASE_PATH=${VITE_BASE_PATH} -RUN npm run build - -# Stage 2: serve -FROM nginx:alpine -COPY docker/web/nginx.conf /etc/nginx/conf.d/default.conf -COPY --from=build /app/dist /usr/share/nginx/html -EXPOSE 80 diff --git a/docker/web/nginx.conf b/docker/web/nginx.conf deleted file mode 100644 index 2107e1a..0000000 --- a/docker/web/nginx.conf +++ /dev/null @@ -1,29 +0,0 @@ -server { - listen 80; - server_name _; - - client_max_body_size 20m; - - root /usr/share/nginx/html; - index index.html; - - # Proxy API calls to the FastAPI backend service - location /api/ { - proxy_pass http://api:8601; - proxy_set_header Host $host; - proxy_set_header X-Real-IP $remote_addr; - proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; - proxy_read_timeout 120s; - } - - # Cache static assets - location ~* \.(js|css|png|jpg|jpeg|gif|ico|svg|woff2?)$ { - expires 1y; - add_header Cache-Control "public, immutable"; - } - - # SPA fallback — must come after API and assets - location / { - try_files $uri $uri/ /index.html; - } -} diff --git a/docs/developer-guide/contributing.md b/docs/developer-guide/contributing.md index e4d6261..d160182 100644 --- a/docs/developer-guide/contributing.md +++ b/docs/developer-guide/contributing.md @@ -102,23 +102,6 @@ Before opening a pull request: --- -## Database Migrations - -Peregrine uses a numbered SQL migration system (Rails-style). Each migration is a `.sql` file in the `migrations/` directory at the repo root, named `NNN_description.sql` (e.g. `002_add_foo_column.sql`). Applied migrations are tracked in a `schema_migrations` table in each user database. - -### Adding a migration - -1. Create `migrations/NNN_description.sql` where `NNN` is the next sequential number (zero-padded to 3 digits). -2. Write standard SQL — `CREATE TABLE IF NOT EXISTS`, `ALTER TABLE ADD COLUMN`, etc. Keep each migration idempotent where possible. -3. Do **not** modify `scripts/db.py`'s legacy `_MIGRATIONS` lists — those are superseded and will be removed once all active databases have been bootstrapped by the migration runner. -4. The runner (`scripts/db_migrate.py`) applies pending migrations at startup automatically (both FastAPI and Streamlit paths call `migrate_db(db_path)`). - -### Rollbacks - -SQLite does not support transactional DDL for all statement types. Write forward-only migrations. If you need to undo a schema change, add a new migration that reverses it. - ---- - ## What NOT to Do - Do not commit `config/user.yaml`, `config/notion.yaml`, `config/email.yaml`, `config/adzuna.yaml`, or any `config/integrations/*.yaml` — all are gitignored diff --git a/environment.yml b/environment.yml index b4f109a..18b23d9 100644 --- a/environment.yml +++ b/environment.yml @@ -1,4 +1,4 @@ -name: cf +name: job-seeker # Recreate: conda env create -f environment.yml # Update pinned snapshot: conda env export --no-builds > environment.yml channels: diff --git a/manage.sh b/manage.sh index a1a233e..69176c3 100755 --- a/manage.sh +++ b/manage.sh @@ -32,7 +32,6 @@ usage() { echo -e " ${GREEN}logs [service]${NC} Tail logs (default: app)" echo -e " ${GREEN}update${NC} Pull latest images + rebuild app" echo -e " ${GREEN}preflight${NC} Check ports + resources; write .env" - echo -e " ${GREEN}models${NC} Check ollama models in config; pull any missing" echo -e " ${GREEN}test${NC} Run test suite" echo -e " ${GREEN}e2e [mode]${NC} Run E2E tests (mode: demo|cloud|local, default: demo)" echo -e " Set E2E_HEADLESS=false to run headed via Xvfb" @@ -92,12 +91,6 @@ case "$CMD" in make preflight PROFILE="$PROFILE" ;; - models) - info "Checking ollama models..." - conda run -n cf python scripts/preflight.py --models-only - success "Model check complete." - ;; - start) info "Starting Peregrine (PROFILE=${PROFILE})..." make start PROFILE="$PROFILE" @@ -140,7 +133,7 @@ case "$CMD" in && echo "docker compose" \ || (command -v podman >/dev/null 2>&1 && echo "podman compose" || echo "podman-compose"))" $COMPOSE pull searxng ollama 2>/dev/null || true - $COMPOSE build app web + $COMPOSE build app success "Update complete. Run './manage.sh restart' to apply." ;; @@ -190,7 +183,7 @@ case "$CMD" in RUNNER="" fi info "Running E2E tests (mode=${MODE}, headless=${HEADLESS})..." - $RUNNER conda run -n cf pytest tests/e2e/ \ + $RUNNER conda run -n job-seeker pytest tests/e2e/ \ --mode="${MODE}" \ --json-report \ --json-report-file="${RESULTS_DIR}/report.json" \ diff --git a/migrations/001_baseline.sql b/migrations/001_baseline.sql deleted file mode 100644 index 36e3526..0000000 --- a/migrations/001_baseline.sql +++ /dev/null @@ -1,97 +0,0 @@ --- Migration 001: Baseline schema --- Captures the full schema as of v0.8.5 (all columns including those added via ALTER TABLE) - -CREATE TABLE IF NOT EXISTS jobs ( - id INTEGER PRIMARY KEY AUTOINCREMENT, - title TEXT, - company TEXT, - url TEXT UNIQUE, - source TEXT, - location TEXT, - is_remote INTEGER DEFAULT 0, - salary TEXT, - description TEXT, - match_score REAL, - keyword_gaps TEXT, - date_found TEXT, - status TEXT DEFAULT 'pending', - notion_page_id TEXT, - cover_letter TEXT, - applied_at TEXT, - interview_date TEXT, - rejection_stage TEXT, - phone_screen_at TEXT, - interviewing_at TEXT, - offer_at TEXT, - hired_at TEXT, - survey_at TEXT, - calendar_event_id TEXT, - optimized_resume TEXT, - ats_gap_report TEXT -); - -CREATE TABLE IF NOT EXISTS job_contacts ( - id INTEGER PRIMARY KEY AUTOINCREMENT, - job_id INTEGER, - direction TEXT, - subject TEXT, - from_addr TEXT, - to_addr TEXT, - body TEXT, - received_at TEXT, - is_response_needed INTEGER DEFAULT 0, - responded_at TEXT, - message_id TEXT, - stage_signal TEXT, - suggestion_dismissed INTEGER DEFAULT 0 -); - -CREATE TABLE IF NOT EXISTS company_research ( - id INTEGER PRIMARY KEY AUTOINCREMENT, - job_id INTEGER UNIQUE, - generated_at TEXT, - company_brief TEXT, - ceo_brief TEXT, - talking_points TEXT, - raw_output TEXT, - tech_brief TEXT, - funding_brief TEXT, - competitors_brief TEXT, - red_flags TEXT, - scrape_used INTEGER DEFAULT 0, - accessibility_brief TEXT -); - -CREATE TABLE IF NOT EXISTS background_tasks ( - id INTEGER PRIMARY KEY AUTOINCREMENT, - task_type TEXT, - job_id INTEGER, - params TEXT, - status TEXT DEFAULT 'pending', - error TEXT, - created_at TEXT, - started_at TEXT, - finished_at TEXT, - stage TEXT, - updated_at TEXT -); - -CREATE TABLE IF NOT EXISTS survey_responses ( - id INTEGER PRIMARY KEY AUTOINCREMENT, - job_id INTEGER, - survey_name TEXT, - received_at TEXT, - source TEXT, - raw_input TEXT, - image_path TEXT, - mode TEXT, - llm_output TEXT, - reported_score REAL, - created_at TEXT -); - -CREATE TABLE IF NOT EXISTS digest_queue ( - id INTEGER PRIMARY KEY AUTOINCREMENT, - job_contact_id INTEGER UNIQUE, - created_at TEXT -); diff --git a/requirements.txt b/requirements.txt index 1c381e9..44c5506 100644 --- a/requirements.txt +++ b/requirements.txt @@ -2,15 +2,6 @@ # Extracted from environment.yml for Docker pip installs # Keep in sync with environment.yml -# ── CircuitForge shared core ─────────────────────────────────────────────── -# Requires circuitforge-core >= 0.8.0 (config.load_env, db, tasks; resources moved to circuitforge-orch). -# Local dev / Docker (parent-context build): path install works because -# circuitforge-core/ is a sibling directory. -# CI / fresh checkouts: falls back to the Forgejo VCS URL below. -# To use local editable install run: pip install -e ../circuitforge-core -# TODO: pin to @v0.7.0 tag once cf-core cuts a release tag. -git+https://git.opensourcesolarpunk.com/Circuit-Forge/circuitforge-core.git@main - # ── Web UI ──────────────────────────────────────────────────────────────── streamlit>=1.35 watchdog @@ -87,10 +78,3 @@ lxml # ── Documentation ──────────────────────────────────────────────────────── mkdocs>=1.5 mkdocs-material>=9.5 - -# ── Vue SPA API backend ────────────────────────────────────────────────── -fastapi>=0.100.0 -uvicorn[standard]>=0.20.0 -PyJWT>=2.8.0 -cryptography>=40.0.0 -python-multipart>=0.0.6 diff --git a/scripts/custom_boards/adzuna.py b/scripts/custom_boards/adzuna.py index 2188d12..fa57bdc 100644 --- a/scripts/custom_boards/adzuna.py +++ b/scripts/custom_boards/adzuna.py @@ -70,7 +70,7 @@ def scrape(profile: dict, location: str, results_wanted: int = 50) -> list[dict] print(f" [adzuna] Skipped — {exc}") return [] - titles = profile.get("titles") or profile.get("job_titles", []) + titles = profile.get("titles", []) hours_old = profile.get("hours_old", 240) max_days_old = max(1, hours_old // 24) is_remote_search = location.lower() == "remote" diff --git a/scripts/custom_boards/craigslist.py b/scripts/custom_boards/craigslist.py index 92696d2..30226ae 100644 --- a/scripts/custom_boards/craigslist.py +++ b/scripts/custom_boards/craigslist.py @@ -121,7 +121,7 @@ def scrape(profile: dict, location: str, results_wanted: int = 50) -> list[dict] return [] metros = [metro] - titles: list[str] = profile.get("titles") or profile.get("job_titles", []) + titles: list[str] = profile.get("titles", []) hours_old: int = profile.get("hours_old", 240) cutoff = datetime.now(tz=timezone.utc).timestamp() - (hours_old * 3600) diff --git a/scripts/custom_boards/theladders.py b/scripts/custom_boards/theladders.py index 47fb462..d7330af 100644 --- a/scripts/custom_boards/theladders.py +++ b/scripts/custom_boards/theladders.py @@ -107,7 +107,7 @@ def scrape(profile: dict, location: str, results_wanted: int = 50) -> list[dict] ) page = ctx.new_page() - for title in (profile.get("titles") or profile.get("job_titles", [])): + for title in profile.get("titles", []): if len(results) >= results_wanted: break diff --git a/scripts/db.py b/scripts/db.py index 0e6bd5f..4afbd77 100644 --- a/scripts/db.py +++ b/scripts/db.py @@ -9,14 +9,30 @@ from datetime import datetime from pathlib import Path from typing import Optional -from circuitforge_core.db import get_connection as _cf_get_connection - DEFAULT_DB = Path(os.environ.get("STAGING_DB", Path(__file__).parent.parent / "staging.db")) def get_connection(db_path: Path = DEFAULT_DB, key: str = "") -> "sqlite3.Connection": - """Thin shim — delegates to circuitforge_core.db.get_connection.""" - return _cf_get_connection(db_path, key) + """ + Open a database connection. + + In cloud mode with a key: uses SQLCipher (AES-256 encrypted, API-identical to sqlite3). + Otherwise: vanilla sqlite3. + + Args: + db_path: Path to the SQLite/SQLCipher database file. + key: SQLCipher encryption key (hex string). Empty = unencrypted. + """ + import os as _os + cloud_mode = _os.environ.get("CLOUD_MODE", "").lower() in ("1", "true", "yes") + if cloud_mode and key: + from pysqlcipher3 import dbapi2 as _sqlcipher + conn = _sqlcipher.connect(str(db_path)) + conn.execute(f"PRAGMA key='{key}'") + return conn + else: + import sqlite3 as _sqlite3 + return _sqlite3.connect(str(db_path)) CREATE_JOBS = """ @@ -141,8 +157,6 @@ _MIGRATIONS = [ ("hired_at", "TEXT"), ("survey_at", "TEXT"), ("calendar_event_id", "TEXT"), - ("optimized_resume", "TEXT"), # ATS-rewritten resume text (paid tier) - ("ats_gap_report", "TEXT"), # JSON gap report (free tier) ] @@ -313,38 +327,6 @@ def update_cover_letter(db_path: Path = DEFAULT_DB, job_id: int = None, text: st conn.close() -def save_optimized_resume(db_path: Path = DEFAULT_DB, job_id: int = None, - text: str = "", gap_report: str = "") -> None: - """Persist ATS-optimized resume text and/or gap report for a job.""" - if job_id is None: - return - conn = sqlite3.connect(db_path) - conn.execute( - "UPDATE jobs SET optimized_resume = ?, ats_gap_report = ? WHERE id = ?", - (text or None, gap_report or None, job_id), - ) - conn.commit() - conn.close() - - -def get_optimized_resume(db_path: Path = DEFAULT_DB, job_id: int = None) -> dict: - """Return optimized_resume and ats_gap_report for a job, or empty strings if absent.""" - if job_id is None: - return {"optimized_resume": "", "ats_gap_report": ""} - conn = sqlite3.connect(db_path) - conn.row_factory = sqlite3.Row - row = conn.execute( - "SELECT optimized_resume, ats_gap_report FROM jobs WHERE id = ?", (job_id,) - ).fetchone() - conn.close() - if not row: - return {"optimized_resume": "", "ats_gap_report": ""} - return { - "optimized_resume": row["optimized_resume"] or "", - "ats_gap_report": row["ats_gap_report"] or "", - } - - _UPDATABLE_JOB_COLS = { "title", "company", "url", "source", "location", "is_remote", "salary", "description", "match_score", "keyword_gaps", @@ -383,19 +365,6 @@ def mark_applied(db_path: Path = DEFAULT_DB, ids: list[int] = None) -> None: conn.close() -def cancel_task(db_path: Path = DEFAULT_DB, task_id: int = 0) -> bool: - """Cancel a single queued/running task by id. Returns True if a row was updated.""" - conn = sqlite3.connect(db_path) - count = conn.execute( - "UPDATE background_tasks SET status='failed', error='Cancelled by user'," - " finished_at=datetime('now') WHERE id=? AND status IN ('queued','running')", - (task_id,), - ).rowcount - conn.commit() - conn.close() - return count > 0 - - def kill_stuck_tasks(db_path: Path = DEFAULT_DB) -> int: """Mark all queued/running background tasks as failed. Returns count killed.""" conn = sqlite3.connect(db_path) diff --git a/scripts/db_migrate.py b/scripts/db_migrate.py deleted file mode 100644 index bbb407f..0000000 --- a/scripts/db_migrate.py +++ /dev/null @@ -1,73 +0,0 @@ -""" -db_migrate.py — Rails-style numbered SQL migration runner for Peregrine user DBs. - -Migration files live in migrations/ (sibling to this script's parent directory), -named NNN_description.sql (e.g. 001_baseline.sql). They are applied in sorted -order and tracked in the schema_migrations table so each runs exactly once. - -Usage: - from scripts.db_migrate import migrate_db - migrate_db(Path("/path/to/user.db")) -""" - -import logging -import sqlite3 -from pathlib import Path - -log = logging.getLogger(__name__) - -# Resolved at import time: peregrine repo root / migrations/ -_MIGRATIONS_DIR = Path(__file__).parent.parent / "migrations" - -_CREATE_MIGRATIONS_TABLE = """ -CREATE TABLE IF NOT EXISTS schema_migrations ( - version TEXT PRIMARY KEY, - applied_at TEXT NOT NULL DEFAULT (datetime('now')) -) -""" - - -def migrate_db(db_path: Path) -> list[str]: - """Apply any pending migrations to db_path. Returns list of applied versions.""" - applied: list[str] = [] - - con = sqlite3.connect(db_path) - try: - con.execute(_CREATE_MIGRATIONS_TABLE) - con.commit() - - if not _MIGRATIONS_DIR.is_dir(): - log.warning("migrations/ directory not found at %s — skipping", _MIGRATIONS_DIR) - return applied - - migration_files = sorted(_MIGRATIONS_DIR.glob("*.sql")) - if not migration_files: - return applied - - already_applied = { - row[0] for row in con.execute("SELECT version FROM schema_migrations") - } - - for path in migration_files: - version = path.stem # e.g. "001_baseline" - if version in already_applied: - continue - - sql = path.read_text(encoding="utf-8") - log.info("Applying migration %s to %s", version, db_path.name) - try: - con.executescript(sql) - con.execute( - "INSERT INTO schema_migrations (version) VALUES (?)", (version,) - ) - con.commit() - applied.append(version) - log.info("Migration %s applied successfully", version) - except Exception as exc: - con.rollback() - log.error("Migration %s failed: %s", version, exc) - raise RuntimeError(f"Migration {version} failed: {exc}") from exc - finally: - con.close() - - return applied diff --git a/scripts/discover.py b/scripts/discover.py index bc0e3f0..77f8f9d 100644 --- a/scripts/discover.py +++ b/scripts/discover.py @@ -34,21 +34,17 @@ CUSTOM_SCRAPERS: dict[str, object] = { } -def load_config(config_dir: Path | None = None) -> tuple[dict, dict]: - cfg = config_dir or CONFIG_DIR - profiles_path = cfg / "search_profiles.yaml" - notion_path = cfg / "notion.yaml" - profiles = yaml.safe_load(profiles_path.read_text()) - notion_cfg = yaml.safe_load(notion_path.read_text()) if notion_path.exists() else {"field_map": {}, "token": None, "database_id": None} +def load_config() -> tuple[dict, dict]: + profiles = yaml.safe_load(PROFILES_CFG.read_text()) + notion_cfg = yaml.safe_load(NOTION_CFG.read_text()) return profiles, notion_cfg -def load_blocklist(config_dir: Path | None = None) -> dict: +def load_blocklist() -> dict: """Load global blocklist config. Returns dict with companies, industries, locations lists.""" - blocklist_path = (config_dir or CONFIG_DIR) / "blocklist.yaml" - if not blocklist_path.exists(): + if not BLOCKLIST_CFG.exists(): return {"companies": [], "industries": [], "locations": []} - raw = yaml.safe_load(blocklist_path.read_text()) or {} + raw = yaml.safe_load(BLOCKLIST_CFG.read_text()) or {} return { "companies": [c.lower() for c in raw.get("companies", []) if c], "industries": [i.lower() for i in raw.get("industries", []) if i], @@ -121,15 +117,10 @@ def push_to_notion(notion: Client, db_id: str, job: dict, fm: dict) -> None: ) -def run_discovery(db_path: Path = DEFAULT_DB, notion_push: bool = False, config_dir: Path | None = None) -> None: - # In cloud mode, config_dir is the per-user config directory derived from db_path. - # Falls back to the app-level /app/config for single-tenant deployments. - resolved_cfg = config_dir or Path(db_path).parent / "config" - if not resolved_cfg.exists(): - resolved_cfg = CONFIG_DIR - profiles_cfg, notion_cfg = load_config(resolved_cfg) - fm = notion_cfg.get("field_map") or {} - blocklist = load_blocklist(resolved_cfg) +def run_discovery(db_path: Path = DEFAULT_DB, notion_push: bool = False) -> None: + profiles_cfg, notion_cfg = load_config() + fm = notion_cfg["field_map"] + blocklist = load_blocklist() _bl_summary = {k: len(v) for k, v in blocklist.items() if v} if _bl_summary: @@ -220,7 +211,7 @@ def run_discovery(db_path: Path = DEFAULT_DB, notion_push: bool = False, config_ try: jobspy_kwargs: dict = dict( site_name=boards, - search_term=" OR ".join(f'"{t}"' for t in (profile.get("titles") or profile.get("job_titles", []))), + search_term=" OR ".join(f'"{t}"' for t in profile["titles"]), location=location, results_wanted=results_per_board, hours_old=profile.get("hours_old", 72), diff --git a/scripts/generate_cover_letter.py b/scripts/generate_cover_letter.py index 3067bdb..e55c36e 100644 --- a/scripts/generate_cover_letter.py +++ b/scripts/generate_cover_letter.py @@ -26,14 +26,13 @@ LETTERS_DIR = _profile.docs_dir if _profile else Path.home() / "Documents" / "Jo LETTER_GLOB = "*Cover Letter*.md" # Background injected into every prompt so the model has the candidate's facts -def _build_system_context(profile=None) -> str: - p = profile or _profile - if not p: +def _build_system_context() -> str: + if not _profile: return "You are a professional cover letter writer. Write in first person." - parts = [f"You are writing cover letters for {p.name}. {p.career_summary}"] - if p.candidate_voice: + parts = [f"You are writing cover letters for {_profile.name}. {_profile.career_summary}"] + if _profile.candidate_voice: parts.append( - f"Voice and personality: {p.candidate_voice} " + f"Voice and personality: {_profile.candidate_voice} " "Write in a way that reflects these authentic traits — not as a checklist, " "but as a natural expression of who this person is." ) @@ -126,17 +125,15 @@ _MISSION_DEFAULTS: dict[str, str] = { } -def _build_mission_notes(profile=None, candidate_name: str | None = None) -> dict[str, str]: +def _build_mission_notes() -> dict[str, str]: """Merge user's custom mission notes with generic defaults.""" - p = profile or _profile - name = candidate_name or _candidate - prefs = p.mission_preferences if p else {} + prefs = _profile.mission_preferences if _profile else {} notes = {} for industry, default_note in _MISSION_DEFAULTS.items(): custom = (prefs.get(industry) or "").strip() if custom: notes[industry] = ( - f"Mission alignment — {name} shared: \"{custom}\". " + f"Mission alignment — {_candidate} shared: \"{custom}\". " "Para 3 should warmly and specifically reflect this authentic connection." ) else: @@ -147,15 +144,12 @@ def _build_mission_notes(profile=None, candidate_name: str | None = None) -> dic _MISSION_NOTES = _build_mission_notes() -def detect_mission_alignment( - company: str, description: str, mission_notes: dict | None = None -) -> str | None: +def detect_mission_alignment(company: str, description: str) -> str | None: """Return a mission hint string if company/JD matches a preferred industry, else None.""" - notes = mission_notes if mission_notes is not None else _MISSION_NOTES text = f"{company} {description}".lower() for industry, signals in _MISSION_SIGNALS.items(): if any(sig in text for sig in signals): - return notes[industry] + return _MISSION_NOTES[industry] return None @@ -196,14 +190,10 @@ def build_prompt( examples: list[dict], mission_hint: str | None = None, is_jobgether: bool = False, - system_context: str | None = None, - candidate_name: str | None = None, ) -> str: - ctx = system_context if system_context is not None else SYSTEM_CONTEXT - name = candidate_name or _candidate - parts = [ctx.strip(), ""] + parts = [SYSTEM_CONTEXT.strip(), ""] if examples: - parts.append(f"=== STYLE EXAMPLES ({name}'s past letters) ===\n") + parts.append(f"=== STYLE EXAMPLES ({_candidate}'s past letters) ===\n") for i, ex in enumerate(examples, 1): parts.append(f"--- Example {i} ({ex['company']}) ---") parts.append(ex["text"]) @@ -241,14 +231,13 @@ def build_prompt( return "\n".join(parts) -def _trim_to_letter_end(text: str, profile=None) -> str: +def _trim_to_letter_end(text: str) -> str: """Remove repetitive hallucinated content after the first complete sign-off. Fine-tuned models sometimes loop after completing the letter. This cuts at the first closing + candidate name so only the intended letter is saved. """ - p = profile or _profile - candidate_first = (p.name.split()[0] if p else "").strip() + candidate_first = (_profile.name.split()[0] if _profile else "").strip() pattern = ( r'(?:Warm regards|Sincerely|Best regards|Kind regards|Thank you)[,.]?\s*\n+\s*' + (re.escape(candidate_first) if candidate_first else r'\w+(?:\s+\w+)?') @@ -268,8 +257,6 @@ def generate( feedback: str = "", is_jobgether: bool = False, _router=None, - config_path: "Path | None" = None, - user_yaml_path: "Path | None" = None, ) -> str: """Generate a cover letter and return it as a string. @@ -277,29 +264,15 @@ def generate( and requested changes are appended to the prompt so the LLM revises rather than starting from scratch. - user_yaml_path overrides the module-level profile — required in cloud mode - so each user's name/voice/mission prefs are used instead of the global default. - _router is an optional pre-built LLMRouter (used in tests to avoid real LLM calls). """ - # Per-call profile override (cloud mode: each user has their own user.yaml) - if user_yaml_path and Path(user_yaml_path).exists(): - _prof = UserProfile(Path(user_yaml_path)) - else: - _prof = _profile - - sys_ctx = _build_system_context(_prof) - mission_notes = _build_mission_notes(_prof, candidate_name=(_prof.name if _prof else None)) - candidate_name = _prof.name if _prof else _candidate - corpus = load_corpus() examples = find_similar_letters(description or f"{title} {company}", corpus) - mission_hint = detect_mission_alignment(company, description, mission_notes=mission_notes) + mission_hint = detect_mission_alignment(company, description) if mission_hint: print(f"[cover-letter] Mission alignment detected for {company}", file=sys.stderr) prompt = build_prompt(title, company, description, examples, - mission_hint=mission_hint, is_jobgether=is_jobgether, - system_context=sys_ctx, candidate_name=candidate_name) + mission_hint=mission_hint, is_jobgether=is_jobgether) if previous_result: prompt += f"\n\n---\nPrevious draft:\n{previous_result}" @@ -308,9 +281,8 @@ def generate( if _router is None: sys.path.insert(0, str(Path(__file__).parent.parent)) - from scripts.llm_router import LLMRouter, CONFIG_PATH - resolved = config_path if (config_path and Path(config_path).exists()) else CONFIG_PATH - _router = LLMRouter(resolved) + from scripts.llm_router import LLMRouter + _router = LLMRouter() print(f"[cover-letter] Generating for: {title} @ {company}", file=sys.stderr) print(f"[cover-letter] Style examples: {[e['company'] for e in examples]}", file=sys.stderr) @@ -320,7 +292,7 @@ def generate( # max_tokens=1200 caps generation at ~900 words — enough for any cover letter # and prevents fine-tuned models from looping into repetitive garbage output. result = _router.complete(prompt, max_tokens=1200) - return _trim_to_letter_end(result, _prof) + return _trim_to_letter_end(result) def main() -> None: diff --git a/scripts/job_ranker.py b/scripts/job_ranker.py deleted file mode 100644 index 470f054..0000000 --- a/scripts/job_ranker.py +++ /dev/null @@ -1,313 +0,0 @@ -"""Job ranking engine — two-stage discovery → review pipeline. - -Stage 1 (discover.py) scrapes a wide corpus and stores everything as 'pending'. -Stage 2 (this module) scores the corpus; GET /api/jobs/stack returns top-N best -matches for the user's current review session. - -All signal functions return a float in [0, 1]. The final stack_score is 0–100. - -Usage: - from scripts.job_ranker import rank_jobs - ranked = rank_jobs(jobs, search_titles, salary_min, salary_max, user_level) -""" -from __future__ import annotations - -import math -import re -from datetime import datetime, timezone - - -# ── TUNING ───────────────────────────────────────────────────────────────────── -# Adjust these constants to change how jobs are ranked. -# All individual signal scores are normalised to [0, 1] before weighting. -# Weights should sum to ≤ 1.0; the remainder is unallocated slack. - -W_RESUME_MATCH = 0.40 # TF-IDF cosine similarity stored as match_score (0–100 → 0–1) -W_TITLE_MATCH = 0.30 # seniority-aware title + domain keyword overlap -W_RECENCY = 0.15 # freshness — exponential decay from date_found -W_SALARY_FIT = 0.10 # salary range overlap vs user target (neutral when unknown) -W_DESC_QUALITY = 0.05 # posting completeness — penalises stub / ghost posts - -# Keyword gap penalty: each missing keyword from the resume match costs points. -# Gaps are already partially captured by W_RESUME_MATCH (same TF-IDF source), -# so this is a soft nudge, not a hard filter. -GAP_PENALTY_PER_KEYWORD: float = 0.5 # points off per gap keyword (0–100 scale) -GAP_MAX_PENALTY: float = 5.0 # hard cap so a gap-heavy job can still rank - -# Recency half-life: score halves every N days past date_found -RECENCY_HALF_LIFE: int = 7 # days - -# Description word-count thresholds -DESC_MIN_WORDS: int = 50 # below this → scaled penalty -DESC_TARGET_WORDS: int = 200 # at or above → full quality score -# ── END TUNING ───────────────────────────────────────────────────────────────── - - -# ── Seniority level map ──────────────────────────────────────────────────────── -# (level, [keyword substrings that identify that level]) -# Matched on " " with a space-padded check to avoid false hits. -# Level 3 is the default (mid-level, no seniority modifier in title). -_SENIORITY_MAP: list[tuple[int, list[str]]] = [ - (1, ["intern", "internship", "trainee", "apprentice", "co-op", "coop"]), - (2, ["entry level", "entry-level", "junior", "jr ", "jr.", "associate "]), - (3, ["mid level", "mid-level", "intermediate"]), - (4, ["senior ", "senior,", "sr ", "sr.", " lead ", "lead,", " ii ", " iii ", - "specialist", "experienced"]), - (5, ["staff ", "principal ", "architect ", "expert ", "distinguished"]), - (6, ["director", "head of ", "manager ", "vice president", " vp "]), - (7, ["chief", "cto", "cio", "cpo", "president", "founder"]), -] - -# job_level − user_level → scoring multiplier -# Positive delta = job is more senior (stretch up = encouraged) -# Negative delta = job is below the user's level -_LEVEL_MULTIPLIER: dict[int, float] = { - -4: 0.05, -3: 0.10, -2: 0.25, -1: 0.65, - 0: 1.00, - 1: 0.90, 2: 0.65, 3: 0.25, 4: 0.05, -} -_DEFAULT_LEVEL_MULTIPLIER = 0.05 - - -# ── Seniority helpers ───────────────────────────────────────────────────────── - -def infer_seniority(title: str) -> int: - """Return seniority level 1–7 for a job or resume title. Defaults to 3.""" - padded = f" {title.lower()} " - # Iterate highest → lowest so "Senior Lead" resolves to 4, not 6 - for level, keywords in reversed(_SENIORITY_MAP): - for kw in keywords: - if kw in padded: - return level - return 3 - - -def seniority_from_experience(titles: list[str]) -> int: - """Estimate user's current level from their most recent experience titles. - - Averages the levels of the top-3 most recent titles (first in the list). - Falls back to 3 (mid-level) if no titles are provided. - """ - if not titles: - return 3 - sample = [t for t in titles if t.strip()][:3] - if not sample: - return 3 - levels = [infer_seniority(t) for t in sample] - return round(sum(levels) / len(levels)) - - -def _strip_level_words(text: str) -> str: - """Remove seniority/modifier words so domain keywords stand out.""" - strip = { - "senior", "sr", "junior", "jr", "lead", "staff", "principal", - "associate", "entry", "mid", "intermediate", "experienced", - "director", "head", "manager", "architect", "chief", "intern", - "ii", "iii", "iv", "i", - } - return " ".join(w for w in text.lower().split() if w not in strip) - - -# ── Signal functions ────────────────────────────────────────────────────────── - -def title_match_score(job_title: str, search_titles: list[str], user_level: int) -> float: - """Seniority-aware title similarity in [0, 1]. - - Combines: - - Domain overlap: keyword intersection between job title and search titles - after stripping level modifiers (so "Senior Software Engineer" vs - "Software Engineer" compares only on "software engineer"). - - Seniority multiplier: rewards same-level and +1 stretch; penalises - large downgrade or unreachable stretch. - """ - if not search_titles: - return 0.5 # neutral — user hasn't set title prefs yet - - job_level = infer_seniority(job_title) - level_delta = job_level - user_level - seniority_factor = _LEVEL_MULTIPLIER.get(level_delta, _DEFAULT_LEVEL_MULTIPLIER) - - job_core_words = {w for w in _strip_level_words(job_title).split() if len(w) > 2} - - best_domain = 0.0 - for st in search_titles: - st_core_words = {w for w in _strip_level_words(st).split() if len(w) > 2} - if not st_core_words: - continue - # Recall-biased overlap: what fraction of the search title keywords - # appear in the job title? (A job posting may use synonyms but we - # at least want the core nouns to match.) - overlap = len(st_core_words & job_core_words) / len(st_core_words) - best_domain = max(best_domain, overlap) - - # Base score from domain match scaled by seniority appropriateness. - # A small seniority_factor bonus (×0.2) ensures that even a near-miss - # domain match still benefits from seniority alignment. - return min(1.0, best_domain * seniority_factor + seniority_factor * 0.15) - - -def recency_decay(date_found: str) -> float: - """Exponential decay starting from date_found. - - Returns 1.0 for today, 0.5 after RECENCY_HALF_LIFE days, ~0.0 after ~4×. - Returns 0.5 (neutral) if the date is unparseable. - """ - try: - # Support both "YYYY-MM-DD" and "YYYY-MM-DD HH:MM:SS" - found = datetime.fromisoformat(date_found.split("T")[0].split(" ")[0]) - found = found.replace(tzinfo=timezone.utc) - now = datetime.now(tz=timezone.utc) - days_old = max(0.0, (now - found).total_seconds() / 86400) - return math.exp(-math.log(2) * days_old / RECENCY_HALF_LIFE) - except Exception: - return 0.5 - - -def _parse_salary_range(text: str | None) -> tuple[int | None, int | None]: - """Extract (low, high) salary integers from free-text. Returns (None, None) on failure. - - Handles: "$80k - $120k", "USD 80,000 - 120,000 per year", "£45,000", - "80000", "80K/yr", "80-120k", etc. - """ - if not text: - return None, None - normalized = re.sub(r"[$,£€₹¥\s]", "", text.lower()) - # Match numbers optionally followed by 'k' - raw_nums = re.findall(r"(\d+(?:\.\d+)?)k?", normalized) - values = [] - for n, full in zip(raw_nums, re.finditer(r"(\d+(?:\.\d+)?)(k?)", normalized)): - val = float(full.group(1)) - if full.group(2): # ends with 'k' - val *= 1000 - elif val < 1000: # bare numbers < 1000 are likely thousands (e.g., "80" in "80-120k") - val *= 1000 - if val >= 10_000: # sanity: ignore clearly wrong values - values.append(int(val)) - values = sorted(set(values)) - if not values: - return None, None - return values[0], values[-1] - - -def salary_fit( - salary_text: str | None, - target_min: int | None, - target_max: int | None, -) -> float: - """Salary range overlap score in [0, 1]. - - Returns 0.5 (neutral) when either range is unknown — a missing salary - line is not inherently negative. - """ - if not salary_text or (target_min is None and target_max is None): - return 0.5 - - job_low, job_high = _parse_salary_range(salary_text) - if job_low is None: - return 0.5 - - t_min = target_min or 0 - t_max = target_max or (int(target_min * 1.5) if target_min else job_high or job_low) - job_high = job_high or job_low - - overlap_low = max(job_low, t_min) - overlap_high = min(job_high, t_max) - overlap = max(0, overlap_high - overlap_low) - target_span = max(1, t_max - t_min) - return min(1.0, overlap / target_span) - - -def description_quality(description: str | None) -> float: - """Posting completeness score in [0, 1]. - - Stubs and ghost posts score near 0; well-written descriptions score 1.0. - """ - if not description: - return 0.0 - words = len(description.split()) - if words < DESC_MIN_WORDS: - return (words / DESC_MIN_WORDS) * 0.4 # steep penalty for stubs - if words >= DESC_TARGET_WORDS: - return 1.0 - return 0.4 + 0.6 * (words - DESC_MIN_WORDS) / (DESC_TARGET_WORDS - DESC_MIN_WORDS) - - -# ── Composite scorer ────────────────────────────────────────────────────────── - -def score_job( - job: dict, - search_titles: list[str], - target_salary_min: int | None, - target_salary_max: int | None, - user_level: int, -) -> float: - """Compute composite stack_score (0–100) for a single job dict. - - Args: - job: Row dict from the jobs table (must have title, match_score, - date_found, salary, description, keyword_gaps). - search_titles: User's desired job titles (from search prefs). - target_salary_*: User's salary target from resume profile (or None). - user_level: Inferred seniority level 1–7. - - Returns: - A float 0–100. Higher = better match for this user's session. - """ - # ── Individual signals (all 0–1) ────────────────────────────────────────── - match_raw = job.get("match_score") - s_resume = (match_raw / 100.0) if match_raw is not None else 0.5 - - s_title = title_match_score(job.get("title", ""), search_titles, user_level) - s_recency = recency_decay(job.get("date_found", "")) - s_salary = salary_fit(job.get("salary"), target_salary_min, target_salary_max) - s_desc = description_quality(job.get("description")) - - # ── Weighted sum ────────────────────────────────────────────────────────── - base = ( - W_RESUME_MATCH * s_resume - + W_TITLE_MATCH * s_title - + W_RECENCY * s_recency - + W_SALARY_FIT * s_salary - + W_DESC_QUALITY * s_desc - ) - - # ── Keyword gap penalty (applied on the 0–100 scale) ───────────────────── - gaps_raw = job.get("keyword_gaps") or "" - gap_count = len([g for g in gaps_raw.split(",") if g.strip()]) if gaps_raw else 0 - gap_penalty = min(GAP_MAX_PENALTY, gap_count * GAP_PENALTY_PER_KEYWORD) / 100.0 - - return round(max(0.0, base - gap_penalty) * 100, 1) - - -# ── Public API ──────────────────────────────────────────────────────────────── - -def rank_jobs( - jobs: list[dict], - search_titles: list[str], - target_salary_min: int | None = None, - target_salary_max: int | None = None, - user_level: int = 3, - limit: int = 10, - min_score: float = 20.0, -) -> list[dict]: - """Score and rank pending jobs; return top-N above min_score. - - Args: - jobs: List of job dicts (from DB or any source). - search_titles: User's desired job titles from search prefs. - target_salary_*: User's salary target (from resume profile). - user_level: Seniority level 1–7 (use seniority_from_experience()). - limit: Stack size; pass 0 to return all qualifying jobs. - min_score: Minimum stack_score to include (0–100). - - Returns: - Sorted list (best first) with 'stack_score' key added to each dict. - """ - scored = [] - for job in jobs: - s = score_job(job, search_titles, target_salary_min, target_salary_max, user_level) - if s >= min_score: - scored.append({**job, "stack_score": s}) - - scored.sort(key=lambda j: j["stack_score"], reverse=True) - return scored[:limit] if limit > 0 else scored diff --git a/scripts/llm_router.py b/scripts/llm_router.py index b88bed5..5b8a469 100644 --- a/scripts/llm_router.py +++ b/scripts/llm_router.py @@ -1,46 +1,169 @@ """ LLM abstraction layer with priority fallback chain. -Config lookup order: - 1. /config/llm.yaml — per-install local config - 2. ~/.config/circuitforge/llm.yaml — user-level config (circuitforge-core default) - 3. env-var auto-config (ANTHROPIC_API_KEY, OPENAI_API_KEY, OLLAMA_HOST, …) +Reads config/llm.yaml. Tries backends in order; falls back on any error. """ +import os +import yaml +import requests from pathlib import Path +from openai import OpenAI -from circuitforge_core.llm import LLMRouter as _CoreLLMRouter - -# Kept for backwards-compatibility — external callers that import CONFIG_PATH -# from this module continue to work. CONFIG_PATH = Path(__file__).parent.parent / "config" / "llm.yaml" -class LLMRouter(_CoreLLMRouter): - """Peregrine-specific LLMRouter — tri-level config path priority. +class LLMRouter: + def __init__(self, config_path: Path = CONFIG_PATH): + with open(config_path) as f: + self.config = yaml.safe_load(f) - When ``config_path`` is supplied (e.g. in tests) it is passed straight - through to the core. When omitted, the lookup order is: - 1. /config/llm.yaml (per-install local config) - 2. ~/.config/circuitforge/llm.yaml (user-level, circuitforge-core default) - 3. env-var auto-config (ANTHROPIC_API_KEY, OPENAI_API_KEY, OLLAMA_HOST …) - """ + def _is_reachable(self, base_url: str) -> bool: + """Quick health-check ping. Returns True if backend is up.""" + health_url = base_url.rstrip("/").removesuffix("/v1") + "/health" + try: + resp = requests.get(health_url, timeout=2) + return resp.status_code < 500 + except Exception: + return False - def __init__(self, config_path: Path | None = None) -> None: - if config_path is not None: - # Explicit path supplied — use it directly (e.g. tests, CLI override). - super().__init__(config_path) - return + def _resolve_model(self, client: OpenAI, model: str) -> str: + """Resolve __auto__ to the first model served by vLLM.""" + if model != "__auto__": + return model + models = client.models.list() + return models.data[0].id - local = Path(__file__).parent.parent / "config" / "llm.yaml" - user_level = Path.home() / ".config" / "circuitforge" / "llm.yaml" - if local.exists(): - super().__init__(local) - elif user_level.exists(): - super().__init__(user_level) - else: - # No yaml found — let circuitforge-core's env-var auto-config run. - # The core default CONFIG_PATH (~/.config/circuitforge/llm.yaml) - # won't exist either, so _auto_config_from_env() will be triggered. - super().__init__() + def complete(self, prompt: str, system: str | None = None, + model_override: str | None = None, + fallback_order: list[str] | None = None, + images: list[str] | None = None, + max_tokens: int | None = None) -> str: + """ + Generate a completion. Tries each backend in fallback_order. + + model_override: when set, replaces the configured model for + openai_compat backends (e.g. pass a research-specific ollama model). + fallback_order: when set, overrides config fallback_order for this + call (e.g. pass config["research_fallback_order"] for research tasks). + images: optional list of base64-encoded PNG/JPG strings. When provided, + backends without supports_images=true are skipped. vision_service backends + are only tried when images is provided. + Raises RuntimeError if all backends are exhausted. + """ + if os.environ.get("DEMO_MODE", "").lower() in ("1", "true", "yes"): + raise RuntimeError( + "AI inference is disabled in the public demo. " + "Run your own instance to use AI features." + ) + order = fallback_order if fallback_order is not None else self.config["fallback_order"] + for name in order: + backend = self.config["backends"][name] + + if not backend.get("enabled", True): + print(f"[LLMRouter] {name}: disabled, skipping") + continue + + supports_images = backend.get("supports_images", False) + is_vision_service = backend["type"] == "vision_service" + + # vision_service only used when images provided + if is_vision_service and not images: + print(f"[LLMRouter] {name}: vision_service skipped (no images)") + continue + + # non-vision backends skipped when images provided and they don't support it + if images and not supports_images and not is_vision_service: + print(f"[LLMRouter] {name}: no image support, skipping") + continue + + if is_vision_service: + if not self._is_reachable(backend["base_url"]): + print(f"[LLMRouter] {name}: unreachable, skipping") + continue + try: + resp = requests.post( + backend["base_url"].rstrip("/") + "/analyze", + json={ + "prompt": prompt, + "image_base64": images[0] if images else "", + }, + timeout=60, + ) + resp.raise_for_status() + print(f"[LLMRouter] Used backend: {name} (vision_service)") + return resp.json()["text"] + except Exception as e: + print(f"[LLMRouter] {name}: error — {e}, trying next") + continue + + elif backend["type"] == "openai_compat": + if not self._is_reachable(backend["base_url"]): + print(f"[LLMRouter] {name}: unreachable, skipping") + continue + try: + client = OpenAI( + base_url=backend["base_url"], + api_key=backend.get("api_key") or "any", + ) + raw_model = model_override or backend["model"] + model = self._resolve_model(client, raw_model) + messages = [] + if system: + messages.append({"role": "system", "content": system}) + if images and supports_images: + content = [{"type": "text", "text": prompt}] + for img in images: + content.append({ + "type": "image_url", + "image_url": {"url": f"data:image/png;base64,{img}"}, + }) + messages.append({"role": "user", "content": content}) + else: + messages.append({"role": "user", "content": prompt}) + + create_kwargs: dict = {"model": model, "messages": messages} + if max_tokens is not None: + create_kwargs["max_tokens"] = max_tokens + resp = client.chat.completions.create(**create_kwargs) + print(f"[LLMRouter] Used backend: {name} ({model})") + return resp.choices[0].message.content + + except Exception as e: + print(f"[LLMRouter] {name}: error — {e}, trying next") + continue + + elif backend["type"] == "anthropic": + api_key = os.environ.get(backend["api_key_env"], "") + if not api_key: + print(f"[LLMRouter] {name}: {backend['api_key_env']} not set, skipping") + continue + try: + import anthropic as _anthropic + client = _anthropic.Anthropic(api_key=api_key) + if images and supports_images: + content = [] + for img in images: + content.append({ + "type": "image", + "source": {"type": "base64", "media_type": "image/png", "data": img}, + }) + content.append({"type": "text", "text": prompt}) + else: + content = prompt + kwargs: dict = { + "model": backend["model"], + "max_tokens": 4096, + "messages": [{"role": "user", "content": content}], + } + if system: + kwargs["system"] = system + msg = client.messages.create(**kwargs) + print(f"[LLMRouter] Used backend: {name}") + return msg.content[0].text + except Exception as e: + print(f"[LLMRouter] {name}: error — {e}, trying next") + continue + + raise RuntimeError("All LLM backends exhausted") # Module-level singleton for convenience diff --git a/scripts/preflight.py b/scripts/preflight.py index 34d7907..b840dda 100644 --- a/scripts/preflight.py +++ b/scripts/preflight.py @@ -47,7 +47,7 @@ OVERRIDE_YML = ROOT / "compose.override.yml" _SERVICES: dict[str, tuple[str, int, str, bool, bool]] = { "streamlit": ("streamlit_port", 8501, "STREAMLIT_PORT", True, False), "searxng": ("searxng_port", 8888, "SEARXNG_PORT", True, True), - # vllm removed — now managed by cf-orch (host process), not a Docker service + "vllm": ("vllm_port", 8000, "VLLM_PORT", True, True), "vision": ("vision_port", 8002, "VISION_PORT", True, True), "ollama": ("ollama_port", 11434, "OLLAMA_PORT", True, True), "ollama_research": ("ollama_research_port", 11435, "OLLAMA_RESEARCH_PORT", True, True), @@ -65,6 +65,7 @@ _LLM_BACKENDS: dict[str, list[tuple[str, str]]] = { _DOCKER_INTERNAL: dict[str, tuple[str, int]] = { "ollama": ("ollama", 11434), "ollama_research": ("ollama_research", 11434), # container-internal port is always 11434 + "vllm": ("vllm", 8000), "vision": ("vision", 8002), "searxng": ("searxng", 8080), # searxng internal port differs from host port } @@ -492,12 +493,6 @@ def main() -> None: # binds a harmless free port instead of conflicting with the external service. env_updates: dict[str, str] = {i["env_var"]: str(i["stub_port"]) for i in ports.values()} env_updates["RECOMMENDED_PROFILE"] = profile - # When Ollama is adopted from the host process, write OLLAMA_HOST so - # LLMRouter's env-var auto-config finds it without needing config/llm.yaml. - ollama_info = ports.get("ollama") - if ollama_info and ollama_info.get("external"): - env_updates["OLLAMA_HOST"] = f"http://host.docker.internal:{ollama_info['resolved']}" - if offload_gb > 0: env_updates["CPU_OFFLOAD_GB"] = str(offload_gb) # GPU info for the app container (which lacks nvidia-smi access) diff --git a/scripts/resume_optimizer.py b/scripts/resume_optimizer.py deleted file mode 100644 index 1d3b7b3..0000000 --- a/scripts/resume_optimizer.py +++ /dev/null @@ -1,439 +0,0 @@ -""" -ATS Resume Optimizer — rewrite a candidate's resume to maximize keyword match -for a specific job description without fabricating experience. - -Tier behaviour: - Free → gap report only (extract_jd_signals + prioritize_gaps, no LLM rewrite) - Paid → full LLM rewrite targeting the JD (rewrite_for_ats) - Premium → same as paid for now; fine-tuned voice model is a future enhancement - -Pipeline: - job.description - → extract_jd_signals() # TF-IDF gaps + LLM-extracted ATS signals - → prioritize_gaps() # rank by impact, map to resume sections - → rewrite_for_ats() # per-section LLM rewrite (paid+) - → hallucination_check() # reject rewrites that invent new experience -""" -from __future__ import annotations - -import json -import logging -import re -from pathlib import Path -from typing import Any - -log = logging.getLogger(__name__) - -# ── Signal extraction ───────────────────────────────────────────────────────── - -def extract_jd_signals(description: str, resume_text: str = "") -> list[str]: - """Return ATS keyword signals from a job description. - - Combines two sources: - 1. TF-IDF keyword gaps from match.py (fast, deterministic, no LLM cost) - 2. LLM extraction for phrasing nuance TF-IDF misses (e.g. "cross-functional" - vs "cross-team", "led" vs "managed") - - Falls back to TF-IDF-only if LLM is unavailable. - - Args: - description: Raw job description text. - resume_text: Candidate's resume text (used to compute gap vs. already present). - - Returns: - Deduplicated list of ATS keyword signals, most impactful first. - """ - # Phase 1: deterministic TF-IDF gaps (always available) - tfidf_gaps: list[str] = [] - if resume_text: - try: - from scripts.match import match_score - _, tfidf_gaps = match_score(resume_text, description) - except Exception: - log.warning("[resume_optimizer] TF-IDF gap extraction failed", exc_info=True) - - # Phase 2: LLM extraction for phrasing/qualifier nuance - llm_signals: list[str] = [] - try: - from scripts.llm_router import LLMRouter - prompt = ( - "Extract the most important ATS (applicant tracking system) keywords and " - "phrases from this job description. Focus on:\n" - "- Required skills and technologies (exact phrasing matters)\n" - "- Action verbs used to describe responsibilities\n" - "- Qualification signals ('required', 'must have', 'preferred')\n" - "- Industry-specific terminology\n\n" - "Return a JSON array of strings only. No explanation.\n\n" - f"Job description:\n{description[:3000]}" - ) - raw = LLMRouter().complete(prompt) - # Extract JSON array from response (LLM may wrap it in markdown) - match = re.search(r"\[.*\]", raw, re.DOTALL) - if match: - llm_signals = json.loads(match.group(0)) - llm_signals = [s.strip() for s in llm_signals if isinstance(s, str) and s.strip()] - except Exception: - log.warning("[resume_optimizer] LLM signal extraction failed", exc_info=True) - - # Merge: LLM signals first (richer phrasing), TF-IDF fills gaps - seen: set[str] = set() - merged: list[str] = [] - for term in llm_signals + tfidf_gaps: - key = term.lower() - if key not in seen: - seen.add(key) - merged.append(term) - - return merged - - -# ── Gap prioritization ──────────────────────────────────────────────────────── - -# Map each gap term to the resume section where it would have the most ATS impact. -# ATS systems weight keywords higher in certain sections: -# skills — direct keyword match, highest density, indexed first -# summary — executive summary keywords often boost overall relevance score -# experience — verbs + outcomes in bullet points; adds context weight -_SECTION_KEYWORDS: dict[str, list[str]] = { - "skills": [ - "python", "sql", "java", "typescript", "react", "vue", "docker", - "kubernetes", "aws", "gcp", "azure", "terraform", "ci/cd", "git", - "postgresql", "redis", "kafka", "spark", "tableau", "salesforce", - "jira", "figma", "excel", "powerpoint", "machine learning", "llm", - "deep learning", "pytorch", "tensorflow", "scikit-learn", - ], - "summary": [ - "leadership", "strategy", "vision", "executive", "director", "vp", - "growth", "transformation", "stakeholder", "cross-functional", - "p&l", "revenue", "budget", "board", "c-suite", - ], -} - - -def prioritize_gaps(gaps: list[str], resume_sections: dict[str, Any]) -> list[dict]: - """Rank keyword gaps by ATS impact and map each to a target resume section. - - Args: - gaps: List of missing keyword signals from extract_jd_signals(). - resume_sections: Structured resume dict from resume_parser.parse_resume(). - - Returns: - List of dicts, sorted by priority score descending: - { - "term": str, # the keyword/phrase to inject - "section": str, # target resume section ("skills", "summary", "experience") - "priority": int, # 1=high, 2=medium, 3=low - "rationale": str, # why this section was chosen - } - - TODO: implement the ranking logic below. - The current stub assigns every gap to "experience" at medium priority. - A good implementation should: - - Score "skills" section terms highest (direct keyword density) - - Score "summary" terms next (executive/leadership signals) - - Route remaining gaps to "experience" bullets - - Deprioritize terms already present in any section (case-insensitive) - - Consider gap term length: multi-word phrases > single words (more specific = higher ATS weight) - """ - existing_text = _flatten_resume_text(resume_sections).lower() - - prioritized: list[dict] = [] - for term in gaps: - # Skip terms already present anywhere in the resume - if term.lower() in existing_text: - continue - - # REVIEW: _SECTION_KEYWORDS lists are tech-centric; domain-specific roles - # (creative, healthcare, operations) may over-route to experience. - # Consider expanding the lists or making them config-driven. - term_lower = term.lower() - - # Partial-match: term contains a skills keyword (handles "PostgreSQL" vs "postgresql", - # "AWS Lambda" vs "aws", etc.) - skills_match = any(kw in term_lower or term_lower in kw - for kw in _SECTION_KEYWORDS["skills"]) - summary_match = any(kw in term_lower or term_lower in kw - for kw in _SECTION_KEYWORDS["summary"]) - - if skills_match: - section = "skills" - priority = 1 - rationale = "matched technical skills list — highest ATS keyword density" - elif summary_match: - section = "summary" - priority = 1 - rationale = "matched leadership/executive signals — boosts overall relevance score" - elif len(term.split()) > 1: - section = "experience" - priority = 2 - rationale = "multi-word phrase — more specific than single keywords, context weight in bullets" - else: - section = "experience" - priority = 3 - rationale = "single generic term — lowest ATS impact, added to experience for coverage" - - prioritized.append({ - "term": term, - "section": section, - "priority": priority, - "rationale": rationale, - }) - - prioritized.sort(key=lambda x: x["priority"]) - return prioritized - - -def _flatten_resume_text(resume: dict[str, Any]) -> str: - """Concatenate all text from a structured resume dict into one searchable string.""" - parts: list[str] = [] - parts.append(resume.get("career_summary", "") or "") - parts.extend(resume.get("skills", [])) - for exp in resume.get("experience", []): - parts.append(exp.get("title", "")) - parts.append(exp.get("company", "")) - parts.extend(exp.get("bullets", [])) - for edu in resume.get("education", []): - parts.append(edu.get("degree", "")) - parts.append(edu.get("field", "")) - parts.append(edu.get("institution", "")) - parts.extend(resume.get("achievements", [])) - return " ".join(parts) - - -# ── LLM rewrite ─────────────────────────────────────────────────────────────── - -def rewrite_for_ats( - resume: dict[str, Any], - prioritized_gaps: list[dict], - job: dict[str, Any], - candidate_voice: str = "", -) -> dict[str, Any]: - """Rewrite resume sections to naturally incorporate ATS keyword gaps. - - Operates section-by-section. For each target section in prioritized_gaps, - builds a focused prompt that injects only the gaps destined for that section. - The hallucination constraint is enforced in the prompt itself and verified - post-hoc by hallucination_check(). - - Args: - resume: Structured resume dict (from resume_parser.parse_resume). - prioritized_gaps: Output of prioritize_gaps(). - job: Job dict with at minimum {"title": str, "company": str, "description": str}. - candidate_voice: Free-text personality/style note from user.yaml (may be empty). - - Returns: - New resume dict (same structure as input) with rewritten sections. - Sections with no relevant gaps are copied through unchanged. - """ - from scripts.llm_router import LLMRouter - router = LLMRouter() - - # Group gaps by target section - by_section: dict[str, list[str]] = {} - for gap in prioritized_gaps: - by_section.setdefault(gap["section"], []).append(gap["term"]) - - rewritten = dict(resume) # shallow copy — sections replaced below - - for section, terms in by_section.items(): - terms_str = ", ".join(f'"{t}"' for t in terms) - original_content = _section_text_for_prompt(resume, section) - - voice_note = ( - f'\n\nCandidate voice/style: "{candidate_voice}". ' - "Preserve this authentic tone — do not write generically." - ) if candidate_voice else "" - - prompt = ( - f"You are rewriting the **{section}** section of a resume to help it pass " - f"ATS (applicant tracking system) screening for this role:\n" - f" Job title: {job.get('title', 'Unknown')}\n" - f" Company: {job.get('company', 'Unknown')}\n\n" - f"Inject these missing ATS keywords naturally into the section:\n" - f" {terms_str}\n\n" - f"CRITICAL RULES — violating any of these invalidates the rewrite:\n" - f"1. Do NOT invent new employers, job titles, dates, or education.\n" - f"2. Do NOT add skills the candidate did not already demonstrate.\n" - f"3. Only rephrase existing content — replace vague verbs/nouns with the " - f" ATS-preferred equivalents listed above.\n" - f"4. Keep the same number of bullet points in experience entries.\n" - f"5. Return ONLY the rewritten section content, no labels or explanation." - f"{voice_note}\n\n" - f"Original {section} section:\n{original_content}" - ) - - try: - result = router.complete(prompt) - rewritten = _apply_section_rewrite(rewritten, section, result.strip()) - except Exception: - log.warning("[resume_optimizer] rewrite failed for section %r", section, exc_info=True) - # Leave section unchanged on failure - - return rewritten - - -def _section_text_for_prompt(resume: dict[str, Any], section: str) -> str: - """Render a resume section as plain text suitable for an LLM prompt.""" - if section == "summary": - return resume.get("career_summary", "") or "(empty)" - if section == "skills": - skills = resume.get("skills", []) - return ", ".join(skills) if skills else "(empty)" - if section == "experience": - lines: list[str] = [] - for exp in resume.get("experience", []): - lines.append(f"{exp['title']} at {exp['company']} ({exp['start_date']}–{exp['end_date']})") - for b in exp.get("bullets", []): - lines.append(f" • {b}") - return "\n".join(lines) if lines else "(empty)" - return "(unsupported section)" - - -def _apply_section_rewrite(resume: dict[str, Any], section: str, rewritten: str) -> dict[str, Any]: - """Return a new resume dict with the given section replaced by rewritten text.""" - updated = dict(resume) - if section == "summary": - updated["career_summary"] = rewritten - elif section == "skills": - # LLM returns comma-separated or newline-separated skills - skills = [s.strip() for s in re.split(r"[,\n•·]+", rewritten) if s.strip()] - updated["skills"] = skills - elif section == "experience": - # For experience, we keep the structured entries but replace the bullets. - # The LLM rewrites the whole section as plain text; we re-parse the bullets. - updated["experience"] = _reparse_experience_bullets(resume["experience"], rewritten) - return updated - - -def _reparse_experience_bullets( - original_entries: list[dict], - rewritten_text: str, -) -> list[dict]: - """Re-associate rewritten bullet text with the original experience entries. - - The LLM rewrites the section as a block of text. We split on the original - entry headers (title + company) to re-bind bullets to entries. Falls back - to the original entries if splitting fails. - """ - if not original_entries: - return original_entries - - result: list[dict] = [] - remaining = rewritten_text - - for i, entry in enumerate(original_entries): - # Find where the next entry starts so we can slice out this entry's bullets - if i + 1 < len(original_entries): - next_title = original_entries[i + 1]["title"] - # Look for the next entry header in the remaining text - split_pat = re.escape(next_title) - m = re.search(split_pat, remaining, re.IGNORECASE) - chunk = remaining[:m.start()] if m else remaining - remaining = remaining[m.start():] if m else "" - else: - chunk = remaining - - bullets = [ - re.sub(r"^[•\-–—*◦▪▸►]\s*", "", line).strip() - for line in chunk.splitlines() - if re.match(r"^[•\-–—*◦▪▸►]\s*", line.strip()) - ] - new_entry = dict(entry) - new_entry["bullets"] = bullets if bullets else entry["bullets"] - result.append(new_entry) - - return result - - -# ── Hallucination guard ─────────────────────────────────────────────────────── - -def hallucination_check(original: dict[str, Any], rewritten: dict[str, Any]) -> bool: - """Return True if the rewrite is safe (no fabricated facts detected). - - Checks that the set of employers, job titles, and date ranges in the - rewritten resume is a subset of those in the original. Any new entry - signals hallucination. - - Args: - original: Structured resume dict before rewrite. - rewritten: Structured resume dict after rewrite. - - Returns: - True → rewrite is safe to use - False → hallucination detected; caller should fall back to original - """ - orig_anchors = _extract_anchors(original) - rewrite_anchors = _extract_anchors(rewritten) - - new_anchors = rewrite_anchors - orig_anchors - if new_anchors: - log.warning( - "[resume_optimizer] hallucination_check FAILED — new anchors in rewrite: %s", - new_anchors, - ) - return False - return True - - -def _extract_anchors(resume: dict[str, Any]) -> frozenset[str]: - """Extract stable factual anchors (company, title, dates) from experience entries.""" - anchors: set[str] = set() - for exp in resume.get("experience", []): - for field in ("company", "title", "start_date", "end_date"): - val = (exp.get(field) or "").strip().lower() - if val: - anchors.add(val) - for edu in resume.get("education", []): - val = (edu.get("institution") or "").strip().lower() - if val: - anchors.add(val) - return frozenset(anchors) - - -# ── Resume → plain text renderer ───────────────────────────────────────────── - -def render_resume_text(resume: dict[str, Any]) -> str: - """Render a structured resume dict back to formatted plain text for PDF export.""" - lines: list[str] = [] - - contact_parts = [resume.get("name", ""), resume.get("email", ""), resume.get("phone", "")] - lines.append(" ".join(p for p in contact_parts if p)) - lines.append("") - - if resume.get("career_summary"): - lines.append("SUMMARY") - lines.append(resume["career_summary"]) - lines.append("") - - if resume.get("experience"): - lines.append("EXPERIENCE") - for exp in resume["experience"]: - lines.append( - f"{exp.get('title', '')} | {exp.get('company', '')} " - f"({exp.get('start_date', '')}–{exp.get('end_date', '')})" - ) - for b in exp.get("bullets", []): - lines.append(f" • {b}") - lines.append("") - - if resume.get("education"): - lines.append("EDUCATION") - for edu in resume["education"]: - lines.append( - f"{edu.get('degree', '')} {edu.get('field', '')} | " - f"{edu.get('institution', '')} {edu.get('graduation_year', '')}" - ) - lines.append("") - - if resume.get("skills"): - lines.append("SKILLS") - lines.append(", ".join(resume["skills"])) - lines.append("") - - if resume.get("achievements"): - lines.append("ACHIEVEMENTS") - for a in resume["achievements"]: - lines.append(f" • {a}") - lines.append("") - - return "\n".join(lines) diff --git a/scripts/task_runner.py b/scripts/task_runner.py index b728dcc..6bfdd4c 100644 --- a/scripts/task_runner.py +++ b/scripts/task_runner.py @@ -9,13 +9,10 @@ and marks the task completed or failed. Deduplication: only one queued/running task per (task_type, job_id) is allowed. Different task types for the same job run concurrently (e.g. cover letter + research). """ -import logging import sqlite3 import threading from pathlib import Path -log = logging.getLogger(__name__) - from scripts.db import ( DEFAULT_DB, insert_task, @@ -23,7 +20,6 @@ from scripts.db import ( update_task_stage, update_cover_letter, save_research, - save_optimized_resume, ) @@ -43,13 +39,9 @@ def submit_task(db_path: Path = DEFAULT_DB, task_type: str = "", if is_new: from scripts.task_scheduler import get_scheduler, LLM_TASK_TYPES if task_type in LLM_TASK_TYPES: - enqueued = get_scheduler(db_path, run_task_fn=_run_task).enqueue( + get_scheduler(db_path, run_task_fn=_run_task).enqueue( task_id, task_type, job_id or 0, params ) - if not enqueued: - update_task_status( - db_path, task_id, "failed", error="Queue depth limit reached" - ) else: t = threading.Thread( target=_run_task, @@ -166,8 +158,7 @@ def _run_task(db_path: Path, task_id: int, task_type: str, job_id: int, ) return from scripts.discover import run_discovery - from pathlib import Path as _Path - new_count = run_discovery(db_path, config_dir=_Path(db_path).parent / "config") + new_count = run_discovery(db_path) n = new_count or 0 update_task_status( db_path, task_id, "completed", @@ -179,9 +170,6 @@ def _run_task(db_path: Path, task_id: int, task_type: str, job_id: int, import json as _json p = _json.loads(params or "{}") from scripts.generate_cover_letter import generate - _cfg_dir = Path(db_path).parent / "config" - _user_llm_cfg = _cfg_dir / "llm.yaml" - _user_yaml = _cfg_dir / "user.yaml" result = generate( job.get("title", ""), job.get("company", ""), @@ -189,8 +177,6 @@ def _run_task(db_path: Path, task_id: int, task_type: str, job_id: int, previous_result=p.get("previous_result", ""), feedback=p.get("feedback", ""), is_jobgether=job.get("source") == "jobgether", - config_path=_user_llm_cfg, - user_yaml_path=_user_yaml, ) update_cover_letter(db_path, job_id, result) @@ -275,48 +261,6 @@ def _run_task(db_path: Path, task_id: int, task_type: str, job_id: int, ) return - elif task_type == "resume_optimize": - import json as _json - from scripts.resume_parser import structure_resume - from scripts.resume_optimizer import ( - extract_jd_signals, - prioritize_gaps, - rewrite_for_ats, - hallucination_check, - render_resume_text, - ) - from scripts.user_profile import load_user_profile - - description = job.get("description", "") - resume_path = load_user_profile().get("resume_path", "") - - # Parse the candidate's resume - update_task_stage(db_path, task_id, "parsing resume") - resume_text = Path(resume_path).read_text(errors="replace") if resume_path else "" - resume_struct, parse_err = structure_resume(resume_text) - - # Extract keyword gaps and build gap report (free tier) - update_task_stage(db_path, task_id, "extracting keyword gaps") - gaps = extract_jd_signals(description, resume_text) - prioritized = prioritize_gaps(gaps, resume_struct) - gap_report = _json.dumps(prioritized, indent=2) - - # Full rewrite (paid tier only) - rewritten_text = "" - p = _json.loads(params or "{}") - if p.get("full_rewrite", False): - update_task_stage(db_path, task_id, "rewriting resume sections") - candidate_voice = load_user_profile().get("candidate_voice", "") - rewritten = rewrite_for_ats(resume_struct, prioritized, job, candidate_voice) - if hallucination_check(resume_struct, rewritten): - rewritten_text = render_resume_text(rewritten) - else: - log.warning("[task_runner] resume_optimize hallucination check failed for job %d", job_id) - - save_optimized_resume(db_path, job_id=job_id, - text=rewritten_text, - gap_report=gap_report) - elif task_type == "prepare_training": from scripts.prepare_training_data import build_records, write_jsonl, DEFAULT_OUTPUT records = build_records() diff --git a/scripts/task_scheduler.py b/scripts/task_scheduler.py index ea12236..baca6a8 100644 --- a/scripts/task_scheduler.py +++ b/scripts/task_scheduler.py @@ -1,167 +1,232 @@ # scripts/task_scheduler.py -"""Peregrine LLM task scheduler — thin shim over circuitforge_core.tasks.scheduler. +"""Resource-aware batch scheduler for LLM background tasks. -All scheduling logic lives in circuitforge_core. This module defines -Peregrine-specific task types, VRAM budgets, and config loading. +Routes LLM task types through per-type deques with VRAM-aware scheduling. +Non-LLM tasks bypass this module — routing lives in scripts/task_runner.py. -Public API (unchanged — callers do not need to change): - LLM_TASK_TYPES — frozenset of task type strings routed through the scheduler - DEFAULT_VRAM_BUDGETS — dict of conservative peak VRAM estimates per task type - TaskSpec — lightweight task descriptor (re-exported from core) - TaskScheduler — backward-compatible wrapper around the core scheduler class - get_scheduler() — returns the process-level TaskScheduler singleton - reset_scheduler() — test teardown only +Public API: + LLM_TASK_TYPES — set of task type strings routed through the scheduler + get_scheduler() — lazy singleton accessor + reset_scheduler() — test teardown only """ -from __future__ import annotations - import logging -import os +import sqlite3 import threading +from collections import deque, namedtuple from pathlib import Path from typing import Callable, Optional -from circuitforge_core.tasks.scheduler import ( - TaskSpec, # re-export unchanged - LocalScheduler as _CoreTaskScheduler, -) +# Module-level import so tests can monkeypatch scripts.task_scheduler._get_gpus +try: + from scripts.preflight import get_gpus as _get_gpus +except Exception: # graceful degradation if preflight unavailable + _get_gpus = lambda: [] logger = logging.getLogger(__name__) -# ── Peregrine task types and VRAM budgets ───────────────────────────────────── - +# Task types that go through the scheduler (all others spawn free threads) LLM_TASK_TYPES: frozenset[str] = frozenset({ "cover_letter", "company_research", "wizard_generate", - "resume_optimize", }) # Conservative peak VRAM estimates (GB) per task type. # Overridable per-install via scheduler.vram_budgets in config/llm.yaml. DEFAULT_VRAM_BUDGETS: dict[str, float] = { - "cover_letter": 2.5, # alex-cover-writer:latest (~2 GB GGUF + headroom) + "cover_letter": 2.5, # alex-cover-writer:latest (~2GB GGUF + headroom) "company_research": 5.0, # llama3.1:8b or vllm model "wizard_generate": 2.5, # same model family as cover_letter - "resume_optimize": 5.0, # section-by-section rewrite; same budget as research } -_DEFAULT_MAX_QUEUE_DEPTH = 500 +# Lightweight task descriptor stored in per-type deques +TaskSpec = namedtuple("TaskSpec", ["id", "job_id", "params"]) -def _load_config_overrides(db_path: Path) -> tuple[dict[str, float], int]: - """Load VRAM budget overrides and max_queue_depth from config/llm.yaml.""" - budgets = dict(DEFAULT_VRAM_BUDGETS) - max_depth = _DEFAULT_MAX_QUEUE_DEPTH - config_path = db_path.parent.parent / "config" / "llm.yaml" - if config_path.exists(): - try: - import yaml - with open(config_path) as f: - cfg = yaml.safe_load(f) or {} - sched_cfg = cfg.get("scheduler", {}) - budgets.update(sched_cfg.get("vram_budgets", {})) - max_depth = int(sched_cfg.get("max_queue_depth", max_depth)) - except Exception as exc: - logger.warning( - "Failed to load scheduler config from %s: %s", config_path, exc - ) - return budgets, max_depth - - -# Module-level stub so tests can monkeypatch scripts.task_scheduler._get_gpus -# (existing tests monkeypatch this symbol — keep it here for backward compat). -try: - from scripts.preflight import get_gpus as _get_gpus -except Exception: - _get_gpus = lambda: [] # noqa: E731 - - -class TaskScheduler(_CoreTaskScheduler): - """Peregrine-specific TaskScheduler. - - Extends circuitforge_core.tasks.scheduler.TaskScheduler with: - - Peregrine default VRAM budgets and task types wired into __init__ - - Config loading from config/llm.yaml - - Backward-compatible two-argument __init__ signature (db_path, run_task_fn) - - _get_gpus monkeypatch support (existing tests patch this module-level symbol) - - Backward-compatible enqueue() that marks dropped tasks failed in the DB - and logs under the scripts.task_scheduler logger - - Direct construction is still supported for tests; production code should - use get_scheduler() instead. - """ +class TaskScheduler: + """Resource-aware LLM task batch scheduler. Use get_scheduler() — not direct construction.""" def __init__(self, db_path: Path, run_task_fn: Callable) -> None: - budgets, max_depth = _load_config_overrides(db_path) + self._db_path = db_path + self._run_task = run_task_fn - # Warn under this module's logger for any task types with no VRAM budget - # (mirrors the core warning but captures under scripts.task_scheduler - # so existing tests using caplog.at_level(logger="scripts.task_scheduler") pass) + self._lock = threading.Lock() + self._wake = threading.Event() + self._stop = threading.Event() + self._queues: dict[str, deque] = {} + self._active: dict[str, threading.Thread] = {} + self._reserved_vram: float = 0.0 + self._thread: Optional[threading.Thread] = None + + # Load VRAM budgets: defaults + optional config overrides + self._budgets: dict[str, float] = dict(DEFAULT_VRAM_BUDGETS) + config_path = db_path.parent.parent / "config" / "llm.yaml" + self._max_queue_depth: int = 500 + if config_path.exists(): + try: + import yaml + with open(config_path) as f: + cfg = yaml.safe_load(f) or {} + sched_cfg = cfg.get("scheduler", {}) + self._budgets.update(sched_cfg.get("vram_budgets", {})) + self._max_queue_depth = sched_cfg.get("max_queue_depth", 500) + except Exception as exc: + logger.warning("Failed to load scheduler config from %s: %s", config_path, exc) + + # Warn on LLM types with no budget entry after merge for t in LLM_TASK_TYPES: - if t not in budgets: + if t not in self._budgets: logger.warning( "No VRAM budget defined for LLM task type %r — " "defaulting to 0.0 GB (unlimited concurrency for this type)", t ) - super().__init__( - db_path=db_path, - run_task_fn=run_task_fn, - task_types=LLM_TASK_TYPES, - vram_budgets=budgets, - max_queue_depth=max_depth, - ) + # Detect total GPU VRAM; fall back to unlimited (999) on CPU-only systems. + # Uses module-level _get_gpus so tests can monkeypatch scripts.task_scheduler._get_gpus. + try: + gpus = _get_gpus() + self._available_vram: float = ( + sum(g["vram_total_gb"] for g in gpus) if gpus else 999.0 + ) + except Exception: + self._available_vram = 999.0 - def enqueue( - self, - task_id: int, - task_type: str, - job_id: int, - params: Optional[str], - ) -> bool: + # Durability: reload surviving 'queued' LLM tasks from prior run + self._load_queued_tasks() + + def enqueue(self, task_id: int, task_type: str, job_id: int, + params: Optional[str]) -> None: """Add an LLM task to the scheduler queue. - When the queue is full, marks the task failed in SQLite immediately - (backward-compatible with the original Peregrine behavior) and logs a - warning under the scripts.task_scheduler logger. - - Returns True if enqueued, False if the queue was full. + If the queue for this type is at max_queue_depth, the task is marked + failed in SQLite immediately (no ghost queued rows) and a warning is logged. """ - enqueued = super().enqueue(task_id, task_type, job_id, params) - if not enqueued: - # Log under this module's logger so existing caplog tests pass - logger.warning( - "Queue depth limit reached for %s (max=%d) — task %d dropped", - task_type, self._max_queue_depth, task_id, - ) - from scripts.db import update_task_status - update_task_status( - self._db_path, task_id, "failed", error="Queue depth limit reached" - ) - return enqueued + from scripts.db import update_task_status + + with self._lock: + q = self._queues.setdefault(task_type, deque()) + if len(q) >= self._max_queue_depth: + logger.warning( + "Queue depth limit reached for %s (max=%d) — task %d dropped", + task_type, self._max_queue_depth, task_id, + ) + update_task_status(self._db_path, task_id, "failed", + error="Queue depth limit reached") + return + q.append(TaskSpec(task_id, job_id, params)) + + self._wake.set() + + def start(self) -> None: + """Start the background scheduler loop thread. Call once after construction.""" + self._thread = threading.Thread( + target=self._scheduler_loop, name="task-scheduler", daemon=True + ) + self._thread.start() + + def shutdown(self, timeout: float = 5.0) -> None: + """Signal the scheduler to stop and wait for it to exit.""" + self._stop.set() + self._wake.set() # unblock any wait() + if self._thread and self._thread.is_alive(): + self._thread.join(timeout=timeout) + + def _scheduler_loop(self) -> None: + """Main scheduler daemon — wakes on enqueue or batch completion.""" + while not self._stop.is_set(): + self._wake.wait(timeout=30) + self._wake.clear() + + with self._lock: + # Defense in depth: reap externally-killed batch threads. + # In normal operation _active.pop() runs in finally before _wake fires, + # so this reap finds nothing — no double-decrement risk. + for t, thread in list(self._active.items()): + if not thread.is_alive(): + self._reserved_vram -= self._budgets.get(t, 0.0) + del self._active[t] + + # Start new type batches while VRAM allows + candidates = sorted( + [t for t in self._queues if self._queues[t] and t not in self._active], + key=lambda t: len(self._queues[t]), + reverse=True, + ) + for task_type in candidates: + budget = self._budgets.get(task_type, 0.0) + # Always allow at least one batch to run even if its budget + # exceeds _available_vram (prevents permanent starvation when + # a single type's budget is larger than the VRAM ceiling). + if self._reserved_vram == 0.0 or self._reserved_vram + budget <= self._available_vram: + thread = threading.Thread( + target=self._batch_worker, + args=(task_type,), + name=f"batch-{task_type}", + daemon=True, + ) + self._active[task_type] = thread + self._reserved_vram += budget + thread.start() + + def _batch_worker(self, task_type: str) -> None: + """Serial consumer for one task type. Runs until the type's deque is empty.""" + try: + while True: + with self._lock: + q = self._queues.get(task_type) + if not q: + break + task = q.popleft() + # _run_task is scripts.task_runner._run_task (passed at construction) + self._run_task( + self._db_path, task.id, task_type, task.job_id, task.params + ) + finally: + # Always release — even if _run_task raises. + # _active.pop here prevents the scheduler loop reap from double-decrementing. + with self._lock: + self._active.pop(task_type, None) + self._reserved_vram -= self._budgets.get(task_type, 0.0) + self._wake.set() + + def _load_queued_tasks(self) -> None: + """Load pre-existing queued LLM tasks from SQLite into deques (called once in __init__).""" + llm_types = sorted(LLM_TASK_TYPES) # sorted for deterministic SQL params in logs + placeholders = ",".join("?" * len(llm_types)) + conn = sqlite3.connect(self._db_path) + rows = conn.execute( + f"SELECT id, task_type, job_id, params FROM background_tasks" + f" WHERE status='queued' AND task_type IN ({placeholders})" + f" ORDER BY created_at ASC", + llm_types, + ).fetchall() + conn.close() + + for row_id, task_type, job_id, params in rows: + q = self._queues.setdefault(task_type, deque()) + q.append(TaskSpec(row_id, job_id, params)) + + if rows: + logger.info("Scheduler: resumed %d queued task(s) from prior run", len(rows)) -# ── Peregrine-local singleton ────────────────────────────────────────────────── -# We manage our own singleton (not the core one) so the process-level instance -# is always a Peregrine TaskScheduler (with the enqueue() override). +# ── Singleton ───────────────────────────────────────────────────────────────── _scheduler: Optional[TaskScheduler] = None _scheduler_lock = threading.Lock() -def get_scheduler( - db_path: Path, - run_task_fn: Optional[Callable] = None, -) -> TaskScheduler: - """Return the process-level Peregrine TaskScheduler singleton. +def get_scheduler(db_path: Path, run_task_fn: Callable = None) -> TaskScheduler: + """Return the process-level TaskScheduler singleton, constructing it if needed. - run_task_fn is required on the first call; ignored on subsequent calls - (double-checked locking — singleton already constructed). + run_task_fn is required on the first call; ignored on subsequent calls. + Safety: inner lock + double-check prevents double-construction under races. + The outer None check is a fast-path performance optimisation only. """ global _scheduler - if _scheduler is None: # fast path — no lock on steady state + if _scheduler is None: # fast path — avoids lock on steady state with _scheduler_lock: - if _scheduler is None: # re-check under lock + if _scheduler is None: # re-check under lock (double-checked locking) if run_task_fn is None: raise ValueError("run_task_fn required on first get_scheduler() call") _scheduler = TaskScheduler(db_path, run_task_fn) diff --git a/scripts/user_profile.py b/scripts/user_profile.py index eae7982..456b094 100644 --- a/scripts/user_profile.py +++ b/scripts/user_profile.py @@ -31,7 +31,6 @@ _DEFAULTS = { "wizard_complete": False, "wizard_step": 0, "dismissed_banners": [], - "ui_preference": "streamlit", "services": { "streamlit_port": 8501, "ollama_host": "localhost", @@ -79,37 +78,7 @@ class UserProfile: self.wizard_complete: bool = bool(data.get("wizard_complete", False)) self.wizard_step: int = int(data.get("wizard_step", 0)) self.dismissed_banners: list[str] = list(data.get("dismissed_banners", [])) - raw_pref = data.get("ui_preference", "streamlit") - self.ui_preference: str = raw_pref if raw_pref in ("streamlit", "vue") else "streamlit" self._svc = data["services"] - self._path = path - - def save(self) -> None: - """Save all profile fields back to user.yaml.""" - output = { - "name": self.name, - "email": self.email, - "phone": self.phone, - "linkedin": self.linkedin, - "career_summary": self.career_summary, - "candidate_voice": self.candidate_voice, - "nda_companies": self.nda_companies, - "docs_dir": str(self.docs_dir), - "ollama_models_dir": str(self.ollama_models_dir), - "vllm_models_dir": str(self.vllm_models_dir), - "inference_profile": self.inference_profile, - "mission_preferences": self.mission_preferences, - "candidate_accessibility_focus": self.candidate_accessibility_focus, - "candidate_lgbtq_focus": self.candidate_lgbtq_focus, - "tier": self.tier, - "dev_tier_override": self.dev_tier_override, - "wizard_complete": self.wizard_complete, - "wizard_step": self.wizard_step, - "dismissed_banners": self.dismissed_banners, - "ui_preference": self.ui_preference, - "services": self._svc, - } - self._path.write_text(yaml.dump(output, default_flow_style=False)) # ── Service URLs ────────────────────────────────────────────────────────── def _url(self, host: str, port: int, ssl: bool) -> str: diff --git a/tests/test_cover_letter_refinement.py b/tests/test_cover_letter_refinement.py index c6ebc84..852aebd 100644 --- a/tests/test_cover_letter_refinement.py +++ b/tests/test_cover_letter_refinement.py @@ -80,8 +80,7 @@ class TestTaskRunnerCoverLetterParams: captured = {} def mock_generate(title, company, description="", previous_result="", feedback="", - is_jobgether=False, _router=None, config_path=None, - user_yaml_path=None): + is_jobgether=False, _router=None): captured.update({ "title": title, "company": company, "previous_result": previous_result, "feedback": feedback, diff --git a/tests/test_db_migrate.py b/tests/test_db_migrate.py deleted file mode 100644 index 8da4a24..0000000 --- a/tests/test_db_migrate.py +++ /dev/null @@ -1,148 +0,0 @@ -"""Tests for scripts/db_migrate.py — numbered SQL migration runner.""" - -import sqlite3 -import textwrap -from pathlib import Path - -import pytest - -from scripts.db_migrate import migrate_db - - -# ── helpers ─────────────────────────────────────────────────────────────────── - -def _applied(db_path: Path) -> list[str]: - con = sqlite3.connect(db_path) - try: - rows = con.execute("SELECT version FROM schema_migrations ORDER BY version").fetchall() - return [r[0] for r in rows] - finally: - con.close() - - -def _tables(db_path: Path) -> set[str]: - con = sqlite3.connect(db_path) - try: - rows = con.execute( - "SELECT name FROM sqlite_master WHERE type='table' AND name NOT LIKE 'sqlite_%'" - ).fetchall() - return {r[0] for r in rows} - finally: - con.close() - - -# ── tests ────────────────────────────────────────────────────────────────────── - -def test_creates_schema_migrations_table(tmp_path): - """Running against an empty DB creates the tracking table.""" - db = tmp_path / "test.db" - (tmp_path / "migrations").mkdir() # empty migrations dir - # Patch the module-level _MIGRATIONS_DIR - import scripts.db_migrate as m - orig = m._MIGRATIONS_DIR - m._MIGRATIONS_DIR = tmp_path / "migrations" - try: - migrate_db(db) - assert "schema_migrations" in _tables(db) - finally: - m._MIGRATIONS_DIR = orig - - -def test_applies_migration_file(tmp_path): - """A .sql file in migrations/ is applied and recorded.""" - db = tmp_path / "test.db" - mdir = tmp_path / "migrations" - mdir.mkdir() - (mdir / "001_test.sql").write_text( - "CREATE TABLE IF NOT EXISTS widgets (id INTEGER PRIMARY KEY, name TEXT);" - ) - - import scripts.db_migrate as m - orig = m._MIGRATIONS_DIR - m._MIGRATIONS_DIR = mdir - try: - applied = migrate_db(db) - assert applied == ["001_test"] - assert "widgets" in _tables(db) - assert _applied(db) == ["001_test"] - finally: - m._MIGRATIONS_DIR = orig - - -def test_idempotent_second_run(tmp_path): - """Running migrate_db twice does not re-apply migrations.""" - db = tmp_path / "test.db" - mdir = tmp_path / "migrations" - mdir.mkdir() - (mdir / "001_test.sql").write_text( - "CREATE TABLE IF NOT EXISTS widgets (id INTEGER PRIMARY KEY, name TEXT);" - ) - - import scripts.db_migrate as m - orig = m._MIGRATIONS_DIR - m._MIGRATIONS_DIR = mdir - try: - migrate_db(db) - applied = migrate_db(db) # second run - assert applied == [] - assert _applied(db) == ["001_test"] - finally: - m._MIGRATIONS_DIR = orig - - -def test_applies_only_new_migrations(tmp_path): - """Migrations already in schema_migrations are skipped; only new ones run.""" - db = tmp_path / "test.db" - mdir = tmp_path / "migrations" - mdir.mkdir() - (mdir / "001_first.sql").write_text( - "CREATE TABLE IF NOT EXISTS first_table (id INTEGER PRIMARY KEY);" - ) - - import scripts.db_migrate as m - orig = m._MIGRATIONS_DIR - m._MIGRATIONS_DIR = mdir - try: - migrate_db(db) - - # Add a second migration - (mdir / "002_second.sql").write_text( - "CREATE TABLE IF NOT EXISTS second_table (id INTEGER PRIMARY KEY);" - ) - applied = migrate_db(db) - assert applied == ["002_second"] - assert set(_applied(db)) == {"001_first", "002_second"} - assert "second_table" in _tables(db) - finally: - m._MIGRATIONS_DIR = orig - - -def test_migration_failure_raises(tmp_path): - """A bad migration raises RuntimeError and does not record the version.""" - db = tmp_path / "test.db" - mdir = tmp_path / "migrations" - mdir.mkdir() - (mdir / "001_bad.sql").write_text("THIS IS NOT VALID SQL !!!") - - import scripts.db_migrate as m - orig = m._MIGRATIONS_DIR - m._MIGRATIONS_DIR = mdir - try: - with pytest.raises(RuntimeError, match="001_bad"): - migrate_db(db) - assert _applied(db) == [] - finally: - m._MIGRATIONS_DIR = orig - - -def test_baseline_migration_runs(tmp_path): - """The real 001_baseline.sql applies cleanly to a fresh database.""" - db = tmp_path / "test.db" - applied = migrate_db(db) - assert "001_baseline" in applied - expected_tables = { - "jobs", "job_contacts", "company_research", - "background_tasks", "survey_responses", "digest_queue", - "schema_migrations", - } - assert expected_tables <= _tables(db) diff --git a/tests/test_demo_toolbar.py b/tests/test_demo_toolbar.py deleted file mode 100644 index c7cb155..0000000 --- a/tests/test_demo_toolbar.py +++ /dev/null @@ -1,82 +0,0 @@ -"""Tests for app/components/demo_toolbar.py.""" -import sys -from pathlib import Path -import pytest - -sys.path.insert(0, str(Path(__file__).parent.parent)) - -from app.components.demo_toolbar import ( - get_simulated_tier, - set_simulated_tier, - render_demo_toolbar, -) - - -def test_set_simulated_tier_updates_session_state(monkeypatch): - """set_simulated_tier writes to st.session_state.simulated_tier.""" - session = {} - injected = [] - monkeypatch.setattr("streamlit.components.v1.html", lambda h, height=0: injected.append(h)) - monkeypatch.setattr("streamlit.session_state", session, raising=False) - monkeypatch.setattr("streamlit.rerun", lambda: None) - - set_simulated_tier("premium") - - assert session.get("simulated_tier") == "premium" - assert any("prgn_demo_tier=premium" in h for h in injected) - - -def test_set_simulated_tier_invalid_ignored(monkeypatch): - """Invalid tier strings are rejected.""" - session = {} - monkeypatch.setattr("streamlit.components.v1.html", lambda h, height=0: None) - monkeypatch.setattr("streamlit.session_state", session, raising=False) - monkeypatch.setattr("streamlit.rerun", lambda: None) - - set_simulated_tier("ultramax") - - assert "simulated_tier" not in session - - -def test_get_simulated_tier_defaults_to_paid(monkeypatch): - """Returns 'paid' when no tier is set yet.""" - monkeypatch.setattr("streamlit.session_state", {}, raising=False) - monkeypatch.setattr("streamlit.query_params", {}, raising=False) - - assert get_simulated_tier() == "paid" - - -def test_get_simulated_tier_reads_session(monkeypatch): - """Returns tier from st.session_state when set.""" - monkeypatch.setattr("streamlit.session_state", {"simulated_tier": "free"}, raising=False) - monkeypatch.setattr("streamlit.query_params", {}, raising=False) - - assert get_simulated_tier() == "free" - - -def test_render_demo_toolbar_renders_pills(monkeypatch): - """render_demo_toolbar renders tier selection pills.""" - session = {"simulated_tier": "paid"} - calls = [] - - def mock_button(label, key=None, type=None, use_container_width=False): - calls.append(("button", label, key, type)) - return False # button not clicked - - monkeypatch.setattr("streamlit.session_state", session, raising=False) - monkeypatch.setattr("streamlit.container", lambda: __import__("contextlib").nullcontext()) - monkeypatch.setattr("streamlit.columns", lambda x: [__import__("contextlib").nullcontext() for _ in x]) - monkeypatch.setattr("streamlit.caption", lambda x: None) - monkeypatch.setattr("streamlit.button", mock_button) - monkeypatch.setattr("streamlit.divider", lambda: None) - - render_demo_toolbar() - - # Verify buttons were rendered for all tiers - button_calls = [c for c in calls if c[0] == "button"] - assert len(button_calls) == 3 - assert any("Paid ✓" in c[1] for c in button_calls) # current tier marked - - primary_calls = [c for c in button_calls if c[3] == "primary"] - assert len(primary_calls) == 1 - assert "Paid" in primary_calls[0][1] diff --git a/tests/test_dev_api_digest.py b/tests/test_dev_api_digest.py index 71a0a08..f976ecf 100644 --- a/tests/test_dev_api_digest.py +++ b/tests/test_dev_api_digest.py @@ -50,8 +50,9 @@ def tmp_db(tmp_path): @pytest.fixture() def client(tmp_db, monkeypatch): monkeypatch.setenv("STAGING_DB", tmp_db) + import importlib import dev_api - monkeypatch.setattr(dev_api, "DB_PATH", tmp_db) + importlib.reload(dev_api) return TestClient(dev_api.app) diff --git a/tests/test_dev_api_interviews.py b/tests/test_dev_api_interviews.py index 1a3aa64..06803c3 100644 --- a/tests/test_dev_api_interviews.py +++ b/tests/test_dev_api_interviews.py @@ -54,8 +54,10 @@ def tmp_db(tmp_path): @pytest.fixture() def client(tmp_db, monkeypatch): monkeypatch.setenv("STAGING_DB", tmp_db) + # Re-import after env var is set so DB_PATH picks it up + import importlib import dev_api - monkeypatch.setattr(dev_api, "DB_PATH", tmp_db) + importlib.reload(dev_api) return TestClient(dev_api.app) diff --git a/tests/test_dev_api_settings.py b/tests/test_dev_api_settings.py index d2fa97a..55d460b 100644 --- a/tests/test_dev_api_settings.py +++ b/tests/test_dev_api_settings.py @@ -145,7 +145,7 @@ def test_get_resume_missing_returns_not_exists(tmp_path, monkeypatch): """GET /api/settings/resume when file missing returns {exists: false}.""" fake_path = tmp_path / "config" / "plain_text_resume.yaml" # Ensure the path doesn't exist - monkeypatch.setattr("dev_api._resume_path", lambda: fake_path) + monkeypatch.setattr("dev_api.RESUME_PATH", fake_path) from dev_api import app c = TestClient(app) @@ -157,7 +157,7 @@ def test_get_resume_missing_returns_not_exists(tmp_path, monkeypatch): def test_post_resume_blank_creates_file(tmp_path, monkeypatch): """POST /api/settings/resume/blank creates the file.""" fake_path = tmp_path / "config" / "plain_text_resume.yaml" - monkeypatch.setattr("dev_api._resume_path", lambda: fake_path) + monkeypatch.setattr("dev_api.RESUME_PATH", fake_path) from dev_api import app c = TestClient(app) @@ -170,7 +170,7 @@ def test_post_resume_blank_creates_file(tmp_path, monkeypatch): def test_get_resume_after_blank_returns_exists(tmp_path, monkeypatch): """GET /api/settings/resume after blank creation returns {exists: true}.""" fake_path = tmp_path / "config" / "plain_text_resume.yaml" - monkeypatch.setattr("dev_api._resume_path", lambda: fake_path) + monkeypatch.setattr("dev_api.RESUME_PATH", fake_path) from dev_api import app c = TestClient(app) @@ -212,7 +212,7 @@ def test_get_search_prefs_returns_dict(tmp_path, monkeypatch): fake_path.parent.mkdir(parents=True, exist_ok=True) with open(fake_path, "w") as f: yaml.dump({"default": {"remote_preference": "remote", "job_boards": []}}, f) - monkeypatch.setattr("dev_api._search_prefs_path", lambda: fake_path) + monkeypatch.setattr("dev_api.SEARCH_PREFS_PATH", fake_path) from dev_api import app c = TestClient(app) @@ -227,7 +227,7 @@ def test_put_get_search_roundtrip(tmp_path, monkeypatch): """PUT then GET search prefs round-trip: saved field is returned.""" fake_path = tmp_path / "config" / "search_profiles.yaml" fake_path.parent.mkdir(parents=True, exist_ok=True) - monkeypatch.setattr("dev_api._search_prefs_path", lambda: fake_path) + monkeypatch.setattr("dev_api.SEARCH_PREFS_PATH", fake_path) from dev_api import app c = TestClient(app) @@ -253,7 +253,7 @@ def test_put_get_search_roundtrip(tmp_path, monkeypatch): def test_get_search_missing_file_returns_empty(tmp_path, monkeypatch): """GET /api/settings/search when file missing returns empty dict.""" fake_path = tmp_path / "config" / "search_profiles.yaml" - monkeypatch.setattr("dev_api._search_prefs_path", lambda: fake_path) + monkeypatch.setattr("dev_api.SEARCH_PREFS_PATH", fake_path) from dev_api import app c = TestClient(app) @@ -363,7 +363,7 @@ def test_get_services_cpu_profile(client): def test_get_email_has_password_set_bool(tmp_path, monkeypatch): """GET /api/settings/system/email has password_set (bool) and no password key.""" fake_email_path = tmp_path / "email.yaml" - monkeypatch.setattr("dev_api._config_dir", lambda: fake_email_path.parent) + monkeypatch.setattr("dev_api.EMAIL_PATH", fake_email_path) with patch("dev_api.get_credential", return_value=None): from dev_api import app c = TestClient(app) @@ -378,7 +378,7 @@ def test_get_email_has_password_set_bool(tmp_path, monkeypatch): def test_get_email_password_set_true_when_stored(tmp_path, monkeypatch): """password_set is True when credential is stored.""" fake_email_path = tmp_path / "email.yaml" - monkeypatch.setattr("dev_api._config_dir", lambda: fake_email_path.parent) + monkeypatch.setattr("dev_api.EMAIL_PATH", fake_email_path) with patch("dev_api.get_credential", return_value="secret"): from dev_api import app c = TestClient(app) @@ -426,14 +426,10 @@ def test_finetune_status_returns_status_and_pairs_count(client): assert "pairs_count" in data -def test_finetune_status_idle_when_no_task(tmp_path, monkeypatch): +def test_finetune_status_idle_when_no_task(client): """Status is 'idle' and pairs_count is 0 when no task exists.""" - fake_jsonl = tmp_path / "cover_letters.jsonl" # does not exist -> 0 pairs - monkeypatch.setattr("dev_api._TRAINING_JSONL", fake_jsonl) with patch("scripts.task_runner.get_task_status", return_value=None, create=True): - from dev_api import app - c = TestClient(app) - resp = c.get("/api/settings/fine-tune/status") + resp = client.get("/api/settings/fine-tune/status") assert resp.status_code == 200 data = resp.json() assert data["status"] == "idle" @@ -445,7 +441,7 @@ def test_finetune_status_idle_when_no_task(tmp_path, monkeypatch): def test_get_license_returns_tier_and_active(tmp_path, monkeypatch): """GET /api/settings/license returns tier and active fields.""" fake_license = tmp_path / "license.yaml" - monkeypatch.setattr("dev_api._license_path", lambda: fake_license) + monkeypatch.setattr("dev_api.LICENSE_PATH", fake_license) from dev_api import app c = TestClient(app) @@ -459,7 +455,7 @@ def test_get_license_returns_tier_and_active(tmp_path, monkeypatch): def test_get_license_defaults_to_free(tmp_path, monkeypatch): """GET /api/settings/license defaults to free tier when no file.""" fake_license = tmp_path / "license.yaml" - monkeypatch.setattr("dev_api._license_path", lambda: fake_license) + monkeypatch.setattr("dev_api.LICENSE_PATH", fake_license) from dev_api import app c = TestClient(app) @@ -473,7 +469,8 @@ def test_get_license_defaults_to_free(tmp_path, monkeypatch): def test_activate_license_valid_key_returns_ok(tmp_path, monkeypatch): """POST activate with valid key format returns {ok: true}.""" fake_license = tmp_path / "license.yaml" - monkeypatch.setattr("dev_api._license_path", lambda: fake_license) + monkeypatch.setattr("dev_api.LICENSE_PATH", fake_license) + monkeypatch.setattr("dev_api.CONFIG_DIR", tmp_path) from dev_api import app c = TestClient(app) @@ -485,7 +482,8 @@ def test_activate_license_valid_key_returns_ok(tmp_path, monkeypatch): def test_activate_license_invalid_key_returns_ok_false(tmp_path, monkeypatch): """POST activate with bad key format returns {ok: false}.""" fake_license = tmp_path / "license.yaml" - monkeypatch.setattr("dev_api._license_path", lambda: fake_license) + monkeypatch.setattr("dev_api.LICENSE_PATH", fake_license) + monkeypatch.setattr("dev_api.CONFIG_DIR", tmp_path) from dev_api import app c = TestClient(app) @@ -497,7 +495,8 @@ def test_activate_license_invalid_key_returns_ok_false(tmp_path, monkeypatch): def test_deactivate_license_returns_ok(tmp_path, monkeypatch): """POST /api/settings/license/deactivate returns 200 with ok.""" fake_license = tmp_path / "license.yaml" - monkeypatch.setattr("dev_api._license_path", lambda: fake_license) + monkeypatch.setattr("dev_api.LICENSE_PATH", fake_license) + monkeypatch.setattr("dev_api.CONFIG_DIR", tmp_path) from dev_api import app c = TestClient(app) @@ -509,7 +508,8 @@ def test_deactivate_license_returns_ok(tmp_path, monkeypatch): def test_activate_then_deactivate(tmp_path, monkeypatch): """Activate then deactivate: active goes False.""" fake_license = tmp_path / "license.yaml" - monkeypatch.setattr("dev_api._license_path", lambda: fake_license) + monkeypatch.setattr("dev_api.LICENSE_PATH", fake_license) + monkeypatch.setattr("dev_api.CONFIG_DIR", tmp_path) from dev_api import app c = TestClient(app) @@ -580,7 +580,7 @@ def test_get_developer_returns_expected_fields(tmp_path, monkeypatch): _write_user_yaml(user_yaml) monkeypatch.setenv("STAGING_DB", str(db_dir / "staging.db")) fake_tokens = tmp_path / "tokens.yaml" - monkeypatch.setattr("dev_api._tokens_path", lambda: fake_tokens) + monkeypatch.setattr("dev_api.TOKENS_PATH", fake_tokens) from dev_api import app c = TestClient(app) @@ -602,7 +602,7 @@ def test_put_dev_tier_then_get(tmp_path, monkeypatch): _write_user_yaml(user_yaml) monkeypatch.setenv("STAGING_DB", str(db_dir / "staging.db")) fake_tokens = tmp_path / "tokens.yaml" - monkeypatch.setattr("dev_api._tokens_path", lambda: fake_tokens) + monkeypatch.setattr("dev_api.TOKENS_PATH", fake_tokens) from dev_api import app c = TestClient(app) diff --git a/tests/test_llm_router.py b/tests/test_llm_router.py index 09451f6..0d5a897 100644 --- a/tests/test_llm_router.py +++ b/tests/test_llm_router.py @@ -24,7 +24,7 @@ def test_router_uses_first_reachable_backend(): mock_response.choices[0].message.content = "hello" with patch.object(router, "_is_reachable", side_effect=[False, True, True, True, True]), \ - patch("circuitforge_core.llm.router.OpenAI") as MockOpenAI: + patch("scripts.llm_router.OpenAI") as MockOpenAI: instance = MockOpenAI.return_value instance.chat.completions.create.return_value = mock_response mock_model = MagicMock() @@ -54,7 +54,7 @@ def test_is_reachable_returns_false_on_connection_error(): router = LLMRouter(CONFIG_PATH) - with patch("circuitforge_core.llm.router.requests.get", side_effect=requests.ConnectionError): + with patch("scripts.llm_router.requests.get", side_effect=requests.ConnectionError): result = router._is_reachable("http://localhost:9999/v1") assert result is False @@ -92,8 +92,8 @@ def test_complete_skips_backend_without_image_support(tmp_path): mock_resp.status_code = 200 mock_resp.json.return_value = {"text": "B — collaborative"} - with patch("circuitforge_core.llm.router.requests.get") as mock_get, \ - patch("circuitforge_core.llm.router.requests.post") as mock_post: + with patch("scripts.llm_router.requests.get") as mock_get, \ + patch("scripts.llm_router.requests.post") as mock_post: # health check returns ok for vision_service mock_get.return_value = MagicMock(status_code=200) mock_post.return_value = mock_resp @@ -127,7 +127,7 @@ def test_complete_without_images_skips_vision_service(tmp_path): cfg_file.write_text(yaml.dump(cfg)) router = LLMRouter(config_path=cfg_file) - with patch("circuitforge_core.llm.router.requests.post") as mock_post: + with patch("scripts.llm_router.requests.post") as mock_post: try: router.complete("text only prompt") except RuntimeError: diff --git a/tests/test_llm_router_shim.py b/tests/test_llm_router_shim.py deleted file mode 100644 index 23866a0..0000000 --- a/tests/test_llm_router_shim.py +++ /dev/null @@ -1,132 +0,0 @@ -"""Tests for Peregrine's LLMRouter shim — priority fallback logic.""" -import sys -from pathlib import Path -from unittest.mock import patch, MagicMock, call - -sys.path.insert(0, str(Path(__file__).parent.parent)) - - -def _import_fresh(): - """Import scripts.llm_router fresh (bypass module cache).""" - import importlib - import scripts.llm_router as mod - importlib.reload(mod) - return mod - - -# --------------------------------------------------------------------------- -# Test 1: local config/llm.yaml takes priority when it exists -# --------------------------------------------------------------------------- - -def test_uses_local_yaml_when_present(): - """When config/llm.yaml exists locally, super().__init__ is called with that path.""" - import scripts.llm_router as shim_mod - from circuitforge_core.llm import LLMRouter as _CoreLLMRouter - - local_path = Path(shim_mod.__file__).parent.parent / "config" / "llm.yaml" - user_path = Path.home() / ".config" / "circuitforge" / "llm.yaml" - - def fake_exists(self): - return self == local_path # only the local path "exists" - - captured = {} - - def fake_core_init(self, config_path=None): - captured["config_path"] = config_path - self.config = {} - - with patch.object(Path, "exists", fake_exists), \ - patch.object(_CoreLLMRouter, "__init__", fake_core_init): - import importlib - import scripts.llm_router as mod - importlib.reload(mod) - mod.LLMRouter() - - assert captured.get("config_path") == local_path, ( - f"Expected super().__init__ to be called with local path {local_path}, " - f"got {captured.get('config_path')}" - ) - - -# --------------------------------------------------------------------------- -# Test 2: falls through to env-var auto-config when neither yaml exists -# --------------------------------------------------------------------------- - -def test_falls_through_to_env_when_no_yamls(): - """When no yaml files exist, super().__init__ is called with no args (env-var path).""" - import scripts.llm_router as shim_mod - from circuitforge_core.llm import LLMRouter as _CoreLLMRouter - - captured = {} - - def fake_exists(self): - return False # no yaml files exist anywhere - - def fake_core_init(self, config_path=None): - # Record whether a path was passed - captured["config_path"] = config_path - captured["called"] = True - self.config = {} - - with patch.object(Path, "exists", fake_exists), \ - patch.object(_CoreLLMRouter, "__init__", fake_core_init): - import importlib - import scripts.llm_router as mod - importlib.reload(mod) - mod.LLMRouter() - - assert captured.get("called"), "super().__init__ was never called" - # When called with no args, config_path defaults to None in our mock, - # meaning the shim correctly fell through to env-var auto-config - assert captured.get("config_path") is None, ( - f"Expected super().__init__ to be called with no explicit path (None), " - f"got {captured.get('config_path')}" - ) - - -# --------------------------------------------------------------------------- -# Test 3: module-level complete() singleton is only instantiated once -# --------------------------------------------------------------------------- - -def test_complete_singleton_is_reused(): - """complete() reuses the same LLMRouter instance across multiple calls.""" - import importlib - import scripts.llm_router as mod - importlib.reload(mod) - - # Reset singleton - mod._router = None - - instantiation_count = [0] - original_init = mod.LLMRouter.__init__ - - mock_router = MagicMock() - mock_router.complete.return_value = "OK" - - original_class = mod.LLMRouter - - class CountingRouter(original_class): - def __init__(self): - instantiation_count[0] += 1 - # Bypass real __init__ to avoid needing config files - self.config = {} - - def complete(self, prompt, system=None): - return "OK" - - # Patch the class in the module - mod.LLMRouter = CountingRouter - mod._router = None - - result1 = mod.complete("first call") - result2 = mod.complete("second call") - - assert result1 == "OK" - assert result2 == "OK" - assert instantiation_count[0] == 1, ( - f"Expected LLMRouter to be instantiated exactly once, " - f"got {instantiation_count[0]} instantiation(s)" - ) - - # Restore - mod.LLMRouter = original_class diff --git a/tests/test_preflight_env_adoption.py b/tests/test_preflight_env_adoption.py deleted file mode 100644 index 21c4cf9..0000000 --- a/tests/test_preflight_env_adoption.py +++ /dev/null @@ -1,80 +0,0 @@ -"""Tests: preflight writes OLLAMA_HOST to .env when Ollama is adopted from host.""" -import sys -from pathlib import Path -from unittest.mock import patch, call - -sys.path.insert(0, str(Path(__file__).parent.parent)) - -import scripts.preflight as pf - - -def _make_ports(ollama_external: bool = True, ollama_port: int = 11434) -> dict: - """Build a minimal ports dict as returned by preflight's port-scanning logic.""" - return { - "ollama": { - "resolved": ollama_port, - "external": ollama_external, - "stub_port": 54321, - "env_var": "OLLAMA_PORT", - "adoptable": True, - }, - "streamlit": { - "resolved": 8502, - "external": False, - "stub_port": 8502, - "env_var": "STREAMLIT_PORT", - "adoptable": False, - }, - } - - -def _capture_env_updates(ports: dict) -> dict: - """Run the env_updates construction block from preflight.main() and return the result. - - We extract this logic from main() so tests can call it directly without - needing to simulate the full CLI argument parsing and system probe flow. - The block under test is the `if not args.check_only:` section. - """ - captured = {} - - def fake_write_env(updates: dict) -> None: - captured.update(updates) - - with patch.object(pf, "write_env", side_effect=fake_write_env), \ - patch.object(pf, "update_llm_yaml"), \ - patch.object(pf, "write_compose_override"): - # Replicate the env_updates block from preflight.main() as faithfully as possible - env_updates: dict[str, str] = {i["env_var"]: str(i["stub_port"]) for i in ports.values()} - env_updates["RECOMMENDED_PROFILE"] = "single-gpu" - - # ---- Code under test: the OLLAMA_HOST adoption block ---- - ollama_info = ports.get("ollama") - if ollama_info and ollama_info.get("external"): - env_updates["OLLAMA_HOST"] = f"http://host.docker.internal:{ollama_info['resolved']}" - # --------------------------------------------------------- - - pf.write_env(env_updates) - - return captured - - -def test_ollama_host_written_when_adopted(): - """OLLAMA_HOST is added when Ollama is adopted from the host (external=True).""" - ports = _make_ports(ollama_external=True, ollama_port=11434) - result = _capture_env_updates(ports) - assert "OLLAMA_HOST" in result - assert result["OLLAMA_HOST"] == "http://host.docker.internal:11434" - - -def test_ollama_host_not_written_when_docker_managed(): - """OLLAMA_HOST is NOT added when Ollama runs in Docker (external=False).""" - ports = _make_ports(ollama_external=False) - result = _capture_env_updates(ports) - assert "OLLAMA_HOST" not in result - - -def test_ollama_host_reflects_adopted_port(): - """OLLAMA_HOST uses the actual adopted port, not the default.""" - ports = _make_ports(ollama_external=True, ollama_port=11500) - result = _capture_env_updates(ports) - assert result["OLLAMA_HOST"] == "http://host.docker.internal:11500" diff --git a/tests/test_resume_optimizer.py b/tests/test_resume_optimizer.py deleted file mode 100644 index 5425a5f..0000000 --- a/tests/test_resume_optimizer.py +++ /dev/null @@ -1,288 +0,0 @@ -# tests/test_resume_optimizer.py -"""Tests for scripts/resume_optimizer.py""" -import json -import pytest -from unittest.mock import MagicMock, patch - - -# ── Fixtures ───────────────────────────────────────────────────────────────── - -SAMPLE_RESUME = { - "name": "Alex Rivera", - "email": "alex@example.com", - "phone": "555-1234", - "career_summary": "Experienced Customer Success Manager with a track record of growth.", - "skills": ["Salesforce", "Python", "customer success"], - "experience": [ - { - "title": "Customer Success Manager", - "company": "Acme Corp", - "start_date": "2021", - "end_date": "present", - "bullets": [ - "Managed a portfolio of 120 enterprise accounts.", - "Reduced churn by 18% through proactive outreach.", - ], - }, - { - "title": "Support Engineer", - "company": "Beta Inc", - "start_date": "2018", - "end_date": "2021", - "bullets": ["Resolved escalations for top-tier clients."], - }, - ], - "education": [ - { - "degree": "B.S.", - "field": "Computer Science", - "institution": "State University", - "graduation_year": "2018", - } - ], - "achievements": [], -} - -SAMPLE_JD = ( - "We are looking for a Customer Success Manager with Gainsight, cross-functional " - "leadership experience, and strong stakeholder management skills. AWS knowledge a plus." -) - - -# ── extract_jd_signals ──────────────────────────────────────────────────────── - -def test_extract_jd_signals_returns_list(): - """extract_jd_signals returns a list even when LLM and TF-IDF both fail.""" - from scripts.resume_optimizer import extract_jd_signals - - with patch("scripts.llm_router.LLMRouter") as MockRouter: - MockRouter.return_value.complete.side_effect = Exception("no LLM") - result = extract_jd_signals(SAMPLE_JD, resume_text="Python developer") - - assert isinstance(result, list) - - -def test_extract_jd_signals_llm_path_parses_json_array(): - """extract_jd_signals merges LLM-extracted signals with TF-IDF gaps.""" - from scripts.resume_optimizer import extract_jd_signals - - llm_response = '["Gainsight", "cross-functional leadership", "stakeholder management"]' - - with patch("scripts.llm_router.LLMRouter") as MockRouter: - MockRouter.return_value.complete.return_value = llm_response - result = extract_jd_signals(SAMPLE_JD) - - assert "Gainsight" in result - assert "cross-functional leadership" in result - - -def test_extract_jd_signals_deduplicates(): - """extract_jd_signals deduplicates terms across LLM and TF-IDF sources.""" - from scripts.resume_optimizer import extract_jd_signals - - llm_response = '["Python", "AWS", "Python"]' - - with patch("scripts.llm_router.LLMRouter") as MockRouter: - MockRouter.return_value.complete.return_value = llm_response - result = extract_jd_signals(SAMPLE_JD) - - assert result.count("Python") == 1 - - -def test_extract_jd_signals_handles_malformed_llm_json(): - """extract_jd_signals falls back gracefully when LLM returns non-JSON.""" - from scripts.resume_optimizer import extract_jd_signals - - with patch("scripts.llm_router.LLMRouter") as MockRouter: - MockRouter.return_value.complete.return_value = "Here are some keywords: Gainsight, AWS" - result = extract_jd_signals(SAMPLE_JD) - - # Should still return a list (may be empty if TF-IDF also silent) - assert isinstance(result, list) - - -# ── prioritize_gaps ─────────────────────────────────────────────────────────── - -def test_prioritize_gaps_skips_existing_terms(): - """prioritize_gaps excludes terms already present in the resume.""" - from scripts.resume_optimizer import prioritize_gaps - - # "Salesforce" is already in SAMPLE_RESUME skills - result = prioritize_gaps(["Salesforce", "Gainsight"], SAMPLE_RESUME) - terms = [r["term"] for r in result] - - assert "Salesforce" not in terms - assert "Gainsight" in terms - - -def test_prioritize_gaps_routes_tech_terms_to_skills(): - """prioritize_gaps maps known tech keywords to the skills section at priority 1.""" - from scripts.resume_optimizer import prioritize_gaps - - result = prioritize_gaps(["AWS", "Docker"], SAMPLE_RESUME) - by_term = {r["term"]: r for r in result} - - assert by_term["AWS"]["section"] == "skills" - assert by_term["AWS"]["priority"] == 1 - assert by_term["Docker"]["section"] == "skills" - - -def test_prioritize_gaps_routes_leadership_terms_to_summary(): - """prioritize_gaps maps leadership/executive signals to the summary section.""" - from scripts.resume_optimizer import prioritize_gaps - - result = prioritize_gaps(["cross-functional", "stakeholder"], SAMPLE_RESUME) - by_term = {r["term"]: r for r in result} - - assert by_term["cross-functional"]["section"] == "summary" - assert by_term["stakeholder"]["section"] == "summary" - - -def test_prioritize_gaps_multi_word_routes_to_experience(): - """Multi-word phrases not in skills/summary lists go to experience at priority 2.""" - from scripts.resume_optimizer import prioritize_gaps - - result = prioritize_gaps(["proactive client engagement"], SAMPLE_RESUME) - assert result[0]["section"] == "experience" - assert result[0]["priority"] == 2 - - -def test_prioritize_gaps_single_word_is_lowest_priority(): - """Single generic words not in any list go to experience at priority 3.""" - from scripts.resume_optimizer import prioritize_gaps - - result = prioritize_gaps(["innovation"], SAMPLE_RESUME) - assert result[0]["priority"] == 3 - - -def test_prioritize_gaps_sorted_by_priority(): - """prioritize_gaps output is sorted ascending by priority (1 first).""" - from scripts.resume_optimizer import prioritize_gaps - - gaps = ["innovation", "AWS", "cross-functional", "managed service contracts"] - result = prioritize_gaps(gaps, SAMPLE_RESUME) - priorities = [r["priority"] for r in result] - - assert priorities == sorted(priorities) - - -# ── hallucination_check ─────────────────────────────────────────────────────── - -def test_hallucination_check_passes_unchanged_resume(): - """hallucination_check returns True when rewrite has no new employers or institutions.""" - from scripts.resume_optimizer import hallucination_check - - # Shallow rewrite: same structure - rewritten = { - **SAMPLE_RESUME, - "career_summary": "Dynamic CSM with cross-functional stakeholder management experience.", - } - assert hallucination_check(SAMPLE_RESUME, rewritten) is True - - -def test_hallucination_check_fails_on_new_employer(): - """hallucination_check returns False when a new company is introduced.""" - from scripts.resume_optimizer import hallucination_check - - fabricated_entry = { - "title": "VP of Customer Success", - "company": "Fabricated Corp", - "start_date": "2019", - "end_date": "2021", - "bullets": ["Led a team of 30."], - } - rewritten = dict(SAMPLE_RESUME) - rewritten["experience"] = SAMPLE_RESUME["experience"] + [fabricated_entry] - - assert hallucination_check(SAMPLE_RESUME, rewritten) is False - - -def test_hallucination_check_fails_on_new_institution(): - """hallucination_check returns False when a new educational institution appears.""" - from scripts.resume_optimizer import hallucination_check - - rewritten = dict(SAMPLE_RESUME) - rewritten["education"] = [ - *SAMPLE_RESUME["education"], - {"degree": "M.S.", "field": "Data Science", "institution": "MIT", "graduation_year": "2020"}, - ] - - assert hallucination_check(SAMPLE_RESUME, rewritten) is False - - -# ── render_resume_text ──────────────────────────────────────────────────────── - -def test_render_resume_text_contains_all_sections(): - """render_resume_text produces plain text containing all resume sections.""" - from scripts.resume_optimizer import render_resume_text - - text = render_resume_text(SAMPLE_RESUME) - - assert "Alex Rivera" in text - assert "SUMMARY" in text - assert "EXPERIENCE" in text - assert "Customer Success Manager" in text - assert "Acme Corp" in text - assert "EDUCATION" in text - assert "State University" in text - assert "SKILLS" in text - assert "Salesforce" in text - - -def test_render_resume_text_omits_empty_sections(): - """render_resume_text skips sections that have no content.""" - from scripts.resume_optimizer import render_resume_text - - sparse = { - "name": "Jordan Lee", - "email": "", - "phone": "", - "career_summary": "", - "skills": [], - "experience": [], - "education": [], - "achievements": [], - } - text = render_resume_text(sparse) - - assert "EXPERIENCE" not in text - assert "SKILLS" not in text - - -# ── db integration ──────────────────────────────────────────────────────────── - -def test_save_and_get_optimized_resume(tmp_path): - """save_optimized_resume persists and get_optimized_resume retrieves the data.""" - from scripts.db import init_db, save_optimized_resume, get_optimized_resume - - db_path = tmp_path / "test.db" - init_db(db_path) - - # Insert a minimal job to satisfy FK - import sqlite3 - conn = sqlite3.connect(db_path) - conn.execute( - "INSERT INTO jobs (id, title, company, url, source, status) VALUES (1, 'CSM', 'Acme', 'http://x.com', 'test', 'approved')" - ) - conn.commit() - conn.close() - - gap_report = json.dumps([{"term": "Gainsight", "section": "skills", "priority": 1, "rationale": "test"}]) - save_optimized_resume(db_path, job_id=1, text="Rewritten resume text.", gap_report=gap_report) - - result = get_optimized_resume(db_path, job_id=1) - assert result["optimized_resume"] == "Rewritten resume text." - parsed = json.loads(result["ats_gap_report"]) - assert parsed[0]["term"] == "Gainsight" - - -def test_get_optimized_resume_returns_empty_for_missing(tmp_path): - """get_optimized_resume returns empty strings when no record exists.""" - from scripts.db import init_db, get_optimized_resume - - db_path = tmp_path / "test.db" - init_db(db_path) - - result = get_optimized_resume(db_path, job_id=999) - assert result["optimized_resume"] == "" - assert result["ats_gap_report"] == "" diff --git a/tests/test_task_scheduler.py b/tests/test_task_scheduler.py index 38b88ff..7746ca4 100644 --- a/tests/test_task_scheduler.py +++ b/tests/test_task_scheduler.py @@ -109,33 +109,24 @@ def test_missing_budget_logs_warning(tmp_db, caplog): ts.LLM_TASK_TYPES = frozenset(original) -def test_cpu_only_system_creates_scheduler(tmp_db, monkeypatch): - """Scheduler constructs without error when _get_gpus() returns empty list. - - LocalScheduler has no VRAM gating — it runs tasks regardless of GPU count. - VRAM-aware scheduling is handled by circuitforge_orch's coordinator. - """ +def test_cpu_only_system_gets_unlimited_vram(tmp_db, monkeypatch): + """_available_vram is 999.0 when _get_gpus() returns empty list.""" + # Patch the module-level _get_gpus in task_scheduler (not preflight) + # so __init__'s _ts_mod._get_gpus() call picks up the mock. monkeypatch.setattr("scripts.task_scheduler._get_gpus", lambda: []) s = TaskScheduler(tmp_db, _noop_run_task) - # Scheduler still has correct budgets configured; no VRAM attribute expected - # Scheduler constructed successfully; budgets contain all LLM task types. - # Does not assert exact values -- a sibling test may write a config override - # to the shared pytest tmp dir, causing _load_config_overrides to pick it up. - assert set(s._budgets.keys()) >= LLM_TASK_TYPES + assert s._available_vram == 999.0 -def test_gpu_detection_does_not_affect_local_scheduler(tmp_db, monkeypatch): - """LocalScheduler ignores GPU VRAM — it has no _available_vram attribute. - - VRAM-gated concurrency requires circuitforge_orch (Paid tier). - """ +def test_gpu_vram_summed_across_all_gpus(tmp_db, monkeypatch): + """_available_vram sums vram_total_gb across all detected GPUs.""" fake_gpus = [ {"name": "RTX 3090", "vram_total_gb": 24.0, "vram_free_gb": 20.0}, {"name": "RTX 3090", "vram_total_gb": 24.0, "vram_free_gb": 18.0}, ] monkeypatch.setattr("scripts.task_scheduler._get_gpus", lambda: fake_gpus) s = TaskScheduler(tmp_db, _noop_run_task) - assert not hasattr(s, "_available_vram") + assert s._available_vram == 48.0 def test_enqueue_adds_taskspec_to_deque(tmp_db): @@ -215,37 +206,40 @@ def _make_recording_run_task(log: list, done_event: threading.Event, expected: i return _run -def _start_scheduler(tmp_db, run_task_fn): +def _start_scheduler(tmp_db, run_task_fn, available_vram=999.0): s = TaskScheduler(tmp_db, run_task_fn) + s._available_vram = available_vram s.start() return s # ── Tests ───────────────────────────────────────────────────────────────────── -def test_all_task_types_complete(tmp_db): - """Scheduler runs tasks from multiple types; all complete. - - LocalScheduler runs type batches concurrently (no VRAM gating). - VRAM-gated sequential scheduling requires circuitforge_orch. - """ +def test_deepest_queue_wins_first_slot(tmp_db): + """Type with more queued tasks starts first when VRAM only fits one type.""" log, done = [], threading.Event() + # Build scheduler but DO NOT start it yet — enqueue all tasks first + # so the scheduler sees the full picture on its very first wake. run_task_fn = _make_recording_run_task(log, done, 4) s = TaskScheduler(tmp_db, run_task_fn) + s._available_vram = 3.0 # fits cover_letter (2.5) but not +company_research (5.0) + # Enqueue cover_letter (3 tasks) and company_research (1 task) before start. + # cover_letter has the deeper queue and must win the first batch slot. for i in range(3): s.enqueue(i + 1, "cover_letter", i + 1, None) s.enqueue(4, "company_research", 4, None) - s.start() + s.start() # scheduler now sees all tasks atomically on its first iteration assert done.wait(timeout=5.0), "timed out — not all 4 tasks completed" s.shutdown() assert len(log) == 4 - cl = [t for _, t in log if t == "cover_letter"] - cr = [t for _, t in log if t == "company_research"] + cl = [i for i, (_, t) in enumerate(log) if t == "cover_letter"] + cr = [i for i, (_, t) in enumerate(log) if t == "company_research"] assert len(cl) == 3 and len(cr) == 1 + assert max(cl) < min(cr), "All cover_letter tasks must finish before company_research starts" def test_fifo_within_type(tmp_db): @@ -262,8 +256,8 @@ def test_fifo_within_type(tmp_db): assert [task_id for task_id, _ in log] == [10, 20, 30] -def test_concurrent_batches_different_types(tmp_db): - """Two type batches run concurrently (LocalScheduler has no VRAM gating).""" +def test_concurrent_batches_when_vram_allows(tmp_db): + """Two type batches start simultaneously when VRAM fits both.""" started = {"cover_letter": threading.Event(), "company_research": threading.Event()} all_done = threading.Event() log = [] @@ -274,7 +268,8 @@ def test_concurrent_batches_different_types(tmp_db): if len(log) >= 2: all_done.set() - s = _start_scheduler(tmp_db, run_task) + # VRAM=10.0 fits both cover_letter (2.5) and company_research (5.0) simultaneously + s = _start_scheduler(tmp_db, run_task, available_vram=10.0) s.enqueue(1, "cover_letter", 1, None) s.enqueue(2, "company_research", 2, None) @@ -312,15 +307,8 @@ def test_new_tasks_picked_up_mid_batch(tmp_db): assert log == [1, 2] -@pytest.mark.filterwarnings("ignore::pytest.PytestUnhandledThreadExceptionWarning") -def test_worker_crash_does_not_stall_scheduler(tmp_db): - """If _run_task raises, the scheduler continues processing the next task. - - The batch_worker intentionally lets the RuntimeError propagate to the thread - boundary (so LocalScheduler can detect crash vs. normal exit). This produces - a PytestUnhandledThreadExceptionWarning -- suppressed here because it is the - expected behavior under test. - """ +def test_worker_crash_releases_vram(tmp_db): + """If _run_task raises, _reserved_vram returns to 0 and scheduler continues.""" log, done = [], threading.Event() def run_task(db_path, task_id, task_type, job_id, params): @@ -329,15 +317,16 @@ def test_worker_crash_does_not_stall_scheduler(tmp_db): log.append(task_id) done.set() - s = _start_scheduler(tmp_db, run_task) + s = _start_scheduler(tmp_db, run_task, available_vram=3.0) s.enqueue(1, "cover_letter", 1, None) s.enqueue(2, "cover_letter", 2, None) assert done.wait(timeout=5.0), "timed out — task 2 never completed after task 1 crash" s.shutdown() - # Second task still ran despite first crashing + # Second task still ran, VRAM was released assert 2 in log + assert s._reserved_vram == 0.0 def test_get_scheduler_returns_singleton(tmp_db): @@ -481,14 +470,3 @@ def test_llm_tasks_routed_to_scheduler(tmp_db): task_runner.submit_task(tmp_db, "cover_letter", 1) assert "cover_letter" in enqueue_calls - - -def test_shim_exports_unchanged_api(): - """Peregrine shim must re-export LLM_TASK_TYPES, get_scheduler, reset_scheduler.""" - from scripts.task_scheduler import LLM_TASK_TYPES, get_scheduler, reset_scheduler - assert "cover_letter" in LLM_TASK_TYPES - assert "company_research" in LLM_TASK_TYPES - assert "wizard_generate" in LLM_TASK_TYPES - assert "resume_optimize" in LLM_TASK_TYPES - assert callable(get_scheduler) - assert callable(reset_scheduler) diff --git a/tests/test_ui_switcher.py b/tests/test_ui_switcher.py deleted file mode 100644 index 9c79c83..0000000 --- a/tests/test_ui_switcher.py +++ /dev/null @@ -1,105 +0,0 @@ -"""Tests for app/components/ui_switcher.py. - -Streamlit is not running during tests — mock all st.* calls. -""" -import sys -from pathlib import Path -from unittest.mock import patch -import pytest -import yaml - -sys.path.insert(0, str(Path(__file__).parent.parent)) - - -@pytest.fixture -def profile_yaml(tmp_path): - data = {"name": "Test", "ui_preference": "streamlit", "wizard_complete": True} - p = tmp_path / "user.yaml" - p.write_text(yaml.dump(data)) - return p - - -def test_sync_cookie_injects_vue_js(profile_yaml, monkeypatch): - """When ui_preference is vue, JS sets prgn_ui=vue.""" - import yaml as _yaml - profile_yaml.write_text(_yaml.dump({"name": "T", "ui_preference": "vue"})) - - injected = [] - monkeypatch.setattr("streamlit.components.v1.html", lambda html, height=0: injected.append(html)) - monkeypatch.setattr("streamlit.query_params", {}, raising=False) - - from app.components.ui_switcher import sync_ui_cookie - sync_ui_cookie(profile_yaml, tier="paid") - - assert any("prgn_ui=vue" in s for s in injected) - - -def test_sync_cookie_injects_streamlit_js(profile_yaml, monkeypatch): - """When ui_preference is streamlit, JS sets prgn_ui=streamlit.""" - injected = [] - monkeypatch.setattr("streamlit.components.v1.html", lambda html, height=0: injected.append(html)) - monkeypatch.setattr("streamlit.query_params", {}, raising=False) - - from app.components.ui_switcher import sync_ui_cookie - sync_ui_cookie(profile_yaml, tier="paid") - - assert any("prgn_ui=streamlit" in s for s in injected) - - -def test_sync_cookie_prgn_switch_param_overrides_yaml(profile_yaml, monkeypatch): - """?prgn_switch=streamlit in query params resets ui_preference to streamlit.""" - import yaml as _yaml - profile_yaml.write_text(_yaml.dump({"name": "T", "ui_preference": "vue"})) - - injected = [] - monkeypatch.setattr("streamlit.components.v1.html", lambda html, height=0: injected.append(html)) - monkeypatch.setattr("streamlit.query_params", {"prgn_switch": "streamlit"}, raising=False) - - with patch('app.components.ui_switcher._DEMO_MODE', False): - from app.components.ui_switcher import sync_ui_cookie - sync_ui_cookie(profile_yaml, tier="paid") - - # user.yaml should now say streamlit - saved = _yaml.safe_load(profile_yaml.read_text()) - assert saved["ui_preference"] == "streamlit" - # JS should set cookie to streamlit - assert any("prgn_ui=streamlit" in s for s in injected) - - -def test_sync_cookie_free_tier_keeps_vue(profile_yaml, monkeypatch): - """Free-tier user with vue preference keeps vue (vue_ui_beta is free tier). - - Previously this test verified a downgrade to streamlit. Vue SPA was opened - to free tier in issue #20 — the downgrade path no longer triggers. - """ - import yaml as _yaml - profile_yaml.write_text(_yaml.dump({"name": "T", "ui_preference": "vue"})) - - injected = [] - monkeypatch.setattr("streamlit.components.v1.html", lambda html, height=0: injected.append(html)) - monkeypatch.setattr("streamlit.query_params", {}, raising=False) - - with patch('app.components.ui_switcher._DEMO_MODE', False): - from app.components.ui_switcher import sync_ui_cookie - sync_ui_cookie(profile_yaml, tier="free") - - saved = _yaml.safe_load(profile_yaml.read_text()) - assert saved["ui_preference"] == "vue" - assert any("prgn_ui=vue" in s for s in injected) - - -def test_switch_ui_writes_yaml_and_calls_sync(profile_yaml, monkeypatch): - """switch_ui(to='vue') writes user.yaml and calls sync.""" - import yaml as _yaml - synced = [] - monkeypatch.setattr("streamlit.components.v1.html", lambda html, height=0: synced.append(html)) - monkeypatch.setattr("streamlit.query_params", {}, raising=False) - monkeypatch.setattr("streamlit.rerun", lambda: None) - - with patch('app.components.ui_switcher._DEMO_MODE', False): - from app.components.ui_switcher import switch_ui - switch_ui(profile_yaml, to="vue", tier="paid") - - saved = _yaml.safe_load(profile_yaml.read_text()) - assert saved["ui_preference"] == "vue" - assert any("prgn_ui=vue" in s for s in synced) diff --git a/tests/test_user_profile.py b/tests/test_user_profile.py index 84c1d72..88c4c88 100644 --- a/tests/test_user_profile.py +++ b/tests/test_user_profile.py @@ -106,34 +106,3 @@ def test_effective_tier_no_override(tmp_path): p.write_text("name: T\nemail: t@t.com\ncareer_summary: x\ntier: paid\n") u = UserProfile(p) assert u.effective_tier == "paid" - -def test_ui_preference_default(tmp_path): - """Fresh profile defaults to streamlit.""" - p = tmp_path / "user.yaml" - p.write_text("name: Test User\n") - profile = UserProfile(p) - assert profile.ui_preference == "streamlit" - -def test_ui_preference_vue(tmp_path): - """Saved vue preference loads correctly.""" - p = tmp_path / "user.yaml" - p.write_text("name: Test\nui_preference: vue\n") - profile = UserProfile(p) - assert profile.ui_preference == "vue" - -def test_ui_preference_roundtrip(tmp_path): - """Saving ui_preference: vue persists and reloads.""" - p = tmp_path / "user.yaml" - p.write_text("name: Test\n") - profile = UserProfile(p) - profile.ui_preference = "vue" - profile.save() - reloaded = UserProfile(p) - assert reloaded.ui_preference == "vue" - -def test_ui_preference_invalid_falls_back(tmp_path): - """Unknown value falls back to streamlit.""" - p = tmp_path / "user.yaml" - p.write_text("name: Test\nui_preference: newui\n") - profile = UserProfile(p) - assert profile.ui_preference == "streamlit" diff --git a/tests/test_wizard_api.py b/tests/test_wizard_api.py deleted file mode 100644 index 3bf30d2..0000000 --- a/tests/test_wizard_api.py +++ /dev/null @@ -1,368 +0,0 @@ -"""Tests for wizard API endpoints (GET/POST /api/wizard/*).""" -import os -import sys -import yaml -import pytest -from pathlib import Path -from unittest.mock import patch, MagicMock -from fastapi.testclient import TestClient - -# ── Path bootstrap ──────────────────────────────────────────────────────────── -_REPO = Path(__file__).parent.parent -if str(_REPO) not in sys.path: - sys.path.insert(0, str(_REPO)) - - -@pytest.fixture(scope="module") -def client(): - from dev_api import app - return TestClient(app) - - -# ── Helpers ─────────────────────────────────────────────────────────────────── - -def _write_user_yaml(path: Path, data: dict | None = None) -> None: - path.parent.mkdir(parents=True, exist_ok=True) - payload = data if data is not None else {} - path.write_text(yaml.dump(payload, allow_unicode=True, default_flow_style=False)) - - -def _read_user_yaml(path: Path) -> dict: - if not path.exists(): - return {} - return yaml.safe_load(path.read_text()) or {} - - -# ── GET /api/config/app — wizardComplete + isDemo ───────────────────────────── - -class TestAppConfigWizardFields: - def test_wizard_complete_false_when_missing(self, client, tmp_path): - yaml_path = tmp_path / "config" / "user.yaml" - # user.yaml does not exist yet - with patch("dev_api._user_yaml_path", return_value=str(yaml_path)): - r = client.get("/api/config/app") - assert r.status_code == 200 - assert r.json()["wizardComplete"] is False - - def test_wizard_complete_true_when_set(self, client, tmp_path): - yaml_path = tmp_path / "config" / "user.yaml" - _write_user_yaml(yaml_path, {"wizard_complete": True}) - with patch("dev_api._user_yaml_path", return_value=str(yaml_path)): - r = client.get("/api/config/app") - assert r.json()["wizardComplete"] is True - - def test_is_demo_false_by_default(self, client, tmp_path): - yaml_path = tmp_path / "config" / "user.yaml" - _write_user_yaml(yaml_path, {"wizard_complete": True}) - with patch("dev_api._user_yaml_path", return_value=str(yaml_path)): - with patch.dict(os.environ, {"DEMO_MODE": ""}, clear=False): - r = client.get("/api/config/app") - assert r.json()["isDemo"] is False - - def test_is_demo_true_when_env_set(self, client, tmp_path): - yaml_path = tmp_path / "config" / "user.yaml" - _write_user_yaml(yaml_path, {"wizard_complete": True}) - with patch("dev_api._user_yaml_path", return_value=str(yaml_path)): - with patch.dict(os.environ, {"DEMO_MODE": "true"}, clear=False): - r = client.get("/api/config/app") - assert r.json()["isDemo"] is True - - -# ── GET /api/wizard/status ──────────────────────────────────────────────────── - -class TestWizardStatus: - def test_returns_not_complete_when_no_yaml(self, client, tmp_path): - yaml_path = tmp_path / "config" / "user.yaml" - with patch("dev_api._wizard_yaml_path", return_value=str(yaml_path)): - r = client.get("/api/wizard/status") - assert r.status_code == 200 - body = r.json() - assert body["wizard_complete"] is False - assert body["wizard_step"] == 0 - - def test_returns_saved_step(self, client, tmp_path): - yaml_path = tmp_path / "config" / "user.yaml" - _write_user_yaml(yaml_path, {"wizard_step": 3, "name": "Alex"}) - with patch("dev_api._wizard_yaml_path", return_value=str(yaml_path)): - r = client.get("/api/wizard/status") - body = r.json() - assert body["wizard_step"] == 3 - assert body["saved_data"]["name"] == "Alex" - - def test_returns_complete_true(self, client, tmp_path): - yaml_path = tmp_path / "config" / "user.yaml" - _write_user_yaml(yaml_path, {"wizard_complete": True}) - with patch("dev_api._wizard_yaml_path", return_value=str(yaml_path)): - r = client.get("/api/wizard/status") - assert r.json()["wizard_complete"] is True - - -# ── GET /api/wizard/hardware ────────────────────────────────────────────────── - -class TestWizardHardware: - def test_returns_profiles_list(self, client): - r = client.get("/api/wizard/hardware") - assert r.status_code == 200 - body = r.json() - assert set(body["profiles"]) == {"remote", "cpu", "single-gpu", "dual-gpu"} - assert "gpus" in body - assert "suggested_profile" in body - - def test_gpu_from_env_var(self, client): - with patch.dict(os.environ, {"PEREGRINE_GPU_NAMES": "RTX 4090,RTX 3080"}, clear=False): - r = client.get("/api/wizard/hardware") - body = r.json() - assert body["gpus"] == ["RTX 4090", "RTX 3080"] - assert body["suggested_profile"] == "dual-gpu" - - def test_single_gpu_suggests_single(self, client): - with patch.dict(os.environ, {"PEREGRINE_GPU_NAMES": "RTX 4090"}, clear=False): - with patch.dict(os.environ, {"RECOMMENDED_PROFILE": ""}, clear=False): - r = client.get("/api/wizard/hardware") - assert r.json()["suggested_profile"] == "single-gpu" - - def test_no_gpus_suggests_remote(self, client): - with patch.dict(os.environ, {"PEREGRINE_GPU_NAMES": ""}, clear=False): - with patch.dict(os.environ, {"RECOMMENDED_PROFILE": ""}, clear=False): - with patch("subprocess.check_output", side_effect=FileNotFoundError): - r = client.get("/api/wizard/hardware") - assert r.json()["suggested_profile"] == "remote" - assert r.json()["gpus"] == [] - - def test_recommended_profile_env_takes_priority(self, client): - with patch.dict(os.environ, - {"PEREGRINE_GPU_NAMES": "RTX 4090", "RECOMMENDED_PROFILE": "cpu"}, - clear=False): - r = client.get("/api/wizard/hardware") - assert r.json()["suggested_profile"] == "cpu" - - -# ── POST /api/wizard/step ───────────────────────────────────────────────────── - -class TestWizardStep: - def test_step1_saves_inference_profile(self, client, tmp_path): - yaml_path = tmp_path / "config" / "user.yaml" - _write_user_yaml(yaml_path, {}) - with patch("dev_api._wizard_yaml_path", return_value=str(yaml_path)): - r = client.post("/api/wizard/step", - json={"step": 1, "data": {"inference_profile": "single-gpu"}}) - assert r.status_code == 200 - saved = _read_user_yaml(yaml_path) - assert saved["inference_profile"] == "single-gpu" - assert saved["wizard_step"] == 1 - - def test_step1_rejects_unknown_profile(self, client, tmp_path): - yaml_path = tmp_path / "config" / "user.yaml" - _write_user_yaml(yaml_path, {}) - with patch("dev_api._wizard_yaml_path", return_value=str(yaml_path)): - r = client.post("/api/wizard/step", - json={"step": 1, "data": {"inference_profile": "turbo-gpu"}}) - assert r.status_code == 400 - - def test_step2_saves_tier(self, client, tmp_path): - yaml_path = tmp_path / "config" / "user.yaml" - _write_user_yaml(yaml_path, {}) - with patch("dev_api._wizard_yaml_path", return_value=str(yaml_path)): - r = client.post("/api/wizard/step", - json={"step": 2, "data": {"tier": "paid"}}) - assert r.status_code == 200 - assert _read_user_yaml(yaml_path)["tier"] == "paid" - - def test_step2_rejects_unknown_tier(self, client, tmp_path): - yaml_path = tmp_path / "config" / "user.yaml" - _write_user_yaml(yaml_path, {}) - with patch("dev_api._wizard_yaml_path", return_value=str(yaml_path)): - r = client.post("/api/wizard/step", - json={"step": 2, "data": {"tier": "enterprise"}}) - assert r.status_code == 400 - - def test_step3_writes_resume_yaml(self, client, tmp_path): - yaml_path = tmp_path / "config" / "user.yaml" - _write_user_yaml(yaml_path, {}) - resume = {"experience": [{"title": "Engineer", "company": "Acme"}]} - with patch("dev_api._wizard_yaml_path", return_value=str(yaml_path)): - r = client.post("/api/wizard/step", - json={"step": 3, "data": {"resume": resume}}) - assert r.status_code == 200 - resume_path = yaml_path.parent / "plain_text_resume.yaml" - assert resume_path.exists() - saved_resume = yaml.safe_load(resume_path.read_text()) - assert saved_resume["experience"][0]["title"] == "Engineer" - - def test_step4_saves_identity_fields(self, client, tmp_path): - yaml_path = tmp_path / "config" / "user.yaml" - _write_user_yaml(yaml_path, {}) - identity = { - "name": "Alex Rivera", - "email": "alex@example.com", - "phone": "555-1234", - "linkedin": "https://linkedin.com/in/alex", - "career_summary": "Experienced engineer.", - } - with patch("dev_api._wizard_yaml_path", return_value=str(yaml_path)): - r = client.post("/api/wizard/step", json={"step": 4, "data": identity}) - assert r.status_code == 200 - saved = _read_user_yaml(yaml_path) - assert saved["name"] == "Alex Rivera" - assert saved["career_summary"] == "Experienced engineer." - assert saved["wizard_step"] == 4 - - def test_step5_writes_env_keys(self, client, tmp_path): - yaml_path = tmp_path / "config" / "user.yaml" - env_path = tmp_path / ".env" - env_path.write_text("SOME_KEY=existing\n") - _write_user_yaml(yaml_path, {}) - # Patch both _wizard_yaml_path and the Path resolution inside wizard_save_step - with patch("dev_api._wizard_yaml_path", return_value=str(yaml_path)): - with patch("dev_api.Path") as mock_path_cls: - # Only intercept the .env path construction; let other Path() calls pass through - real_path = Path - def path_side_effect(*args): - result = real_path(*args) - return result - mock_path_cls.side_effect = path_side_effect - - # Direct approach: monkeypatch the env path - import dev_api as _dev_api - original_fn = _dev_api.wizard_save_step - - # Simpler: just test via the real endpoint, verify env not written if no key given - r = client.post("/api/wizard/step", - json={"step": 5, "data": {"services": {"ollama_host": "localhost"}}}) - assert r.status_code == 200 - - def test_step6_writes_search_profiles(self, client, tmp_path): - yaml_path = tmp_path / "config" / "user.yaml" - search_path = tmp_path / "config" / "search_profiles.yaml" - _write_user_yaml(yaml_path, {}) - with patch("dev_api._wizard_yaml_path", return_value=str(yaml_path)): - with patch("dev_api._search_prefs_path", return_value=search_path): - r = client.post("/api/wizard/step", - json={"step": 6, "data": { - "titles": ["Software Engineer", "Backend Developer"], - "locations": ["Remote", "Austin, TX"], - }}) - assert r.status_code == 200 - assert search_path.exists() - prefs = yaml.safe_load(search_path.read_text()) - assert prefs["default"]["job_titles"] == ["Software Engineer", "Backend Developer"] - assert "Remote" in prefs["default"]["location"] - - def test_step7_only_advances_counter(self, client, tmp_path): - yaml_path = tmp_path / "config" / "user.yaml" - _write_user_yaml(yaml_path, {}) - with patch("dev_api._wizard_yaml_path", return_value=str(yaml_path)): - r = client.post("/api/wizard/step", json={"step": 7, "data": {}}) - assert r.status_code == 200 - assert _read_user_yaml(yaml_path)["wizard_step"] == 7 - - def test_invalid_step_number(self, client, tmp_path): - yaml_path = tmp_path / "config" / "user.yaml" - _write_user_yaml(yaml_path, {}) - with patch("dev_api._wizard_yaml_path", return_value=str(yaml_path)): - r = client.post("/api/wizard/step", json={"step": 99, "data": {}}) - assert r.status_code == 400 - - def test_crash_recovery_round_trip(self, client, tmp_path): - """Save steps 1-4 sequentially, then verify status reflects step 4.""" - yaml_path = tmp_path / "config" / "user.yaml" - _write_user_yaml(yaml_path, {}) - steps = [ - (1, {"inference_profile": "cpu"}), - (2, {"tier": "free"}), - (4, {"name": "Alex", "email": "a@b.com", "career_summary": "Eng."}), - ] - with patch("dev_api._wizard_yaml_path", return_value=str(yaml_path)): - for step, data in steps: - r = client.post("/api/wizard/step", json={"step": step, "data": data}) - assert r.status_code == 200 - - r = client.get("/api/wizard/status") - - body = r.json() - assert body["wizard_step"] == 4 - assert body["saved_data"]["name"] == "Alex" - assert body["saved_data"]["inference_profile"] == "cpu" - - -# ── POST /api/wizard/inference/test ────────────────────────────────────────── - -class TestWizardInferenceTest: - def test_local_profile_ollama_running(self, client): - mock_resp = MagicMock() - mock_resp.status_code = 200 - with patch("dev_api.requests.get", return_value=mock_resp): - r = client.post("/api/wizard/inference/test", - json={"profile": "cpu", "ollama_host": "localhost", - "ollama_port": 11434}) - assert r.status_code == 200 - body = r.json() - assert body["ok"] is True - assert "Ollama" in body["message"] - - def test_local_profile_ollama_down_soft_fail(self, client): - import requests as _req - with patch("dev_api.requests.get", side_effect=_req.exceptions.ConnectionError): - r = client.post("/api/wizard/inference/test", - json={"profile": "single-gpu"}) - assert r.status_code == 200 - body = r.json() - assert body["ok"] is False - assert "configure" in body["message"].lower() - - def test_remote_profile_llm_responding(self, client): - # LLMRouter is imported inside wizard_test_inference — patch the source module - with patch("scripts.llm_router.LLMRouter") as mock_cls: - mock_cls.return_value.complete.return_value = "OK" - r = client.post("/api/wizard/inference/test", - json={"profile": "remote", "anthropic_key": "sk-ant-test"}) - assert r.status_code == 200 - assert r.json()["ok"] is True - - def test_remote_profile_llm_error(self, client): - with patch("scripts.llm_router.LLMRouter") as mock_cls: - mock_cls.return_value.complete.side_effect = RuntimeError("no key") - r = client.post("/api/wizard/inference/test", - json={"profile": "remote"}) - assert r.status_code == 200 - body = r.json() - assert body["ok"] is False - assert "failed" in body["message"].lower() - - -# ── POST /api/wizard/complete ───────────────────────────────────────────────── - -class TestWizardComplete: - def test_sets_wizard_complete_true(self, client, tmp_path): - yaml_path = tmp_path / "config" / "user.yaml" - _write_user_yaml(yaml_path, {"wizard_step": 6, "name": "Alex"}) - # apply_service_urls is a local import inside wizard_complete — patch source module - with patch("dev_api._wizard_yaml_path", return_value=str(yaml_path)): - with patch("scripts.generate_llm_config.apply_service_urls", - side_effect=Exception("no llm.yaml")): - r = client.post("/api/wizard/complete") - assert r.status_code == 200 - assert r.json()["ok"] is True - saved = _read_user_yaml(yaml_path) - assert saved["wizard_complete"] is True - assert "wizard_step" not in saved - assert saved["name"] == "Alex" # other fields preserved - - def test_complete_removes_wizard_step(self, client, tmp_path): - yaml_path = tmp_path / "config" / "user.yaml" - _write_user_yaml(yaml_path, {"wizard_step": 7, "tier": "paid"}) - with patch("dev_api._wizard_yaml_path", return_value=str(yaml_path)): - with patch("scripts.generate_llm_config.apply_service_urls", return_value=None): - client.post("/api/wizard/complete") - saved = _read_user_yaml(yaml_path) - assert "wizard_step" not in saved - assert saved["tier"] == "paid" - - def test_complete_tolerates_missing_llm_yaml(self, client, tmp_path): - yaml_path = tmp_path / "config" / "user.yaml" - _write_user_yaml(yaml_path, {}) - with patch("dev_api._wizard_yaml_path", return_value=str(yaml_path)): - # llm.yaml doesn't exist → apply_service_urls is never called, no error - r = client.post("/api/wizard/complete") - assert r.status_code == 200 - assert r.json()["ok"] is True diff --git a/tests/test_wizard_tiers.py b/tests/test_wizard_tiers.py index a1252c6..325f0b5 100644 --- a/tests/test_wizard_tiers.py +++ b/tests/test_wizard_tiers.py @@ -1,16 +1,12 @@ import sys from pathlib import Path -from unittest.mock import patch - sys.path.insert(0, str(Path(__file__).parent.parent)) from app.wizard.tiers import can_use, tier_label, TIERS, FEATURES, BYOK_UNLOCKABLE def test_tiers_list(): - # Peregrine uses the core tier list; "ultra" is included but no features require it yet - assert TIERS[:3] == ["free", "paid", "premium"] - assert "ultra" in TIERS + assert TIERS == ["free", "paid", "premium"] def test_can_use_free_feature_always(): @@ -116,42 +112,3 @@ def test_byok_false_preserves_original_gating(): # has_byok=False (default) must not change existing behaviour assert can_use("free", "company_research", has_byok=False) is False assert can_use("paid", "company_research", has_byok=False) is True - - -# ── Vue UI Beta & Demo Tier tests ────────────────────────────────────────────── - -def test_vue_ui_beta_free_tier(): - # Vue SPA is open to all tiers (issue #20 — beta restriction removed) - assert can_use("free", "vue_ui_beta") is True - - -def test_vue_ui_beta_paid_tier(): - assert can_use("paid", "vue_ui_beta") is True - - -def test_vue_ui_beta_premium_tier(): - assert can_use("premium", "vue_ui_beta") is True - - -def test_can_use_demo_tier_overrides_real_tier(): - # demo_tier="paid" overrides real tier "free" when DEMO_MODE is active - with patch('app.wizard.tiers._DEMO_MODE', True): - assert can_use("free", "company_research", demo_tier="paid") is True - - -def test_can_use_demo_tier_free_restricts(): - # demo_tier="free" restricts access even if real tier is "paid" - with patch('app.wizard.tiers._DEMO_MODE', True): - assert can_use("paid", "model_fine_tuning", demo_tier="free") is False - - -def test_can_use_demo_tier_none_falls_back_to_real(): - # demo_tier=None means no override regardless of DEMO_MODE - with patch('app.wizard.tiers._DEMO_MODE', True): - assert can_use("paid", "company_research", demo_tier=None) is True - - -def test_can_use_demo_tier_does_not_affect_non_demo(): - # When _DEMO_MODE is False, demo_tier is ignored - with patch('app.wizard.tiers._DEMO_MODE', False): - assert can_use("free", "company_research", demo_tier="paid") is False diff --git a/web/public/peregrine.svg b/web/public/peregrine.svg deleted file mode 100644 index 7653d13..0000000 --- a/web/public/peregrine.svg +++ /dev/null @@ -1,165 +0,0 @@ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - diff --git a/web/src/App.vue b/web/src/App.vue index 28efa08..7bee901 100644 --- a/web/src/App.vue +++ b/web/src/App.vue @@ -1,9 +1,9 @@ @@ -100,14 +94,4 @@ body { padding-bottom: calc(56px + env(safe-area-inset-bottom)); } } - -/* Wizard: full-bleed, no sidebar offset, no tab-bar clearance */ -.app-root--wizard { - display: block; -} - -.app-main--wizard { - margin-left: 0; - padding-bottom: 0; -} diff --git a/web/src/assets/theme.css b/web/src/assets/theme.css index 6150a0c..4bf7491 100644 --- a/web/src/assets/theme.css +++ b/web/src/assets/theme.css @@ -73,11 +73,11 @@ } /* ── Accessible Solarpunk — dark (system dark mode) ─ - Activates when OS/browser is in dark mode AND no - explicit theme is selected. Explicit [data-theme="*"] - always wins over the system preference. */ + Activates when OS/browser is in dark mode. + Uses :not([data-theme="hacker"]) so the Konami easter + egg always wins over the system preference. */ @media (prefers-color-scheme: dark) { - :root:not([data-theme]) { + :root:not([data-theme="hacker"]) { /* Brand — lighter greens readable on dark surfaces */ --color-primary: #6ab870; --color-primary-hover: #7ecb84; @@ -161,153 +161,6 @@ --color-accent-glow-lg: rgba(0, 255, 65, 0.6); } -/* ── Explicit light — forces light even on dark-OS ─ */ -[data-theme="light"] { - --color-primary: #2d5a27; - --color-primary-hover: #234820; - --color-primary-light: #e8f2e7; - --color-surface: #eaeff8; - --color-surface-alt: #dde4f0; - --color-surface-raised: #f5f7fc; - --color-border: #a8b8d0; - --color-border-light: #ccd5e6; - --color-text: #1a2338; - --color-text-muted: #4a5c7a; - --color-text-inverse: #eaeff8; - --color-accent: #c4732a; - --color-accent-hover: #a85c1f; - --color-accent-light: #fdf0e4; - --color-success: #3a7a32; - --color-error: #c0392b; - --color-warning: #d4891a; - --color-info: #1e6091; - --shadow-sm: 0 1px 3px rgba(26, 35, 56, 0.08), 0 1px 2px rgba(26, 35, 56, 0.04); - --shadow-md: 0 4px 12px rgba(26, 35, 56, 0.1), 0 2px 4px rgba(26, 35, 56, 0.06); - --shadow-lg: 0 10px 30px rgba(26, 35, 56, 0.12), 0 4px 8px rgba(26, 35, 56, 0.06); -} - -/* ── Explicit dark — forces dark even on light-OS ── */ -[data-theme="dark"] { - --color-primary: #6ab870; - --color-primary-hover: #7ecb84; - --color-primary-light: #162616; - --color-surface: #16202e; - --color-surface-alt: #1e2a3a; - --color-surface-raised: #263547; - --color-border: #2d4060; - --color-border-light: #233352; - --color-text: #e4eaf5; - --color-text-muted: #8da0bc; - --color-text-inverse: #16202e; - --color-accent: #e8a84a; - --color-accent-hover: #f5bc60; - --color-accent-light: #2d1e0a; - --color-success: #5eb85e; - --color-error: #e05252; - --color-warning: #e8a84a; - --color-info: #4da6e8; - --shadow-sm: 0 1px 3px rgba(0, 0, 0, 0.3), 0 1px 2px rgba(0, 0, 0, 0.2); - --shadow-md: 0 4px 12px rgba(0, 0, 0, 0.35), 0 2px 4px rgba(0, 0, 0, 0.2); - --shadow-lg: 0 10px 30px rgba(0, 0, 0, 0.4), 0 4px 8px rgba(0, 0, 0, 0.2); -} - -/* ── Solarized Dark ──────────────────────────────── */ -/* Ethan Schoonover's Solarized palette (dark variant) */ -[data-theme="solarized-dark"] { - --color-primary: #2aa198; /* cyan — used as primary brand color */ - --color-primary-hover: #35b8ad; - --color-primary-light: #002b36; - - --color-surface: #002b36; /* base03 */ - --color-surface-alt: #073642; /* base02 */ - --color-surface-raised: #0d4352; - - --color-border: #073642; - --color-border-light: #0a4a5a; - - --color-text: #839496; /* base0 */ - --color-text-muted: #657b83; /* base00 */ - --color-text-inverse: #002b36; - - --color-accent: #b58900; /* yellow */ - --color-accent-hover: #cb9f10; - --color-accent-light: #1a1300; - - --color-success: #859900; /* green */ - --color-error: #dc322f; /* red */ - --color-warning: #b58900; /* yellow */ - --color-info: #268bd2; /* blue */ - - --shadow-sm: 0 1px 3px rgba(0, 0, 0, 0.4), 0 1px 2px rgba(0, 0, 0, 0.3); - --shadow-md: 0 4px 12px rgba(0, 0, 0, 0.45), 0 2px 4px rgba(0, 0, 0, 0.3); - --shadow-lg: 0 10px 30px rgba(0, 0, 0, 0.5), 0 4px 8px rgba(0, 0, 0, 0.3); -} - -/* ── Solarized Light ─────────────────────────────── */ -[data-theme="solarized-light"] { - --color-primary: #2aa198; /* cyan */ - --color-primary-hover: #1e8a82; - --color-primary-light: #eee8d5; - - --color-surface: #fdf6e3; /* base3 */ - --color-surface-alt: #eee8d5; /* base2 */ - --color-surface-raised: #fffdf7; - - --color-border: #d3c9b0; - --color-border-light: #e4dacc; - - --color-text: #657b83; /* base00 */ - --color-text-muted: #839496; /* base0 */ - --color-text-inverse: #fdf6e3; - - --color-accent: #b58900; /* yellow */ - --color-accent-hover: #9a7300; - --color-accent-light: #fdf0c0; - - --color-success: #859900; /* green */ - --color-error: #dc322f; /* red */ - --color-warning: #b58900; /* yellow */ - --color-info: #268bd2; /* blue */ - - --shadow-sm: 0 1px 3px rgba(101, 123, 131, 0.12), 0 1px 2px rgba(101, 123, 131, 0.08); - --shadow-md: 0 4px 12px rgba(101, 123, 131, 0.15), 0 2px 4px rgba(101, 123, 131, 0.08); - --shadow-lg: 0 10px 30px rgba(101, 123, 131, 0.18), 0 4px 8px rgba(101, 123, 131, 0.08); -} - -/* ── Colorblind-safe (deuteranopia/protanopia) ────── */ -/* Avoids red/green confusion. Uses blue+orange as the - primary pair; cyan+magenta as semantic differentiators. - Based on Wong (2011) 8-color colorblind-safe palette. */ -[data-theme="colorblind"] { - --color-primary: #0072B2; /* blue — safe primary */ - --color-primary-hover: #005a8e; - --color-primary-light: #e0f0fa; - - --color-surface: #f4f6fb; - --color-surface-alt: #e6eaf4; - --color-surface-raised: #fafbfe; - - --color-border: #b0bcd8; - --color-border-light: #cdd5e8; - - --color-text: #1a2338; - --color-text-muted: #4a5c7a; - --color-text-inverse: #f4f6fb; - - --color-accent: #E69F00; /* orange — safe secondary */ - --color-accent-hover: #c98900; - --color-accent-light: #fdf4dc; - - --color-success: #009E73; /* teal-green — distinct from red/green confusion zone */ - --color-error: #CC0066; /* magenta-red — distinguishable from green */ - --color-warning: #E69F00; /* orange */ - --color-info: #56B4E9; /* sky blue */ - - --shadow-sm: 0 1px 3px rgba(26, 35, 56, 0.08), 0 1px 2px rgba(26, 35, 56, 0.04); - --shadow-md: 0 4px 12px rgba(26, 35, 56, 0.1), 0 2px 4px rgba(26, 35, 56, 0.06); - --shadow-lg: 0 10px 30px rgba(26, 35, 56, 0.12), 0 4px 8px rgba(26, 35, 56, 0.06); -} - /* ── Base resets ─────────────────────────────────── */ *, *::before, *::after { box-sizing: border-box; } diff --git a/web/src/components/AppNav.vue b/web/src/components/AppNav.vue index 04b102d..cd4af21 100644 --- a/web/src/components/AppNav.vue +++ b/web/src/components/AppNav.vue @@ -34,31 +34,12 @@ - - - @@ -95,10 +76,7 @@ import { } from '@heroicons/vue/24/outline' import { useDigestStore } from '../stores/digest' -import { useTheme, THEME_OPTIONS, type Theme } from '../composables/useTheme' - const digestStore = useDigestStore() -const { currentTheme, setTheme, restoreTheme } = useTheme() // Logo click easter egg — 9.6: Click the Bird 5× rapidly const logoClickCount = ref(0) @@ -123,25 +101,8 @@ const isHackerMode = computed(() => ) function exitHackerMode() { + delete document.documentElement.dataset.theme localStorage.removeItem('cf-hacker-mode') - restoreTheme() -} - -const _apiBase = import.meta.env.BASE_URL.replace(/\/$/, '') - -async function switchToClassic() { - // Persist preference via API so Streamlit reads streamlit from user.yaml - // and won't re-set the cookie back to vue (avoids the ?prgn_switch rerun cycle) - try { - await fetch(_apiBase + '/api/settings/ui-preference', { - method: 'POST', - headers: { 'Content-Type': 'application/json' }, - body: JSON.stringify({ preference: 'streamlit' }), - }) - } catch { /* non-fatal — cookie below is enough for immediate redirect */ } - document.cookie = 'prgn_ui=streamlit; path=/; SameSite=Lax' - // Navigate to root (no query params) — Caddy routes to Streamlit based on cookie - window.location.href = window.location.origin + '/' } const navLinks = computed(() => [ @@ -311,70 +272,6 @@ const mobileLinks = [ margin: 0; } -.sidebar__classic-btn { - display: flex; - align-items: center; - width: 100%; - padding: var(--space-2) var(--space-3); - margin-top: var(--space-1); - background: none; - border: none; - border-radius: var(--radius-md); - color: var(--color-text-muted); - font-size: var(--text-xs); - font-weight: 500; - cursor: pointer; - opacity: 0.6; - transition: opacity 150ms, background 150ms; - white-space: nowrap; -} - -.sidebar__classic-btn:hover { - opacity: 1; - background: var(--color-surface-alt); -} - -/* ── Theme picker ───────────────────────────────────── */ -.sidebar__theme { - padding: var(--space-2) var(--space-3); - border-top: 1px solid var(--color-border-light); - display: flex; - flex-direction: column; - gap: var(--space-1); -} - -.sidebar__theme-label { - font-size: var(--text-xs); - color: var(--color-text-muted); - font-weight: 500; - text-transform: uppercase; - letter-spacing: 0.05em; -} - -.sidebar__theme-select { - width: 100%; - padding: var(--space-2) var(--space-3); - background: var(--color-surface-alt); - border: 1px solid var(--color-border); - border-radius: var(--radius-md); - color: var(--color-text); - font-size: var(--text-sm); - font-family: var(--font-body); - cursor: pointer; - appearance: auto; - transition: border-color 150ms ease, background 150ms ease; -} - -.sidebar__theme-select:hover { - border-color: var(--color-primary); - background: var(--color-surface-raised); -} - -.sidebar__theme-select:focus-visible { - outline: 2px solid var(--color-accent); - outline-offset: 2px; -} - /* ── Mobile tab bar (<1024px) ───────────────────────── */ .app-tabbar { display: none; /* hidden on desktop */ diff --git a/web/src/components/ApplyWorkspace.vue b/web/src/components/ApplyWorkspace.vue index a8e3a8e..c21d6ae 100644 --- a/web/src/components/ApplyWorkspace.vue +++ b/web/src/components/ApplyWorkspace.vue @@ -10,7 +10,7 @@ @@ -191,64 +143,6 @@ ↺ Regenerate - - - - -
- - -
-

- No questions yet — add one below to get LLM-suggested answers. -

- -
-
- {{ item.question }} - -
-