Commit graph

81 commits

Author SHA1 Message Date
71480d630a refactor: use _get_db() pattern in get_research_brief, fix HTTPException style
- Replace lazy import + scripts.db.get_research with inline SQL via _get_db(),
  matching the pattern used by research_task_status and get_job_contacts
- Exclude raw_output from SELECT instead of post-fetch pop
- Change HTTPException in generate_research to positional-arg style
- Update test_get_research_found/not_found to patch dev_api._get_db
2026-03-20 18:32:02 -07:00
a29cc7b7d3 feat: add research and contacts endpoints for interview prep 2026-03-20 18:18:39 -07:00
347c171e26 fix: prefer HTML body in imap_sync, strip head/style/script, remove 4000-char truncation
- _parse_message now prefers text/html over text/plain so digest emails
  retain href attribute values needed for link extraction
- Strip <head>, <style>, <script> blocks before storing to remove CSS/JS
  garbage while keeping anchor tags intact
- Remove [:4000] truncation — digest emails need full body for URL regex
- Update test: large body should NOT be truncated (assert len == 10_000)
2026-03-20 13:35:30 -07:00
1b2643675d feat: add queue-jobs and delete digest endpoints 2026-03-20 07:44:19 -07:00
5bb3674fea fix: guard extract_digest_links db.close(), remove domain-in-path false positive, add hint assertion 2026-03-20 07:04:24 -07:00
182ab789df feat: add /extract-links endpoint with URL scoring 2026-03-20 06:59:26 -07:00
a503ecde3b feat: add GET/POST /api/digest-queue endpoints 2026-03-20 02:51:17 -07:00
34494db8d8 feat(signals): strip HTML and normalize whitespace from email bodies 2026-03-19 19:59:59 -07:00
1d943ed8a3 feat(signals): add body/from_addr to signal query; add reclassify endpoint 2026-03-19 19:14:11 -07:00
bc8174271e feat(interviews): add stage signals, email sync, and dismiss endpoints to dev-api 2026-03-19 16:17:22 -07:00
bf1dc39f14 fix(tests): update mock from inner_text() to text_content() in e2e helpers
get_page_errors() was switched to text_content() to capture errors in
CSS-hidden elements (collapsed Streamlit expanders). Two unit test mocks
still stubbed inner_text() — causing CI failures because MagicMock()
returned a non-string from text_content(), breaking the "boom" in message
content assertion.
2026-03-17 20:33:55 -07:00
167fa8d84a fix(e2e): cloud auth via cookie, local port, Playwright WebSocket gotcha
E2E harness fixes to get all three modes (demo/cloud/local) passing:

- conftest.py: use ctx.add_cookies() for cloud auth instead of
  ctx.route() or set_extra_http_headers(). Playwright's route() only
  intercepts HTTP; set_extra_http_headers() explicitly excludes
  WebSocket handshakes. Streamlit reads st.context.headers from the
  WebSocket upgrade, so cookies are the only vehicle that reaches it
  without a reverse proxy.

- cloud_session.py: fall back to Cookie header when X-CF-Session is
  absent — supports direct access (E2E tests, dev without Caddy).
  In production Caddy sets X-CF-Session; in tests the cf_session cookie
  is set on the browser context and arrives in the Cookie header.

- modes/cloud.py: add /peregrine base URL path (STREAMLIT_SERVER_BASE_URL_PATH=peregrine)

- modes/local.py: correct port from 8502 → 8501 and add /peregrine path

All three modes now pass smoke + interaction tests clean.
2026-03-17 20:01:42 -07:00
0758b70306 feat(e2e): add smoke + interaction tests; fix two demo mode errors
- Add tests/e2e/test_smoke.py: page-load error check for all pages
- Add tests/e2e/test_interactions.py: click every interactable, diff
  errors, XFAIL expected demo failures, flag regressions as XPASS
- Fix conftest get_page_errors() to use text_content() instead of
  inner_text() so errors inside collapsed expanders are captured with
  their actual message text (inner_text respects CSS display:none)
- Fix tests/e2e/modes/demo.py base_url to include /peregrine path prefix
  (STREAMLIT_SERVER_BASE_URL_PATH=peregrine set in demo container)

App fixes surfaced by the harness:
- task_runner.py: add DEMO_MODE guard for discovery task — previously
  crashed with FileNotFoundError on search_profiles.yaml before any
  demo guard could fire; now returns friendly error immediately
- 6_Interview_Prep.py: stop auto-triggering LLM session on page load
  in demo mode; show "AI features disabled" info instead, preventing
  a silent st.error() inside the collapsed Practice Q&A expander

Both smoke and interaction tests now pass clean against demo mode.
2026-03-17 07:00:54 -07:00
c746acd89f feat(e2e): add conftest with Streamlit helpers, browser fixtures, console filter 2026-03-16 23:14:24 -07:00
4844c55292 feat(e2e): add BasePage and 7 page objects
BasePage provides navigation, error capture, and interactable discovery
with fnmatch-based expected_failure matching. SettingsPage extends it
with tab-aware discovery. All conftest imports are deferred to method
bodies so the module loads without a live browser fixture.
2026-03-16 23:14:20 -07:00
3be63f4a81 feat(e2e): add mode configs (demo/cloud/local) with Directus JWT auth 2026-03-16 23:07:34 -07:00
4d58d33567 feat(e2e): add ErrorRecord, ModeConfig, diff_errors models with tests 2026-03-16 23:06:02 -07:00
0317b9582a chore(e2e): scaffold E2E harness directory and install deps
Add pytest-playwright and pytest-json-report to requirements.txt; create
tests/e2e/ skeleton (modes/, pages/, results/) with __init__.py files and
.gitkeep; add results subdirs to .gitignore.
2026-03-16 22:58:47 -07:00
37d151725e feat: push interview events to connected calendar integrations (#19)
Implements idempotent calendar push for Apple Calendar (CalDAV) and
Google Calendar from the Interviews kanban.

- db: add calendar_event_id column (migration) + set_calendar_event_id helper
- integrations/apple_calendar: create_event / update_event via caldav + icalendar
- integrations/google_calendar: create_event / update_event via google-api-python-client;
  test() now makes a real API call instead of checking file existence
- scripts/calendar_push: orchestrates push/update, builds event title from stage +
  job title + company, attaches job URL and company brief to description,
  defaults to noon UTC / 1hr duration
- app/pages/5_Interviews: "Add to Calendar" / "Update Calendar" button shown
  when interview date is set and a calendar integration is configured
- environment.yml: pin caldav, icalendar, google-api-python-client, google-auth
- tests/test_calendar_push: 9 tests covering create, update, error handling,
  event timing, idempotency, and missing job/date guards
2026-03-16 21:31:22 -07:00
9c36c578ef feat: add Jobgether recruiter framing to cover letter generation
When source == "jobgether", build_prompt() injects a recruiter context
note directing the LLM to address the Jobgether recruiter using
"Your client [at {company}] will appreciate..." framing rather than
addressing the employer directly. generate() and task_runner both
thread the is_jobgether flag through automatically.
2026-03-15 09:45:51 -07:00
b3893e9ad9 feat: add Jobgether URL detection and scraper to scrape_url.py 2026-03-15 09:45:50 -07:00
ee054408ea feat: filter Jobgether listings via blocklist 2026-03-15 09:45:50 -07:00
690a1ccf93 feat(task_runner): route LLM tasks through scheduler in submit_task()
Replaces the spawn-per-task model for LLM task types with scheduler
routing: cover_letter, company_research, and wizard_generate are now
enqueued via the TaskScheduler singleton for VRAM-aware batching.
Non-LLM tasks (discovery, email_sync, etc.) continue to spawn daemon
threads directly. Adds autouse clean_scheduler fixture to
test_task_runner.py to prevent singleton cross-test contamination.
2026-03-15 04:52:42 -07:00
3e3c6f1fc5 feat(scheduler): add durability — re-queue surviving LLM tasks on startup 2026-03-15 04:24:11 -07:00
9b96c45b63 feat(scheduler): implement thread-safe singleton get_scheduler/reset_scheduler 2026-03-15 04:19:23 -07:00
a53a03d593 feat(scheduler): implement scheduler loop and batch worker with VRAM-aware scheduling 2026-03-15 04:14:56 -07:00
68d257d278 feat(scheduler): implement enqueue() with depth guard and ghost-row cleanup 2026-03-15 04:05:22 -07:00
415e98d401 feat(scheduler): implement TaskScheduler.__init__ with budget loading and VRAM detection 2026-03-15 03:32:11 -07:00
1616858729 refactor(tests): remove unused imports from test_task_scheduler 2026-03-15 03:27:17 -07:00
376e028af5 feat(db): add reset_running_tasks() for durable scheduler restart 2026-03-15 03:22:45 -07:00
00f0eb4807 feat(linkedin): add staging file parser with re-parse support 2026-03-13 10:18:01 -07:00
e937094884 fix(linkedin): improve scraper error handling, current-job date range, add missing tests 2026-03-13 06:02:03 -07:00
f64ecf81e0 feat(linkedin): add scraper (Playwright + export zip) with URL validation 2026-03-13 01:06:39 -07:00
a43e29e50d feat(linkedin): add HTML parser utils with fixture tests 2026-03-13 01:01:05 -07:00
04c4efd3e0 fix(cloud): extract cf_session cookie by name from X-CF-Session header 2026-03-10 09:22:08 -07:00
7a698496f9 feat(cloud): fix backup/restore for cloud mode — SQLCipher encrypt/decrypt
T13: Three fixes:
1. backup.py: _decrypt_db_to_bytes() decrypts SQLCipher DB before archiving
   so the zip is portable to any local Docker install (plain SQLite).
2. backup.py: _encrypt_db_from_bytes() re-encrypts on restore in cloud mode
   so the app can open the restored DB normally.
3. 2_Settings.py: _base_dir uses get_db_path().parent in cloud mode (user's
   per-tenant data dir) instead of the hardcoded app root; db_key wired
   through both create_backup() and restore_backup() calls.

6 new cloud backup tests + 2 unit tests for SQLCipher helpers (pysqlcipher3
mocked — not available in the local conda test env). 419/419 total passing.
2026-03-09 22:41:44 -07:00
0e3abb5e63 feat(cloud): add compose.cloud.yml and telemetry consent middleware
T8: compose.cloud.yml — multi-tenant cloud stack on port 8505, CLOUD_MODE=true,
per-user encrypted data at /devl/menagerie-data, joins caddy-proxy_caddy-internal
network; .env.example extended with five cloud-only env vars.

T10: app/telemetry.py — log_usage_event() is the ONLY entry point to usage_events
table; hard kill switch (all_disabled) checked before any DB write; complete no-op
in local mode; swallows all exceptions so telemetry never crashes the app;
psycopg2-binary added to requirements.txt. Event calls wired into 4_Apply.py at
cover_letter_generated and job_applied. 5 tests, 413/413 total passing.
2026-03-09 22:10:18 -07:00
96715bdeb6 feat(peregrine): add cloud_session middleware + SQLCipher get_connection()
cloud_session.py: no-op in local mode; in cloud mode resolves Directus JWT
from X-CF-Session header to per-user db_path in st.session_state.

get_connection() in scripts/db.py: transparent SQLCipher/sqlite3 switch —
uses encrypted driver when CLOUD_MODE=true and key provided, vanilla sqlite3
otherwise. libsqlcipher-dev added to Dockerfile for Docker builds.

6 new cloud_session tests + 1 new get_connection test — 34/34 db tests pass.
2026-03-09 19:43:42 -07:00
ce760200ed test: anonymize real personal data — use fictional Alex Rivera throughout test suite 2026-03-06 15:35:04 -08:00
f60ac07541 test: add missing base_url edge case + clarify 0.0.0.0 marker intent
Document defensive behavior: openai_compat with no base_url returns True
(cloud) because unknown destination is assumed cloud. Add explanatory
comment to LOCAL_URL_MARKERS for the 0.0.0.0 bind-address case.
2026-03-06 14:43:45 -08:00
47d8317d56 feat: byok_guard — cloud backend detection with full test coverage 2026-03-06 14:40:06 -08:00
ce8d5a4ac0 feat: add suggest_resume_keywords for skills/domains/keywords gap analysis
Replaces NotImplementedError stub with full LLM-backed implementation.
Builds a prompt from the last 3 resume positions plus already-selected
skills/domains/keywords, calls LLMRouter, and returns de-duped suggestions
in all three categories.
2026-03-05 15:00:53 -08:00
b841ac5418 feat: add suggest_search_terms with three-angle exclude analysis
Replaces NotImplementedError stub with a real LLMRouter-backed implementation
that builds a structured prompt covering blocklist alias expansion, values
misalignment, and role-type filtering, then parses the JSON response into
suggested_titles and suggested_excludes lists.

Moves LLMRouter import to module level so tests can patch it at
scripts.suggest_helpers.LLMRouter.
2026-03-05 13:15:25 -08:00
d56c44224f feat: backup/restore script with multi-instance and legacy support
- create_backup() / restore_backup() / list_backup_contents() public API
- --base-dir PATH flag: targets any instance root (default: this repo)
  --base-dir /devl/job-seeker backs up the legacy Conda install
- _DB_CANDIDATES fallback: data/staging.db (Peregrine) or staging.db root (legacy)
- Manifest records source label (dir name), source_path, created_at, files, includes_db
- Added config/resume_keywords.yaml and config/server.yaml to backup lists
- 21 tests covering create, list, restore, legacy DB path, overwrite, roundtrip
2026-03-04 10:52:51 -08:00
582f2422ff fix: lazy-import playwright in screenshot_page, fix SQLite connection leak in collect_listings 2026-03-03 12:45:39 -08:00
260be9e821 feat: feedback_api — screenshot_page with Playwright (graceful fallback) 2026-03-03 12:14:33 -08:00
b77bb754af feat: feedback_api — Forgejo label management + issue filing + attachment upload 2026-03-03 12:09:11 -08:00
1940cfb131 feat: feedback_api — build_issue_body 2026-03-03 12:00:01 -08:00
6764ad4288 feat: feedback_api — collect_logs + collect_listings 2026-03-03 11:56:35 -08:00
7f46d7fadf feat: feedback_api — mask_pii + collect_context 2026-03-03 11:43:35 -08:00