docs: add Jobgether integration implementation plan
This commit is contained in:
parent
952b21377f
commit
fc6ef88a05
1 changed files with 700 additions and 0 deletions
700
docs/superpowers/plans/2026-03-15-jobgether-integration.md
Normal file
700
docs/superpowers/plans/2026-03-15-jobgether-integration.md
Normal file
|
|
@ -0,0 +1,700 @@
|
|||
# Jobgether Integration Implementation Plan
|
||||
|
||||
> **For agentic workers:** REQUIRED: Use superpowers:subagent-driven-development (if subagents available) or superpowers:executing-plans to implement this plan. Steps use checkbox (`- [ ]`) syntax for tracking.
|
||||
|
||||
**Goal:** Filter Jobgether listings out of all other scrapers, add a dedicated Jobgether scraper and URL scraper (Playwright-based), and add recruiter-aware cover letter framing for Jobgether jobs.
|
||||
|
||||
**Architecture:** Blocklist config handles filtering with zero code changes. A new `_scrape_jobgether()` in `scrape_url.py` handles manual URL imports via Playwright with URL slug fallback. A new `scripts/custom_boards/jobgether.py` handles discovery. Cover letter framing is an `is_jobgether` flag threaded from `task_runner.py` → `generate()` → `build_prompt()`.
|
||||
|
||||
**Tech Stack:** Python, Playwright (already installed), SQLite, PyTest, YAML config
|
||||
|
||||
**Spec:** `/Library/Development/CircuitForge/peregrine/docs/superpowers/specs/2026-03-15-jobgether-integration-design.md`
|
||||
|
||||
---
|
||||
|
||||
## Worktree Setup
|
||||
|
||||
- [ ] **Create worktree for this feature**
|
||||
|
||||
```bash
|
||||
cd /Library/Development/CircuitForge/peregrine
|
||||
git worktree add .worktrees/jobgether-integration -b feature/jobgether-integration
|
||||
```
|
||||
|
||||
All implementation work happens in `/Library/Development/CircuitForge/peregrine/.worktrees/jobgether-integration/`.
|
||||
|
||||
---
|
||||
|
||||
## Chunk 1: Blocklist filter + scrape_url.py
|
||||
|
||||
### Task 1: Add Jobgether to blocklist
|
||||
|
||||
**Files:**
|
||||
- Modify: `/Library/Development/CircuitForge/peregrine/config/blocklist.yaml`
|
||||
|
||||
- [ ] **Step 1: Edit blocklist.yaml**
|
||||
|
||||
```yaml
|
||||
companies:
|
||||
- jobgether
|
||||
```
|
||||
|
||||
- [ ] **Step 2: Verify the existing `_is_blocklisted` test passes (or write one)**
|
||||
|
||||
Check `/Library/Development/CircuitForge/peregrine/tests/test_discover.py` for existing blocklist tests. If none cover company matching, add:
|
||||
|
||||
```python
|
||||
def test_is_blocklisted_jobgether():
|
||||
from scripts.discover import _is_blocklisted
|
||||
blocklist = {"companies": ["jobgether"], "industries": [], "locations": []}
|
||||
assert _is_blocklisted({"company": "Jobgether", "location": "", "description": ""}, blocklist)
|
||||
assert _is_blocklisted({"company": "jobgether inc", "location": "", "description": ""}, blocklist)
|
||||
assert not _is_blocklisted({"company": "Acme Corp", "location": "", "description": ""}, blocklist)
|
||||
```
|
||||
|
||||
Run: `conda run -n job-seeker python -m pytest tests/test_discover.py -v -k "blocklist"`
|
||||
Expected: PASS
|
||||
|
||||
- [ ] **Step 3: Commit**
|
||||
|
||||
```bash
|
||||
git add config/blocklist.yaml tests/test_discover.py
|
||||
git commit -m "feat: filter Jobgether listings via blocklist"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 2: Add Jobgether detection to scrape_url.py
|
||||
|
||||
**Files:**
|
||||
- Modify: `/Library/Development/CircuitForge/peregrine/scripts/scrape_url.py`
|
||||
- Modify: `/Library/Development/CircuitForge/peregrine/tests/test_scrape_url.py`
|
||||
|
||||
- [ ] **Step 1: Write failing tests**
|
||||
|
||||
In `/Library/Development/CircuitForge/peregrine/tests/test_scrape_url.py`, add:
|
||||
|
||||
```python
|
||||
def test_detect_board_jobgether():
|
||||
from scripts.scrape_url import _detect_board
|
||||
assert _detect_board("https://jobgether.com/offer/69b42d9d24d79271ee0618e8-csm---resware") == "jobgether"
|
||||
assert _detect_board("https://www.jobgether.com/offer/abc-role---company") == "jobgether"
|
||||
|
||||
|
||||
def test_jobgether_slug_company_extraction():
|
||||
from scripts.scrape_url import _company_from_jobgether_url
|
||||
assert _company_from_jobgether_url(
|
||||
"https://jobgether.com/offer/69b42d9d24d79271ee0618e8-customer-success-manager---resware"
|
||||
) == "Resware"
|
||||
assert _company_from_jobgether_url(
|
||||
"https://jobgether.com/offer/abc123-director-of-cs---acme-corp"
|
||||
) == "Acme Corp"
|
||||
assert _company_from_jobgether_url(
|
||||
"https://jobgether.com/offer/abc123-no-separator-here"
|
||||
) == ""
|
||||
|
||||
|
||||
def test_scrape_jobgether_no_playwright(tmp_path):
|
||||
"""When Playwright is unavailable, _scrape_jobgether falls back to URL slug for company."""
|
||||
# Patch playwright.sync_api to None in sys.modules so the local import inside
|
||||
# _scrape_jobgether raises ImportError at call time (local imports run at call time,
|
||||
# not at module load time — so no reload needed).
|
||||
import sys
|
||||
import unittest.mock as mock
|
||||
|
||||
url = "https://jobgether.com/offer/69b42d9d24d79271ee0618e8-customer-success-manager---resware"
|
||||
with mock.patch.dict(sys.modules, {"playwright": None, "playwright.sync_api": None}):
|
||||
from scripts.scrape_url import _scrape_jobgether
|
||||
result = _scrape_jobgether(url)
|
||||
|
||||
assert result.get("company") == "Resware"
|
||||
assert result.get("source") == "jobgether"
|
||||
```
|
||||
|
||||
Run: `conda run -n job-seeker python -m pytest tests/test_scrape_url.py::test_detect_board_jobgether tests/test_scrape_url.py::test_jobgether_slug_company_extraction tests/test_scrape_url.py::test_scrape_jobgether_no_playwright -v`
|
||||
Expected: FAIL (functions not yet defined)
|
||||
|
||||
- [ ] **Step 2: Add `_company_from_jobgether_url()` to scrape_url.py**
|
||||
|
||||
Add after the `_STRIP_PARAMS` block (around line 34):
|
||||
|
||||
```python
|
||||
def _company_from_jobgether_url(url: str) -> str:
|
||||
"""Extract company name from Jobgether offer URL slug.
|
||||
|
||||
Slug format: /offer/{24-hex-hash}-{title-slug}---{company-slug}
|
||||
Triple-dash separator delimits title from company.
|
||||
Returns title-cased company name, or "" if pattern not found.
|
||||
"""
|
||||
m = re.search(r"---([^/?]+)$", urlparse(url).path)
|
||||
if not m:
|
||||
print(f"[scrape_url] Jobgether URL slug: no company separator found in {url}")
|
||||
return ""
|
||||
return m.group(1).replace("-", " ").title()
|
||||
```
|
||||
|
||||
- [ ] **Step 3: Add `"jobgether"` branch to `_detect_board()`**
|
||||
|
||||
In `/Library/Development/CircuitForge/peregrine/scripts/scrape_url.py`, modify `_detect_board()` (add before `return "generic"`):
|
||||
|
||||
```python
|
||||
if "jobgether.com" in url_lower:
|
||||
return "jobgether"
|
||||
```
|
||||
|
||||
- [ ] **Step 4: Add `_scrape_jobgether()` function**
|
||||
|
||||
Add after `_scrape_glassdoor()` (around line 137):
|
||||
|
||||
```python
|
||||
def _scrape_jobgether(url: str) -> dict:
|
||||
"""Scrape a Jobgether offer page using Playwright to bypass 403.
|
||||
|
||||
Falls back to URL slug for company name when Playwright is unavailable.
|
||||
Does not use requests — no raise_for_status().
|
||||
"""
|
||||
try:
|
||||
from playwright.sync_api import sync_playwright
|
||||
except ImportError:
|
||||
company = _company_from_jobgether_url(url)
|
||||
if company:
|
||||
print(f"[scrape_url] Jobgether: Playwright not installed, using slug fallback → {company}")
|
||||
return {"company": company, "source": "jobgether"} if company else {}
|
||||
|
||||
try:
|
||||
with sync_playwright() as p:
|
||||
browser = p.chromium.launch(headless=True)
|
||||
try:
|
||||
ctx = browser.new_context(user_agent=_HEADERS["User-Agent"])
|
||||
page = ctx.new_page()
|
||||
page.goto(url, timeout=30_000)
|
||||
page.wait_for_load_state("networkidle", timeout=20_000)
|
||||
|
||||
result = page.evaluate("""() => {
|
||||
const title = document.querySelector('h1')?.textContent?.trim() || '';
|
||||
const company = document.querySelector('[class*="company"], [class*="employer"], [data-testid*="company"]')
|
||||
?.textContent?.trim() || '';
|
||||
const location = document.querySelector('[class*="location"], [data-testid*="location"]')
|
||||
?.textContent?.trim() || '';
|
||||
const desc = document.querySelector('[class*="description"], [class*="job-desc"], article')
|
||||
?.innerText?.trim() || '';
|
||||
return { title, company, location, description: desc };
|
||||
}""")
|
||||
finally:
|
||||
browser.close()
|
||||
|
||||
# Fall back to slug for company if DOM extraction missed it
|
||||
if not result.get("company"):
|
||||
result["company"] = _company_from_jobgether_url(url)
|
||||
|
||||
result["source"] = "jobgether"
|
||||
return {k: v for k, v in result.items() if v}
|
||||
|
||||
except Exception as exc:
|
||||
print(f"[scrape_url] Jobgether Playwright error for {url}: {exc}")
|
||||
# Last resort: slug fallback
|
||||
company = _company_from_jobgether_url(url)
|
||||
return {"company": company, "source": "jobgether"} if company else {}
|
||||
```
|
||||
|
||||
> ⚠️ **The CSS selectors in the `page.evaluate()` call are placeholders.** Before committing, inspect `https://jobgether.com/offer/` in a browser to find the actual class names for title, company, location, and description. Update the selectors accordingly.
|
||||
|
||||
- [ ] **Step 5: Add dispatch branch in `scrape_job_url()`**
|
||||
|
||||
In the `if board == "linkedin":` dispatch chain (around line 208), add before the `else`:
|
||||
|
||||
```python
|
||||
elif board == "jobgether":
|
||||
fields = _scrape_jobgether(url)
|
||||
```
|
||||
|
||||
- [ ] **Step 6: Run tests to verify they pass**
|
||||
|
||||
Run: `conda run -n job-seeker python -m pytest tests/test_scrape_url.py -v`
|
||||
Expected: All PASS (including pre-existing tests)
|
||||
|
||||
- [ ] **Step 7: Commit**
|
||||
|
||||
```bash
|
||||
git add scripts/scrape_url.py tests/test_scrape_url.py
|
||||
git commit -m "feat: add Jobgether URL detection and scraper to scrape_url.py"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Chunk 2: Jobgether custom board scraper
|
||||
|
||||
> ⚠️ **Pre-condition:** Before writing the scraper, inspect `https://jobgether.com/remote-jobs` live to determine the actual URL/filter param format and DOM card selectors. Use the Playwright MCP browser tool or Chrome devtools. Record: (1) the query param for job title search, (2) the job card CSS selectors for title, company, URL, location, salary.
|
||||
|
||||
### Task 3: Inspect Jobgether search live
|
||||
|
||||
**Files:** None (research step)
|
||||
|
||||
- [ ] **Step 1: Navigate to Jobgether remote jobs and inspect search params**
|
||||
|
||||
Using browser devtools or Playwright network capture, navigate to `https://jobgether.com/remote-jobs`, search for "Customer Success Manager", and capture:
|
||||
- The resulting URL (query params)
|
||||
- Network requests (XHR/fetch) if the page uses API calls
|
||||
- CSS selectors for job card elements
|
||||
|
||||
Record findings here before proceeding.
|
||||
|
||||
- [ ] **Step 2: Test a Playwright page.evaluate() extraction manually**
|
||||
|
||||
```python
|
||||
# Run interactively to validate selectors
|
||||
from playwright.sync_api import sync_playwright
|
||||
with sync_playwright() as p:
|
||||
browser = p.chromium.launch(headless=False) # headless=False to see the page
|
||||
page = browser.new_page()
|
||||
page.goto("https://jobgether.com/remote-jobs")
|
||||
page.wait_for_load_state("networkidle")
|
||||
# Test your selectors here
|
||||
cards = page.query_selector_all("[YOUR_CARD_SELECTOR]")
|
||||
print(len(cards))
|
||||
browser.close()
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 4: Write jobgether.py scraper
|
||||
|
||||
**Files:**
|
||||
- Create: `/Library/Development/CircuitForge/peregrine/scripts/custom_boards/jobgether.py`
|
||||
- Modify: `/Library/Development/CircuitForge/peregrine/tests/test_discover.py` (or create `tests/test_jobgether.py`)
|
||||
|
||||
- [ ] **Step 1: Write failing test**
|
||||
|
||||
In `/Library/Development/CircuitForge/peregrine/tests/test_discover.py` (or a new `tests/test_jobgether.py`):
|
||||
|
||||
```python
|
||||
def test_jobgether_scraper_returns_empty_on_missing_playwright(monkeypatch):
|
||||
"""Graceful fallback when Playwright is unavailable."""
|
||||
import scripts.custom_boards.jobgether as jg
|
||||
monkeypatch.setattr("scripts.custom_boards.jobgether.sync_playwright", None)
|
||||
result = jg.scrape({"titles": ["Customer Success Manager"]}, "Remote", results_wanted=5)
|
||||
assert result == []
|
||||
|
||||
|
||||
def test_jobgether_scraper_respects_results_wanted(monkeypatch):
|
||||
"""Scraper caps results at results_wanted."""
|
||||
import scripts.custom_boards.jobgether as jg
|
||||
|
||||
fake_jobs = [
|
||||
{"title": f"CSM {i}", "href": f"/offer/abc{i}-csm---acme", "company": f"Acme {i}",
|
||||
"location": "Remote", "is_remote": True, "salary": ""}
|
||||
for i in range(20)
|
||||
]
|
||||
|
||||
class FakePage:
|
||||
def goto(self, *a, **kw): pass
|
||||
def wait_for_load_state(self, *a, **kw): pass
|
||||
def evaluate(self, _): return fake_jobs
|
||||
|
||||
class FakeCtx:
|
||||
def new_page(self): return FakePage()
|
||||
|
||||
class FakeBrowser:
|
||||
def new_context(self, **kw): return FakeCtx()
|
||||
def close(self): pass
|
||||
|
||||
class FakeChromium:
|
||||
def launch(self, **kw): return FakeBrowser()
|
||||
|
||||
class FakeP:
|
||||
chromium = FakeChromium()
|
||||
def __enter__(self): return self
|
||||
def __exit__(self, *a): pass
|
||||
|
||||
monkeypatch.setattr("scripts.custom_boards.jobgether.sync_playwright", lambda: FakeP())
|
||||
result = jg.scrape({"titles": ["CSM"]}, "Remote", results_wanted=5)
|
||||
assert len(result) <= 5
|
||||
```
|
||||
|
||||
Run: `conda run -n job-seeker python -m pytest tests/ -v -k "jobgether"`
|
||||
Expected: FAIL (module not found)
|
||||
|
||||
- [ ] **Step 2: Create `scripts/custom_boards/jobgether.py`**
|
||||
|
||||
```python
|
||||
"""Jobgether scraper — Playwright-based (requires chromium installed).
|
||||
|
||||
Jobgether (jobgether.com) is a remote-work job aggregator. It blocks plain
|
||||
requests with 403, so we use Playwright to render the page and extract cards.
|
||||
|
||||
Install Playwright: conda run -n job-seeker pip install playwright &&
|
||||
conda run -n job-seeker python -m playwright install chromium
|
||||
|
||||
Returns a list of dicts compatible with scripts.db.insert_job().
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
import re
|
||||
import time
|
||||
from typing import Any
|
||||
|
||||
_BASE = "https://jobgether.com"
|
||||
_SEARCH_PATH = "/remote-jobs"
|
||||
|
||||
# TODO: Replace with confirmed query param key after live inspection (Task 3)
|
||||
_QUERY_PARAM = "search"
|
||||
|
||||
# Module-level import so tests can monkeypatch scripts.custom_boards.jobgether.sync_playwright
|
||||
try:
|
||||
from playwright.sync_api import sync_playwright
|
||||
except ImportError:
|
||||
sync_playwright = None
|
||||
|
||||
|
||||
def scrape(profile: dict, location: str, results_wanted: int = 50) -> list[dict]:
|
||||
"""
|
||||
Scrape job listings from Jobgether using Playwright.
|
||||
|
||||
Args:
|
||||
profile: Search profile dict (uses 'titles').
|
||||
location: Location string — Jobgether is remote-focused; location used
|
||||
only if the site exposes a location filter.
|
||||
results_wanted: Maximum results to return across all titles.
|
||||
|
||||
Returns:
|
||||
List of job dicts with keys: title, company, url, source, location,
|
||||
is_remote, salary, description.
|
||||
"""
|
||||
if sync_playwright is None:
|
||||
print(
|
||||
" [jobgether] playwright not installed.\n"
|
||||
" Install: conda run -n job-seeker pip install playwright && "
|
||||
"conda run -n job-seeker python -m playwright install chromium"
|
||||
)
|
||||
return []
|
||||
|
||||
results: list[dict] = []
|
||||
seen_urls: set[str] = set()
|
||||
|
||||
with sync_playwright() as p:
|
||||
browser = p.chromium.launch(headless=True)
|
||||
ctx = browser.new_context(
|
||||
user_agent=(
|
||||
"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 "
|
||||
"(KHTML, like Gecko) Chrome/122.0.0.0 Safari/537.36"
|
||||
)
|
||||
)
|
||||
page = ctx.new_page()
|
||||
|
||||
for title in profile.get("titles", []):
|
||||
if len(results) >= results_wanted:
|
||||
break
|
||||
|
||||
# TODO: Confirm URL param format from live inspection (Task 3)
|
||||
url = f"{_BASE}{_SEARCH_PATH}?{_QUERY_PARAM}={title.replace(' ', '+')}"
|
||||
|
||||
try:
|
||||
page.goto(url, timeout=30_000)
|
||||
page.wait_for_load_state("networkidle", timeout=20_000)
|
||||
except Exception as exc:
|
||||
print(f" [jobgether] Page load error for '{title}': {exc}")
|
||||
continue
|
||||
|
||||
# TODO: Replace JS selector with confirmed card selector from Task 3
|
||||
try:
|
||||
raw_jobs: list[dict[str, Any]] = page.evaluate(_extract_jobs_js())
|
||||
except Exception as exc:
|
||||
print(f" [jobgether] JS extract error for '{title}': {exc}")
|
||||
continue
|
||||
|
||||
if not raw_jobs:
|
||||
print(f" [jobgether] No cards found for '{title}' — selector may need updating")
|
||||
continue
|
||||
|
||||
for job in raw_jobs:
|
||||
href = job.get("href", "")
|
||||
if not href:
|
||||
continue
|
||||
full_url = _BASE + href if href.startswith("/") else href
|
||||
if full_url in seen_urls:
|
||||
continue
|
||||
seen_urls.add(full_url)
|
||||
|
||||
results.append({
|
||||
"title": job.get("title", ""),
|
||||
"company": job.get("company", ""),
|
||||
"url": full_url,
|
||||
"source": "jobgether",
|
||||
"location": job.get("location") or "Remote",
|
||||
"is_remote": True, # Jobgether is remote-focused
|
||||
"salary": job.get("salary") or "",
|
||||
"description": "", # not in card view; scrape_url fills in
|
||||
})
|
||||
|
||||
if len(results) >= results_wanted:
|
||||
break
|
||||
|
||||
time.sleep(1) # polite pacing between titles
|
||||
|
||||
browser.close()
|
||||
|
||||
return results[:results_wanted]
|
||||
|
||||
|
||||
def _extract_jobs_js() -> str:
|
||||
"""JS to run in page context — extracts job data from rendered card elements.
|
||||
|
||||
TODO: Replace selectors with confirmed values from Task 3 live inspection.
|
||||
"""
|
||||
return """() => {
|
||||
// TODO: replace '[class*=job-card]' with confirmed card selector
|
||||
const cards = document.querySelectorAll('[class*="job-card"], [data-testid*="job"]');
|
||||
return Array.from(cards).map(card => {
|
||||
// TODO: replace these selectors with confirmed values
|
||||
const titleEl = card.querySelector('h2, h3, [class*="title"]');
|
||||
const companyEl = card.querySelector('[class*="company"], [class*="employer"]');
|
||||
const linkEl = card.querySelector('a');
|
||||
const salaryEl = card.querySelector('[class*="salary"]');
|
||||
const locationEl = card.querySelector('[class*="location"]');
|
||||
return {
|
||||
title: titleEl ? titleEl.textContent.trim() : null,
|
||||
company: companyEl ? companyEl.textContent.trim() : null,
|
||||
href: linkEl ? linkEl.getAttribute('href') : null,
|
||||
salary: salaryEl ? salaryEl.textContent.trim() : null,
|
||||
location: locationEl ? locationEl.textContent.trim() : null,
|
||||
is_remote: true,
|
||||
};
|
||||
}).filter(j => j.title && j.href);
|
||||
}"""
|
||||
```
|
||||
|
||||
- [ ] **Step 3: Run tests**
|
||||
|
||||
Run: `conda run -n job-seeker python -m pytest tests/ -v -k "jobgether"`
|
||||
Expected: PASS
|
||||
|
||||
- [ ] **Step 4: Commit**
|
||||
|
||||
```bash
|
||||
git add scripts/custom_boards/jobgether.py tests/test_discover.py
|
||||
git commit -m "feat: add Jobgether custom board scraper (selectors pending live inspection)"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Chunk 3: Registration, config, cover letter framing
|
||||
|
||||
### Task 5: Register scraper in discover.py + update search_profiles.yaml
|
||||
|
||||
**Files:**
|
||||
- Modify: `/Library/Development/CircuitForge/peregrine/scripts/discover.py`
|
||||
- Modify: `/Library/Development/CircuitForge/peregrine/config/search_profiles.yaml`
|
||||
- Modify: `/Library/Development/CircuitForge/peregrine/config/search_profiles.yaml.example` (if it exists)
|
||||
|
||||
- [ ] **Step 1: Add import to discover.py import block (lines 20–22)**
|
||||
|
||||
`jobgether.py` absorbs the Playwright `ImportError` internally (module-level `try/except`), so it always imports successfully. Match the existing pattern exactly:
|
||||
|
||||
```python
|
||||
from scripts.custom_boards import jobgether as _jobgether
|
||||
```
|
||||
|
||||
- [ ] **Step 2: Add to CUSTOM_SCRAPERS dict literal (lines 30–34)**
|
||||
|
||||
```python
|
||||
CUSTOM_SCRAPERS: dict[str, object] = {
|
||||
"adzuna": _adzuna.scrape,
|
||||
"theladders": _theladders.scrape,
|
||||
"craigslist": _craigslist.scrape,
|
||||
"jobgether": _jobgether.scrape,
|
||||
}
|
||||
```
|
||||
|
||||
When Playwright is absent, `_jobgether.scrape()` returns `[]` gracefully — no special guard needed in `discover.py`.
|
||||
|
||||
- [ ] **Step 3: Add `jobgether` to remote-eligible profiles in search_profiles.yaml**
|
||||
|
||||
Add `- jobgether` to the `custom_boards` list for every profile that has `Remote` in its `locations`. Based on the current file, that means: `cs_leadership`, `music_industry`, `animal_welfare`, `education`. Do NOT add it to `default` (locations: San Francisco CA only).
|
||||
|
||||
- [ ] **Step 4: Run discover tests**
|
||||
|
||||
Run: `conda run -n job-seeker python -m pytest tests/test_discover.py -v`
|
||||
Expected: All PASS
|
||||
|
||||
- [ ] **Step 5: Commit**
|
||||
|
||||
```bash
|
||||
git add scripts/discover.py config/search_profiles.yaml
|
||||
git commit -m "feat: register Jobgether scraper and add to remote search profiles"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 6: Cover letter recruiter framing
|
||||
|
||||
**Files:**
|
||||
- Modify: `/Library/Development/CircuitForge/peregrine/scripts/generate_cover_letter.py`
|
||||
- Modify: `/Library/Development/CircuitForge/peregrine/scripts/task_runner.py`
|
||||
- Modify: `/Library/Development/CircuitForge/peregrine/tests/test_match.py` or add `tests/test_cover_letter.py`
|
||||
|
||||
- [ ] **Step 1: Write failing test**
|
||||
|
||||
Create or add to `/Library/Development/CircuitForge/peregrine/tests/test_cover_letter.py`:
|
||||
|
||||
```python
|
||||
def test_build_prompt_jobgether_framing_unknown_company():
|
||||
from scripts.generate_cover_letter import build_prompt
|
||||
prompt = build_prompt(
|
||||
title="Customer Success Manager",
|
||||
company="Jobgether",
|
||||
description="CSM role at an undisclosed company.",
|
||||
examples=[],
|
||||
is_jobgether=True,
|
||||
)
|
||||
assert "Your client" in prompt
|
||||
assert "recruiter" in prompt.lower() or "jobgether" in prompt.lower()
|
||||
|
||||
|
||||
def test_build_prompt_jobgether_framing_known_company():
|
||||
from scripts.generate_cover_letter import build_prompt
|
||||
prompt = build_prompt(
|
||||
title="Customer Success Manager",
|
||||
company="Resware",
|
||||
description="CSM role at Resware.",
|
||||
examples=[],
|
||||
is_jobgether=True,
|
||||
)
|
||||
assert "Your client at Resware" in prompt
|
||||
|
||||
|
||||
def test_build_prompt_no_jobgether_framing_by_default():
|
||||
from scripts.generate_cover_letter import build_prompt
|
||||
prompt = build_prompt(
|
||||
title="Customer Success Manager",
|
||||
company="Acme Corp",
|
||||
description="CSM role.",
|
||||
examples=[],
|
||||
)
|
||||
assert "Your client" not in prompt
|
||||
```
|
||||
|
||||
Run: `conda run -n job-seeker python -m pytest tests/test_cover_letter.py -v`
|
||||
Expected: FAIL
|
||||
|
||||
- [ ] **Step 2: Add `is_jobgether` to `build_prompt()` in generate_cover_letter.py**
|
||||
|
||||
Modify the `build_prompt()` signature (line 186):
|
||||
|
||||
```python
|
||||
def build_prompt(
|
||||
title: str,
|
||||
company: str,
|
||||
description: str,
|
||||
examples: list[dict],
|
||||
mission_hint: str | None = None,
|
||||
is_jobgether: bool = False,
|
||||
) -> str:
|
||||
```
|
||||
|
||||
Add the recruiter hint block after the `mission_hint` block (after line 203):
|
||||
|
||||
```python
|
||||
if is_jobgether:
|
||||
if company and company.lower() != "jobgether":
|
||||
recruiter_note = (
|
||||
f"🤝 Recruiter context: This listing is posted by Jobgether on behalf of "
|
||||
f"{company}. Address the cover letter to the Jobgether recruiter, not directly "
|
||||
f"to the hiring company. Use framing like 'Your client at {company} will "
|
||||
f"appreciate...' rather than addressing {company} directly. The role "
|
||||
f"requirements are those of the actual employer."
|
||||
)
|
||||
else:
|
||||
recruiter_note = (
|
||||
"🤝 Recruiter context: This listing is posted by Jobgether on behalf of an "
|
||||
"undisclosed employer. Address the cover letter to the Jobgether recruiter. "
|
||||
"Use framing like 'Your client will appreciate...' rather than addressing "
|
||||
"the company directly."
|
||||
)
|
||||
parts.append(f"{recruiter_note}\n")
|
||||
```
|
||||
|
||||
- [ ] **Step 3: Add `is_jobgether` to `generate()` signature**
|
||||
|
||||
Modify `generate()` (line 233):
|
||||
|
||||
```python
|
||||
def generate(
|
||||
title: str,
|
||||
company: str,
|
||||
description: str = "",
|
||||
previous_result: str = "",
|
||||
feedback: str = "",
|
||||
is_jobgether: bool = False,
|
||||
_router=None,
|
||||
) -> str:
|
||||
```
|
||||
|
||||
Pass it through to `build_prompt()` (line 254):
|
||||
|
||||
```python
|
||||
prompt = build_prompt(title, company, description, examples,
|
||||
mission_hint=mission_hint, is_jobgether=is_jobgether)
|
||||
```
|
||||
|
||||
- [ ] **Step 4: Pass `is_jobgether` from task_runner.py**
|
||||
|
||||
In `/Library/Development/CircuitForge/peregrine/scripts/task_runner.py`, modify the `generate()` call inside the `cover_letter` task block (`elif task_type == "cover_letter":` starts at line 152; the `generate()` call is at ~line 156):
|
||||
|
||||
```python
|
||||
elif task_type == "cover_letter":
|
||||
import json as _json
|
||||
p = _json.loads(params or "{}")
|
||||
from scripts.generate_cover_letter import generate
|
||||
result = generate(
|
||||
job.get("title", ""),
|
||||
job.get("company", ""),
|
||||
job.get("description", ""),
|
||||
previous_result=p.get("previous_result", ""),
|
||||
feedback=p.get("feedback", ""),
|
||||
is_jobgether=job.get("source") == "jobgether",
|
||||
)
|
||||
update_cover_letter(db_path, job_id, result)
|
||||
```
|
||||
|
||||
- [ ] **Step 5: Run tests**
|
||||
|
||||
Run: `conda run -n job-seeker python -m pytest tests/test_cover_letter.py -v`
|
||||
Expected: All PASS
|
||||
|
||||
- [ ] **Step 6: Run full test suite**
|
||||
|
||||
Run: `conda run -n job-seeker python -m pytest tests/ -v`
|
||||
Expected: All PASS
|
||||
|
||||
- [ ] **Step 7: Commit**
|
||||
|
||||
```bash
|
||||
git add scripts/generate_cover_letter.py scripts/task_runner.py tests/test_cover_letter.py
|
||||
git commit -m "feat: add Jobgether recruiter framing to cover letter generation"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Final: Merge
|
||||
|
||||
- [ ] **Merge worktree branch to main**
|
||||
|
||||
```bash
|
||||
cd /Library/Development/CircuitForge/peregrine
|
||||
git merge feature/jobgether-integration
|
||||
git worktree remove .worktrees/jobgether-integration
|
||||
```
|
||||
|
||||
- [ ] **Push to remote**
|
||||
|
||||
```bash
|
||||
git push origin main
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Manual verification after merge
|
||||
|
||||
1. Add the stuck Jobgether manual import (job 2286) — delete the old stuck row and re-add the URL via "Add Jobs by URL" in the Home page. Verify the scraper resolves company = "Resware".
|
||||
2. Run a short discovery (`discover.py` with `results_per_board: 5`) and confirm no `company="Jobgether"` rows appear in `staging.db`.
|
||||
3. Generate a cover letter for a Jobgether-sourced job and confirm recruiter framing appears.
|
||||
Loading…
Reference in a new issue