App: Peregrine Company: Circuit Forge LLC Source: github.com/pyr0ball/job-seeker (personal fork, not linked)
31 KiB
Job Seeker Platform — Implementation Plan
For Claude: REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.
Goal: Stand up a job discovery pipeline (JobSpy → Notion) with LLM routing, resume matching, and automated LinkedIn application support for Meghan McCann.
Architecture: JobSpy scrapes listings from multiple boards and pushes deduplicated results into a Notion database. A local LLM router with 5-backend fallback chain powers AIHawk's application answer generation. Resume Matcher scores each listing against Meghan's resume and writes keyword gaps back to Notion.
Tech Stack: Python 3.12, conda env job-seeker, python-jobspy, notion-client, openai SDK, anthropic SDK, pyyaml, pandas, Resume-Matcher (cloned), Auto_Jobs_Applier_AIHawk (cloned), pytest, pytest-mock
Priority order: Discovery (Tasks 1–5) must be running before Match or AIHawk setup.
Document storage rule: Resumes and cover letters live in /Library/Documents/JobSearch/ — never committed to this repo.
Task 1: Conda Environment + Project Scaffold
Files:
- Create:
/devl/job-seeker/environment.yml - Create:
/devl/job-seeker/.gitignore - Create:
/devl/job-seeker/tests/__init__.py
Step 1: Write environment.yml
# /devl/job-seeker/environment.yml
name: job-seeker
channels:
- conda-forge
- defaults
dependencies:
- python=3.12
- pip
- pip:
- python-jobspy
- notion-client
- openai
- anthropic
- pyyaml
- pandas
- requests
- pytest
- pytest-mock
Step 2: Create the conda env
conda env create -f /devl/job-seeker/environment.yml
Expected: env job-seeker created with no errors.
Step 3: Verify the env
conda run -n job-seeker python -c "import jobspy, notion_client, openai, anthropic; print('all good')"
Expected: all good
Step 4: Write .gitignore
# /devl/job-seeker/.gitignore
.env
config/notion.yaml # contains Notion token
__pycache__/
*.pyc
.pytest_cache/
output/
aihawk/
resume_matcher/
Note: aihawk/ and resume_matcher/ are cloned externally — don't commit them.
Step 5: Create tests directory
mkdir -p /devl/job-seeker/tests
touch /devl/job-seeker/tests/__init__.py
Step 6: Commit
cd /devl/job-seeker
git add environment.yml .gitignore tests/__init__.py
git commit -m "feat: add conda env spec and project scaffold"
Task 2: Config Files
Files:
- Create:
config/search_profiles.yaml - Create:
config/llm.yaml - Create:
config/notion.yaml.example(the realnotion.yamlis gitignored)
Step 1: Write search_profiles.yaml
# config/search_profiles.yaml
profiles:
- name: cs_leadership
titles:
- "Customer Success Manager"
- "Director of Customer Success"
- "VP Customer Success"
- "Head of Customer Success"
- "Technical Account Manager"
- "Revenue Operations Manager"
- "Customer Experience Lead"
locations:
- "Remote"
- "San Francisco Bay Area, CA"
boards:
- linkedin
- indeed
- glassdoor
- zip_recruiter
results_per_board: 25
hours_old: 72
Step 2: Write llm.yaml
# config/llm.yaml
fallback_order:
- claude_code
- ollama
- vllm
- github_copilot
- anthropic
backends:
claude_code:
type: openai_compat
base_url: http://localhost:3009/v1
model: claude-code-terminal
api_key: "any"
ollama:
type: openai_compat
base_url: http://localhost:11434/v1
model: llama3.2
api_key: "ollama"
vllm:
type: openai_compat
base_url: http://localhost:8000/v1
model: __auto__
api_key: ""
github_copilot:
type: openai_compat
base_url: http://localhost:3010/v1
model: gpt-4o
api_key: "any"
anthropic:
type: anthropic
model: claude-sonnet-4-6
api_key_env: ANTHROPIC_API_KEY
Step 3: Write notion.yaml.example
# config/notion.yaml.example
# Copy to config/notion.yaml and fill in your values.
# notion.yaml is gitignored — never commit it.
token: "secret_XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"
database_id: "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"
Step 4: Commit
cd /devl/job-seeker
git add config/search_profiles.yaml config/llm.yaml config/notion.yaml.example
git commit -m "feat: add search profiles, LLM config, and Notion config template"
Task 3: Create Notion Database
This task creates the Notion DB that all scripts write to. Do it once manually.
Step 1: Open Notion and create a new database
Create a full-page database called "Meghan's Job Search" in whatever Notion workspace you use for tracking.
Step 2: Add the required properties
Delete the default properties and create exactly these (type matters):
| Property Name | Type |
|---|---|
| Job Title | Title |
| Company | Text |
| Location | Text |
| Remote | Checkbox |
| URL | URL |
| Source | Select |
| Status | Select |
| Match Score | Number |
| Keyword Gaps | Text |
| Salary | Text |
| Date Found | Date |
| Notes | Text |
For the Status select, add these options in order:
New, Reviewing, Applied, Interview, Offer, Rejected
For the Source select, add:
Linkedin, Indeed, Glassdoor, Zip_Recruiter
Step 3: Get the database ID
Open the database as a full page. The URL will look like:
https://www.notion.so/YourWorkspace/XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX?v=...
The 32-character hex string before the ? is the database ID.
Step 4: Get your Notion integration token
Go to https://www.notion.so/my-integrations → create integration (or use existing) →
copy the "Internal Integration Token" (starts with secret_).
Connect the integration to your database: open the database → ... menu →
Add connections → select your integration.
Step 5: Write config/notion.yaml
cp /devl/job-seeker/config/notion.yaml.example /devl/job-seeker/config/notion.yaml
# Edit notion.yaml and fill in your token and database_id
Step 6: Verify connection
conda run -n job-seeker python3 -c "
from notion_client import Client
import yaml
cfg = yaml.safe_load(open('/devl/job-seeker/config/notion.yaml'))
n = Client(auth=cfg['token'])
db = n.databases.retrieve(cfg['database_id'])
print('Connected to:', db['title'][0]['plain_text'])
"
Expected: Connected to: Meghan's Job Search
Task 4: LLM Router
Files:
- Create:
scripts/llm_router.py - Create:
tests/test_llm_router.py
Step 1: Write the failing tests
# tests/test_llm_router.py
import pytest
from unittest.mock import patch, MagicMock
from pathlib import Path
import yaml
# Point tests at the real config
CONFIG_PATH = Path(__file__).parent.parent / "config" / "llm.yaml"
def test_config_loads():
"""Config file is valid YAML with required keys."""
cfg = yaml.safe_load(CONFIG_PATH.read_text())
assert "fallback_order" in cfg
assert "backends" in cfg
assert len(cfg["fallback_order"]) >= 1
def test_router_uses_first_reachable_backend(tmp_path):
"""Router skips unreachable backends and uses the first that responds."""
from scripts.llm_router import LLMRouter
router = LLMRouter(CONFIG_PATH)
mock_response = MagicMock()
mock_response.choices[0].message.content = "hello"
with patch.object(router, "_is_reachable", side_effect=[False, True, True, True, True]), \
patch("scripts.llm_router.OpenAI") as MockOpenAI:
instance = MockOpenAI.return_value
instance.chat.completions.create.return_value = mock_response
# Also mock models.list for __auto__ case
mock_model = MagicMock()
mock_model.id = "test-model"
instance.models.list.return_value.data = [mock_model]
result = router.complete("say hello")
assert result == "hello"
def test_router_raises_when_all_backends_fail():
"""Router raises RuntimeError when every backend is unreachable or errors."""
from scripts.llm_router import LLMRouter
router = LLMRouter(CONFIG_PATH)
with patch.object(router, "_is_reachable", return_value=False):
with pytest.raises(RuntimeError, match="All LLM backends exhausted"):
router.complete("say hello")
def test_is_reachable_returns_false_on_connection_error():
"""_is_reachable returns False when the health endpoint is unreachable."""
from scripts.llm_router import LLMRouter
import requests
router = LLMRouter(CONFIG_PATH)
with patch("scripts.llm_router.requests.get", side_effect=requests.ConnectionError):
result = router._is_reachable("http://localhost:9999/v1")
assert result is False
Step 2: Run tests to verify they fail
cd /devl/job-seeker
conda run -n job-seeker pytest tests/test_llm_router.py -v
Expected: ImportError — scripts.llm_router doesn't exist yet.
Step 3: Create scripts/init.py
touch /devl/job-seeker/scripts/__init__.py
Step 4: Write scripts/llm_router.py
# scripts/llm_router.py
"""
LLM abstraction layer with priority fallback chain.
Reads config/llm.yaml. Tries backends in order; falls back on any error.
"""
import os
import yaml
import requests
from pathlib import Path
from openai import OpenAI
CONFIG_PATH = Path(__file__).parent.parent / "config" / "llm.yaml"
class LLMRouter:
def __init__(self, config_path: Path = CONFIG_PATH):
with open(config_path) as f:
self.config = yaml.safe_load(f)
def _is_reachable(self, base_url: str) -> bool:
"""Quick health-check ping. Returns True if backend is up."""
health_url = base_url.rstrip("/").removesuffix("/v1") + "/health"
try:
resp = requests.get(health_url, timeout=2)
return resp.status_code < 500
except Exception:
return False
def _resolve_model(self, client: OpenAI, model: str) -> str:
"""Resolve __auto__ to the first model served by vLLM."""
if model != "__auto__":
return model
models = client.models.list()
return models.data[0].id
def complete(self, prompt: str, system: str | None = None) -> str:
"""
Generate a completion. Tries each backend in fallback_order.
Raises RuntimeError if all backends are exhausted.
"""
for name in self.config["fallback_order"]:
backend = self.config["backends"][name]
if backend["type"] == "openai_compat":
if not self._is_reachable(backend["base_url"]):
print(f"[LLMRouter] {name}: unreachable, skipping")
continue
try:
client = OpenAI(
base_url=backend["base_url"],
api_key=backend.get("api_key", "any"),
)
model = self._resolve_model(client, backend["model"])
messages = []
if system:
messages.append({"role": "system", "content": system})
messages.append({"role": "user", "content": prompt})
resp = client.chat.completions.create(
model=model, messages=messages
)
print(f"[LLMRouter] Used backend: {name} ({model})")
return resp.choices[0].message.content
except Exception as e:
print(f"[LLMRouter] {name}: error — {e}, trying next")
continue
elif backend["type"] == "anthropic":
api_key = os.environ.get(backend["api_key_env"], "")
if not api_key:
print(f"[LLMRouter] {name}: {backend['api_key_env']} not set, skipping")
continue
try:
import anthropic as _anthropic
client = _anthropic.Anthropic(api_key=api_key)
kwargs: dict = {
"model": backend["model"],
"max_tokens": 4096,
"messages": [{"role": "user", "content": prompt}],
}
if system:
kwargs["system"] = system
msg = client.messages.create(**kwargs)
print(f"[LLMRouter] Used backend: {name}")
return msg.content[0].text
except Exception as e:
print(f"[LLMRouter] {name}: error — {e}, trying next")
continue
raise RuntimeError("All LLM backends exhausted")
# Module-level singleton for convenience
_router: LLMRouter | None = None
def complete(prompt: str, system: str | None = None) -> str:
global _router
if _router is None:
_router = LLMRouter()
return _router.complete(prompt, system)
Step 5: Run tests to verify they pass
conda run -n job-seeker pytest tests/test_llm_router.py -v
Expected: 4 tests PASS.
Step 6: Smoke-test against live Ollama
conda run -n job-seeker python3 -c "
from scripts.llm_router import complete
print(complete('Say: job-seeker LLM router is working'))
"
Expected: A short response from Ollama (or next reachable backend).
Step 7: Commit
cd /devl/job-seeker
git add scripts/__init__.py scripts/llm_router.py tests/test_llm_router.py
git commit -m "feat: add LLM router with 5-backend fallback chain"
Task 5: Job Discovery (discover.py) — PRIORITY
Files:
- Create:
scripts/discover.py - Create:
tests/test_discover.py
Step 1: Write the failing tests
# tests/test_discover.py
import pytest
from unittest.mock import patch, MagicMock, call
import pandas as pd
from pathlib import Path
SAMPLE_JOB = {
"title": "Customer Success Manager",
"company": "Acme Corp",
"location": "Remote",
"is_remote": True,
"job_url": "https://linkedin.com/jobs/view/123456",
"site": "linkedin",
"salary_source": "$90,000 - $120,000",
}
def make_jobs_df(jobs=None):
return pd.DataFrame(jobs or [SAMPLE_JOB])
def test_get_existing_urls_returns_set():
"""get_existing_urls returns a set of URL strings from Notion pages."""
from scripts.discover import get_existing_urls
mock_notion = MagicMock()
mock_notion.databases.query.return_value = {
"results": [
{"properties": {"URL": {"url": "https://example.com/job/1"}}},
{"properties": {"URL": {"url": "https://example.com/job/2"}}},
],
"has_more": False,
"next_cursor": None,
}
urls = get_existing_urls(mock_notion, "fake-db-id")
assert urls == {"https://example.com/job/1", "https://example.com/job/2"}
def test_discover_skips_duplicate_urls():
"""discover does not push a job whose URL is already in Notion."""
from scripts.discover import run_discovery
existing = {"https://linkedin.com/jobs/view/123456"}
with patch("scripts.discover.scrape_jobs", return_value=make_jobs_df()), \
patch("scripts.discover.get_existing_urls", return_value=existing), \
patch("scripts.discover.push_to_notion") as mock_push, \
patch("scripts.discover.Client"):
run_discovery()
mock_push.assert_not_called()
def test_discover_pushes_new_jobs():
"""discover pushes jobs whose URLs are not already in Notion."""
from scripts.discover import run_discovery
with patch("scripts.discover.scrape_jobs", return_value=make_jobs_df()), \
patch("scripts.discover.get_existing_urls", return_value=set()), \
patch("scripts.discover.push_to_notion") as mock_push, \
patch("scripts.discover.Client"):
run_discovery()
assert mock_push.call_count == 1
def test_push_to_notion_sets_status_new():
"""push_to_notion always sets Status to 'New'."""
from scripts.discover import push_to_notion
mock_notion = MagicMock()
push_to_notion(mock_notion, "fake-db-id", SAMPLE_JOB)
call_kwargs = mock_notion.pages.create.call_args[1]
status = call_kwargs["properties"]["Status"]["select"]["name"]
assert status == "New"
Step 2: Run tests to verify they fail
conda run -n job-seeker pytest tests/test_discover.py -v
Expected: ImportError — scripts.discover doesn't exist yet.
Step 3: Write scripts/discover.py
# scripts/discover.py
"""
JobSpy → Notion discovery pipeline.
Scrapes job boards, deduplicates against existing Notion records,
and pushes new listings with Status=New.
Usage:
conda run -n job-seeker python scripts/discover.py
"""
import yaml
from datetime import datetime
from pathlib import Path
import pandas as pd
from jobspy import scrape_jobs
from notion_client import Client
CONFIG_DIR = Path(__file__).parent.parent / "config"
NOTION_CFG = CONFIG_DIR / "notion.yaml"
PROFILES_CFG = CONFIG_DIR / "search_profiles.yaml"
def load_config() -> tuple[dict, dict]:
profiles = yaml.safe_load(PROFILES_CFG.read_text())
notion_cfg = yaml.safe_load(NOTION_CFG.read_text())
return profiles, notion_cfg
def get_existing_urls(notion: Client, db_id: str) -> set[str]:
"""Return the set of all job URLs already tracked in Notion."""
existing: set[str] = set()
has_more = True
start_cursor = None
while has_more:
kwargs: dict = {"database_id": db_id, "page_size": 100}
if start_cursor:
kwargs["start_cursor"] = start_cursor
resp = notion.databases.query(**kwargs)
for page in resp["results"]:
url = page["properties"].get("URL", {}).get("url")
if url:
existing.add(url)
has_more = resp.get("has_more", False)
start_cursor = resp.get("next_cursor")
return existing
def push_to_notion(notion: Client, db_id: str, job: dict) -> None:
"""Create a new page in the Notion jobs database for a single listing."""
notion.pages.create(
parent={"database_id": db_id},
properties={
"Job Title": {"title": [{"text": {"content": str(job.get("title", "Unknown"))}}]},
"Company": {"rich_text": [{"text": {"content": str(job.get("company", ""))}}]},
"Location": {"rich_text": [{"text": {"content": str(job.get("location", ""))}}]},
"Remote": {"checkbox": bool(job.get("is_remote", False))},
"URL": {"url": str(job.get("job_url", ""))},
"Source": {"select": {"name": str(job.get("site", "unknown")).title()}},
"Status": {"select": {"name": "New"}},
"Salary": {"rich_text": [{"text": {"content": str(job.get("salary_source") or "")}}]},
"Date Found": {"date": {"start": datetime.now().isoformat()[:10]}},
},
)
def run_discovery() -> None:
profiles_cfg, notion_cfg = load_config()
notion = Client(auth=notion_cfg["token"])
db_id = notion_cfg["database_id"]
existing_urls = get_existing_urls(notion, db_id)
print(f"[discover] {len(existing_urls)} existing listings in Notion")
new_count = 0
for profile in profiles_cfg["profiles"]:
print(f"\n[discover] Profile: {profile['name']}")
for location in profile["locations"]:
print(f" Scraping: {location}")
jobs: pd.DataFrame = scrape_jobs(
site_name=profile["boards"],
search_term=" OR ".join(f'"{t}"' for t in profile["titles"]),
location=location,
results_wanted=profile.get("results_per_board", 25),
hours_old=profile.get("hours_old", 72),
linkedin_fetch_description=True,
)
for _, job in jobs.iterrows():
url = str(job.get("job_url", ""))
if not url or url in existing_urls:
continue
push_to_notion(notion, db_id, job.to_dict())
existing_urls.add(url)
new_count += 1
print(f" + {job.get('title')} @ {job.get('company')}")
print(f"\n[discover] Done — {new_count} new listings pushed to Notion.")
if __name__ == "__main__":
run_discovery()
Step 4: Run tests to verify they pass
conda run -n job-seeker pytest tests/test_discover.py -v
Expected: 4 tests PASS.
Step 5: Run a live discovery (requires notion.yaml to be set up from Task 3)
conda run -n job-seeker python scripts/discover.py
Expected: listings printed and pushed to Notion. Check the Notion DB to confirm rows appear with Status=New.
Step 6: Commit
cd /devl/job-seeker
git add scripts/discover.py tests/test_discover.py
git commit -m "feat: add JobSpy discovery pipeline with Notion deduplication"
Task 6: Clone and Configure Resume Matcher
Step 1: Clone Resume Matcher
cd /devl/job-seeker
git clone https://github.com/srbhr/Resume-Matcher.git resume_matcher
Step 2: Install Resume Matcher dependencies into the job-seeker env
conda run -n job-seeker pip install -r /devl/job-seeker/resume_matcher/requirements.txt
If there are conflicts, install only the core matching library:
conda run -n job-seeker pip install sentence-transformers streamlit qdrant-client pypdf2
Step 3: Verify it launches
conda run -n job-seeker streamlit run /devl/job-seeker/resume_matcher/streamlit_app.py --server.port 8501
Expected: Streamlit opens on http://localhost:8501 (port confirmed clear). Stop it with Ctrl+C — we'll run it on-demand.
Step 4: Note the resume path to use
The ATS-clean resume to use with Resume Matcher:
/Library/Documents/JobSearch/Meghan_McCann_Resume_02-19-2025.pdf
Task 7: Resume Match Script (match.py)
Files:
- Create:
scripts/match.py - Create:
tests/test_match.py
Step 1: Write the failing tests
# tests/test_match.py
import pytest
from unittest.mock import patch, MagicMock
def test_extract_job_description_from_url():
"""extract_job_description fetches and returns text from a URL."""
from scripts.match import extract_job_description
with patch("scripts.match.requests.get") as mock_get:
mock_get.return_value.text = "<html><body><p>We need a CSM with Salesforce.</p></body></html>"
mock_get.return_value.raise_for_status = MagicMock()
result = extract_job_description("https://example.com/job/123")
assert "CSM" in result
assert "Salesforce" in result
def test_score_is_between_0_and_100():
"""match_score returns a float in [0, 100]."""
from scripts.match import match_score
# Provide minimal inputs that the scorer can handle
score, gaps = match_score(
resume_text="Customer Success Manager with Salesforce experience",
job_text="Looking for a Customer Success Manager who knows Salesforce and Gainsight",
)
assert 0 <= score <= 100
assert isinstance(gaps, list)
def test_write_score_to_notion():
"""write_match_to_notion updates the Notion page with score and gaps."""
from scripts.match import write_match_to_notion
mock_notion = MagicMock()
write_match_to_notion(mock_notion, "page-id-abc", 85.5, ["Gainsight", "Churnzero"])
mock_notion.pages.update.assert_called_once()
call_kwargs = mock_notion.pages.update.call_args[1]
assert call_kwargs["page_id"] == "page-id-abc"
score_val = call_kwargs["properties"]["Match Score"]["number"]
assert score_val == 85.5
Step 2: Run tests to verify they fail
conda run -n job-seeker pytest tests/test_match.py -v
Expected: ImportError — scripts.match doesn't exist.
Step 3: Write scripts/match.py
# scripts/match.py
"""
Resume Matcher integration: score a Notion job listing against Meghan's resume.
Writes Match Score and Keyword Gaps back to the Notion page.
Usage:
conda run -n job-seeker python scripts/match.py <notion-page-url-or-id>
"""
import re
import sys
from pathlib import Path
import requests
import yaml
from bs4 import BeautifulSoup
from notion_client import Client
CONFIG_DIR = Path(__file__).parent.parent / "config"
RESUME_PATH = Path("/Library/Documents/JobSearch/Meghan_McCann_Resume_02-19-2025.pdf")
def load_notion() -> tuple[Client, str]:
cfg = yaml.safe_load((CONFIG_DIR / "notion.yaml").read_text())
return Client(auth=cfg["token"]), cfg["database_id"]
def extract_page_id(url_or_id: str) -> str:
"""Extract 32-char Notion page ID from a URL or return as-is."""
match = re.search(r"[0-9a-f]{32}", url_or_id.replace("-", ""))
if match:
return match.group(0)
return url_or_id.strip()
def get_job_url_from_notion(notion: Client, page_id: str) -> str:
page = notion.pages.retrieve(page_id)
return page["properties"]["URL"]["url"]
def extract_job_description(url: str) -> str:
"""Fetch a job listing URL and return its visible text."""
resp = requests.get(url, headers={"User-Agent": "Mozilla/5.0"}, timeout=10)
resp.raise_for_status()
soup = BeautifulSoup(resp.text, "html.parser")
for tag in soup(["script", "style", "nav", "header", "footer"]):
tag.decompose()
return " ".join(soup.get_text(separator=" ").split())
def read_resume_text() -> str:
"""Extract text from the ATS-clean PDF resume."""
try:
import pypdf
reader = pypdf.PdfReader(str(RESUME_PATH))
return " ".join(page.extract_text() or "" for page in reader.pages)
except ImportError:
import PyPDF2
with open(RESUME_PATH, "rb") as f:
reader = PyPDF2.PdfReader(f)
return " ".join(p.extract_text() or "" for p in reader.pages)
def match_score(resume_text: str, job_text: str) -> tuple[float, list[str]]:
"""
Score resume against job description using TF-IDF keyword overlap.
Returns (score 0-100, list of keywords in job not found in resume).
"""
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics.pairwise import cosine_similarity
import numpy as np
vectorizer = TfidfVectorizer(stop_words="english", max_features=200)
tfidf = vectorizer.fit_transform([resume_text, job_text])
score = float(cosine_similarity(tfidf[0:1], tfidf[1:2])[0][0]) * 100
# Keyword gap: terms in job description not present in resume (lowercased)
job_terms = set(job_text.lower().split())
resume_terms = set(resume_text.lower().split())
feature_names = vectorizer.get_feature_names_out()
job_tfidf = tfidf[1].toarray()[0]
top_indices = np.argsort(job_tfidf)[::-1][:30]
top_job_terms = [feature_names[i] for i in top_indices if job_tfidf[i] > 0]
gaps = [t for t in top_job_terms if t not in resume_terms][:10]
return round(score, 1), gaps
def write_match_to_notion(notion: Client, page_id: str, score: float, gaps: list[str]) -> None:
notion.pages.update(
page_id=page_id,
properties={
"Match Score": {"number": score},
"Keyword Gaps": {"rich_text": [{"text": {"content": ", ".join(gaps)}}]},
},
)
def run_match(page_url_or_id: str) -> None:
notion, _ = load_notion()
page_id = extract_page_id(page_url_or_id)
print(f"[match] Page ID: {page_id}")
job_url = get_job_url_from_notion(notion, page_id)
print(f"[match] Fetching job description from: {job_url}")
job_text = extract_job_description(job_url)
resume_text = read_resume_text()
score, gaps = match_score(resume_text, job_text)
print(f"[match] Score: {score}/100")
print(f"[match] Keyword gaps: {', '.join(gaps) or 'none'}")
write_match_to_notion(notion, page_id, score, gaps)
print("[match] Written to Notion.")
if __name__ == "__main__":
if len(sys.argv) < 2:
print("Usage: python scripts/match.py <notion-page-url-or-id>")
sys.exit(1)
run_match(sys.argv[1])
Step 4: Install sklearn (needed by match.py)
conda run -n job-seeker pip install scikit-learn beautifulsoup4 pypdf
Step 5: Run tests
conda run -n job-seeker pytest tests/test_match.py -v
Expected: 3 tests PASS.
Step 6: Commit
cd /devl/job-seeker
git add scripts/match.py tests/test_match.py
git commit -m "feat: add resume match scoring with Notion write-back"
Task 8: Clone and Configure AIHawk
Step 1: Clone AIHawk
cd /devl/job-seeker
git clone https://github.com/feder-cr/Auto_Jobs_Applier_AIHawk.git aihawk
Step 2: Install AIHawk dependencies
conda run -n job-seeker pip install -r /devl/job-seeker/aihawk/requirements.txt
Step 3: Install Playwright browsers (AIHawk uses Playwright for browser automation)
conda run -n job-seeker playwright install chromium
Step 4: Create AIHawk personal info config
AIHawk reads a personal_info.yaml. Create it in AIHawk's data directory:
cp /devl/job-seeker/aihawk/data_folder/plain_text_resume.yaml \
/devl/job-seeker/aihawk/data_folder/plain_text_resume.yaml.bak
Edit /devl/job-seeker/aihawk/data_folder/plain_text_resume.yaml with Meghan's info.
Key fields to fill:
personal_information: name, email, phone, linkedin, github (leave blank), locationwork_experience: pull from the SVG content already extractededucation: Texas State University, Mass Communications & PR, 2012-2015skills: Zendesk, Intercom, Asana, Jira, etc.
Step 5: Configure AIHawk to use the LLM router
AIHawk's config (aihawk/data_folder/config.yaml) has an llm_model_type and llm_model field.
Set it to use the local OpenAI-compatible endpoint:
# In aihawk/data_folder/config.yaml
llm_model_type: openai
llm_model: claude-code-terminal
openai_api_url: http://localhost:3009/v1 # or whichever backend is running
If 3009 is down, change to http://localhost:11434/v1 (Ollama).
Step 6: Run AIHawk in dry-run mode first
conda run -n job-seeker python /devl/job-seeker/aihawk/main.py --help
Review the flags. Start with a test run before enabling real submissions.
Step 7: Commit the environment update
cd /devl/job-seeker
conda env export -n job-seeker > environment.yml
git add environment.yml
git commit -m "chore: update environment.yml with all installed packages"
Task 9: End-to-End Smoke Test
Step 1: Run full test suite
conda run -n job-seeker pytest tests/ -v
Expected: all tests PASS.
Step 2: Run discovery
conda run -n job-seeker python scripts/discover.py
Expected: new listings appear in Notion with Status=New.
Step 3: Run match on one listing
Copy the URL of a Notion page from the DB and run:
conda run -n job-seeker python scripts/match.py "https://www.notion.so/..."
Expected: Match Score and Keyword Gaps written back to that Notion page.
Step 4: Commit anything left
cd /devl/job-seeker
git status
git add -p # stage only code/config, not secrets
git commit -m "chore: final smoke test cleanup"
Quick Reference
| Command | What it does |
|---|---|
conda run -n job-seeker python scripts/discover.py |
Scrape boards → push new listings to Notion |
conda run -n job-seeker python scripts/match.py <url> |
Score a listing → write back to Notion |
conda run -n job-seeker streamlit run resume_matcher/streamlit_app.py --server.port 8501 |
Open Resume Matcher UI |
conda run -n job-seeker pytest tests/ -v |
Run all tests |
cd "/Library/Documents/Post Fight Processing" && ./manage.sh start |
Start Claude Code pipeline (port 3009) |
cd "/Library/Documents/Post Fight Processing" && ./manage-copilot.sh start |
Start Copilot wrapper (port 3010) |