App: Peregrine Company: Circuit Forge LLC Source: github.com/pyr0ball/job-seeker (personal fork, not linked)
187 lines
5.8 KiB
Markdown
187 lines
5.8 KiB
Markdown
# Research Workflow Redesign
|
||
|
||
**Date:** 2026-02-22
|
||
**Status:** Approved
|
||
|
||
## Problem
|
||
|
||
The current `company_research.py` produces shallow output:
|
||
- Resume context is a hardcoded 2-sentence blurb — talking points aren't grounded in Alex's actual experience
|
||
- Search coverage is limited: CEO, HQ, LinkedIn, one generic news query
|
||
- Output has 4 sections; new data categories (tech stack, funding, culture, competitors) have nowhere to go
|
||
- No skills/keyword config to drive experience matching against the JD
|
||
|
||
## Approach: Query Expansion + Parallel JSON Searches + Single LLM Pass
|
||
|
||
Run all searches (companyScraper sequential + new parallel SearXNG JSON queries), aggregate into a structured context block, pre-select resume experiences by keyword score, single LLM call produces all expanded sections.
|
||
|
||
---
|
||
|
||
## Design
|
||
|
||
### 1. Search Pipeline
|
||
|
||
**Phase 1 — companyScraper (unchanged, sequential)**
|
||
- CEO name, HQ address, LinkedIn URL
|
||
|
||
**Phase 1b — Parallel SearXNG JSON queries (new/expanded)**
|
||
|
||
Six queries run concurrently via daemon threads:
|
||
|
||
| Intent | Query pattern |
|
||
|---|---|
|
||
| Recent news/press | `"{company}" news 2025 2026` |
|
||
| Funding & investors | `"{company}" funding round investors Series valuation` |
|
||
| Tech stack | `"{company}" tech stack engineering technology platform` |
|
||
| Competitors | `"{company}" competitors alternatives vs market` |
|
||
| Culture / Glassdoor | `"{company}" glassdoor culture reviews employees` |
|
||
| CEO press (if found) | `"{ceo}" "{company}"` |
|
||
|
||
Each returns 3–4 deduplicated snippets (title + content + URL), labeled by type.
|
||
Results are best-effort — any failed query is silently skipped.
|
||
|
||
---
|
||
|
||
### 2. Resume Matching
|
||
|
||
**`config/resume_keywords.yaml`** — three categories, tag-managed via Settings UI:
|
||
|
||
```yaml
|
||
skills:
|
||
- Customer Success
|
||
- Technical Account Management
|
||
- Revenue Operations
|
||
- Salesforce
|
||
- Gainsight
|
||
- data analysis
|
||
- stakeholder management
|
||
|
||
domains:
|
||
- B2B SaaS
|
||
- enterprise software
|
||
- security / compliance
|
||
- post-sale lifecycle
|
||
|
||
keywords:
|
||
- QBR
|
||
- churn reduction
|
||
- NRR / ARR
|
||
- onboarding
|
||
- renewal
|
||
- executive sponsorship
|
||
- VOC
|
||
```
|
||
|
||
**Matching logic:**
|
||
1. Case-insensitive substring check of all keywords against JD text → `matched_keywords` list
|
||
2. Score each experience entry: count of matched keywords appearing in position title + responsibility bullets
|
||
3. Top 2 by score → included in prompt as full detail (position, company, period, all bullets)
|
||
4. Remaining entries → condensed one-liners ("Founder @ M3 Consulting, 2023–present")
|
||
|
||
**UpGuard NDA rule** (explicit in prompt): reference as "enterprise security vendor" in general; only name UpGuard directly if the role has a strong security/compliance focus.
|
||
|
||
---
|
||
|
||
### 3. LLM Context Block Structure
|
||
|
||
```
|
||
## Role Context
|
||
{title} at {company}
|
||
|
||
## Job Description
|
||
{JD text, up to 2500 chars}
|
||
|
||
## Alex's Matched Experience
|
||
[Top 2 scored experience entries — full detail]
|
||
|
||
Also in Alex's background: [remaining entries as one-liners]
|
||
|
||
## Matched Skills & Keywords
|
||
Skills matching this JD: {matched_keywords joined}
|
||
|
||
## Live Company Data
|
||
- CEO: {name}
|
||
- HQ: {location}
|
||
- LinkedIn: {url}
|
||
|
||
## News & Press
|
||
[snippets]
|
||
|
||
## Funding & Investors
|
||
[snippets]
|
||
|
||
## Tech Stack
|
||
[snippets]
|
||
|
||
## Competitors
|
||
[snippets]
|
||
|
||
## Culture & Employee Signals
|
||
[snippets]
|
||
```
|
||
|
||
---
|
||
|
||
### 4. Output Sections (7, up from 4)
|
||
|
||
| Section header | Purpose |
|
||
|---|---|
|
||
| `## Company Overview` | What they do, business model, size/stage, market position |
|
||
| `## Leadership & Culture` | CEO background, leadership team, philosophy |
|
||
| `## Tech Stack & Product` | What they build, relevant technology, product direction |
|
||
| `## Funding & Market Position` | Stage, investors, recent rounds, competitor landscape |
|
||
| `## Recent Developments` | News, launches, pivots, exec moves |
|
||
| `## Red Flags & Watch-outs` | Culture issues, layoffs, exec departures, financial stress |
|
||
| `## Talking Points for Alex` | 5 role-matched, resume-grounded, UpGuard-aware talking points ready to speak aloud |
|
||
|
||
Talking points prompt instructs LLM to: cite the specific matched experience by name, reference matched skills, apply UpGuard NDA rule, frame each as a ready-to-speak sentence.
|
||
|
||
---
|
||
|
||
### 5. DB Schema Changes
|
||
|
||
Add columns to `company_research` table:
|
||
|
||
```sql
|
||
ALTER TABLE company_research ADD COLUMN tech_brief TEXT;
|
||
ALTER TABLE company_research ADD COLUMN funding_brief TEXT;
|
||
ALTER TABLE company_research ADD COLUMN competitors_brief TEXT;
|
||
ALTER TABLE company_research ADD COLUMN red_flags TEXT;
|
||
```
|
||
|
||
Existing columns (`company_brief`, `ceo_brief`, `talking_points`, `raw_output`) unchanged.
|
||
|
||
---
|
||
|
||
### 6. Settings UI — Skills & Keywords Tab
|
||
|
||
New tab in `app/pages/2_Settings.py`:
|
||
- One expander or subheader per category (Skills, Domains, Keywords)
|
||
- Tag chips rendered with `st.pills` or columns of `st.badge`-style buttons with ×
|
||
- Inline text input + Add button per category
|
||
- Each add/remove saves immediately to `config/resume_keywords.yaml`
|
||
|
||
---
|
||
|
||
### 7. Interview Prep UI Changes
|
||
|
||
`app/pages/6_Interview_Prep.py` — render new sections alongside existing ones:
|
||
- Tech Stack & Product (new panel)
|
||
- Funding & Market Position (new panel)
|
||
- Red Flags & Watch-outs (new panel, visually distinct — e.g. orange/amber)
|
||
- Talking Points promoted to top (most useful during a live call)
|
||
|
||
---
|
||
|
||
## Files Affected
|
||
|
||
| File | Change |
|
||
|---|---|
|
||
| `scripts/company_research.py` | Parallel search queries, resume matching, expanded prompt + sections |
|
||
| `scripts/db.py` | Add 4 new columns to `company_research`; update `save_research` / `get_research` |
|
||
| `config/resume_keywords.yaml` | New file |
|
||
| `config/resume_keywords.yaml.example` | New committed template |
|
||
| `app/pages/2_Settings.py` | New Skills & Keywords tab |
|
||
| `app/pages/6_Interview_Prep.py` | Render new sections |
|
||
| `tests/test_db.py` | Tests for new columns |
|
||
| `tests/test_company_research.py` | New test file for matching logic + section parsing |
|