peregrine

Author	SHA1	Message	Date
pyr0ball	74455ef10e	feat: add _download_size_mb() pure function for preflight size warning	2026-02-27 00:15:26 -08:00
pyr0ball	56b8af6bc7	feat: add ollama_research to preflight service table and LLM backend map	2026-02-27 00:14:04 -08:00
pyr0ball	d82cd43f2a	feat: ZeroShotAdapter, GLiClassAdapter, RerankerAdapter with full mock test coverage	2026-02-27 00:10:43 -08:00
pyr0ball	1f04f75905	feat: ClassifierAdapter ABC + compute_metrics() with full test coverage	2026-02-27 00:09:45 -08:00
pyr0ball	99f0f5b277	feat: add job-seeker-classifiers conda env for HF classifier benchmark	2026-02-26 23:43:41 -08:00
pyr0ball	93bf6b3c6f	feat: bundled skills suggestion list and content filter utility - config/skills_suggestions.yaml: 168 curated tags across skills (77), domains (40), keywords (51) covering CS/TAM/ops and common tech roles; structured for future community aggregate (paid tier backlog) - scripts/skills_utils.py: filter_tag() rejects blanks, URLs, profanity, overlong strings, disallowed chars, and repeated-char runs; load_suggestions() reads bundled YAML per category	2026-02-26 13:09:32 -08:00
pyr0ball	e982fa7a8b	fix: resume CID glyphs, resume YAML path, PyJWT dep, candidate voice & mission UI - resume_parser: add _clean_cid() to strip (cid:NNN) glyph refs from ATS PDFs; CIDs 127/149/183 become bullets, unknowns are stripped; applied to PDF/DOCX/ODT - resume YAML: canonicalize plain_text_resume.yaml path to config/ across all references (Settings, Apply, Setup, company_research, migrate); was pointing at unmounted aihawk/data_folder/ in Docker - requirements/environment: add PyJWT>=2.8 (was missing; broke Settings page) - user_profile: add candidate_voice field - generate_cover_letter: inject candidate_voice into SYSTEM_CONTEXT; add social_impact mission signal category (nonprofit, community, equity, etc.) - Settings: add Voice & Personality textarea to Identity expander; add Mission & Values expander with editable fields for all 4 mission categories - .gitignore: exclude CLAUDE.md, config/plain_text_resume.yaml, config/user.yaml.working - search_profiles: add default profile	2026-02-26 12:32:28 -08:00
pyr0ball	7ca20eec42	feat: ODT support, two-column PDF column-split extraction, title/company layout detection hardening	2026-02-26 10:33:28 -08:00
pyr0ball	1775c7fa36	fix: harden resume section detection — anchor patterns to full line, expand header synonyms, fix name heuristic for hyphenated/middle-initial names, add parse diagnostics UI	2026-02-26 09:28:31 -08:00
pyr0ball	26563a0990	refactor: replace LLM-based resume parser with section regex parser Primary parse path is now fully deterministic — no LLM, no token limits, no JSON generation. Handles two-column experience headers, institution-before- or-after-degree education layouts, and header bleed prevention via looks_like_header detection. LLM path retained as optional career_summary enhancement only (1500 chars, falls back silently). structure_resume() now returns tuple[dict, str]. Tests updated to match the new API.	2026-02-26 07:34:25 -08:00
pyr0ball	c8d8434371	fix: resume parser — max_tokens, json-repair fallback, logging, PYTHONUNBUFFERED	2026-02-26 00:00:23 -08:00
pyr0ball	70b385f3fd	fix: add /v1 prefix to all license server API paths	2026-02-25 23:35:58 -08:00
pyr0ball	7d5a706202	feat: license.py client — verify_local, effective_tier, activate, refresh, report_usage	2026-02-25 22:53:11 -08:00
pyr0ball	4da5e0a2a4	fix: GPU detection + pdfplumber + pass GPU env vars into app container - preflight.py now writes PEREGRINE_GPU_COUNT and PEREGRINE_GPU_NAMES to .env so the app container gets GPU info without needing nvidia-smi access - compose.yml passes PEREGRINE_GPU_COUNT, PEREGRINE_GPU_NAMES, and RECOMMENDED_PROFILE as env vars to the app service - 0_Setup.py _detect_gpus() reads PEREGRINE_GPU_NAMES env var first; falls back to nvidia-smi (bare / GPU-passthrough environments) - 0_Setup.py _suggest_profile() reads RECOMMENDED_PROFILE env var first - requirements.txt: add pdfplumber (needed for resume PDF parsing)	2026-02-25 21:58:28 -08:00
pyr0ball	1d228b293b	fix: stub-port adoption — stubs bind free ports, app routes to external via host.docker.internal Three inter-related fixes for the service adoption flow: - preflight: stub_port field — adopted services get a free port for their no-op container (avoids binding conflict with external service on real port) while update_llm_yaml still uses the real external port for host.docker.internal URLs - preflight: write_env now uses stub_port (not resolved) for adopted services so SEARXNG_PORT etc point to the stub's harmless port, not the occupied one - preflight: stub containers use sleep infinity + CMD true healthcheck so depends_on: service_healthy is satisfied without holding any real port - Makefile: finetune profile changed from [cpu,single-gpu,dual-gpu] to [finetune] so the pytorch/cuda base image is not built during make start	2026-02-25 21:38:23 -08:00
pyr0ball	7c62935371	fix: ollama docker_owned=True; finetune gets own profile to avoid build on start - preflight: ollama was incorrectly marked docker_owned=False — Docker does define an ollama service, so external detection now correctly disables it via compose.override.yml when host Ollama is already running - compose.yml: finetune moves from [cpu,single-gpu,dual-gpu] profiles to [finetune] profile so it is never built during 'make start' (pytorch/cuda base is 3.7GB+ and unnecessary for the UI) - compose.yml: remove depends_on ollama from finetune — it reaches Ollama via OLLAMA_URL env var which works whether Ollama is Docker or host - Makefile: finetune target uses --profile finetune + compose.gpu.yml overlay	2026-02-25 21:24:33 -08:00
pyr0ball	9c1f894446	feat: smart service adoption in preflight — use external services instead of conflicting preflight.py now detects when a managed service (ollama, vllm, vision, searxng) is already running on its configured port and adopts it rather than reassigning or conflicting: - Generates compose.override.yml disabling Docker containers for adopted services (profiles: [_external_] — a profile never passed via --profile) - Rewrites config/llm.yaml base_url entries to host.docker.internal:<port> so the app container can reach host-side services through Docker's host-gateway mapping - compose.yml: adds extra_hosts host.docker.internal:host-gateway to the app service (required on Linux; no-op on macOS Docker Desktop) - .gitignore: excludes compose.override.yml (auto-generated, host-specific) Only streamlit is non-adoptable and continues to reassign on conflict.	2026-02-25 19:23:02 -08:00
pyr0ball	bcde4c960e	feat: wire fine-tune UI end-to-end + harden setup.sh - setup.sh: replace docker-image-based NVIDIA test with nvidia-ctk validate (faster, no 100MB pull, no daemon required); add check_docker_running() to auto-start the Docker service on Linux or warn on macOS - prepare_training_data.py: also scan training_data/uploads/*.{md,txt} so web-uploaded letters are included in training data - task_runner.py: add prepare_training task type (calls build_records + write_jsonl inline; reports pair count in task result) - Settings fine-tune tab: Step 1 accepts .md/.txt uploads; Step 2 Extract button submits prepare_training background task + shows status; Step 3 shows make finetune command + live Ollama model status poller	2026-02-25 16:31:53 -08:00
pyr0ball	740b0ea45a	feat: containerize fine-tune pipeline (Dockerfile.finetune + make finetune) - Dockerfile.finetune: PyTorch 2.3/CUDA 12.1 base + unsloth + training stack - finetune_local.py: auto-register model via Ollama HTTP API after GGUF export; path-translate between finetune container mount and Ollama's view; update config/llm.yaml automatically; DOCS_DIR env override for Docker - prepare_training_data.py: DOCS_DIR env override so make prepare-training works correctly inside the app container - compose.yml: add finetune service (cpu/single-gpu/dual-gpu profiles); DOCS_DIR=/docs injected into app + finetune containers - compose.podman-gpu.yml: CDI device override for finetune service - Makefile: make prepare-training + make finetune targets	2026-02-25 16:22:48 -08:00
pyr0ball	7fab2a0cd3	feat: cover letter iterative refinement — feedback UI + backend params - generate() accepts previous_result + feedback; appends both to LLM prompt - task_runner cover_letter handler parses params JSON, passes fields through - Apply Workspace: "Refine with Feedback" expander with text area + Regenerate button; only shown when a draft exists; clears feedback after submitting - 8 new tests (TestGenerateRefinement + TestTaskRunnerCoverLetterParams)	2026-02-25 14:44:20 -08:00
pyr0ball	9fdb95e17f	feat: wizard_generate — feedback + previous_result support for iterative refinement	2026-02-25 08:29:56 -08:00
pyr0ball	6156aebd3a	feat: wizard_generate task type — 8 LLM generation sections	2026-02-25 08:25:17 -08:00
pyr0ball	2dd331cd59	feat: 13 integration implementations + config examples Add all 13 integration modules (Notion, Google Drive, Google Sheets, Airtable, Dropbox, OneDrive, MEGA, Nextcloud, Google Calendar, Apple Calendar/CalDAV, Slack, Discord, Home Assistant) with fields(), connect(), and test() implementations. Add config/integrations/*.yaml.example files and gitignore rules for live config files. Add 5 new registry/schema tests bringing total to 193 passing.	2026-02-25 08:18:45 -08:00
pyr0ball	f67eaab7de	feat: integration base class + auto-discovery registry	2026-02-25 08:13:14 -08:00
pyr0ball	c7e4749fc6	feat: resume parser — PDF/DOCX extraction + LLM structuring	2026-02-25 08:04:48 -08:00
pyr0ball	e2b5b26689	feat: wizard fields in UserProfile + params column in background_tasks - Add tier, dev_tier_override, wizard_complete, wizard_step, dismissed_banners fields to UserProfile with defaults and effective_tier property - Add params TEXT column to background_tasks table (CREATE + migration) - Update insert_task() to accept params with params-aware dedup logic - Update submit_task() and _run_task() to thread params through - Add test_wizard_defaults, test_effective_tier_override, test_effective_tier_no_override, and test_insert_task_with_params	2026-02-25 07:27:14 -08:00
pyr0ball	236db81ed3	feat: startup preflight — port collision avoidance + resource checks scripts/preflight.py (stdlib-only, no psutil): - Port probing: owned services auto-reassign to next free port; external services (Ollama) show ✓ reachable / ⚠ not responding - System resources: CPU cores, RAM (total + available), GPU VRAM via nvidia-smi; works on Linux + macOS - Profile recommendation: remote / cpu / single-gpu / dual-gpu - vLLM KV cache offload: calculates CPU_OFFLOAD_GB when VRAM < 10 GB free and RAM headroom > 4 GB (uses up to 25% of available headroom) - Writes resolved values to .env for docker compose; single-service mode (--service streamlit) for scripted port queries - Exit 0 unless an owned port genuinely can't be resolved scripts/manage-ui.sh: - Calls preflight.py --service streamlit before bind; falls back to pure-bash port scan if Python/yaml unavailable compose.yml: - vllm command: adds --cpu-offload-gb ${CPU_OFFLOAD_GB:-0} Makefile: - start / restart depend on preflight target - PYTHON variable for env portability - test target uses PYTHON variable Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-24 20:36:16 -08:00
pyr0ball	3eeccd7a33	feat: migration tool + portable startup scripts scripts/migrate.py: - dry-run by default; --apply writes files; --copy-db migrates staging.db - generates config/user.yaml from source repo's resume + cover letter scripts - copies gitignored configs (notion, email, adzuna, craigslist, search profiles, resume keywords, blocklist, aihawk resume) - merges fine-tuned model name from source llm.yaml into dest llm.yaml scripts/manage-ui.sh: - STREAMLIT_BIN no longer hardcoded; auto-resolves via conda env or PATH; override with STREAMLIT_BIN env var scripts/manage-vllm.sh: - VLLM_BIN and MODEL_DIR now read from env vars with portable defaults Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-24 20:25:54 -08:00
pyr0ball	84eb647348	feat: LGBTQIA+ focus + Phase 2/3 audit fixes LGBTQIA+ inclusion section in research briefs: - user_profile.py: add candidate_lgbtq_focus bool accessor - user.yaml.example: add candidate_lgbtq_focus flag (default false) - company_research.py: gate new LGBTQIA+ section behind flag; section count now dynamic (7 base + 1 per opt-in section, max 9) - 2_Settings.py: add "Research Brief Preferences" expander with checkboxes for both accessibility and LGBTQIA+ focus flags; mission_preferences now round-trips through save (no silent drop) Phase 2 fixes: - manage-vllm.sh: MODEL_DIR and VLLM_BIN now read from env vars (VLLM_MODELS_DIR, VLLM_BIN) with portable defaults - search_profiles.yaml: replace personal CS/TAM/Bay Area profiles with a documented generic starter profile Phase 3 fix: - llm.yaml: rename alex-cover-writer:latest → llama3.2:3b with inline comment for users to substitute their fine-tuned model; fix model-exclusion comment Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-24 20:02:03 -08:00
pyr0ball	40bd297b14	fix: remove hardcoded personal values — Phase 1 audit findings - 3_Resume_Editor.py: replace "Alex's" in docstring and caption - user_profile.py: expose mission_preferences and candidate_accessibility_focus - user.yaml.example: add mission_preferences section + candidate_accessibility_focus flag - generate_cover_letter.py: build _MISSION_NOTES from user profile instead of hardcoded personal passion notes; falls back to generic defaults when not set - company_research.py: gate "Inclusion & Accessibility" section behind candidate_accessibility_focus flag; section count adjusts (7 or 8) accordingly Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-24 19:57:03 -08:00
pyr0ball	9931f811ac	feat: add vision service to compose stack and fine-tune wizard tab to Settings - Add moondream2 vision service to compose.yml (single-gpu + dual-gpu profiles) - Create scripts/vision_service/Dockerfile for the vision container - Add VISION_PORT, VISION_MODEL, VISION_REVISION vars to .env.example - Add Vision Service entry to SERVICES list in Settings (hidden unless gpu profile active) - Add Fine-Tune Wizard tab (Task 10) to Settings with 3-step upload→preview→train flow - Tab is always rendered; shows info message when non-GPU profile is active Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-24 19:37:55 -08:00
pyr0ball	e575c66d53	feat: auto-generate llm.yaml base_url values from user profile services config Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-24 19:10:54 -08:00
pyr0ball	2b5ee80ca8	fix: thread searxng URL through research functions via _SEARXNG_URL constant - Add module-level _SEARXNG_URL derived from UserProfile.searxng_url (or default localhost:8888) - Update all _searxng_running() call sites to pass _SEARXNG_URL explicitly - Replace hardcoded "http://localhost:8888/" in _scrape_company() with _SEARXNG_URL + "/" - Replace hardcoded "http://localhost:8888/search" in _run_search_query() with f"{_SEARXNG_URL}/search" - Guard _profile.name.split() against empty string in finetune_local.py OLLAMA_NAME Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-24 18:52:10 -08:00
pyr0ball	bc94a92681	feat: extract hard-coded personal references from all scripts via UserProfile Replace hard-coded paths (/Library/Documents/JobSearch), names (Alex Rivera), NDA sets (_NDA_COMPANIES), and the scraper path with UserProfile-driven lookups. Update tests to be profile-agnostic (no user.yaml in peregrine config dir). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-24 18:45:39 -08:00
pyr0ball	83ce120666	feat: add UserProfile class with service URL generation and NDA helpers Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-24 18:29:45 -08:00
pyr0ball	f11a38eb0b	chore: seed Peregrine from personal job-seeker (pre-generalization) App: Peregrine Company: Circuit Forge LLC Source: github.com/pyr0ball/job-seeker (personal fork, not linked)	2026-02-24 18:25:39 -08:00

36 commits