feat: LLM queue optimizer — resource-aware batch scheduler (closes #2) #15

Merged

pyr0ball merged 6 commits from feature/llm-queue-optimizer into main

2026-03-15 16:48:38 -07:00

pyr0ball commented

2026-03-15 15:56:50 -07:00

Owner

Summary

New scripts/task_scheduler.py: Resource-aware TaskScheduler singleton grouping LLM tasks by type into per-type deques, scheduling batches by VRAM budget (deepest queue wins), running each type serially to avoid repeated model context-switching.
scripts/task_runner.py: submit_task() routes cover_letter, company_research, and wizard_generate through the scheduler; all other types continue spawning free threads.
scripts/db.py: reset_running_tasks() — marks only running tasks failed on restart, leaving queued intact for the scheduler to resume.
app/app.py: _startup() uses reset_running_tasks() instead of the old inline SQL that cleared both queued and running rows.
config/llm.yaml.example: Documents scheduler.vram_budgets and max_queue_depth config keys.
tests/test_task_scheduler.py (new): 24 tests covering all behaviors.

Test plan

469 tests pass (baseline 445 + 24 new), 1 pre-existing failure excluded (test_generate_calls_llm_router, issue #12)
app/app.py syntax verified with py_compile
No regressions in existing task runner tests

## Summary - **New `scripts/task_scheduler.py`**: Resource-aware `TaskScheduler` singleton grouping LLM tasks by type into per-type deques, scheduling batches by VRAM budget (deepest queue wins), running each type serially to avoid repeated model context-switching. - **`scripts/task_runner.py`**: `submit_task()` routes `cover_letter`, `company_research`, and `wizard_generate` through the scheduler; all other types continue spawning free threads. - **`scripts/db.py`**: `reset_running_tasks()` — marks only `running` tasks failed on restart, leaving `queued` intact for the scheduler to resume. - **`app/app.py`**: `_startup()` uses `reset_running_tasks()` instead of the old inline SQL that cleared both `queued` and `running` rows. - **`config/llm.yaml.example`**: Documents `scheduler.vram_budgets` and `max_queue_depth` config keys. - **`tests/test_task_scheduler.py`** (new): 24 tests covering all behaviors. ## Test plan - [x] 469 tests pass (baseline 445 + 24 new), 1 pre-existing failure excluded (`test_generate_calls_llm_router`, issue #12) - [x] `app/app.py` syntax verified with `py_compile` - [x] No regressions in existing task runner tests

pyr0ball added 3 commits 2026-03-15 15:56:50 -07:00

ci: trigger runner 782936bae4

ci: enable forgejo actions 72c1d4a945

ci: re-trigger after actions enabled

CI / test (pull_request) Failing after 12s

Details

2bf73bbd44

pyr0ball added 1 commit 2026-03-15 16:37:00 -07:00

ci: install libsqlcipher-dev before pip install

CI / test (pull_request) Failing after 7s

Details

5939ae88eb

pyr0ball added 1 commit 2026-03-15 16:37:49 -07:00

ci: apt-get update before installing libsqlcipher-dev

CI / test (pull_request) Failing after 1m2s

Details

4a996c2628

pyr0ball added 1 commit 2026-03-15 16:43:31 -07:00

fix: _trim_to_letter_end matches full name when no profile set

CI / test (pull_request) Successful in 42s

Details

971336841f

When _profile is None the fallback pattern \w+ only matched the first
word of a two-word sign-off (e.g. 'Alex' from 'Alex Rivera'), silently
dropping the last name. Switch fallback to \w+(?:\s+\w+)? so a full
first+last sign-off is preserved in no-config environments (CI, first run).