feat: LLM queue optimizer — resource-aware batch scheduler (closes #2) #13

Merged
pyr0ball merged 17 commits from feature/llm-queue-optimizer into main 2026-03-15 05:11:30 -07:00

17 commits

Author SHA1 Message Date
22091760bd feat: LLM queue optimizer complete — closes #2
Some checks failed
CI / test (pull_request) Failing after 32s
Resource-aware batch scheduler for LLM tasks:
- scripts/task_scheduler.py (new): TaskScheduler singleton with VRAM-aware
  batch scheduling, durability, thread-safe singleton, memory safety
- scripts/task_runner.py: submit_task() routes LLM types through scheduler
- scripts/db.py: reset_running_tasks() for durable restart behavior
- app/app.py: _startup() preserves queued tasks on restart
- config/llm.yaml.example: scheduler VRAM budget config documented
- tests/test_task_scheduler.py (new): 24 tests covering all behaviors

Pre-existing failure: test_generate_calls_llm_router (issue #12, unrelated)
2026-03-15 05:01:24 -07:00
a17ba1e8d8 feat(app): use reset_running_tasks() on startup to preserve queued tasks 2026-03-15 04:57:49 -07:00
1139cd55ec feat(task_runner): route LLM tasks through scheduler in submit_task()
Replaces the spawn-per-task model for LLM task types with scheduler
routing: cover_letter, company_research, and wizard_generate are now
enqueued via the TaskScheduler singleton for VRAM-aware batching.
Non-LLM tasks (discovery, email_sync, etc.) continue to spawn daemon
threads directly. Adds autouse clean_scheduler fixture to
test_task_runner.py to prevent singleton cross-test contamination.
2026-03-15 04:52:42 -07:00
dfd2f0214e feat(scheduler): add durability — re-queue surviving LLM tasks on startup 2026-03-15 04:24:11 -07:00
1d9020c99a feat(scheduler): implement thread-safe singleton get_scheduler/reset_scheduler 2026-03-15 04:19:23 -07:00
84ce68af46 feat(scheduler): implement scheduler loop and batch worker with VRAM-aware scheduling 2026-03-15 04:14:56 -07:00
605e820fa6 feat(scheduler): implement enqueue() with depth guard and ghost-row cleanup 2026-03-15 04:05:22 -07:00
fa780af2f1 refactor(scheduler): use module-level _get_gpus directly in __init__ 2026-03-15 04:01:01 -07:00
cceacdd371 feat(scheduler): implement TaskScheduler.__init__ with budget loading and VRAM detection 2026-03-15 03:32:11 -07:00
0fedf7989e feat(scheduler): add task_scheduler.py skeleton with constants and TaskSpec 2026-03-15 03:28:43 -07:00
b664240340 docs(config): add scheduler VRAM budget config to llm.yaml.example 2026-03-15 03:28:26 -07:00
1f2273f049 refactor(tests): remove unused imports from test_task_scheduler 2026-03-15 03:27:17 -07:00
5ba654e414 feat(db): add reset_running_tasks() for durable scheduler restart 2026-03-15 03:22:45 -07:00
07166325dd docs: add LLM queue optimizer implementation plan
11-task TDD plan across 3 reviewed chunks. Covers:
- reset_running_tasks() db helper
- TaskScheduler skeleton + __init__ + enqueue + loop + workers
- Thread-safe singleton, durability, submit_task routing shim
- app.py startup change + full suite verification
2026-03-14 17:11:49 -07:00
7983f3365d docs: revise queue optimizer spec after review
Addresses 16 review findings across two passes:
- Clarify _active.pop/double-decrement non-issue
- Fix app.py change target (inline SQL, not kill_stuck_tasks)
- Scope durability to LLM types only
- Add _budgets to state table with load logic
- Fix singleton safety explanation (lock, not GIL)
- Ghost row fix: mark dropped tasks failed in DB
- Document static _available_vram as known limitation
- Fix test_llm_tasks_batch_by_type description
- Eliminate circular import via routing split in submit_task()
- Add missing budget warning at construction
2026-03-14 16:46:38 -07:00
9fcfe7daa1 docs: add LLM queue optimizer design spec
Resource-aware batch scheduler for LLM tasks. Closes #2.
2026-03-14 16:38:47 -07:00
397b778217 chore: add .worktrees/ to .gitignore
Prevents worktree directories from being tracked.
2026-03-14 16:30:38 -07:00