feat: LLM queue optimizer — resource-aware batch scheduler (closes #2) #13
No reviewers
Labels
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: Circuit-Forge/peregrine#13
Loading…
Reference in a new issue
No description provided.
Delete branch "feature/llm-queue-optimizer"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Summary
scripts/task_scheduler.py: Resource-awareTaskSchedulersingleton that groups LLM tasks by type into per-type deques, schedules batches using a VRAM budget check (deepest queue wins), and runs each type serially to avoid repeated model context-switching. Up to N type batches may run concurrently when VRAM fits.scripts/task_runner.py:submit_task()now routescover_letter,company_research, andwizard_generatethrough the scheduler; all other types (discovery,email_sync, etc.) continue spawning free threads unchanged.scripts/db.py:reset_running_tasks()— on restart, marks onlyrunningtasks failed while leavingqueuedtasks intact for the scheduler to resume (durability).app/app.py:_startup()callsreset_running_tasks()instead of the old inline SQL that cleared bothqueuedandrunningrows.config/llm.yaml.example: Documentedscheduler.vram_budgetsandmax_queue_depthconfig keys.tests/test_task_scheduler.py(new): 24 tests covering budget loading, VRAM detection, enqueue depth guard, FIFO ordering, concurrent batches, mid-batch pickup, crash recovery, singleton thread-safety, and durability.Key design decisions
_reserved_vram == 0, at least one batch always starts even if its budget exceeds the VRAM ceiling (prevents permanent deadlock on under-resourced systems)queuedrows survive restarts and are re-loaded into deques onTaskScheduler.__init__; onlyrunningrows (results unknown) are reset tofailedtask_scheduler.pynever importstask_runner.py; routing lives insubmit_task(),_run_taskpassed as a parameterTest plan
test_generate_calls_llm_router, tracked in issue #12)tests/test_task_scheduler.py: 24 new tests, all passingtests/test_task_runner.py: all passing, no regressionsapp/app.pysyntax verified withpy_compileconfig/llm.yaml.exampleYAML valid