sec: add per-user rate limiting on LLM generation endpoints #122
Labels
No labels
a11y
backlog
beta-feedback
bug
enhancement
feature-request
frontend
needs-triage
question
vue
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: Circuit-Forge/peregrine#122
Loading…
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Summary
No rate limiting exists on any AI generation endpoint. In cloud mode, a single user can fire unlimited LLM requests, creating a cost-amplification risk.
Affected endpoints
POST /api/jobs/{job_id}/cover_letter/generatePOST /api/jobs/{job_id}/research/generatePOST /api/jobs/{job_id}/qa/suggestPOST /api/wizard/ai/interviewPOST /api/jobs/{job_id}/survey/assistProposed approach
Add
slowapi(pip install slowapi) with per-user rate limits. Example limits:Use
X-CF-User-ID(set by cloud session middleware) as the rate limit key; fall back to client IP in non-cloud mode.Acceptance criteria
slowapiadded toenvironment.ymlRetry-AfterheaderLLM_RATE_COVER_LETTER, etc.)IS_DEMO=true)Related
Part of the 2026-06-13 CVE scan security review.