sec: add per-user rate limiting on LLM generation endpoints #122

Open
opened 2026-06-13 21:35:46 -07:00 by pyr0ball · 0 comments
Owner

Summary

No rate limiting exists on any AI generation endpoint. In cloud mode, a single user can fire unlimited LLM requests, creating a cost-amplification risk.

Affected endpoints

  • POST /api/jobs/{job_id}/cover_letter/generate
  • POST /api/jobs/{job_id}/research/generate
  • POST /api/jobs/{job_id}/qa/suggest
  • POST /api/wizard/ai/interview
  • POST /api/jobs/{job_id}/survey/assist

Proposed approach

Add slowapi (pip install slowapi) with per-user rate limits. Example limits:

  • Interview wizard: 60 req/hour per user
  • Cover letter gen: 20 req/hour per user
  • Research: 10 req/hour per user

Use X-CF-User-ID (set by cloud session middleware) as the rate limit key; fall back to client IP in non-cloud mode.

Acceptance criteria

  • slowapi added to environment.yml
  • Rate limiter wired to the 5 endpoints above
  • 429 response includes Retry-After header
  • Rate limits configurable via env vars (LLM_RATE_COVER_LETTER, etc.)
  • Rate limiting disabled in demo mode (IS_DEMO=true)

Part of the 2026-06-13 CVE scan security review.

## Summary No rate limiting exists on any AI generation endpoint. In cloud mode, a single user can fire unlimited LLM requests, creating a cost-amplification risk. ## Affected endpoints - `POST /api/jobs/{job_id}/cover_letter/generate` - `POST /api/jobs/{job_id}/research/generate` - `POST /api/jobs/{job_id}/qa/suggest` - `POST /api/wizard/ai/interview` - `POST /api/jobs/{job_id}/survey/assist` ## Proposed approach Add `slowapi` (`pip install slowapi`) with per-user rate limits. Example limits: - Interview wizard: 60 req/hour per user - Cover letter gen: 20 req/hour per user - Research: 10 req/hour per user Use `X-CF-User-ID` (set by cloud session middleware) as the rate limit key; fall back to client IP in non-cloud mode. ## Acceptance criteria - [ ] `slowapi` added to `environment.yml` - [ ] Rate limiter wired to the 5 endpoints above - [ ] 429 response includes `Retry-After` header - [ ] Rate limits configurable via env vars (`LLM_RATE_COVER_LETTER`, etc.) - [ ] Rate limiting disabled in demo mode (`IS_DEMO=true`) ## Related Part of the 2026-06-13 CVE scan security review.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: Circuit-Forge/peregrine#122
No description provided.