Premium/ultra users with a custom_writing_model in their session are
routed to that model as the first cf-orch candidate; all other tiers
use the shared Qwen2.5-3B-Instruct base. complete_json() is unchanged
since fine-tuned writing models aren't trained for structured output.
Adds _request_tier and _request_writing_model ContextVars. Resolution
order: USER_WRITING_MODELS env var (Monday path) then Heimdall meta
(future path via peregrine#110).
- app/cloud_session.py: CloudSessionFactory(product="peregrine") from
cf-core v0.16.0; get_session / require_tier FastAPI dependencies;
session_middleware_dep sets request-scoped user_id ContextVar
- app/llm.py: _request_user_id ContextVar + set/get helpers;
_allocate_orch_async includes user_id in payload when present so
premium users get their custom model path from cf-orch UserModelRegistry
- app/main.py: session_middleware_dep wired as global FastAPI dependency;
runs on every request, zero function-signature changes needed
Force-added to bypass resume_matcher/ gitignore (CF-specific patch files).