QueryTranslator now supports two backends chosen at startup:
- CF_ORCH_URL set: allocate via POST /api/inference/task (product=snipe,
task=query_translation), call the allocated cf-text service, release the
slot in a finally block to guarantee the VRAM lease is freed.
- CF_ORCH_URL absent: existing LLMRouter path unchanged (ollama/vllm/api keys).
Also moves httpx from dev-only to main dependencies (already used by mcp/server.py).