QueryTranslator now supports two backends chosen at startup: - CF_ORCH_URL set: allocate via POST /api/inference/task (product=snipe, task=query_translation), call the allocated cf-text service, release the slot in a finally block to guarantee the VRAM lease is freed. - CF_ORCH_URL absent: existing LLMRouter path unchanged (ollama/vllm/api keys). Also moves httpx from dev-only to main dependencies (already used by mcp/server.py). |
||
|---|---|---|
| .. | ||
| __init__.py | ||
| cloud_session.py | ||
| ebay_webhook.py | ||
| main.py | ||