chore(llm): swap model_candidates order — Qwen2.5-3B first, Phi-4-mini fallback

Phi-4-mini's cached modeling_phi3.py imports SlidingWindowCache which was removed in transformers 5.x. Qwen2.5-3B uses built-in qwen2 arch and works cleanly. Reorder so Qwen is tried first.
2026-04-02 16:36:38 -07:00 · 2026-04-02 16:36:38 -07:00 · bc80922d61
commit bc80922d61
parent 11fb3a07b4
1 changed files with 1 additions and 1 deletions
--- a/config/llm.yaml
+++ b/config/llm.yaml
@ -48,8 +48,8 @@ backends:
    cf_orch:
      service: vllm
      model_candidates:
      - Phi-4-mini-instruct
      - Qwen2.5-3B-Instruct
      - Phi-4-mini-instruct
      ttl_s: 300
  vllm_research:
    api_key: ''