20s was too tight for first-request model swaps in Ollama (model cold load can take 30-60s). 120s matches coordinator inference timeout. |
||
|---|---|---|
| .. | ||
| __init__.py | ||
| diagnose.py | ||
| incidents.py | ||
| llm.py | ||
| models.py | ||
| search.py | ||
20s was too tight for first-request model swaps in Ollama (model cold load can take 30-60s). 120s matches coordinator inference timeout. |
||
|---|---|---|
| .. | ||
| __init__.py | ||
| diagnose.py | ||
| incidents.py | ||
| llm.py | ||
| models.py | ||
| search.py | ||