Migrate LLMRouter model selection to cf-orch task routing via /api/inference/task #7

Closed
opened 2026-05-13 10:02:35 -07:00 by pyr0ball · 0 comments
Owner

Background

cf-orch #60 shipped the three-layer task-model assignment system and POST /api/inference/task. Products can now route inference by task name instead of hardcoded model IDs.

Spec: circuitforge-plans/circuitforge-orch/superpowers/specs/2026-05-13-task-model-assignments-design.md

Current state

Pagepiper already supports CF_ORCH_URL as an LLMRouter backend (via get_llm_config() in app/config.py). However, model selection is still explicit via env vars (PAGEPIPER_EMBED_MODEL, etc.) and the chat endpoint calls LLMRouter with a specific model rather than delegating model choice to the assignment layer.

Call sites:

  • app/api/chat.py_get_llm_router() / _require_llm(), passed to Synthesizer
  • app/services/synthesizer.pySynthesizer.__init__(llm) (already DI-friendly)
  • scripts/ingest_pdf.py, scripts/ingest_epub.py, scripts/ingest_docx.py — embedding during ingest

What to do

1. Register tasks in assignments.yaml

pagepiper:
  rag_query:
    model_id: ibm-granite--granite-4.1-8b
    description: RAG chat completion (answer from retrieved context)
  embed:
    model_id: <embedding-model-slug>
    description: Chunk embedding for PDF ingest

2. Migrate chat.py to /api/inference/task

When CF_ORCH_URL is set, call POST /api/inference/task with {"product": "pagepiper", "task": "rag_query", ...} instead of routing through LLMRouter with an explicit model. Keep the LLMRouter path as fallback for standalone/Ollama installs.

3. Embed task (optional first pass)

The ingest scripts use nomic-embed-text explicitly. In a future pass, wire the embed task so the embedding model can be swapped via Avocet Assignments UI without restarting pagepiper.

4. Backwards compatibility

This is opt-in — LLMRouter direct path continues to work for standalone installs without cf-orch.

Acceptance Criteria

  • assignments.yaml has pagepiper.rag_query entry
  • chat.py routes through /api/inference/task when CF_ORCH_URL is set
  • Fallback to LLMRouter still works for standalone Ollama installs
  • Tests updated
  • cf-orch #60 (task-model assignment layer)
  • app/api/chat.py, app/services/synthesizer.py
  • circuitforge-plans/circuitforge-orch/superpowers/specs/2026-05-13-task-model-assignments-design.md
## Background cf-orch #60 shipped the three-layer task-model assignment system and `POST /api/inference/task`. Products can now route inference by task name instead of hardcoded model IDs. Spec: `circuitforge-plans/circuitforge-orch/superpowers/specs/2026-05-13-task-model-assignments-design.md` ## Current state Pagepiper already supports `CF_ORCH_URL` as an LLMRouter backend (via `get_llm_config()` in `app/config.py`). However, model selection is still explicit via env vars (`PAGEPIPER_EMBED_MODEL`, etc.) and the chat endpoint calls `LLMRouter` with a specific model rather than delegating model choice to the assignment layer. Call sites: - `app/api/chat.py` — `_get_llm_router()` / `_require_llm()`, passed to `Synthesizer` - `app/services/synthesizer.py` — `Synthesizer.__init__(llm)` (already DI-friendly) - `scripts/ingest_pdf.py`, `scripts/ingest_epub.py`, `scripts/ingest_docx.py` — embedding during ingest ## What to do ### 1. Register tasks in `assignments.yaml` ```yaml pagepiper: rag_query: model_id: ibm-granite--granite-4.1-8b description: RAG chat completion (answer from retrieved context) embed: model_id: <embedding-model-slug> description: Chunk embedding for PDF ingest ``` ### 2. Migrate `chat.py` to `/api/inference/task` When `CF_ORCH_URL` is set, call `POST /api/inference/task` with `{"product": "pagepiper", "task": "rag_query", ...}` instead of routing through `LLMRouter` with an explicit model. Keep the `LLMRouter` path as fallback for standalone/Ollama installs. ### 3. Embed task (optional first pass) The ingest scripts use `nomic-embed-text` explicitly. In a future pass, wire the `embed` task so the embedding model can be swapped via Avocet Assignments UI without restarting pagepiper. ### 4. Backwards compatibility This is opt-in — `LLMRouter` direct path continues to work for standalone installs without cf-orch. ## Acceptance Criteria - [ ] `assignments.yaml` has `pagepiper.rag_query` entry - [ ] `chat.py` routes through `/api/inference/task` when `CF_ORCH_URL` is set - [ ] Fallback to `LLMRouter` still works for standalone Ollama installs - [ ] Tests updated ## Related - cf-orch #60 (task-model assignment layer) - `app/api/chat.py`, `app/services/synthesizer.py` - `circuitforge-plans/circuitforge-orch/superpowers/specs/2026-05-13-task-model-assignments-design.md`
Sign in to join this conversation.
No labels
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: Circuit-Forge/pagepiper#7
No description provided.