ollama: register as tracked service in cf-orch (VRAM accounting + adopt-if-running) #16
Labels
No labels
architecture
backlog
enhancement
module:documents
module:hardware
module:manage
module:pipeline
module:voice
priority:backlog
priority:high
priority:medium
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: Circuit-Forge/circuitforge-core#16
Loading…
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Problem
Ollama is listed in GPU profiles with a
max_mbbudget, but cf-orch has no visibility into it:This also blocks self-hoster generalisation: Ollama is the most common local LLM backend, and most self-hosters will already have it running as a system service.
Proposed solution
1.
adopt-if-runningProcessSpec modeAdd an
adopt: trueflag to the managed block. On coordinator startup (or first allocation attempt for that service), the agent:GET http://localhost:11434/api/tags)runninginstance in the ServiceRegistry with the known URL; no process is started2.
health_pathfield in ProcessSpecOllama does not expose
GET /health— it usesGET /api/tags. Add an optionalhealth_pathfield to ProcessSpec so the probe loop uses the correct endpoint.3. VRAM accounting
Once Ollama is a tracked
runninginstance, the existing VRAM lease machinery accounts for it automatically. The allocator will correctly refuse to schedule competing services on the same GPU when Ollama is loaded.4. Dashboard visibility
Ollama will appear in the Service Instances table as any other managed service.
Acceptance criteria
runninghealth_pathfield honoured by probe loopmanaged:+adopt: trueblocks for Ollama