feat: agent watchdog + Ollama adopt-if-running #17
No reviewers
Labels
No labels
architecture
backlog
enhancement
module:documents
module:hardware
module:manage
module:pipeline
module:voice
priority:backlog
priority:high
priority:medium
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: Circuit-Forge/circuitforge-core#17
Loading…
Reference in a new issue
No description provided.
Delete branch "feature/agent-watchdog"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Summary
#15 — Agent watchdog / coordinator-restart reconnect
NodeStore: SQLite persistence at~/.local/share/circuitforge/cf-orch-nodes.dbAgentSupervisor.restore_from_store(): reloads all known nodes on coordinator startup, marks offline until first successful pollregister()now persists every call to NodeStore#16 — Ollama adopt-if-running + health_path
ProcessSpec:adopt: bool+health_path: strfieldsServiceManager.start(): withadopt=True, probes health first; claims the running service without spawning a new processServiceInstance.health_path+upsert_instance(health_path=)— probe loop uses per-instance pathadopt: true,health_path: /api/tagsTests
Closes #15, #16
- ProcessSpec: adopt (bool) and health_path (str, default /health) fields - ServiceManager: adopt=True probes health_path before spawning; is_running() uses health probe for adopt services rather than proc table + socket check - _probe_health() helper: urllib GET on localhost:port+path, returns bool - Agent /services/{service}/start: returns adopted=True when service was already running; coordinator sets state=running immediately (no probe wait) - ServiceInstance: health_path field (default /health) - service_registry.upsert_instance(): health_path kwarg - Probe loop uses inst.health_path instead of hardcoded /health - coordinator allocate_service: looks up health_path from profile spec via _get_health_path() and stores on ServiceInstance - All GPU profiles (2/4/6/8/16/24 GB + cpu-16/32): ollama managed block with adopt=true, health_path=/api/tags, port 11434 - 11 new tests