feat: replace nomic-embed-text retriever with Agent-ModernColBERT for semantic chunk search #8
Loading…
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Background
Pagepiper currently uses
nomic-embed-text(bi-encoder) + cosine similarity for chunk retrieval. This works for simple keyword-adjacent queries but loses nuance on complex rulebook questions like:A bi-encoder collapses the whole query into a single vector, losing the multi-part reasoning structure.
Proposed upgrade
lightonai/Agent-ModernColBERT— a late-interaction retriever built on ModernBERT. Instead of one vector per chunk, it stores token-level embeddings and computes MaxSim interaction at query time. Designed specifically for agentic/multi-hop queries.Model is registered in cf-orch model registry as
agent-moderncolbert(cf-retrieverservice type, ~800MB VRAM).What to change
Option A — in-process (simpler)
Load Agent-ModernColBERT directly in
app/services/retriever.pyusing thepylatelibrary (the recommended ColBERT inference library from LightOn). Replace thenomic-embed-textembedding + cosine search step.Option B — via cf-orch (consistent with fleet model management)
Route retrieval through
CFOrchClient.task_allocate("pagepiper", "retrieve")once acf-retrieverservice is defined. Keeps model loading/VRAM outside the FastAPI process.Option A is the right first step — Option B is worth revisiting when
cf-retrieveris a fully managed cf-orch service.Index storage note
ColBERT stores per-token embeddings (~128 floats/token) rather than one vector per chunk. Index will be larger than the current embedding store. For typical TTRPG rulebook collections this is acceptable — flag if storage becomes a concern at scale.
Acceptance criteria
pylateadded to dependenciesapp/services/retriever.pyuses Agent-ModernColBERT for chunk retrieval/chatendpoint behaviour unchanged (retriever is internal)pagepiper.retrievetoassignments.yamlin cf-orch pointing atagent-moderncolbertRelated
agent-moderncolbert(already registered)app/services/retriever.py,app/api/chat.pylightonai/Agent-ModernColBERTon HuggingFace