Add embedding model and RAG classifier support to benchmark/finetune harness #55
Loading…
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Context
Pagepiper (new CF product) uses
nomic-embed-textvia Ollama for vector embeddings and a BM25+sqlite-vec hybrid retrieval pipeline. As we add embedding-based products, Avocet needs to be able to benchmark, label, and finetune embedding and RAG classifiers — not just sequence-to-sequence generation models.Required additions
Embedding model support
llm.yaml/ benchmark config (separate from chat model)LLMRouter.embed()(cf-core v0.19.0) as the embedding backend in harness runsEmbedding-based classifier support
embedding_similarity— embeds input, computes cosine similarity to class exemplars, assigns labelRAG pipeline evaluation
{ query, relevant_doc_ids, expected_answer }for end-to-end RAG evalFine-tuning targets
nomic-embed-textor a similar small embedding model on domain-specific corpora (TTRPG rulebooks, HR docs, etc.)?{ anchor, positive, negative }triplets (contrastive)Acceptance criteria
Notes
LLMRouter.embed()is available in cf-core v0.19.0 (just merged)nomic-embed-text768-dim via Ollama — good first target for domain fine-tune