Startup vec DB schema validation: detect dimension mismatch and auto-rebuild #3
Loading…
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Problem
When
PAGEPIPER_EMBED_DIMSchanges (e.g. swapping embedding models), the sqlite-vec virtual table has the old dimension baked into its DDL (float[768]vsfloat[1024]). The mismatch is not caught at startup — it surfaces as a runtimesqlite3.OperationalError: Dimension mismatchonly when a user tries to chat or search.This is especially disruptive for cloud instances where the vec DB is root-owned (created by the container process) and can only be deleted from inside the container.
Fix
At application startup (lifespan), read the actual dimension from the
page_vecs_vecsvirtual table schema and compare toPAGEPIPER_EMBED_DIMS:Call this from
lifespan()before any ingest or search runs. After dropping, trigger re-ingest for allreadydocuments so vectors are rebuilt against the new schema.Additional items
PAGEPIPER_EMBED_DIMSis not a positive int, exit with a clear error messagedocker execto delete it)