pipeline: multimodal chunked pipeline — cf-docuvision page chunks → cf-text streaming #42
Labels
No labels
architecture
backlog
enhancement
module:documents
module:hardware
module:manage
module:pipeline
module:voice
priority:backlog
priority:high
priority:medium
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: Circuit-Forge/circuitforge-core#42
Loading…
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Problem
For multi-page documents, the current
ingest()→generate()pattern processes the entire document before any text is generated. For a 10-page resume or government form, that is a long wait before the UI shows anything.Desired behaviour
Design sketch
The
pipelinemodule (currently a staging stub in cf-core) needs aMultimodalPipelinethat:StructuredDocument.raw_textchunk into cf-textgenerate_stream()(page_idx, token)tuples to the caller for progressive UI renderingVRAM considerations
offload_between_steps: trueflag for nodes below 16GB.Consumers
falcon— government forms (multi-page PDFs)peregrine— resume analysis + cover letter generation in one pipelinegodwit— identity document bundle (multiple document types)Related
circuitforge-core#41(cf-text module — closed)Circuit-Forge/cf-docuvision(Dolphin-v2 service — scaffolded)circuitforge-corepipeline stub