Fingerprint-based incremental glean: skip unchanged log files on batch re-glean #30
Labels
No labels
compliance
demo
deployment
docs
enhancement
parser
patterns
performance
security
ux
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: Circuit-Forge/turnstone#30
Loading…
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Turnstone currently re-gleams every configured log source on each batch interval, even when files have not changed. For large static archives or slow-rotating logs this wastes I/O and CPU.
Proposed approach:
manage.sh glean --forceImpact: Makes the 15-minute glean interval viable even for sources with hundreds of MB of static history. Particularly useful for corpus import workflows.
Reference: https://github.com/Lum1104/Understand-Anything (fingerprint-based incremental update pattern)