research: evaluate HRM-Text-1B as fine-tuning base for email classifier #68
Loading…
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Summary
Evaluate
sapientinc/HRM-Text-1Bas a candidate fine-tuning base model for Avocet's email classifier, in addition to or instead of standard decoder-only 1B models.Why it is interesting
HRM (Hierarchical Reasoning Model) uses a dual-timescale recurrent architecture: two transformer stacks (H=slow/high-level, L=fast/low-level) iterate over the same embeddings in nested cycles. With
H_cycles=2, L_cycles=3, each forward pass makes 6 effective reasoning passes through the input — more effective compute depth than a standard 1B decoder-only model at the same parameter count.This may generalize better than a standard 1B on small labeled datasets, which is exactly the regime Avocet operates in: limited human-labeled email samples, high label diversity, and a need for robust generalization across user inboxes.
Model facts
sinimiini/HRM-Text-1B-GGUFGGUF / llama.cpp status
GGUF quants exist (
sinimiini/HRM-Text-1B-GGUF) but require a 556-line patch to llama.cpp across 11 core files (llama-arch.cpp,llama-model.cpp, newhrm-text.cpp, etc.). Patch not yet upstreamed. For fine-tuning, use the safetensors path via transformers — GGUF is only relevant post-quantization for inference deployment.Proposed evaluation
Fine-tuning notes
token_type_idsprefix-marking convention from the model card<|quad_end|><|object_ref_end|>composite condition tokens enable chain-of-thought; may be worth including in fine-tuning prompts for label reasoningReferences
runtime/llama.cpp-hrm_text.patchin GGUF repo