feat: LLM SFT finetuning backend (TRL + PEFT/LoRA) #46
Loading…
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Context
The SFT corrections pipeline (candidates → review → export JSONL) already exists. This closes the loop by adding the training backend so corrections actually train models.
Work
scripts/finetune_sft.pyusing TRLSFTTrainer+ PEFT LoRA{prompt, completion}pairs), base model HF id or local pathmodels/directory,training_info.json(same schema as classifier)llm-sftjob type in the train job queue (#43)r,lora_alpha,target_modules,epochs,batch_sizevia job config_json_best_cuda_device()pattern (highest free VRAM via nvidia-smi)environment.yml:trl,peftAcceptance
CF_TEXT_4BIT=1equivalent for trainingclassifier_adapters.pyFineTunedAdapteror newLoRAAdapterShipped in the Apr 19–May 4 sprint. LLM SFT backend using TRL + PEFT/LoRA is in app/train/train.py.