Library for aligning LLMs using human-aware loss functions
Top 42.0% on sourcepulse
This library provides extensible implementations of human-aware loss functions (HALOs) for aligning Large Language Models (LLMs). It targets researchers and engineers seeking modularity and simplicity in implementing alignment methods like DPO, KTO, and PPO, enabling custom dataloaders and loss functions.
How It Works
HALOs offers a modular architecture separating dataloading, training, and sampling. It leverages Hydra for configuration, Accelerate for job launching, and FSDP for distributed training. This design prioritizes extensibility, allowing users to easily implement new alignment losses or data handling strategies by subclassing provided trainer and dataloader classes.
Quick Start & Requirements
./install.sh
.Highlighted Details
Maintenance & Community
The project was originally released with the KTO paper and has seen significant revisions. Archangel models trained using an earlier version are available on Huggingface.
Licensing & Compatibility
The repository does not explicitly state a license in the README.
Limitations & Caveats
The README emphasizes that package versions are critical and changing them may break the code. It also notes that intermediate checkpoints during LoRA training only save the LoRA module.
2 weeks ago
1 week