HALOs  by ContextualAI

Library for aligning LLMs using human-aware loss functions

created 1 year ago
873 stars

Top 42.0% on sourcepulse

GitHubView on GitHub
Project Summary

This library provides extensible implementations of human-aware loss functions (HALOs) for aligning Large Language Models (LLMs). It targets researchers and engineers seeking modularity and simplicity in implementing alignment methods like DPO, KTO, and PPO, enabling custom dataloaders and loss functions.

How It Works

HALOs offers a modular architecture separating dataloading, training, and sampling. It leverages Hydra for configuration, Accelerate for job launching, and FSDP for distributed training. This design prioritizes extensibility, allowing users to easily implement new alignment losses or data handling strategies by subclassing provided trainer and dataloader classes.

Quick Start & Requirements

  • Install dependencies via ./install.sh.
  • Requires Python and PyTorch. Specific package versions are critical.
  • Supports multi-node training and LoRA.
  • FlashAttention can be enabled.
  • Precomputing reference model log probabilities is supported for memory savings.
  • Official quick-start examples and detailed configuration options are available within the repository.

Highlighted Details

  • Supports DPO, KTO, PPO, and ORPO.
  • Features reference logit caching for efficiency.
  • Includes easy evaluation capabilities with AlpacaEval and LMEval.
  • Tested on LLMs from 1B to 30B parameters.

Maintenance & Community

The project was originally released with the KTO paper and has seen significant revisions. Archangel models trained using an earlier version are available on Huggingface.

Licensing & Compatibility

The repository does not explicitly state a license in the README.

Limitations & Caveats

The README emphasizes that package versions are critical and changing them may break the code. It also notes that intermediate checkpoints during LoRA training only save the LoRA module.

Health Check
Last commit

2 weeks ago

Responsiveness

1 week

Pull Requests (30d)
0
Issues (30d)
0
Star History
37 stars in the last 90 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Ying Sheng Ying Sheng(Author of SGLang), and
1 more.

LookaheadDecoding by hao-ai-lab

0.1%
1k
Parallel decoding algorithm for faster LLM inference
created 1 year ago
updated 4 months ago
Starred by Ross Taylor Ross Taylor(Cofounder of General Reasoning; Creator of Papers with Code), Daniel Han Daniel Han(Cofounder of Unsloth), and
4 more.

open-instruct by allenai

0.2%
3k
Training codebase for instruction-following language models
created 2 years ago
updated 16 hours ago
Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Ying Sheng Ying Sheng(Author of SGLang), and
9 more.

alpaca-lora by tloen

0.0%
19k
LoRA fine-tuning for LLaMA
created 2 years ago
updated 1 year ago
Feedback? Help us improve.