HALOs  by ContextualAI

Library for aligning LLMs using human-aware loss functions

Created 1 year ago
886 stars

Top 40.9% on SourcePulse

GitHubView on GitHub
Project Summary

This library provides extensible implementations of human-aware loss functions (HALOs) for aligning Large Language Models (LLMs). It targets researchers and engineers seeking modularity and simplicity in implementing alignment methods like DPO, KTO, and PPO, enabling custom dataloaders and loss functions.

How It Works

HALOs offers a modular architecture separating dataloading, training, and sampling. It leverages Hydra for configuration, Accelerate for job launching, and FSDP for distributed training. This design prioritizes extensibility, allowing users to easily implement new alignment losses or data handling strategies by subclassing provided trainer and dataloader classes.

Quick Start & Requirements

  • Install dependencies via ./install.sh.
  • Requires Python and PyTorch. Specific package versions are critical.
  • Supports multi-node training and LoRA.
  • FlashAttention can be enabled.
  • Precomputing reference model log probabilities is supported for memory savings.
  • Official quick-start examples and detailed configuration options are available within the repository.

Highlighted Details

  • Supports DPO, KTO, PPO, and ORPO.
  • Features reference logit caching for efficiency.
  • Includes easy evaluation capabilities with AlpacaEval and LMEval.
  • Tested on LLMs from 1B to 30B parameters.

Maintenance & Community

The project was originally released with the KTO paper and has seen significant revisions. Archangel models trained using an earlier version are available on Huggingface.

Licensing & Compatibility

The repository does not explicitly state a license in the README.

Limitations & Caveats

The README emphasizes that package versions are critical and changing them may break the code. It also notes that intermediate checkpoints during LoRA training only save the LoRA module.

Health Check
Last Commit

2 months ago

Responsiveness

1 week

Pull Requests (30d)
0
Issues (30d)
0
Star History
7 stars in the last 30 days

Explore Similar Projects

Starred by Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), Stefan van der Walt Stefan van der Walt(Core Contributor to scientific Python ecosystem), and
12 more.

litgpt by Lightning-AI

0.1%
13k
LLM SDK for pretraining, finetuning, and deploying 20+ high-performance LLMs
Created 2 years ago
Updated 5 days ago
Starred by Junyang Lin Junyang Lin(Core Maintainer at Alibaba Qwen), Vincent Weisser Vincent Weisser(Cofounder of Prime Intellect), and
25 more.

alpaca-lora by tloen

0.0%
19k
LoRA fine-tuning for LLaMA
Created 2 years ago
Updated 1 year ago
Feedback? Help us improve.