Discover and explore top open-source AI tools and projects—updated daily.
huggingfaceHandbook for aligning language models with human/AI preferences
Top 9.3% on SourcePulse
This repository provides a comprehensive toolkit and recipes for aligning large language models (LLMs) with human and AI preferences. It targets ML engineers and researchers seeking to replicate state-of-the-art chatbot alignment techniques like RLHF, DPO, and ORPO, offering robust, reproducible training pipelines.
How It Works
The handbook implements a four-step alignment pipeline: continued pretraining, supervised fine-tuning (SFT) for instruction following, preference alignment using methods like Direct Preference Optimization (DPO) or Odds Ratio Preference Optimisation (ORPO), and a combined SFT/ORPO stage. It supports both full model weight training with DeepSpeed ZeRO-3 and parameter-efficient fine-tuning (PEFT) via LoRA/QLoRA. This modular approach allows users to adapt LLMs to new domains, languages, or specific behavioral objectives.
Quick Start & Requirements
conda create -n handbook python=3.10 && conda activate handbook), install PyTorch v2.1.2 (hardware-dependent), then pip install . and pip install flash-attn --no-build-isolation. Log in via huggingface-cli login and install Git LFS.Highlighted Details
Maintenance & Community
The project is actively maintained by Hugging Face, with contributions from prominent researchers. It has a growing ecosystem with releases of new models and recipes. Community engagement channels are available via Hugging Face's platforms.
Licensing & Compatibility
The project is licensed under the Apache-2.0 license, permitting commercial use and integration with closed-source projects.
Limitations & Caveats
The installation of PyTorch v2.1.2 is critical for reproducibility and requires careful attention to hardware compatibility. Flash Attention 2 installation may require adjusting MAX_JOBS for systems with limited RAM.
1 month ago
1 week
xfactlab
eric-mitchell