orpo  by xfactlab

Preference optimization without a reference model

created 1 year ago
461 stars

Top 66.7% on sourcepulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

ORPO (Monolithic Preference Optimization without Reference Model) is a novel method for aligning large language models (LLMs) with human preferences, offering an alternative to existing techniques like RLHF. It targets LLM researchers and developers seeking to improve model instruction following and preference alignment.

How It Works

ORPO directly optimizes the LLM's policy using a preference loss function that penalizes deviations from preferred responses and rewards disliking less preferred ones. This approach avoids the complexity and instability associated with training a separate reward model, simplifying the alignment pipeline.

Quick Start & Requirements

  • Install: Integration with 🤗 TRL, Axolotl, and LLaMA-Factory is available. A sample script for ORPOTrainer is in trl/test_orpo_trainer_demo.py.
  • Prerequisites: Requires Python and Hugging Face libraries. Specific hardware requirements (e.g., GPU, VRAM) depend on the model size and training configuration.
  • Resources: Links to Wandb reports for model checkpoints are provided.

Highlighted Details

  • Mistral-ORPO-β achieved a 14.7% length-controlled win rate on the AlpacaEval Leaderboard.
  • Provides pre-trained model checkpoints like kaist-ai/mistral-orpo-capybara-7k, kaist-ai/mistral-orpo-alpha, and kaist-ai/mistral-orpo-beta.
  • Includes performance results on AlpacaEval, MT-Bench, and IFEval benchmarks.

Maintenance & Community

  • Official repository for ORPO.
  • Updates indicate ongoing development and integration efforts.

Licensing & Compatibility

  • The README does not explicitly state the license. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

  • Detailed training logs for Mistral-ORPO-Capybara-7k are marked as "TBU" (To Be Updated).
  • The project appears to be in active development, with some components potentially subject to change.
Health Check
Last commit

1 year ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
12 stars in the last 90 days

Explore Similar Projects

Starred by Stas Bekman Stas Bekman(Author of Machine Learning Engineering Open Book; Research Engineer at Snowflake).

HALOs by ContextualAI

0.2%
873
Library for aligning LLMs using human-aware loss functions
created 1 year ago
updated 2 weeks ago
Starred by Ross Taylor Ross Taylor(Cofounder of General Reasoning; Creator of Papers with Code), Daniel Han Daniel Han(Cofounder of Unsloth), and
4 more.

open-instruct by allenai

0.2%
3k
Training codebase for instruction-following language models
created 2 years ago
updated 22 hours ago
Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Jeff Hammerbacher Jeff Hammerbacher(Cofounder of Cloudera), and
10 more.

open-r1 by huggingface

0.2%
25k
SDK for reproducing DeepSeek-R1
created 6 months ago
updated 3 days ago
Feedback? Help us improve.