Xwin-LM  by Xwin-LM

LLM for alignment research, fine-tuning, and open-source contribution

created 1 year ago
1,040 stars

Top 36.8% on sourcepulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

Xwin-LM provides a suite of powerful, stable, and reproducible large language model (LLM) alignment technologies. It targets researchers and developers seeking to enhance LLM performance through methods like supervised fine-tuning (SFT), reward modeling (RM), and reinforcement learning from human feedback (RLHF). The project's models have achieved state-of-the-art results, notably surpassing GPT-4 on the AlpacaEval benchmark.

How It Works

Xwin-LM builds upon Llama 2 base models, integrating advanced alignment techniques including RLHF. This approach focuses on improving instruction following, reasoning, and conversational abilities. The project emphasizes reproducibility and provides detailed benchmarks demonstrating competitive performance against leading proprietary and open-source models across various tasks, including general conversation, coding, and mathematical reasoning.

Quick Start & Requirements

  • Install/Run: Models are available via Hugging Face Transformers. Inference can be accelerated using vLLM.
  • Prerequisites: Python, Hugging Face Transformers, vLLM (for accelerated inference). Specific hardware requirements depend on model size (7B, 13B, 34B, 70B).
  • Resources: Requires significant GPU memory for larger models.
  • Links: Hugging Face, vLLM

Highlighted Details

  • Achieved TOP-1 on AlpacaEval, surpassing GPT-4.
  • Xwin-Math models set new state-of-the-art on MATH and GSM8K benchmarks.
  • Xwin-Coder models show comparable performance to GPT-3.5-turbo on multiple benchmarks.
  • Supports multi-turn conversations with a Vicuna-compatible prompt format.

Maintenance & Community

The project is actively updated with new model releases and benchmark results. Community engagement and support channels are not explicitly detailed in the README.

Licensing & Compatibility

All models are released under the Llama 2 License, which permits commercial use but has specific usage restrictions.

Limitations & Caveats

The project is described as "pre-release" in its citation, suggesting ongoing development. While code for inference is provided, training code or detailed alignment methodologies are not explicitly available.

Health Check
Last commit

1 year ago

Responsiveness

1 week

Pull Requests (30d)
0
Issues (30d)
0
Star History
6 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.