Xwin-LM by Xwin-LM

LLM for alignment research, fine-tuning, and open-source contribution

Created 2 years ago

1,041 stars

Top 36.1% on SourcePulse

View on GitHub

1 Expert Loves This Project

Jeremy Howard

Cofounder of fast.ai

Project Summary

Xwin-LM provides a suite of powerful, stable, and reproducible large language model (LLM) alignment technologies. It targets researchers and developers seeking to enhance LLM performance through methods like supervised fine-tuning (SFT), reward modeling (RM), and reinforcement learning from human feedback (RLHF). The project's models have achieved state-of-the-art results, notably surpassing GPT-4 on the AlpacaEval benchmark.

How It Works

Xwin-LM builds upon Llama 2 base models, integrating advanced alignment techniques including RLHF. This approach focuses on improving instruction following, reasoning, and conversational abilities. The project emphasizes reproducibility and provides detailed benchmarks demonstrating competitive performance against leading proprietary and open-source models across various tasks, including general conversation, coding, and mathematical reasoning.

Quick Start & Requirements

Install/Run: Models are available via Hugging Face Transformers. Inference can be accelerated using vLLM.
Prerequisites: Python, Hugging Face Transformers, vLLM (for accelerated inference). Specific hardware requirements depend on model size (7B, 13B, 34B, 70B).
Resources: Requires significant GPU memory for larger models.
Links: Hugging Face, vLLM

Highlighted Details

Achieved TOP-1 on AlpacaEval, surpassing GPT-4.
Xwin-Math models set new state-of-the-art on MATH and GSM8K benchmarks.
Xwin-Coder models show comparable performance to GPT-3.5-turbo on multiple benchmarks.
Supports multi-turn conversations with a Vicuna-compatible prompt format.

Maintenance & Community

The project is actively updated with new model releases and benchmark results. Community engagement and support channels are not explicitly detailed in the README.

Licensing & Compatibility

All models are released under the Llama 2 License, which permits commercial use but has specific usage restrictions.

Limitations & Caveats

The project is described as "pre-release" in its citation, suggesting ongoing development. While code for inference is provided, training code or detailed alignment methodologies are not explicitly available.

Health Check

Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

0 stars in the last 30 days