WizardLM by nlpxucan

LLMs built using Evol-Instruct for complex instruction following

Created 2 years ago

9,474 stars

Top 5.4% on SourcePulse

View on GitHub

20 Experts Love This Project

Vincent Weisser

Cofounder of Prime Intellect

Chip Huyen

Author of "AI Engineering", "Designing Machine Learning Systems"

and 16 more!

Project Summary

WizardLM is a suite of large language models (LLMs) designed to excel at following complex instructions, with specialized versions for coding (WizardCoder) and mathematical reasoning (WizardMath). It targets researchers and developers seeking high-performance, instruction-following models that outperform many existing open-source alternatives and compete with leading proprietary models.

How It Works

The core innovation is Evol-Instruct, a method that uses LLMs to automatically generate diverse and complex instructions, progressively increasing difficulty. This approach enhances the model's ability to understand and execute intricate commands, leading to improved performance across various benchmarks.

Quick Start & Requirements

Models are available via Hugging Face 🤗 HF Repo.
Requires Python 3.9+.
Specific model requirements (e.g., GPU, VRAM) depend on the model size.
Refer to individual model directories for detailed setup and inference scripts.

Highlighted Details

WizardCoder-33B-V1.1 achieves SOTA OSS performance on EvalPlus Leaderboard, outperforming ChatGPT 3.5 and Gemini Pro on HumanEval benchmarks.
WizardMath-7B-V1.1 is a top-performing 7B math LLM, surpassing ChatGPT 3.5 and Gemini Pro on GSM8k.
WizardLM-70B-V1.0 demonstrates strong performance on MT-Bench and AlpacaEval, with competitive GSM8k and HumanEval scores.
Models are fine-tuned from various base models, including Llama and DeepSeek-Coder.

Maintenance & Community

Active development with recent releases of WizardCoder-33B-V1.1 and WizardMath-7B-V1.1.
Community engagement via Discord.
Researchers are encouraged to provide feedback on issues and suggestions.

Licensing & Compatibility

Code License: Apache 2.0.
Data License: CC BY-NC 4.0.
Model weights are subject to the Llama 2 License or specific non-commercial licenses for older versions.
Strictly academic research and non-commercial use.

Limitations & Caveats

Data used for training is not publicly released due to organizational policy and legal review.
Output accuracy is not guaranteed due to model randomness.
Older model versions (e.g., WizardLM-13B-V1.1, WizardLM-30B-V1.0, WizardLM-7B-V1.0) are explicitly marked as non-commercial.

Health Check

Last Commit

7 months ago

Responsiveness

1 day

Pull Requests (30d)

Issues (30d)

Star History

13 stars in the last 30 days