WizardLM  by nlpxucan

LLMs built using Evol-Instruct for complex instruction following

Created 2 years ago
9,458 stars

Top 5.4% on SourcePulse

GitHubView on GitHub
Project Summary

WizardLM is a suite of large language models (LLMs) designed to excel at following complex instructions, with specialized versions for coding (WizardCoder) and mathematical reasoning (WizardMath). It targets researchers and developers seeking high-performance, instruction-following models that outperform many existing open-source alternatives and compete with leading proprietary models.

How It Works

The core innovation is Evol-Instruct, a method that uses LLMs to automatically generate diverse and complex instructions, progressively increasing difficulty. This approach enhances the model's ability to understand and execute intricate commands, leading to improved performance across various benchmarks.

Quick Start & Requirements

  • Models are available via Hugging Face 🤗 HF Repo.
  • Requires Python 3.9+.
  • Specific model requirements (e.g., GPU, VRAM) depend on the model size.
  • Refer to individual model directories for detailed setup and inference scripts.

Highlighted Details

  • WizardCoder-33B-V1.1 achieves SOTA OSS performance on EvalPlus Leaderboard, outperforming ChatGPT 3.5 and Gemini Pro on HumanEval benchmarks.
  • WizardMath-7B-V1.1 is a top-performing 7B math LLM, surpassing ChatGPT 3.5 and Gemini Pro on GSM8k.
  • WizardLM-70B-V1.0 demonstrates strong performance on MT-Bench and AlpacaEval, with competitive GSM8k and HumanEval scores.
  • Models are fine-tuned from various base models, including Llama and DeepSeek-Coder.

Maintenance & Community

  • Active development with recent releases of WizardCoder-33B-V1.1 and WizardMath-7B-V1.1.
  • Community engagement via Discord.
  • Researchers are encouraged to provide feedback on issues and suggestions.

Licensing & Compatibility

  • Code License: Apache 2.0.
  • Data License: CC BY-NC 4.0.
  • Model weights are subject to the Llama 2 License or specific non-commercial licenses for older versions.
  • Strictly academic research and non-commercial use.

Limitations & Caveats

  • Data used for training is not publicly released due to organizational policy and legal review.
  • Output accuracy is not guaranteed due to model randomness.
  • Older model versions (e.g., WizardLM-13B-V1.1, WizardLM-30B-V1.0, WizardLM-7B-V1.0) are explicitly marked as non-commercial.
Health Check
Last Commit

3 months ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
17 stars in the last 30 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Junyang Lin Junyang Lin(Core Maintainer at Alibaba Qwen), and
3 more.

Alpaca-CoT by PhoebusSi

0.1%
3k
IFT platform for instruction collection, parameter-efficient methods, and LLMs
Created 2 years ago
Updated 1 year ago
Feedback? Help us improve.