Yi-Coder is an open-source series of code language models designed for state-of-the-art coding performance with fewer than 10 billion parameters. It supports 52 programming languages and excels in long-context understanding with a 128K token context length, targeting developers and researchers seeking efficient and powerful code generation capabilities.
How It Works
Yi-Coder models are transformer-based language models trained on a large corpus of code. Their architecture is optimized for handling long sequences, enabling them to understand and generate code within extensive contexts. The models are available in both base and instruction-tuned (chat) versions, offering flexibility for various coding tasks.
Quick Start & Requirements
- Installation: Clone the repository and install requirements:
git clone https://github.com/01-ai/Yi-Coder.git && cd Yi-Coder && pip install -r requirements.txt
.
- Prerequisites: Python >= 3.9. GPU is recommended for inference.
- Usage: Models can be run via Ollama (
ollama run yi-coder
) or Hugging Face Transformers.
- Resources: Links to Hugging Face, ModelScope, and wisemodel are provided for model downloads.
Highlighted Details
- Achieves strong performance on multilingual HumanEval and CodeEditorBench benchmarks, with the 9B chat model scoring 71.8 average on multilingual HumanEval and 75.00% average win rate on CodeEditorBench (Plus).
- Demonstrates competitive results in math programming tasks, with the 9B model achieving 70.3 average across several benchmarks.
- Supports fine-tuning and quantization for customized use cases.
- Offers models in 1.5B and 9B parameter sizes, catering to different resource constraints.
Maintenance & Community
- The project is actively developed by 01.AI.
- Community channels include Discord and Twitter.
- A blog provides updates, quick start guides, and detailed results.
Licensing & Compatibility
- Licensed under the Apache 2.0 license.
- Allows for commercial use and derivative works, provided attribution is included.
Limitations & Caveats
- While the 9B model shows strong performance, the 1.5B model's coding capabilities are significantly lower, as indicated by benchmark comparisons.
- The README does not specify hardware requirements for running the models, beyond the general recommendation for GPUs.