Yi by 01-ai

Open-source bilingual LLMs trained from scratch

Created 2 years ago

7,840 stars

Top 6.6% on SourcePulse

View on GitHub

12 Experts Love This Project

Chip Huyen

Author of "AI Engineering", "Designing Machine Learning Systems"

Cofounder of Fireworks AI

and 8 more!

Project Summary

The Yi series models are open-source large language models developed by 01.AI, trained from scratch on a 3T multilingual corpus. They are designed for strong language understanding, reasoning, and comprehension, targeting researchers, developers, and businesses seeking high-performing bilingual LLMs.

How It Works

Yi models are built upon the Transformer architecture, similar to Llama, but are not derivatives. This foundation provides stability and compatibility within the AI ecosystem. The key differentiators are 01.AI's proprietary training datasets, efficient pipelines, and robust infrastructure, which contribute to Yi models' competitive performance against leading LLMs.

Quick Start & Requirements

Installation: Options include pip (Python 3.10+), Docker, conda-lock, and llama.cpp for quantized models.
Dependencies: Python 3.10+, PyTorch, Transformers, DeepSpeed (for fine-tuning), CUDA (for GPU acceleration), Docker, git-lfs.
Hardware: Varies by model size; e.g., Yi-6B requires ~15GB VRAM, while Yi-34B requires ~72GB VRAM. Quantized versions (4-bit, 8-bit) significantly reduce VRAM requirements.
Resources: Yi Cookbook, Hugging Face, ModelScope.

Highlighted Details

Performance: Yi-34B-Chat ranked second on AlpacaEval (behind GPT-4 Turbo) and first among open-source models on Hugging Face Open LLM Leaderboard and C-Eval.
Context Window: Models like Yi-34B-200K support a 200K context window.
Bilingual: Trained on a 3T multilingual corpus, excelling in both English and Chinese.
Quantization: Supports GPTQ and AWQ for reduced VRAM and faster inference.

Maintenance & Community

The project is actively maintained by 01.AI. Community engagement is encouraged via Discord and WeChat. Recent updates include the Yi-1.5 series and the Yi Cookbook.

Licensing & Compatibility

The Yi series models are distributed under the Apache 2.0 license, permitting personal, academic, and commercial use. Derivative works require attribution.

Limitations & Caveats

The chat models' increased response diversity, while beneficial for creative tasks, may lead to higher instances of hallucination or non-determinism. Adjusting generation parameters like temperature is recommended for more coherent outputs.

Health Check

Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

6 stars in the last 30 days