Discover and explore top open-source AI tools and projects—updated daily.
inclusionAIEfficient MoE LLMs for advanced reasoning and high-speed generation
Top 100.0% on SourcePulse
Ling-V2 is an open-source family of Mixture-of-Experts (MoE) Large Language Models (LLMs) from InclusionAI, designed to deliver state-of-the-art performance with high computational efficiency. Targeting researchers and developers seeking powerful yet resource-conscious LLMs, Ling-V2 offers significant advantages in complex reasoning and instruction following, achieving performance comparable to much larger dense models with a fraction of activated parameters.
How It Works
Ling-V2 employs a 1/32 activation ratio MoE architecture, meticulously optimized with choices in expert granularity, shared expert ratios, attention mechanisms, and routing strategies like sigmoid routing and aux-loss-free design. This sparse activation, combined with techniques such as MTP loss, QK-Norm, and half RoPE, allows models like Ling-mini-2.0 (16B total parameters, 1.4B activated) to deliver performance equivalent to 7–8B dense models. Furthermore, the project leverages FP8 mixed-precision training, utilizing tile/blockwise FP8 scaling, FP8 optimizers, and on-demand transpose weights for extreme memory optimization and efficient training.
Quick Start & Requirements
Integration is primarily supported via Hugging Face Transformers with a provided code snippet. For users in mainland China, ModelScope is recommended. Advanced inference can be achieved using vLLM or SGLang, both requiring the cloning of their respective repositories and applying provided patches (bailing_moe_v2.patch) to their installations. Specific hardware requirements are not detailed beyond GPU mentions for performance benchmarks and inference speed examples (e.g., H20, 80G GPUs). Users should ensure compatibility with Python environments supporting these libraries. Links to model downloads (Hugging Face, ModelScope) and external libraries (vLLM, SGLang) are available within the repository.
Highlighted Details
Maintenance & Community
The project is provided by InclusionAI. Specific details regarding community channels (e.g., Discord, Slack), active contributors, sponsorships, or a public roadmap are not detailed in the provided README.
Licensing & Compatibility
The code repository is licensed under the permissive MIT License, allowing for broad use, including commercial applications and linking with closed-source software.
Limitations & Caveats
Integration with vLLM and SGLang currently requires users to manually apply patches to the respective libraries, as these changes are not yet merged into their official releases. Support for Mixture-of-Thought (MoT) is noted as available for base models in SGLang but not yet for chat models. Specific hardware requirements beyond GPU usage for performance metrics are not explicitly detailed.
3 months ago
Inactive
ByteDance-Seed
facebookresearch
huggingface
SafeAILab
openvinotoolkit