Discover and explore top open-source AI tools and projects—updated daily.
LLM for research/commercial use (license required for some commercial use cases)
Top 12.0% on SourcePulse
Baichuan 2 is a series of large language models developed by Baichuan Intelligent Technology, offering 7B and 13B parameter versions in both base and chat configurations. These models are trained on 2.6 trillion tokens of high-quality data and aim to provide state-of-the-art performance across various Chinese and English benchmarks, including general knowledge, legal, medical, mathematical, coding, and translation tasks. The models are open for academic research and available for free commercial use under specific conditions, making them accessible to developers and researchers.
How It Works
Baichuan 2 models are transformer-based large language models. The project provides pre-trained weights for both base and chat-tuned versions, with the chat versions further optimized for conversational AI. Notably, the project offers 4-bit quantized versions (NF4) of the chat models, significantly reducing memory footprint while maintaining performance close to the original models. This quantization is achieved using the BitsAndBytes library, supporting both online and offline quantization methods for flexible deployment.
Quick Start & Requirements
pip install -r requirements.txt
(for fine-tuning) or direct download from Hugging Face/ModelScope.peft
and xFormers
.Highlighted Details
Maintenance & Community
The project is actively maintained by Baichuan Intelligent Technology. Community support channels are available via WeChat. The project also highlights integrations with Intel, Huawei Ascend, and MindSpore.
Licensing & Compatibility
The models are released under Apache 2.0 and a specific "Baichuan 2 Model Community License Agreement." Commercial use is permitted if daily active users are below 1 million, the entity is not a cloud/software provider, and there's no third-party sub-licensing. A formal application process is required for commercial licensing.
Limitations & Caveats
The project disclaims responsibility for any misuse or issues arising from the model's use, including data security or public opinion risks. Users are cautioned against using the model for illegal activities or internet services without proper security review. CPU inference is supported but significantly slower than GPU.
10 months ago
Inactive