LLM for both pretraining and chat
Top 16.5% on sourcepulse
Baichuan-13B is a 13-billion parameter open-source large language model developed by Baichuan Intelligent Technology, offering both base and chat-tuned versions. It excels in Chinese and English benchmarks, trained on 1.4 trillion tokens, and supports a 4096 context window with ALiBi positional encoding. The model is designed for efficient inference, with int8 and int4 quantized versions available for deployment on consumer-grade GPUs, and is available for commercial use upon application.
How It Works
Baichuan-13B utilizes ALiBi positional encoding, which offers computational advantages over Rotary Embeddings, leading to a claimed 31.6% increase in inference speed compared to LLaMA-13B. The model architecture features 40 layers, a hidden dimension of 5120, and 40 attention heads. It supports both full fine-tuning and LoRA fine-tuning methods, with provided scripts and configurations for these processes.
Quick Start & Requirements
pip install -r requirements.txt
streamlit run web_demo.py
.Highlighted Details
Maintenance & Community
The project is actively maintained by Baichuan Intelligent Technology. Updates include the release of Baichuan 2. Community interaction channels are available via WeChat.
Licensing & Compatibility
The source code is licensed under Apache 2.0. Model usage is governed by the "Baichuan-13B Model Community License Agreement." Commercial use is permitted upon registration and written authorization via opensource@baichuan-inc.com.
Limitations & Caveats
The developers disclaim responsibility for any issues arising from the model's use, including data security, public opinion risks, or misuse. Users are urged not to use the model for illegal activities or internet services without proper security review and filing.
1 year ago
1 week