Discover and explore top open-source AI tools and projects—updated daily.
Chinese large language model series trained on domestic hardware
Top 99.1% on SourcePulse
TeleChat2 is a family of large language models developed by China Telecom Artificial Intelligence Research Institute, offering a range of parameter sizes from 3B to 115B. These models are notable for being trained entirely on domestic computing power and are designed for improved performance in general Q&A, coding, and mathematical reasoning, with specific versions supporting Function Call capabilities.
How It Works
TeleChat2 models utilize a standard Decoder-only architecture, incorporating Rotary Embedding for positional encoding, SwiGLU activation functions, and RMSNorm for pre-normalization. They employ Grouped Query Attention (GQA) to optimize parameter and computation efficiency. The training methodology involves curriculum learning, starting with high-quality educational and multilingual data, then incorporating complex reasoning and coding data, and finally fine-tuning with high-quality data for performance enhancement. For Mixture-of-Experts (MoE) models, specific optimizations are applied to communication efficiency and load balancing within the expert parallel domain.
Quick Start & Requirements
Highlighted Details
Maintenance & Community
The project is actively updated with new model releases, including MoE variants. Links to Hugging Face, ModelScope, and Gitee are provided for access.
Licensing & Compatibility
The models are released under a "TeleChat Model Community License Agreement." Commercial use requires submitting an application to tele_ai@chinatelecom.cn for a specific license grant. The license agreement also includes a declaration against using the models for illegal activities or internet services without security review.
Limitations & Caveats
The license agreement explicitly disclaims responsibility for any issues arising from the use of the models, including data security, public opinion risks, or misuse. Users are cautioned against using the models for internet services without proper security review and备案.
1 month ago
1 week