Chinese LLM for dialogue, long-form generation, and code assistance
Top 24.0% on sourcepulse
TeleChat is a suite of large language models developed by China Telecom AI, offering 1B, 7B, and 12B parameter versions trained on extensive Chinese and English datasets. It aims to provide robust conversational AI capabilities for general Q&A, coding, and mathematical tasks, with specialized versions for long-form content generation and improved multi-turn dialogue.
How It Works
TeleChat models utilize a standard Decoder-only architecture enhanced with Rotary Embedding for positional encoding, SwiGLU activation functions, and RMSNorm for layer normalization. The 12B model features a decoupled word embedding and output layer for improved training stability. Training employs scientific data ratio learning and curriculum learning, dynamically adjusting dataset weights based on model performance to ensure comprehensive learning across diverse data types.
Quick Start & Requirements
transformers
library.transformers
, deepspeed
, auto-gptq
. GPU acceleration (NVIDIA recommended) is essential for efficient operation. CUDA 11.x or higher is implied for GPU usage.Highlighted Details
Maintenance & Community
Licensing & Compatibility
tele_ai@chinatelecom.cn
.Limitations & Caveats
The project disclaims responsibility for any issues arising from model use, including data security, public opinion risks, or misuse. Users must adhere to the stated usage restrictions.
8 months ago
1 week