Telechat  by Tele-AI

Chinese LLM for dialogue, long-form generation, and code assistance

created 1 year ago
1,843 stars

Top 24.0% on sourcepulse

GitHubView on GitHub
Project Summary

TeleChat is a suite of large language models developed by China Telecom AI, offering 1B, 7B, and 12B parameter versions trained on extensive Chinese and English datasets. It aims to provide robust conversational AI capabilities for general Q&A, coding, and mathematical tasks, with specialized versions for long-form content generation and improved multi-turn dialogue.

How It Works

TeleChat models utilize a standard Decoder-only architecture enhanced with Rotary Embedding for positional encoding, SwiGLU activation functions, and RMSNorm for layer normalization. The 12B model features a decoupled word embedding and output layer for improved training stability. Training employs scientific data ratio learning and curriculum learning, dynamically adjusting dataset weights based on model performance to ensure comprehensive learning across diverse data types.

Quick Start & Requirements

  • Installation: Primarily through Hugging Face transformers library.
  • Dependencies: Python, PyTorch, transformers, deepspeed, auto-gptq. GPU acceleration (NVIDIA recommended) is essential for efficient operation. CUDA 11.x or higher is implied for GPU usage.
  • Resources: Requires significant GPU memory, especially for the 12B model and longer sequence lengths (e.g., 4096 tokens). Fine-tuning with DeepSpeed Zero-3 is supported for memory optimization.
  • Links: Hugging Face, ModelScope, Tutorials

Highlighted Details

  • Offers 7B and 12B models with INT8 and INT4 quantization for reduced memory footprint and faster inference.
  • Supports fine-tuning with DeepSpeed, including Zero-3 for memory optimization and FlashAttention2 integration.
  • Provides an 8K context window version with NTK-aware and attention scaling extrapolation, extending to 96K.
  • Demonstrates strong performance in long-form writing tasks like work summaries, plans, and proposals.

Maintenance & Community

  • Active development with regular updates (e.g., 12B-V2 release).
  • Community engagement via WeChat.
  • Technical reports and model details are available.

Licensing & Compatibility

  • License: "TeleChat Model Community License Agreement".
  • Commercial Use: Permitted upon application and approval via tele_ai@chinatelecom.cn.
  • Restrictions: Prohibits use for activities harmful to national security or illegal purposes. Requires safety review and filing for internet services.

Limitations & Caveats

The project disclaims responsibility for any issues arising from model use, including data security, public opinion risks, or misuse. Users must adhere to the stated usage restrictions.

Health Check
Last commit

8 months ago

Responsiveness

1 week

Pull Requests (30d)
0
Issues (30d)
0
Star History
15 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.