Tencent-Hunyuan-Large  by Tencent-Hunyuan

Large language model for research

created 9 months ago
1,570 stars

Top 27.2% on sourcepulse

GitHubView on GitHub
Project Summary

Tencent-Hunyuan/Tencent-Hunyuan-Large is an open-source Mixture of Experts (MoE) large language model, specifically the Hunyuan-MoE-A52B variant. It aims to advance AI by providing a large-scale, high-performance model with 389 billion total parameters and 52 billion active parameters, targeting researchers and developers in natural language processing and AI.

How It Works

Hunyuan-Large employs a Mixture of Experts (MoE) architecture, featuring 52 billion active parameters. Key technical advantages include high-quality synthetic data for richer representations and generalization, KV Cache Compression using Grouped Query Attention (GQA) and Cross-Layer Attention (CLA) for reduced memory and overhead, expert-specific learning rate scaling for optimized sub-model training, and long-context processing capabilities (up to 256K for pre-trained, 128K for Instruct).

Quick Start & Requirements

  • Install/Run: Docker image hunyuaninfer/hunyuan-large:infer-open-source is provided for inference. Training scripts are available for use with hf-deepspeed.
  • Prerequisites: For training, 32 GPUs (full fine-tuning) or 8 GPUs (LoRA fine-tuning) are recommended. BF16 inference requires 16 H20 GPUs. FP8/INT8 inference is also supported.
  • Resources: Training requires significant GPU resources. Inference performance benchmarks are provided for various configurations.
  • Links: Hugging Face, Official Website, Technical Report.

Highlighted Details

  • Largest open-source Transformer-based MoE model (52B active parameters).
  • Achieves state-of-the-art performance on various benchmarks, outperforming models like Llama3.1-405B on MMLU and MATH.
  • Supports long-context processing up to 256K tokens.
  • Offers FP8 quantization for reduced memory and increased throughput.

Maintenance & Community

  • Open-sourced by Tencent.
  • Community contact via email: hunyuan_opensource@tencent.com.

Licensing & Compatibility

  • The specific license is not explicitly stated in the provided text, but it is an open-source release. Compatibility for commercial use or closed-source linking is not detailed.

Limitations & Caveats

  • The TRT-LLM inference backend is planned for future release; currently, vLLM is the primary open-sourced inference option.
  • Using the provided Docker image in --privileged mode is noted as a security risk.
  • The README mentions potential deviations in loss when resuming training from checkpoints due to non-deterministic algorithms.
Health Check
Last commit

7 months ago

Responsiveness

1 week

Pull Requests (30d)
0
Issues (30d)
1
Star History
77 stars in the last 90 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Jeff Hammerbacher Jeff Hammerbacher(Cofounder of Cloudera), and
10 more.

open-r1 by huggingface

0.2%
25k
SDK for reproducing DeepSeek-R1
created 6 months ago
updated 3 days ago
Feedback? Help us improve.