Large language model for research
Top 27.2% on sourcepulse
Tencent-Hunyuan/Tencent-Hunyuan-Large is an open-source Mixture of Experts (MoE) large language model, specifically the Hunyuan-MoE-A52B variant. It aims to advance AI by providing a large-scale, high-performance model with 389 billion total parameters and 52 billion active parameters, targeting researchers and developers in natural language processing and AI.
How It Works
Hunyuan-Large employs a Mixture of Experts (MoE) architecture, featuring 52 billion active parameters. Key technical advantages include high-quality synthetic data for richer representations and generalization, KV Cache Compression using Grouped Query Attention (GQA) and Cross-Layer Attention (CLA) for reduced memory and overhead, expert-specific learning rate scaling for optimized sub-model training, and long-context processing capabilities (up to 256K for pre-trained, 128K for Instruct).
Quick Start & Requirements
hunyuaninfer/hunyuan-large:infer-open-source
is provided for inference. Training scripts are available for use with hf-deepspeed
.Highlighted Details
Maintenance & Community
hunyuan_opensource@tencent.com
.Licensing & Compatibility
Limitations & Caveats
--privileged
mode is noted as a security risk.7 months ago
1 week