Tencent-Hunyuan-Large by Tencent-Hunyuan

Large language model for research

Created 1 year ago

1,590 stars

Top 26.1% on SourcePulse

View on GitHub

1 Expert Loves This Project

Luis Capelo

Cofounder of Lightning AI

Project Summary

Tencent-Hunyuan/Tencent-Hunyuan-Large is an open-source Mixture of Experts (MoE) large language model, specifically the Hunyuan-MoE-A52B variant. It aims to advance AI by providing a large-scale, high-performance model with 389 billion total parameters and 52 billion active parameters, targeting researchers and developers in natural language processing and AI.

How It Works

Hunyuan-Large employs a Mixture of Experts (MoE) architecture, featuring 52 billion active parameters. Key technical advantages include high-quality synthetic data for richer representations and generalization, KV Cache Compression using Grouped Query Attention (GQA) and Cross-Layer Attention (CLA) for reduced memory and overhead, expert-specific learning rate scaling for optimized sub-model training, and long-context processing capabilities (up to 256K for pre-trained, 128K for Instruct).

Quick Start & Requirements

Install/Run: Docker image hunyuaninfer/hunyuan-large:infer-open-source is provided for inference. Training scripts are available for use with hf-deepspeed.
Prerequisites: For training, 32 GPUs (full fine-tuning) or 8 GPUs (LoRA fine-tuning) are recommended. BF16 inference requires 16 H20 GPUs. FP8/INT8 inference is also supported.
Resources: Training requires significant GPU resources. Inference performance benchmarks are provided for various configurations.
Links: Hugging Face, Official Website, Technical Report.

Highlighted Details

Largest open-source Transformer-based MoE model (52B active parameters).
Achieves state-of-the-art performance on various benchmarks, outperforming models like Llama3.1-405B on MMLU and MATH.
Supports long-context processing up to 256K tokens.
Offers FP8 quantization for reduced memory and increased throughput.

Maintenance & Community

Open-sourced by Tencent.
Community contact via email: hunyuan_opensource@tencent.com.

Licensing & Compatibility

The specific license is not explicitly stated in the provided text, but it is an open-source release. Compatibility for commercial use or closed-source linking is not detailed.

Limitations & Caveats

The TRT-LLM inference backend is planned for future release; currently, vLLM is the primary open-sourced inference option.
Using the provided Docker image in --privileged mode is noted as a security risk.
The README mentions potential deviations in loss when resuming training from checkpoints due to non-deterministic algorithms.

Health Check

Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

5 stars in the last 30 days