Tencent-Hunyuan-Large  by Tencent-Hunyuan

Large language model for research

Created 11 months ago
1,583 stars

Top 26.4% on SourcePulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

Tencent-Hunyuan/Tencent-Hunyuan-Large is an open-source Mixture of Experts (MoE) large language model, specifically the Hunyuan-MoE-A52B variant. It aims to advance AI by providing a large-scale, high-performance model with 389 billion total parameters and 52 billion active parameters, targeting researchers and developers in natural language processing and AI.

How It Works

Hunyuan-Large employs a Mixture of Experts (MoE) architecture, featuring 52 billion active parameters. Key technical advantages include high-quality synthetic data for richer representations and generalization, KV Cache Compression using Grouped Query Attention (GQA) and Cross-Layer Attention (CLA) for reduced memory and overhead, expert-specific learning rate scaling for optimized sub-model training, and long-context processing capabilities (up to 256K for pre-trained, 128K for Instruct).

Quick Start & Requirements

  • Install/Run: Docker image hunyuaninfer/hunyuan-large:infer-open-source is provided for inference. Training scripts are available for use with hf-deepspeed.
  • Prerequisites: For training, 32 GPUs (full fine-tuning) or 8 GPUs (LoRA fine-tuning) are recommended. BF16 inference requires 16 H20 GPUs. FP8/INT8 inference is also supported.
  • Resources: Training requires significant GPU resources. Inference performance benchmarks are provided for various configurations.
  • Links: Hugging Face, Official Website, Technical Report.

Highlighted Details

  • Largest open-source Transformer-based MoE model (52B active parameters).
  • Achieves state-of-the-art performance on various benchmarks, outperforming models like Llama3.1-405B on MMLU and MATH.
  • Supports long-context processing up to 256K tokens.
  • Offers FP8 quantization for reduced memory and increased throughput.

Maintenance & Community

  • Open-sourced by Tencent.
  • Community contact via email: hunyuan_opensource@tencent.com.

Licensing & Compatibility

  • The specific license is not explicitly stated in the provided text, but it is an open-source release. Compatibility for commercial use or closed-source linking is not detailed.

Limitations & Caveats

  • The TRT-LLM inference backend is planned for future release; currently, vLLM is the primary open-sourced inference option.
  • Using the provided Docker image in --privileged mode is noted as a security risk.
  • The README mentions potential deviations in loss when resuming training from checkpoints due to non-deterministic algorithms.
Health Check
Last Commit

9 months ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
4 stars in the last 30 days

Explore Similar Projects

Starred by Yineng Zhang Yineng Zhang(Inference Lead at SGLang; Research Scientist at Together AI).

dots.llm1 by rednote-hilab

0.2%
462
MoE model for research
Created 4 months ago
Updated 4 weeks ago
Starred by Jason Knight Jason Knight(Director AI Compilers at NVIDIA; Cofounder of OctoML), Omar Sanseviero Omar Sanseviero(DevRel at Google DeepMind), and
11 more.

mistral.rs by EricLBuehler

0.3%
6k
LLM inference engine for blazing fast performance
Created 1 year ago
Updated 1 day ago
Feedback? Help us improve.