MiniMax-01 by MiniMax-AI

Large language & vision-language models based on linear attention

Created 1 year ago

3,289 stars

Top 14.5% on SourcePulse

View on GitHub

7 Experts Love This Project

Alex Yu

Research Scientist at OpenAI; Cofounder of Luma AI

Inference Lead at SGLang; Research Scientist at Together AI

and 3 more!

Project Summary

This repository provides the official implementations for MiniMax-Text-01 and MiniMax-VL-01, large-scale language and vision-language models. These models are designed for researchers and developers seeking state-of-the-art performance in long-context understanding and multimodal tasks, offering advanced architectures and competitive benchmark results.

How It Works

MiniMax-Text-01 utilizes a hybrid attention mechanism combining Lightning Attention, Softmax Attention, and Mixture-of-Experts (MoE) to achieve a 1 million token context length during training and up to 4 million during inference. It employs parallel strategies like LASP+ and ETP for efficient scaling. MiniMax-VL-01 builds upon this by integrating a Vision Transformer (ViT) and a dynamic resolution mechanism, allowing it to process images at resolutions up to 2016x2016 while maintaining a 336x336 thumbnail for efficient multimodal understanding.

Quick Start & Requirements

Installation: Models are available via Hugging Face Transformers.
Hardware: Requires multiple GPUs (e.g., 8 GPUs for the provided examples).
Dependencies: PyTorch, Transformers, and potentially vLLM for deployment.
Resources: Significant GPU memory is needed due to the large parameter counts.
Links:
- MiniMax-Text-01: Hugging Face
- MiniMax-VL-01: Hugging Face
- vLLM Deployment Guide: [Link provided in README]

Highlighted Details

MiniMax-Text-01: 456B total parameters, 45.9B activated per token, 1M context training, 4M inference context.
MiniMax-VL-01: Integrates a 303M ViT with MiniMax-Text-01, supports dynamic image resolutions up to 2016x2016.
Strong performance across various academic benchmarks (MMLU, GSM8k, HumanEval) and long-context evaluations (Needle In A Haystack, LongBench).
Supports INT8 quantization for reduced memory footprint.

Maintenance & Community

Official repository from MiniMax.
Contact: model@minimaxi.com for API and server inquiries.

Licensing & Compatibility

The specific license is not explicitly stated in the README, but models are available on Hugging Face, implying a permissive license for research and potentially commercial use. Further clarification is recommended.

Limitations & Caveats

The provided quick-start examples assume a distributed setup across multiple GPUs, indicating significant hardware requirements for effective use.
The README does not detail specific installation steps beyond Hugging Face model loading, and deployment guidance points to external tools like vLLM.

Health Check

Last Commit

6 months ago

Responsiveness

1 day

Pull Requests (30d)

Issues (30d)

Star History

42 stars in the last 30 days