Open-weight reasoning model with hybrid attention
Top 17.0% on SourcePulse
MiniMax-M1 is an open-weight, large-scale hybrid-attention reasoning model designed for complex tasks requiring extensive context processing and reasoning. It targets researchers and developers building advanced AI agents, offering significant efficiency gains and strong performance on benchmarks involving coding, software engineering, and long-context understanding.
How It Works
MiniMax-M1 employs a hybrid Mixture-of-Experts (MoE) architecture coupled with a "lightning attention" mechanism. This combination allows for efficient scaling of test-time compute, activating only a fraction of its 456 billion total parameters per token. The model natively supports a 1 million token context length and achieves this with significantly lower FLOPs compared to other models at extended sequence lengths. It was trained using reinforcement learning, featuring a novel CISPO algorithm for importance sampling.
Quick Start & Requirements
Highlighted Details
Maintenance & Community
The project is associated with MiniMax AI. Contact is available via email at model@minimax.io. Further community or roadmap details are not explicitly provided in the README.
Licensing & Compatibility
The model weights are open-weight. Specific licensing terms for commercial use or closed-source linking are not detailed in the README.
Limitations & Caveats
The README does not specify licensing restrictions for commercial use. Performance on certain benchmarks, like HLE (no tools), shows a decrease compared to some competitors. The methodology for SWE-bench excludes 14 test cases due to infrastructure incompatibility.
1 month ago
Inactive