Kimi-K2 by MoonshotAI

State-of-the-art MoE language model

Created 6 months ago

9,801 stars

Top 5.1% on SourcePulse

View on GitHub

9 Experts Love This Project

Kris Rasmussen

CTO of Figma

Phil Wang

Prolific Research Paper Implementer

Lianmin Zheng

Coauthor of SGLang, vLLM

Shizhe Diao

Author of LMFlow; Research Scientist at NVIDIA

and 5 more!

Project Summary

Kimi K2 is a series of large language models developed by Moonshot AI, featuring a Mixture-of-Experts (MoE) architecture. It offers both a base model for fine-tuning and an instruct-tuned version optimized for chat and agentic capabilities, targeting researchers and developers building AI applications.

How It Works

Kimi K2 utilizes a 1 trillion total parameter MoE architecture with 32 billion activated parameters, trained using the novel Muon optimizer. This approach allows for efficient scaling and improved performance across various tasks, particularly excelling in agentic intelligence, tool use, and complex reasoning. The model boasts a 128K context length and a 160K vocabulary size.

Quick Start & Requirements

Model checkpoints are available on Huggingface in block-fp8 format. Recommended inference engines include vLLM, SGLang, KTransformers, and TensorRT-LLM. Deployment examples for vLLM and SGLang are provided in the Model Deployment Guide.

Highlighted Details

Achieves state-of-the-art (SOTA) performance on several coding benchmarks, including LiveCodeBench v6 (Pass@1: 53.7) and SWE-bench Verified (Agentless Coding Acc: 51.8, Agentic Coding Acc: 65.8).
Demonstrates strong tool-calling capabilities, with examples provided for integrating custom tools.
Offers an OpenAI/Anthropic-compatible API for easy integration.
Supports a 128K context length.

Maintenance & Community

Contact for questions or concerns is support@moonshot.cn.

Licensing & Compatibility

Released under the Modified MIT License, permitting commercial use and integration with closed-source applications.

Limitations & Caveats

Some evaluation data points were omitted due to prohibitive costs. The README mentions a paper link is "coming soon."

Health Check

Last Commit

2 months ago

Responsiveness

1 day

Pull Requests (30d)