dots.llm1  by rednote-hilab

MoE model for research

Created 4 months ago
462 stars

Top 65.6% on SourcePulse

GitHubView on GitHub
Project Summary

The dots.llm1 project provides a large-scale Mixture-of-Experts (MoE) language model, designed for researchers and developers seeking high-performance LLMs trained on quality data without synthetic augmentation. It offers intermediate checkpoints and efficient inference capabilities, aiming to match state-of-the-art performance with a reduced active parameter count.

How It Works

dots.llm1 is a 142B total parameter MoE model that activates 14B parameters per inference. It features a novel MoE architecture with 128 experts (126 fine-grained, 2 shared) and a top-6 routing mechanism. The model incorporates QK-Norm in its attention layers and is trained on a meticulously crafted, three-stage data processing pipeline using exclusively non-synthetic data. This approach aims for enhanced performance and computational efficiency.

Quick Start & Requirements

  • Docker: docker run --gpus all -v ~/.cache/huggingface:/root/.cache/huggingface -p 8000:8000 --ipc=host rednotehilab/dots1:vllm-openai-v0.9.0.1 --model rednote-hilab/dots.llm1.inst --tensor-parallel-size 8 --trust-remote-code --served-model-name dots1
  • Prerequisites: GPU (tensor-parallel-size 8 recommended for vLLM), Docker, Hugging Face Transformers, vLLM, or SGLang.
  • Resources: Requires significant GPU memory for 14B active parameters.
  • Links: Hugging Face Collection, Docker Hub, vLLM PR, SGLang PR.

Highlighted Details

  • 14B activated parameters out of 142B total, comparable to Qwen2.5-72B.
  • Trained on high-quality, non-synthetic data using a three-stage processing pipeline.
  • Supports 32,768 token context length.
  • Includes intermediate training checkpoints for research into LLM learning dynamics.

Maintenance & Community

The project released its dots.llm1 series in June 2025. Further details are available in their technical report. Community interaction channels are listed as WeChat.

Licensing & Compatibility

  • License: MIT.
  • Compatibility: Permissive MIT license allows for commercial use and integration into closed-source projects.

Limitations & Caveats

The project is relatively new, with a technical report released in June 2025. While aiming for state-of-the-art performance, specific benchmarks and real-world performance metrics beyond the report are not detailed in the README. Integration with Hugging Face Transformers is pending a PR.

Health Check
Last Commit

4 weeks ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
9 stars in the last 30 days

Explore Similar Projects

Starred by Shizhe Diao Shizhe Diao(Author of LMFlow; Research Scientist at NVIDIA), Yineng Zhang Yineng Zhang(Inference Lead at SGLang; Research Scientist at Together AI), and
8 more.

EAGLE by SafeAILab

10.6%
2k
Speculative decoding research paper for faster LLM inference
Created 1 year ago
Updated 1 week ago
Starred by Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), Tim J. Baek Tim J. Baek(Founder of Open WebUI), and
7 more.

gemma.cpp by google

0.1%
7k
C++ inference engine for Google's Gemma models
Created 1 year ago
Updated 1 day ago
Feedback? Help us improve.