MoE model for research
Top 69.2% on sourcepulse
The dots.llm1
project provides a large-scale Mixture-of-Experts (MoE) language model, designed for researchers and developers seeking high-performance LLMs trained on quality data without synthetic augmentation. It offers intermediate checkpoints and efficient inference capabilities, aiming to match state-of-the-art performance with a reduced active parameter count.
How It Works
dots.llm1
is a 142B total parameter MoE model that activates 14B parameters per inference. It features a novel MoE architecture with 128 experts (126 fine-grained, 2 shared) and a top-6 routing mechanism. The model incorporates QK-Norm in its attention layers and is trained on a meticulously crafted, three-stage data processing pipeline using exclusively non-synthetic data. This approach aims for enhanced performance and computational efficiency.
Quick Start & Requirements
docker run --gpus all -v ~/.cache/huggingface:/root/.cache/huggingface -p 8000:8000 --ipc=host rednotehilab/dots1:vllm-openai-v0.9.0.1 --model rednote-hilab/dots.llm1.inst --tensor-parallel-size 8 --trust-remote-code --served-model-name dots1
Highlighted Details
Maintenance & Community
The project released its dots.llm1
series in June 2025. Further details are available in their technical report. Community interaction channels are listed as WeChat.
Licensing & Compatibility
Limitations & Caveats
The project is relatively new, with a technical report released in June 2025. While aiming for state-of-the-art performance, specific benchmarks and real-world performance metrics beyond the report are not detailed in the README. Integration with Hugging Face Transformers is pending a PR.
1 day ago
Inactive