SkyThought provides open-source code, data, and models for training advanced AI models, specifically targeting the Sky-T1 series. It aims to empower researchers and developers to replicate and build upon state-of-the-art reasoning capabilities in mathematics, coding, and science, with a focus on cost-effective training.
How It Works
The project leverages Llama-Factory for training and includes a dedicated library for data curation and evaluation. It emphasizes techniques to tackle "overthinking" and reduce reasoning sequence lengths while maintaining accuracy, as demonstrated by the Sky-T1-32B-Flash model. Reinforcement learning is also employed to enhance model capabilities beyond standard distillation.
Quick Start & Requirements
- Install via pip:
pip install skythought
- Install from source: Clone repo, create/activate virtual environment (Python 3.10+ recommended),
uv pip install -e .
- Evaluation command:
skythought evaluate --model NovaSky-AI/Sky-T1-32B-Preview --task aime24
- Supported datasets include AIME'24, MATH500, GPQADiamond, MMLU, ARC-Challenge, OlympiadBench, AMC'23, TACO, APPS, LiveCodeBench, MMLU Pro, MinervaMath, GSM8K, and AIME'25.
- Official documentation and evaluation guides are available.
Highlighted Details
- Sky-T1-32B-Preview model and data are fully open-sourced.
- Achieves competitive performance on reasoning benchmarks like Math500 and AIME2024, outperforming comparable models in some cases.
- Offers models like Sky-T1-7B and Sky-T1-mini demonstrating RL enhancements.
- Sky-T1-32B-Flash addresses overthinking and reduces reasoning sequence lengths.
Maintenance & Community
- Active development with recent releases in early 2025.
- Links to Hugging Face, Twitter, and Discord for community engagement.
- Acknowledgements include Berkeley Sky Computing Lab, Lambda Labs, Anyscale, Databricks, and the Still-2 and Qwen teams.
Licensing & Compatibility
- The repository is fully open-sourced, with code, data, and model weights available.
- Specific license details are not explicitly stated in the README, but the emphasis on "fully open-source" suggests permissive licensing for community use and replication.
Limitations & Caveats
- While aiming for cost-effective training ($450 target), the actual compute costs may vary.
- The README highlights specific models and their features, but a comprehensive overview of all supported training configurations or potential limitations across the entire Sky-T1 series is not detailed.