SkyThought  by NovaSky-AI

Training recipes for Sky-T1 family of models

created 6 months ago
3,318 stars

Top 15.0% on sourcepulse

GitHubView on GitHub
Project Summary

SkyThought provides open-source code, data, and models for training advanced AI models, specifically targeting the Sky-T1 series. It aims to empower researchers and developers to replicate and build upon state-of-the-art reasoning capabilities in mathematics, coding, and science, with a focus on cost-effective training.

How It Works

The project leverages Llama-Factory for training and includes a dedicated library for data curation and evaluation. It emphasizes techniques to tackle "overthinking" and reduce reasoning sequence lengths while maintaining accuracy, as demonstrated by the Sky-T1-32B-Flash model. Reinforcement learning is also employed to enhance model capabilities beyond standard distillation.

Quick Start & Requirements

  • Install via pip: pip install skythought
  • Install from source: Clone repo, create/activate virtual environment (Python 3.10+ recommended), uv pip install -e .
  • Evaluation command: skythought evaluate --model NovaSky-AI/Sky-T1-32B-Preview --task aime24
  • Supported datasets include AIME'24, MATH500, GPQADiamond, MMLU, ARC-Challenge, OlympiadBench, AMC'23, TACO, APPS, LiveCodeBench, MMLU Pro, MinervaMath, GSM8K, and AIME'25.
  • Official documentation and evaluation guides are available.

Highlighted Details

  • Sky-T1-32B-Preview model and data are fully open-sourced.
  • Achieves competitive performance on reasoning benchmarks like Math500 and AIME2024, outperforming comparable models in some cases.
  • Offers models like Sky-T1-7B and Sky-T1-mini demonstrating RL enhancements.
  • Sky-T1-32B-Flash addresses overthinking and reduces reasoning sequence lengths.

Maintenance & Community

  • Active development with recent releases in early 2025.
  • Links to Hugging Face, Twitter, and Discord for community engagement.
  • Acknowledgements include Berkeley Sky Computing Lab, Lambda Labs, Anyscale, Databricks, and the Still-2 and Qwen teams.

Licensing & Compatibility

  • The repository is fully open-sourced, with code, data, and model weights available.
  • Specific license details are not explicitly stated in the README, but the emphasis on "fully open-source" suggests permissive licensing for community use and replication.

Limitations & Caveats

  • While aiming for cost-effective training ($450 target), the actual compute costs may vary.
  • The README highlights specific models and their features, but a comprehensive overview of all supported training configurations or potential limitations across the entire Sky-T1 series is not detailed.
Health Check
Last commit

3 weeks ago

Responsiveness

1 day

Pull Requests (30d)
1
Issues (30d)
0
Star History
109 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.