SkyThought by NovaSky-AI

Training recipes for Sky-T1 family of models

Created 1 year ago

3,366 stars

Top 14.3% on SourcePulse

View on GitHub

9 Experts Love This Project

Luis Capelo

Cofounder of Lightning AI

Chip Huyen

Author of "AI Engineering", "Designing Machine Learning Systems"

Thomas Dohmke

Former CEO of GitHub

Yaowei Zheng

Author of LLaMA-Factory

and 5 more!

Project Summary

SkyThought provides open-source code, data, and models for training advanced AI models, specifically targeting the Sky-T1 series. It aims to empower researchers and developers to replicate and build upon state-of-the-art reasoning capabilities in mathematics, coding, and science, with a focus on cost-effective training.

How It Works

The project leverages Llama-Factory for training and includes a dedicated library for data curation and evaluation. It emphasizes techniques to tackle "overthinking" and reduce reasoning sequence lengths while maintaining accuracy, as demonstrated by the Sky-T1-32B-Flash model. Reinforcement learning is also employed to enhance model capabilities beyond standard distillation.

Quick Start & Requirements

Install via pip: pip install skythought
Install from source: Clone repo, create/activate virtual environment (Python 3.10+ recommended), uv pip install -e .
Evaluation command: skythought evaluate --model NovaSky-AI/Sky-T1-32B-Preview --task aime24
Supported datasets include AIME'24, MATH500, GPQADiamond, MMLU, ARC-Challenge, OlympiadBench, AMC'23, TACO, APPS, LiveCodeBench, MMLU Pro, MinervaMath, GSM8K, and AIME'25.
Official documentation and evaluation guides are available.

Highlighted Details

Sky-T1-32B-Preview model and data are fully open-sourced.
Achieves competitive performance on reasoning benchmarks like Math500 and AIME2024, outperforming comparable models in some cases.
Offers models like Sky-T1-7B and Sky-T1-mini demonstrating RL enhancements.
Sky-T1-32B-Flash addresses overthinking and reduces reasoning sequence lengths.

Maintenance & Community

Active development with recent releases in early 2025.
Links to Hugging Face, Twitter, and Discord for community engagement.
Acknowledgements include Berkeley Sky Computing Lab, Lambda Labs, Anyscale, Databricks, and the Still-2 and Qwen teams.

Licensing & Compatibility

The repository is fully open-sourced, with code, data, and model weights available.
Specific license details are not explicitly stated in the README, but the emphasis on "fully open-source" suggests permissive licensing for community use and replication.

Limitations & Caveats

While aiming for cost-effective training ($450 target), the actual compute costs may vary.
The README highlights specific models and their features, but a comprehensive overview of all supported training configurations or potential limitations across the entire Sky-T1 series is not detailed.

Health Check

Last Commit

6 months ago

Responsiveness

1 day

Pull Requests (30d)

Issues (30d)

Star History

11 stars in the last 30 days