Open-O1  by Open-Source-O1

AI model for matching OpenAI O1 capabilities with open-source alternatives

created 10 months ago
1,356 stars

Top 30.3% on sourcepulse

GitHubView on GitHub
Project Summary

Open O1 is an open-source large language model project aiming to replicate the capabilities of proprietary models like OpenAI's O1. It targets developers and researchers seeking advanced, accessible AI alternatives. The project's core innovation lies in its training methodology, which uses curated SFT data for Chain-of-Thought (CoT) activation to enhance reasoning and problem-solving in LLaMA and Qwen models. This approach aims to achieve O1-like performance with potential for test-time scaling.

How It Works

Open O1 utilizes a curated dataset specifically designed to improve Chain-of-Thought (CoT) reasoning. This dataset is used to fine-tune existing open-source models like LLaMA and Qwen. The CoT activation technique aims to imbue these models with enhanced long-range reasoning and complex problem-solving abilities, mimicking the performance characteristics of proprietary models. The project emphasizes community-driven development and aims for future advancements in test-time scaling.

Quick Start & Requirements

  • Installation: Clone the repository (git clone https://github.com/OpenSource-O1/Open-O1.git) and install dependencies from Deployment/requirements.txt.
  • Execution: Run python Deployment/app.py.
  • Models: Requires downloading OpenO1-Qwen-7B-v0.1 or OpenO1-LLama-8B-v0.1 from Hugging Face.
  • Dependencies: Python, Git. Specific hardware requirements (e.g., GPU) are not explicitly stated but are implied for optimal performance.
  • Resources: Official quick-start and deployment instructions are available within the repository.

Highlighted Details

  • Performance Benchmarks: Claims superior or competitive performance against llama3.1-8b-instruct across GSM8K, MATH, MMLU, ARC-C, and BBH benchmarks in zero-shot settings.
  • Training Data: Released SFT data for CoT Activation.
  • Chat Templates: Adopts LLaMA3.1's chat template format.
  • Model Releases: Released v0.1 of OpenO1 models based on LLaMA and Qwen.

Maintenance & Community

  • Community: Active Discord and Slack channels are available for community engagement.
  • Development: Project is non-profit and welcomes community contributions.
  • Roadmap: Future plans include releasing reward models, training infrastructure, a chatbot arena, and reproducing O1 scaling laws.

Licensing & Compatibility

  • License: Not explicitly stated in the README. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The project is in its early stages of development, and while it shows performance improvements, it has not yet fully achieved O1 capabilities. Quantized models on Hugging Face may exhibit performance degradation.

Health Check
Last commit

8 months ago

Responsiveness

1+ week

Pull Requests (30d)
0
Issues (30d)
0
Star History
5 stars in the last 90 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems) and Shishir Patil Shishir Patil(Author of BFCL, Gorilla).

SkyThought by NovaSky-AI

0.2%
3k
Training recipes for Sky-T1 family of models
created 6 months ago
updated 3 weeks ago
Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Didier Lopes Didier Lopes(Founder of OpenBB), and
10 more.

JARVIS by microsoft

0.1%
24k
System for LLM-orchestrated AI task automation
created 2 years ago
updated 4 days ago
Feedback? Help us improve.