awesome-o1  by srush

Bibliography for OpenAI's o1 project

created 9 months ago
1,207 stars

Top 33.1% on sourcepulse

GitHubView on GitHub
Project Summary

This repository is a bibliography and survey of papers related to OpenAI's "o1" model, which uses a chain-of-thought approach enhanced by reinforcement learning for improved reasoning and problem-solving. It targets researchers and engineers interested in advanced LLM reasoning techniques, providing a curated list of foundational and related works.

How It Works

The "o1" model learns to "think productively" by employing a chain of thought, similar to human reasoning. This process is refined through reinforcement learning, enabling the model to improve its reasoning strategies, identify and correct errors, break down complex steps, and adapt its approach when necessary. This methodology aims to significantly enhance the model's reasoning capabilities, with performance scaling with both training compute (reinforcement learning) and test-time compute (thinking time).

Quick Start & Requirements

This repository is a curated list of academic papers and does not involve code execution.

Highlighted Details

  • The bibliography categorizes papers by key concepts such as Self-Consistency, Scratchpad/Chain-of-Thought, Tree-of-Thought, and AlphaGo-like approaches.
  • It includes foundational works in reinforcement learning and search relevant to LLM reasoning, like AlphaZero and Libratus.
  • Papers on self-verification, planning, and scaling laws for LLMs are also featured.
  • The list covers methods like STaR, ReST, and Expert Iteration, which are crucial for LLM self-improvement.

Maintenance & Community

This is a static bibliography. No community or maintenance information is provided.

Licensing & Compatibility

This repository contains links to external academic papers. The licensing of the individual papers varies and is not specified here.

Limitations & Caveats

This is a bibliography and does not contain executable code or model implementations. The "o1" model itself is described conceptually, with no direct access or implementation details provided within this repository.

Health Check
Last commit

8 months ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
19 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.