awesome-o1 by srush

Bibliography for OpenAI's o1 project

Created 1 year ago

1,212 stars

Top 32.0% on SourcePulse

View on GitHub

8 Experts Love This Project

Jason Knight

Director AI Compilers at NVIDIA; Cofounder of OctoML

Tim J. Baek

Founder of Open WebUI

Johannes Hagemann

Cofounder of Prime Intellect

Shizhe Diao

Author of LMFlow; Research Scientist at NVIDIA

and 4 more!

Project Summary

This repository is a bibliography and survey of papers related to OpenAI's "o1" model, which uses a chain-of-thought approach enhanced by reinforcement learning for improved reasoning and problem-solving. It targets researchers and engineers interested in advanced LLM reasoning techniques, providing a curated list of foundational and related works.

How It Works

The "o1" model learns to "think productively" by employing a chain of thought, similar to human reasoning. This process is refined through reinforcement learning, enabling the model to improve its reasoning strategies, identify and correct errors, break down complex steps, and adapt its approach when necessary. This methodology aims to significantly enhance the model's reasoning capabilities, with performance scaling with both training compute (reinforcement learning) and test-time compute (thinking time).

Quick Start & Requirements

This repository is a curated list of academic papers and does not involve code execution.

Highlighted Details

The bibliography categorizes papers by key concepts such as Self-Consistency, Scratchpad/Chain-of-Thought, Tree-of-Thought, and AlphaGo-like approaches.
It includes foundational works in reinforcement learning and search relevant to LLM reasoning, like AlphaZero and Libratus.
Papers on self-verification, planning, and scaling laws for LLMs are also featured.
The list covers methods like STaR, ReST, and Expert Iteration, which are crucial for LLM self-improvement.

Maintenance & Community

This is a static bibliography. No community or maintenance information is provided.

Licensing & Compatibility

This repository contains links to external academic papers. The licensing of the individual papers varies and is not specified here.

Limitations & Caveats

This is a bibliography and does not contain executable code or model implementations. The "o1" model itself is described conceptually, with no direct access or implementation details provided within this repository.

Health Check

Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

1 stars in the last 30 days