discover  by test-time-training

Learning to discover at test time

Created 2 months ago
536 stars

Top 59.1% on SourcePulse

GitHubView on GitHub
Project Summary

Summary

TTT-Discover introduces a novel approach to enhance Large Language Models (LLMs) by performing reinforcement learning (RL) at test time. This allows models to adapt and train on experience specific to the problem at hand, achieving new state-of-the-art results across challenging domains like mathematics, GPU kernel engineering, algorithm design, and biological data processing. It targets researchers and engineers seeking to push LLM capabilities beyond pre-training.

How It Works

The core innovation lies in applying RL during the inference or testing phase. Instead of relying solely on pre-trained knowledge, TTT-Discover enables the LLM to learn from its interactions and outcomes within a specific task context. This adaptive learning process, leveraging frameworks like Tinker for RL recipes, allows for fine-tuned performance improvements on novel or complex problems where general pre-training might fall short.

Quick Start & Requirements

Installation involves pip install -r requirements/requirements-math.txt, with additional requirements files available for GPU kernels (requirements-gpumode.txt), AtCoder (requirements-ale.txt), and denoising (requirements-denoising.txt). Environment variables TINKER_API_KEY, WANDB_API_KEY, and WANDB_ENTITY must be set. Launching jobs requires SLURM. A sample command is provided: python main_tinker_submitit.py --nodes 4 --partition default --cpus-per-task 100 env=ac1 model_name="openai/gpt-oss-120b" sampler_type=puct_backprop initial_exp_type=random num_epochs=50 wandb_project="my-project" wandb_name="ac1-run-1". Further details are in docs/launching.md and docs/intro.md.

Highlighted Details

  • Mathematics: Achieved state-of-the-art Erdős Overlap score of 0.380876, surpassing previous AI bests.
  • Kernel Engineering: Set new benchmarks for GPU kernel TriMul performance on A100 (2198 μs) and H100 (1161 μs) GPUs, outperforming human bests.
  • Algorithm Engineering: Established state-of-the-art on AtCoder AHC39 (Geometry) with 567,062 points.
  • Biology: Demonstrated superior performance in single-cell RNA-seq denoising, achieving 0.71 on PBMC and 0.73 on Tabula benchmarks.

Maintenance & Community

The project is under active development, with an upcoming refactor and API simplification announced. While specific community channels are not listed, acknowledgments point to contributions and inspirations from projects like Tinker, ALE-Bench, AlphaEvolve, and OpenEvolve.

Licensing & Compatibility

The project is licensed under the MIT License, which permits broad use, including commercial applications, with minimal restrictions.

Limitations & Caveats

The project is currently in a "transition period" with an impending API refactor, suggesting potential instability or breaking changes in the current codebase. Job execution relies on a SLURM cluster environment.

Health Check
Last Commit

1 week ago

Responsiveness

Inactive

Pull Requests (30d)
2
Issues (30d)
4
Star History
44 stars in the last 30 days

Explore Similar Projects

Starred by Junyang Lin Junyang Lin(Core Maintainer at Alibaba Qwen), Shizhe Diao Shizhe Diao(Author of LMFlow; Research Scientist at NVIDIA), and
1 more.

LMaaS-Papers by txsun1997

0%
545
Curated list of LMaaS research papers
Created 3 years ago
Updated 1 year ago
Starred by Maxime Labonne Maxime Labonne(Head of Post-Training at Liquid AI), Shizhe Diao Shizhe Diao(Author of LMFlow; Research Scientist at NVIDIA), and
19 more.

llm-course by mlabonne

0.5%
78k
LLM course with roadmaps and notebooks
Created 2 years ago
Updated 2 months ago
Feedback? Help us improve.