Discover and explore top open-source AI tools and projects—updated daily.
test-time-trainingLearning to discover at test time
Top 65.9% on SourcePulse
Summary
TTT-Discover introduces a novel approach to enhance Large Language Models (LLMs) by performing reinforcement learning (RL) at test time. This allows models to adapt and train on experience specific to the problem at hand, achieving new state-of-the-art results across challenging domains like mathematics, GPU kernel engineering, algorithm design, and biological data processing. It targets researchers and engineers seeking to push LLM capabilities beyond pre-training.
How It Works
The core innovation lies in applying RL during the inference or testing phase. Instead of relying solely on pre-trained knowledge, TTT-Discover enables the LLM to learn from its interactions and outcomes within a specific task context. This adaptive learning process, leveraging frameworks like Tinker for RL recipes, allows for fine-tuned performance improvements on novel or complex problems where general pre-training might fall short.
Quick Start & Requirements
Installation involves pip install -r requirements/requirements-math.txt, with additional requirements files available for GPU kernels (requirements-gpumode.txt), AtCoder (requirements-ale.txt), and denoising (requirements-denoising.txt). Environment variables TINKER_API_KEY, WANDB_API_KEY, and WANDB_ENTITY must be set. Launching jobs requires SLURM. A sample command is provided: python main_tinker_submitit.py --nodes 4 --partition default --cpus-per-task 100 env=ac1 model_name="openai/gpt-oss-120b" sampler_type=puct_backprop initial_exp_type=random num_epochs=50 wandb_project="my-project" wandb_name="ac1-run-1". Further details are in docs/launching.md and docs/intro.md.
Highlighted Details
Maintenance & Community
The project is under active development, with an upcoming refactor and API simplification announced. While specific community channels are not listed, acknowledgments point to contributions and inspirations from projects like Tinker, ALE-Bench, AlphaEvolve, and OpenEvolve.
Licensing & Compatibility
The project is licensed under the MIT License, which permits broad use, including commercial applications, with minimal restrictions.
Limitations & Caveats
The project is currently in a "transition period" with an impending API refactor, suggesting potential instability or breaking changes in the current codebase. Job execution relies on a SLURM cluster environment.
3 days ago
Inactive
txsun1997
tensorzero
mlabonne