Research paper implementation for abstract reasoning via test-time training
Top 83.8% on SourcePulse
This repository provides the official implementation for "The Surprising Effectiveness of Test-Time Training for Abstract Reasoning," focusing on applying Test-Time Training (TTT) to large language models for abstract reasoning tasks. It is intended for researchers and practitioners interested in advancing AI capabilities in complex problem-solving.
How It Works
The project leverages a modified version of the torchtune
library for its Test-Time Training pipeline. It fine-tunes large language models (specifically Llama-3 variants) and then applies TTT to adapt these models to specific abstract reasoning tasks during inference. This approach aims to improve performance by allowing the model to learn from the test data distribution without explicit retraining.
Quick Start & Requirements
torchtune
in editable mode, and then installing other dependencies via pip install torch torchao --pre --upgrade --index-url https://download.pytorch.org/whl/nightly/cu121
and pip install -r requirements.txt
.Highlighted Details
Maintenance & Community
The repository is marked as "in progress" with a caution to report errors. No specific community channels or roadmap are explicitly mentioned in the README.
Licensing & Compatibility
The README does not explicitly state a license. Compatibility for commercial use or closed-source linking is not specified.
Limitations & Caveats
The repository is explicitly stated to be in progress and should be used with caution. Some functionalities, like lora_to_output
, may not apply to all model versions. Separate vLLM environments are required for different Llama versions due to compatibility issues.
8 months ago
Inactive