LaCT by a1600012888

Test-Time Training framework for adaptable models

Created 9 months ago

403 stars

Top 72.2% on SourcePulse

View on GitHub

2 Experts Love This Project

Pawel Garbacki

Cofounder of Fireworks AI

Zhiqiang Xie

Coauthor of SGLang

Project Summary

This repository provides the official code release for the paper "Test-Time Training Done Right," offering a framework and minimal implementations for adapting machine learning models during inference. It targets researchers and practitioners seeking to enhance model robustness and efficiency at test time. The project enables easier understanding, modification, and extension of their novel test-time training approach.

How It Works

The project centers around a LaCT layer, with minimal implementations provided in minimal_implementations/ to serve as a starting point for understanding and modification. A key technical advancement is the recent integration of fused Triton kernels for the TTT layer. This optimization fuses multiple matrix multiplications with their epilogues, a design choice aimed at significantly reducing memory consumption and minimizing global memory read/writes during training. This approach enhances computational efficiency and memory footprint for test-time adaptation.

Quick Start & Requirements

Minimal implementations are available in minimal_implementations/ for understanding and modification. Direct installation commands or detailed setup instructions are not provided in the README snippet. The inclusion of Triton kernels suggests a Python environment with CUDA support may be required for optimized performance. Links to the paper, project website, and pre-trained models on HuggingFace are referenced for further exploration.

Highlighted Details

Official code release for the paper "Test-Time Training Done Right," enabling reproduction and extension.
Provides minimal, accessible implementations of the LaCT layer, facilitating easier understanding and modification.
Features recently added fused Triton kernels for the TTT layer, specifically designed to reduce training memory consumption by optimizing memory access patterns.
Demonstrates broad applicability with completed releases for language models, novel view synthesis, and video model finetuning, built upon foundational work.

Maintenance & Community

No specific details regarding maintainers, community channels (e.g., Discord, Slack), or a public roadmap are present in the provided text.

Licensing & Compatibility

The license type is not specified in the provided README snippet, making commercial use or closed-source integration assessment difficult.

Limitations & Caveats

The provided README snippet lacks explicit details on project limitations, alpha status, or known bugs. Crucially, the absence of clear installation instructions, dependency lists, and licensing information presents immediate blockers for rapid adoption and assessment of commercial viability.

Health Check

Last Commit

2 months ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

34 stars in the last 30 days