LaCT  by a1600012888

Test-Time Training framework for adaptable models

Created 6 months ago
324 stars

Top 83.8% on SourcePulse

GitHubView on GitHub
Project Summary

This repository provides the official code release for the paper "Test-Time Training Done Right," offering a framework and minimal implementations for adapting machine learning models during inference. It targets researchers and practitioners seeking to enhance model robustness and efficiency at test time. The project enables easier understanding, modification, and extension of their novel test-time training approach.

How It Works

The project centers around a LaCT layer, with minimal implementations provided in minimal_implementations/ to serve as a starting point for understanding and modification. A key technical advancement is the recent integration of fused Triton kernels for the TTT layer. This optimization fuses multiple matrix multiplications with their epilogues, a design choice aimed at significantly reducing memory consumption and minimizing global memory read/writes during training. This approach enhances computational efficiency and memory footprint for test-time adaptation.

Quick Start & Requirements

Minimal implementations are available in minimal_implementations/ for understanding and modification. Direct installation commands or detailed setup instructions are not provided in the README snippet. The inclusion of Triton kernels suggests a Python environment with CUDA support may be required for optimized performance. Links to the paper, project website, and pre-trained models on HuggingFace are referenced for further exploration.

Highlighted Details

  • Official code release for the paper "Test-Time Training Done Right," enabling reproduction and extension.
  • Provides minimal, accessible implementations of the LaCT layer, facilitating easier understanding and modification.
  • Features recently added fused Triton kernels for the TTT layer, specifically designed to reduce training memory consumption by optimizing memory access patterns.
  • Demonstrates broad applicability with completed releases for language models, novel view synthesis, and video model finetuning, built upon foundational work.

Maintenance & Community

No specific details regarding maintainers, community channels (e.g., Discord, Slack), or a public roadmap are present in the provided text.

Licensing & Compatibility

The license type is not specified in the provided README snippet, making commercial use or closed-source integration assessment difficult.

Limitations & Caveats

The provided README snippet lacks explicit details on project limitations, alpha status, or known bugs. Crucially, the absence of clear installation instructions, dependency lists, and licensing information presents immediate blockers for rapid adoption and assessment of commercial viability.

Health Check
Last Commit

1 week ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
2
Star History
19 stars in the last 30 days

Explore Similar Projects

Starred by Yineng Zhang Yineng Zhang(Inference Lead at SGLang; Research Scientist at Together AI).

dots.llm1 by rednote-hilab

0.2%
468
MoE model for research
Created 6 months ago
Updated 3 months ago
Starred by Wing Lian Wing Lian(Founder of Axolotl AI) and Stas Bekman Stas Bekman(Author of "Machine Learning Engineering Open Book"; Research Engineer at Snowflake).

fms-fsdp by foundation-model-stack

0%
271
Efficiently train foundation models with PyTorch
Created 1 year ago
Updated 6 days ago
Starred by Shizhe Diao Shizhe Diao(Author of LMFlow; Research Scientist at NVIDIA), Yineng Zhang Yineng Zhang(Inference Lead at SGLang; Research Scientist at Together AI), and
8 more.

EAGLE by SafeAILab

0.9%
2k
Speculative decoding research paper for faster LLM inference
Created 2 years ago
Updated 1 week ago
Feedback? Help us improve.