titans-pytorch by lucidrains

PyTorch module for memory-efficient Transformers, based on the Titans paper

Created 1 year ago

1,864 stars

Top 23.1% on SourcePulse

View on GitHub

1 Expert Loves This Project

Phil Wang

Prolific Research Paper Implementer

Project Summary

This repository provides an unofficial PyTorch implementation of Titans, a state-of-the-art memory mechanism for transformers designed to enhance their ability to learn and adapt at test time. It targets researchers and engineers working on advanced transformer architectures seeking improved performance and adaptability in sequence modeling tasks.

How It Works

The core of the implementation is the NeuralMemory module, which acts as an external memory for transformers. It utilizes a multi-layer perceptron (MLP) for its neural memory component, allowing transformers to store and retrieve information efficiently. The MemoryAsContextTransformer class integrates this memory directly into the transformer architecture, enabling it to condition its output on the learned memory states, thereby improving context retention and learning during inference.

Quick Start & Requirements

Install: pip install titans-pytorch
Requirements: PyTorch, CUDA (implied by .cuda() calls).
Usage examples and experimentation scripts are provided.
Official documentation and demo links are not present.

Highlighted Details

Implements the Titans architecture for test-time memory learning.
Offers a MemoryAsContextTransformer for direct integration into transformer models.
Includes examples for querying nature with the MAC configuration.
Cites several related research papers on memory-augmented networks and transformer improvements.

Maintenance & Community

The repository is an unofficial implementation.
No specific contributors, sponsorships, or community channels (Discord/Slack) are mentioned.
A roadmap or active development status is not indicated.

Licensing & Compatibility

The repository does not explicitly state a license.
Compatibility for commercial use or closed-source linking is undetermined.

Limitations & Caveats

This is an unofficial implementation, meaning it may not perfectly mirror the original Titans paper or receive official support. The use of .cuda() implies a dependency on NVIDIA GPUs and CUDA, and no CPU fallback is evident.

titans-pytorch by lucidrains

Explore Similar Projects

mint by dpressel

MegaDLMs by JinjieNi

evo-memory by SakanaAI

EET by NetEase-FuXi

fastformers by microsoft

LongMem by Victorwz

lightning-transformers by Lightning-Universe

keras-transformer by kpot

matmulfreellm by ridgerchu

reformer-pytorch by lucidrains

CTranslate2 by OpenNMT

Megatron-LM by NVIDIA