xlstm-resources  by AI-Guru

xLSTM resource list

created 1 year ago
318 stars

Top 86.3% on sourcepulse

GitHubView on GitHub
Project Summary

This repository serves as a curated collection of resources for xLSTM, an extended Long Short-Term Memory architecture proposed by Sepp Hochreiter. It targets researchers and developers interested in exploring alternatives to Transformer models for sequence processing tasks, offering access to the foundational paper, related research, video explanations, and multiple community-driven implementations.

How It Works

xLSTM is presented as an evolution of the traditional LSTM, designed to enhance performance and efficiency in handling long sequences. The repository links to the core paper detailing its architectural innovations, which aim to overcome limitations of existing recurrent and attention-based models. The advantage lies in its potential to offer competitive or superior performance with reduced computational complexity compared to Transformers for certain sequence modeling tasks.

Quick Start & Requirements

The repository itself does not provide a direct installation or run command. Instead, it links to external GitHub repositories for implementations, such as AI-Guru/helibrunna (Hugging Face compatible) and the official NX-AI/xlstm. Users will need to refer to the respective repositories for installation instructions and specific dependencies, which typically include Python and deep learning frameworks like PyTorch or JAX.

Highlighted Details

  • Links to the primary xLSTM paper: "xLSTM: Extended Long Short-Term Memory" (arXiv:2405.04517).
  • Curated list of diverse research papers applying xLSTM to vision, audio, robotics, and time series forecasting.
  • Compilation of video explanations and discussions from prominent AI channels and podcasts.
  • References to multiple community-developed implementations across different frameworks (PyTorch, JAX, MLX).

Maintenance & Community

The repository is maintained by AI-Guru and references work by Sepp Hochreiter and NXAI. It highlights community contributions through various GitHub repositories implementing xLSTM. Links to relevant LinkedIn profiles and discussions are provided for community engagement.

Licensing & Compatibility

The licensing of the xLSTM architecture itself is not specified within this resource repository. Users must consult the licenses of the individual implementation repositories (e.g., AI-Guru/helibrunna, NX-AI/xlstm) for details on usage rights, particularly for commercial applications.

Limitations & Caveats

This repository is a collection of links and does not contain runnable code itself. Users must navigate to external repositories for implementations, which may have varying levels of maturity, documentation, and support. The xLSTM architecture is relatively new, and its long-term viability and widespread adoption are still under evaluation.

Health Check
Last commit

8 months ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
9 stars in the last 90 days

Explore Similar Projects

Starred by Stas Bekman Stas Bekman(Author of Machine Learning Engineering Open Book; Research Engineer at Snowflake), Abhishek Thakur Abhishek Thakur(World's First 4x Kaggle GrandMaster), and
5 more.

xlnet by zihangdai

0.0%
6k
Language model research paper using generalized autoregressive pretraining
created 6 years ago
updated 2 years ago
Feedback? Help us improve.