mamba  by state-spaces

Mamba SSM architecture for sequence modeling

created 1 year ago
15,522 stars

Top 3.2% on sourcepulse

GitHubView on GitHub
Project Summary

Mamba is a novel state space model (SSM) architecture designed for efficient sequence modeling, particularly on information-dense data where traditional Transformers can be computationally prohibitive. It targets researchers and engineers building large language models or other sequence-aware AI systems, offering a linear-time complexity alternative to quadratic-complexity Transformers.

How It Works

Mamba leverages a selective state space model (SSM) approach, which allows it to dynamically adjust its behavior based on the input data. This is achieved through a hardware-aware implementation inspired by FlashAttention, optimizing the computation of the SSM recurrence relation for modern hardware. The core innovation lies in the selective mechanism, enabling Mamba to focus on relevant information and ignore irrelevant context, leading to improved performance on complex sequences.

Quick Start & Requirements

  • Install via pip: pip install mamba-ssm or pip install mamba-ssm[causal-conv1d].
  • Requirements: Linux, NVIDIA GPU, PyTorch 1.12+, CUDA 11.6+.
  • ROCm support for AMD GPUs is available with specific patching for ROCm 6.0.
  • Official documentation and examples are available within the repository.

Highlighted Details

  • Implements both Mamba and Mamba-2 architectures.
  • Provides pretrained models on Hugging Face up to 2.8B parameters.
  • Includes scripts for zero-shot evaluation using lm-evaluation-harness.
  • Offers generation benchmarking for latency and throughput.

Maintenance & Community

  • Developed by Albert Gu and Tri Dao, authors of the Mamba papers.
  • Pretrained models are hosted on Hugging Face.
  • Citation details for the Mamba and Mamba-2 papers are provided.

Licensing & Compatibility

  • The repository does not explicitly state a license in the README. Users should verify licensing terms.

Limitations & Caveats

  • Models are trained with PyTorch AMP; users may need to ensure compatible precision settings for stability.
  • Initialization details might require framework-specific adjustments to avoid unintended parameter resets.
Health Check
Last commit

2 weeks ago

Responsiveness

1 day

Pull Requests (30d)
4
Issues (30d)
9
Star History
861 stars in the last 90 days

Explore Similar Projects

Starred by Stas Bekman Stas Bekman(Author of Machine Learning Engineering Open Book; Research Engineer at Snowflake).

HALOs by ContextualAI

0.2%
873
Library for aligning LLMs using human-aware loss functions
created 1 year ago
updated 2 weeks ago
Starred by Omar Sanseviero Omar Sanseviero(DevRel at Google DeepMind), Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), and
3 more.

Medusa by FasterDecoding

0.2%
3k
Framework for accelerating LLM generation using multiple decoding heads
created 1 year ago
updated 1 year ago
Feedback? Help us improve.