mamba  by state-spaces

Mamba SSM architecture for sequence modeling

Created 1 year ago
15,881 stars

Top 3.0% on SourcePulse

GitHubView on GitHub
Project Summary

Mamba is a novel state space model (SSM) architecture designed for efficient sequence modeling, particularly on information-dense data where traditional Transformers can be computationally prohibitive. It targets researchers and engineers building large language models or other sequence-aware AI systems, offering a linear-time complexity alternative to quadratic-complexity Transformers.

How It Works

Mamba leverages a selective state space model (SSM) approach, which allows it to dynamically adjust its behavior based on the input data. This is achieved through a hardware-aware implementation inspired by FlashAttention, optimizing the computation of the SSM recurrence relation for modern hardware. The core innovation lies in the selective mechanism, enabling Mamba to focus on relevant information and ignore irrelevant context, leading to improved performance on complex sequences.

Quick Start & Requirements

  • Install via pip: pip install mamba-ssm or pip install mamba-ssm[causal-conv1d].
  • Requirements: Linux, NVIDIA GPU, PyTorch 1.12+, CUDA 11.6+.
  • ROCm support for AMD GPUs is available with specific patching for ROCm 6.0.
  • Official documentation and examples are available within the repository.

Highlighted Details

  • Implements both Mamba and Mamba-2 architectures.
  • Provides pretrained models on Hugging Face up to 2.8B parameters.
  • Includes scripts for zero-shot evaluation using lm-evaluation-harness.
  • Offers generation benchmarking for latency and throughput.

Maintenance & Community

  • Developed by Albert Gu and Tri Dao, authors of the Mamba papers.
  • Pretrained models are hosted on Hugging Face.
  • Citation details for the Mamba and Mamba-2 papers are provided.

Licensing & Compatibility

  • The repository does not explicitly state a license in the README. Users should verify licensing terms.

Limitations & Caveats

  • Models are trained with PyTorch AMP; users may need to ensure compatible precision settings for stability.
  • Initialization details might require framework-specific adjustments to avoid unintended parameter resets.
Health Check
Last Commit

2 weeks ago

Responsiveness

1 day

Pull Requests (30d)
9
Issues (30d)
13
Star History
274 stars in the last 30 days

Explore Similar Projects

Starred by Yineng Zhang Yineng Zhang(Inference Lead at SGLang; Research Scientist at Together AI).

dots.llm1 by rednote-hilab

0.2%
462
MoE model for research
Created 4 months ago
Updated 4 weeks ago
Starred by Ying Sheng Ying Sheng(Coauthor of SGLang) and Stas Bekman Stas Bekman(Author of "Machine Learning Engineering Open Book"; Research Engineer at Snowflake).

llm-analysis by cli99

0.4%
455
CLI tool for LLM latency/memory analysis during training/inference
Created 2 years ago
Updated 5 months ago
Starred by Wing Lian Wing Lian(Founder of Axolotl AI) and Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems").

airllm by lyogavin

0.1%
6k
Inference optimization for LLMs on low-resource hardware
Created 2 years ago
Updated 2 weeks ago
Starred by Lianmin Zheng Lianmin Zheng(Coauthor of SGLang, vLLM), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
1 more.

MiniCPM by OpenBMB

0.4%
8k
Ultra-efficient LLMs for end devices, achieving 5x+ speedup
Created 1 year ago
Updated 1 week ago
Starred by Tobi Lutke Tobi Lutke(Cofounder of Shopify), Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), and
36 more.

unsloth by unslothai

0.6%
46k
Finetuning tool for LLMs, targeting speed and memory efficiency
Created 1 year ago
Updated 12 hours ago
Feedback? Help us improve.