gemma-2B-10M  by mustafaaljadery

Gemma 2B with 10M context length using Infini-attention

created 1 year ago
946 stars

Top 39.6% on sourcepulse

GitHubView on GitHub
Project Summary

This repository provides a Gemma 2B language model fine-tuned to achieve a 10 million token context length using an Infini-attention mechanism. It targets researchers and developers needing to process extremely long sequences with limited hardware, offering significant memory savings over standard attention mechanisms.

How It Works

The core innovation is Infini-attention, which addresses the quadratic memory growth of standard multi-head attention's KV cache. By splitting attention into local blocks and applying recurrence to these blocks, it achieves a linear memory complexity (O(N)) while enabling global attention over a 10M token context. This approach draws inspiration from Transformer-XL.

Quick Start & Requirements

  • Install requirements: pip install -r requirements.txt
  • Run inference: python main.py
  • Requires Python and PyTorch.
  • Model weights are available on Hugging Face.

Highlighted Details

  • Achieves 10M sequence length on Gemma 2B.
  • Operates with less than 32GB of memory.
  • Features native inference optimized for CUDA.

Maintenance & Community

  • Developed by Mustafa Aljadery, Siddharth Sharma, and Aksh Garg.
  • Further training is planned.
  • Technical overview available on Medium.

Licensing & Compatibility

  • License not specified in the README.
  • Compatibility for commercial or closed-source use is undetermined.

Limitations & Caveats

This is a very early checkpoint with only 200 training steps, indicating potential limitations in model performance and robustness. The long-term maintenance and community support are not yet established.

Health Check
Last commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
3 stars in the last 90 days

Explore Similar Projects

Starred by Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), George Hotz George Hotz(Author of tinygrad; Founder of the tiny corp, comma.ai), and
10 more.

TinyLlama by jzhang38

0.3%
9k
Tiny pretraining project for a 1.1B Llama model
created 1 year ago
updated 1 year ago
Feedback? Help us improve.