matmulfreellm  by ridgerchu

MatMul-free language models

created 1 year ago
3,020 stars

Top 16.2% on sourcepulse

GitHubView on GitHub
Project Summary

This repository implements MatMul-Free LM, a novel language model architecture that replaces traditional matrix multiplication with more efficient operations. Targeting researchers and developers seeking to optimize LLM inference and training, it offers compatibility with the Hugging Face Transformers library and provides pre-trained models up to 2.7B parameters.

How It Works

The core innovation lies in replacing dense matrix multiplications with a custom architecture, HGRNBit, which leverages fused operations and potentially ternary weights. This approach aims to reduce computational complexity and memory bandwidth requirements, leading to more efficient model scaling and inference. The architecture includes specialized projection layers (FusedBitLinear) and activation functions (SiLU) within its attention and MLP blocks.

Quick Start & Requirements

  • Install via pip: pip install -U git+https://github.com/ridgerchu/matmulfreellm
  • Requirements: PyTorch >= 2.0, Triton >= 2.2, einops.
  • Pre-trained models are available on Hugging Face: 370M, 1.3B, 2.7B.
  • Usage examples for model initialization and text generation are provided in the README.

Highlighted Details

  • Implements MatMul-Free LM architecture compatible with Hugging Face Transformers.
  • Offers pre-trained models ranging from 370M to 2.7B parameters.
  • Scaling law analysis suggests steeper performance descent compared to Transformer++, indicating higher efficiency.
  • Utilizes custom fused operations and potentially ternary weights for optimization.

Maintenance & Community

  • The project is associated with an arXiv preprint: 2406.02528.
  • Primary contributor appears to be ridgerchu.

Licensing & Compatibility

  • The repository does not explicitly state a license in the README.

Limitations & Caveats

The project is presented as an implementation of research findings, and its stability, long-term maintenance, and production readiness are not yet established. The absence of a specified license may pose compatibility issues for commercial or closed-source applications.

Health Check
Last commit

1 week ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
39 stars in the last 90 days

Explore Similar Projects

Starred by Tobi Lutke Tobi Lutke(Cofounder of Shopify), Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), and
7 more.

ctransformers by marella

0.1%
2k
Python bindings for fast Transformer model inference
created 2 years ago
updated 1 year ago
Starred by Lysandre Debut Lysandre Debut(Chief Open-Source Officer at Hugging Face), Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), and
1 more.

AQLM by Vahe1994

0.1%
1k
PyTorch code for LLM compression via Additive Quantization (AQLM)
created 1 year ago
updated 2 months ago
Starred by Tobi Lutke Tobi Lutke(Cofounder of Shopify), Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), and
10 more.

qlora by artidoro

0.2%
11k
Finetuning tool for quantized LLMs
created 2 years ago
updated 1 year ago
Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems) and Jiayi Pan Jiayi Pan(Author of SWE-Gym; AI Researcher at UC Berkeley).

DeepSeek-Coder-V2 by deepseek-ai

0.4%
6k
Open-source code language model comparable to GPT4-Turbo
created 1 year ago
updated 10 months ago
Feedback? Help us improve.