safari  by HazyResearch

Research paper implementations for sequence modeling with convolutions

created 2 years ago
895 stars

Top 41.3% on sourcepulse

GitHubView on GitHub
Project Summary

This repository provides implementations and experimental results for advanced convolutional sequence modeling techniques, including Hyena, H3, and Long Convs. It targets researchers and practitioners in deep learning, particularly those working with long sequences in natural language processing and computer vision, offering efficient and scalable alternatives to traditional attention mechanisms.

How It Works

The project implements novel convolutional architectures designed to efficiently model long-range dependencies in sequential data. These methods leverage structured convolutions, often with implicit or explicit kernel generation, to achieve performance comparable to or exceeding attention mechanisms while offering significantly better computational complexity and memory efficiency, especially for very long sequences.

Quick Start & Requirements

  • Install via pip install -e . (after cloning).
  • Requires Python 3.8+ and PyTorch 1.10+.
  • Run python -m standalone_cifar.py for a basic CIFAR-10 example.
  • See experiments page for LRA, H3, and Hyena experiments.

Highlighted Details

  • Implements Hyena, H3 (Hungry Hungry Hippos), and Long Convs papers.
  • Focuses on hardware-efficient, long-range convolutional sequence modeling.
  • Offers implementations for models ranging from 150M to 2.7B parameters.
  • Includes code from Albert Gu's state spaces repo and FlashAttention training scripts.

Maintenance & Community

The project is associated with HazyResearch and includes contributions from authors of the cited ICML and ICLR papers. Links to external reimplementations and explainer posts are provided.

Licensing & Compatibility

The repository's license is not explicitly stated in the README. However, given its association with academic research and the inclusion of code from other repositories, users should verify licensing for commercial or closed-source use.

Limitations & Caveats

The README indicates that weights for larger models need to be downloaded separately. The project is presented as a collection of experimental implementations rather than a production-ready library, and specific compatibility or performance guarantees for all use cases are not detailed.

Health Check
Last commit

1 year ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
19 stars in the last 90 days

Explore Similar Projects

Starred by Stas Bekman Stas Bekman(Author of Machine Learning Engineering Open Book; Research Engineer at Snowflake).

applied-ai by pytorch-labs

0.3%
289
Applied AI experiments and examples for PyTorch
created 2 years ago
updated 2 months ago
Feedback? Help us improve.