transfomers-silicon-research  by aliemo

Hardware research and materials for Transformer model implementations

Created 2 years ago
281 stars

Top 92.7% on SourcePulse

GitHubView on GitHub
Project Summary

This repository serves as a curated collection of research papers and materials focused on the hardware implementation of Transformer models, particularly BERT. It targets researchers and engineers interested in optimizing Transformer architectures for efficient execution on specialized hardware like FPGAs and ASICs. The primary benefit is a comprehensive overview of the evolving landscape of hardware-algorithm co-design for Transformer acceleration.

How It Works

The repository organizes research papers chronologically and by topic, highlighting advancements in areas such as model compression (quantization, pruning), novel accelerator architectures (FPGA, ReRAM, PIM), and algorithm-hardware co-optimization. It showcases how researchers are tackling the computational and memory demands of Transformers to enable their deployment on resource-constrained edge devices or to improve performance on larger systems.

Quick Start & Requirements

  • Contribute: Add new papers via pull requests to data/papers.yaml.
  • Requirements: Access to research papers, potentially requiring academic subscriptions or pre-print server access. No software installation is directly required for browsing the curated list.

Highlighted Details

  • Extensive list of research papers from 2018 to 2025 covering various aspects of Transformer hardware acceleration.
  • Focus on techniques like quantization, pruning, sparsity, and specialized architectures (FPGA, PIM, ReRAM).
  • Includes foundational papers like "Attention Is All You Need" and "BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding."
  • Covers a wide range of applications and model variants, from BERT and its derivatives to Vision Transformers (ViT).

Maintenance & Community

  • Maintained by aliemo.
  • Contribution is encouraged via pull requests for adding new papers.

Licensing & Compatibility

  • The repository itself does not specify a license. The linked research papers are subject to their respective publication licenses and copyright.

Limitations & Caveats

This repository is a collection of research pointers and does not provide executable code or implementations. The "quick start" is for contributing to the list, not for running any accelerated models. Users must independently access and evaluate the referenced research papers.

Health Check
Last Commit

6 months ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
3 stars in the last 30 days

Explore Similar Projects

Starred by Luca Soldaini Luca Soldaini(Research Scientist at Ai2), Edward Sun Edward Sun(Research Scientist at Meta Superintelligence Lab), and
4 more.

parallelformers by tunib-ai

0%
790
Toolkit for easy model parallelization
Created 4 years ago
Updated 2 years ago
Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems") and Jeremy Howard Jeremy Howard(Cofounder of fast.ai).

GPTFast by MDK8888

0%
687
HF Transformers accelerator for faster inference
Created 1 year ago
Updated 1 year ago
Starred by Nat Friedman Nat Friedman(Former CEO of GitHub), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
15 more.

FasterTransformer by NVIDIA

0.1%
6k
Optimized transformer library for inference
Created 4 years ago
Updated 1 year ago
Feedback? Help us improve.