Awesome-Token-Compress  by daixiangzi

Paper list for token compression methods in Vision Transformers (ViT) and Vision Language Models (VLM)

created 11 months ago
579 stars

Top 56.7% on sourcepulse

GitHubView on GitHub
Project Summary

This repository is a curated list of recent research papers focused on "Token Compress" techniques for Vision Transformers (ViTs) and Vision-Language Models (VLMs). It serves as a valuable resource for researchers and engineers looking to improve the efficiency and speed of these models, particularly for tasks involving long sequences or high-resolution inputs.

How It Works

The project compiles a comprehensive collection of papers that propose methods for reducing the number of tokens processed by ViTs and VLMs. These methods often involve techniques like pruning, clustering, merging, or dynamic selection of tokens, aiming to maintain performance while significantly decreasing computational cost and memory usage. The advantage of this approach lies in its ability to accelerate inference and training without substantial accuracy degradation.

Highlighted Details

  • Extensive coverage of techniques for both image and video understanding tasks.
  • Includes papers from major conferences like CVPR, ICLR, ECCV, NeurIPS, and AAAI.
  • Features methods applicable to various model architectures and multimodal tasks.
  • Highlights projects with associated GitHub repositories for practical implementation.

Maintenance & Community

This is a static list of papers, with updates reflecting recent publications in the field of efficient VLMs. There are no direct community channels or active development mentioned for the list itself, but many linked papers have their own active communities and repositories.

Licensing & Compatibility

The repository itself is a list and does not contain code that would typically have licensing restrictions. However, users should refer to the individual licenses of the linked papers and their associated code repositories for usage terms.

Limitations & Caveats

This resource is a bibliography and does not provide implementations, benchmarks, or direct comparisons of the listed techniques. Users must consult the individual papers for details on performance, limitations, and implementation requirements.

Health Check
Last commit

3 days ago

Responsiveness

1 day

Pull Requests (30d)
2
Issues (30d)
1
Star History
143 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.