Awesome-VQVAE  by rese1f

Resource list for Vector Quantized Variational Autoencoders (VQ-VAE)

Created 2 years ago
310 stars

Top 86.7% on SourcePulse

GitHubView on GitHub
Project Summary

This repository serves as a curated collection of research papers, blog posts, and resources focused on Vector Quantized Variational Autoencoders (VQ-VAEs) and their diverse applications. It targets researchers and practitioners in machine learning, computer vision, and generative modeling seeking to understand and implement VQ-VAE architectures. The primary benefit is a centralized, up-to-date overview of the VQ-VAE landscape, facilitating exploration of state-of-the-art techniques.

How It Works

The repository organizes papers and resources by application domain, including image generation, video synthesis, 3D shape modeling, and speech/audio processing. It highlights key papers that introduce novel VQ-VAE variants, improvements, or significant applications, often linking to official publications, code repositories, or explanatory blog posts. This structured approach allows users to quickly identify relevant advancements and foundational work within specific subfields.

Quick Start & Requirements

This repository is a collection of links and does not require installation or execution. All listed papers and resources are external.

Highlighted Details

  • Comprehensive coverage of VQ-VAE advancements from foundational papers (e.g., VQ-VAE-2, Neural Discrete Representation Learning) to recent applications in text-to-image, text-to-video, and multimodal generation.
  • Includes resources on VQ-VAE variants like VQGAN, BEiT, and SoundStream, showcasing their impact across various domains.
  • Features links to explanatory blog posts that demystify complex concepts like DALL-E's VQ-VAE components.
  • Organizes content by application area, providing a clear path for users interested in specific VQ-VAE use cases.

Maintenance & Community

The repository is maintained by rese1f. The last update appears to be around February 2021 for blog posts, with papers listed up to arXiv 2024 and CVPR 2023. There are no explicit community links (e.g., Discord, Slack) provided.

Licensing & Compatibility

The repository itself is licensed under an unspecified license. The linked papers are subject to their respective publication and copyright terms. Compatibility for commercial use depends on the licenses of the individual linked projects and papers.

Limitations & Caveats

This is a curated list of resources, not an executable library. It does not provide code implementations directly, requiring users to seek out associated repositories for practical application. The recency of blog posts may lag behind the latest research papers.

Health Check
Last Commit

7 months ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
8 stars in the last 30 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Luca Antiga Luca Antiga(CTO of Lightning AI), and
2 more.

mmagic by open-mmlab

0.1%
7k
AIGC toolbox for image/video editing and generation
Created 6 years ago
Updated 1 year ago
Starred by Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), Jiayi Pan Jiayi Pan(Author of SWE-Gym; MTS at xAI), and
15 more.

taming-transformers by CompVis

0.1%
6k
Image synthesis research paper using transformers
Created 4 years ago
Updated 1 year ago
Feedback? Help us improve.