awesome-audio-visual by krantiparida

Curated list of audio-visual papers and datasets

Created 7 years ago

775 stars

Top 44.3% on SourcePulse

Project Summary

This repository is a curated list of papers and datasets focused on audio-visual processing, inspired by the "awesome-computer-vision" format. It serves researchers and practitioners in machine learning and computer vision who are working on tasks that leverage both audio and visual information from videos. The primary benefit is a centralized, organized resource for exploring the state-of-the-art in this interdisciplinary field.

How It Works

The repository categorizes research papers and datasets across a wide spectrum of audio-visual tasks. These include localization, separation, representation learning, action recognition, deepfakes, navigation, speech processing, question answering, stylization, and generation. Each entry typically links to the paper, and often to associated code, project pages, or datasets, providing a comprehensive overview of the research landscape.

Quick Start & Requirements

This is a curated list, not a software package. No installation or execution is required. Users access the information via the README.

Highlighted Details

Comprehensive coverage of 20+ distinct audio-visual research areas.
Extensive listing of papers from top-tier conferences (CVPR, ECCV, NeurIPS, ICCV, ICLR, AAAI).
Includes links to numerous datasets specifically designed for audio-visual tasks.
Many entries provide direct links to code repositories for practical implementation.

Maintenance & Community

The repository is maintained by Kranti Kumar Parida, with an open invitation for pull requests and contributions to add or correct links.

Licensing & Compatibility

The content is licensed under a Creative Commons CC0 (Public Domain Dedication), meaning it is free for all uses without restriction.

Limitations & Caveats

As a curated list, the repository's content is dependent on the maintainer's and community's ongoing efforts to update it with the latest research. Links may become outdated over time.

awesome-audio-visual by krantiparida

Explore Similar Projects

awesome-fake-audio-detection by john852517791

awesome-audiovisual-learning by GeWu-Lab

Audio-Deepfake-Detection by media-sec-lab

smol-audio by Deep-unlearning

audio-development-tools by Yuan-ManX

bubogpt by magic-research

MOSS-Audio by OpenMOSS

seld-net by sharathadavanne

VideoLLaMA2 by DAMO-NLP-SG

ast by YuanGongND

Kimi-Audio by MoonshotAI

Ultralight-Digital-Human by anliyuan