awesome-audiovisual-learning  by GeWu-Lab

Curated list of audio-visual learning methods and datasets

created 3 years ago
264 stars

Top 97.5% on sourcepulse

GitHubView on GitHub
Project Summary

This repository is a curated list of audio-visual learning methods and datasets, serving as a comprehensive resource for researchers and practitioners in the field. It aims to catalog and organize the rapidly evolving landscape of techniques that leverage both audio and visual information for tasks such as recognition, generation, localization, and more.

How It Works

The repository categorizes audio-visual learning research into distinct areas, including boosting recognition, cross-modal perception and generation, transfer learning, collaboration, localization, and question answering. Within each category, it lists relevant papers, often with author affiliations and publication venues, providing a structured overview of the state-of-the-art. It also includes a table of popular datasets used in audio-visual learning, detailing their size, source, and primary tasks.

Quick Start & Requirements

This repository is a curated list and does not require installation or execution. It serves as a reference guide.

Highlighted Details

  • Extensive coverage across numerous audio-visual learning sub-fields, from speech recognition to embodied navigation.
  • Includes a detailed table of datasets with metadata like length, source, and applicable tasks.
  • Features a broad range of research papers, spanning from 2016 to the latest publications.
  • Organized by task, allowing users to quickly find relevant work.

Maintenance & Community

The list is actively maintained and encourages community contributions via Pull Requests for nominating new works. Links to the survey website and arXiv are provided for deeper engagement.

Licensing & Compatibility

The repository itself is a list of links and information, not software, and is therefore not subject to software licensing restrictions.

Limitations & Caveats

As a curated list, its comprehensiveness is dependent on community contributions and the maintainers' efforts. It does not provide code or implementations, requiring users to seek out individual research papers for practical application.

Health Check
Last commit

8 months ago

Responsiveness

1 week

Pull Requests (30d)
0
Issues (30d)
0
Star History
7 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.