Discover and explore top open-source AI tools and projects—updated daily.
microsoftStandardized library for audio distance metrics
Top 96.4% on SourcePulse
Summary
FADtk provides a standardized Python library for calculating Fréchet Audio Distance (FAD), a metric for evaluating audio generation models, particularly in music. It targets researchers and engineers, offering efficient FAD computation, outlier detection, and the use of pre-computed statistics, streamlining generative audio model assessment.
How It Works
The toolkit computes audio embeddings using a diverse selection of pre-trained models (e.g., CLAP, Encodec, Wav2vec 2.0) and then calculates FAD scores between reference and generated audio datasets. This approach allows for quantitative comparison of audio quality and diversity, with support for both aggregate (FAD∞) and per-sample FAD scores to identify outliers.
Quick Start & Requirements
Installation is straightforward via pip: pip install fadtk. A PyTorch installation is a prerequisite. The library is tested on Python 3.12 and supports versions greater than 3.10 on Linux, Windows, and macOS. Optional dependencies for specific embedding models like CDPAM and DAC can be installed separately. A comprehensive test suite is available via python -m fadtk.test. Further details and a demo are available at https://fadtk.hydev.org/.
Highlighted Details
Maintenance & Community
The project is associated with authors from Microsoft and academic institutions, as indicated by the paper citation. The README does not specify dedicated community channels (e.g., Discord, Slack) or a public roadmap.
Licensing & Compatibility
FADtk is released under the permissive MIT License. This license permits broad usage, including integration into commercial and closed-source projects without significant restrictions.
Limitations & Caveats
Certain advanced embedding models require separate installation beyond the default pip install fadtk command. The effectiveness of FAD scores is highly dependent on the careful selection of reference datasets and embedding models, as detailed in the project's best practices.
9 months ago
Inactive
intel
openai
facebookresearch