musicfm  by minzwon

Enables comprehensive music analysis and representation via foundation model

Created 2 years ago
253 stars

Top 99.3% on SourcePulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

Summary

MusicFM is a foundation model designed for music informatics, addressing the need for versatile audio representations applicable across various downstream tasks. It targets researchers and engineers in music AI, offering a powerful base for tasks like beat tracking, chord recognition, and music tagging, aiming to simplify and advance music analysis.

How It Works

MusicFM employs a masked token modeling approach, inspired by BEST-RQ, where input audio segments are masked, and the model reconstructs their representations. It utilizes a Conformer architecture, demonstrating superior performance over BERT-based models for music tasks. The model supports mixed precision and Flash attention for memory efficiency and can output both frame-level and sequence-level embeddings through adaptive or global average pooling, respectively.

Quick Start & Requirements

  • Installation: Clone the repository and configure the HOME_PATH environment variable.
  • Prerequisites: Python, PyTorch. A GPU is recommended for optimal performance.
  • Models: Download pretrained checkpoints using wget commands provided in the README:
    • FMA version: fma_stats.json, pretrained_fma.pt
    • MSD version (recommended for better performance): msd_stats.json, pretrained_msd.pt Note: Model checkpoints prior to February 13, 2024, were incorrect and require re-download.
  • Links: Paper.

Highlighted Details

  • Offers two pretrained versions: FMA (using Creative Commons licensed audio) and MSD (Million Song Dataset), with the MSD version showing superior performance.
  • Supports mixed precision and Flash attention for reduced memory footprint during inference.
  • Embeddings can be adapted for frame-level or sequence-level analysis using pooling strategies.
  • The Conformer architecture proves more effective than BERT for music informatics tasks.

Maintenance & Community

  • Contributors: Core development by Minz Won, Yun-Ning Hung, and Duc Le. Ju-Chiang Wang contributed to data refinement and evaluation code.
  • Updates: Model checkpoints were updated on February 13, 2024, due to an identified error.
  • Community: No explicit community channels (e.g., Discord, Slack) or roadmap are mentioned in the provided text.

Licensing & Compatibility

  • License: The repository's specific software license is not stated. The FMA dataset used for one model is Creative Commons licensed.
  • Compatibility: No explicit notes regarding commercial use or linking restrictions are provided. The absence of a clear software license is a significant adoption blocker.

Limitations & Caveats

Self-supervised foundation models, including MusicFM, exhibit limitations in inherent musical key detection, requiring fine-tuning to improve performance. The released model uses the FMA dataset to mitigate licensing issues; larger datasets could yield better results but are not released. Fine-tuned models for downstream tasks are not publicly available. The downstream evaluation pipeline is also not included in the repository. Fine-tuning requires careful learning rate management to avoid catastrophic forgetting and potential overfitting, as observed in the tagging task.

Health Check
Last Commit

2 years ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
2 stars in the last 30 days

Explore Similar Projects

Starred by Luis Capelo Luis Capelo(Cofounder of Lightning AI), Patrick von Platen Patrick von Platen(Author of Hugging Face Diffusers; Research Engineer at Mistral), and
2 more.

muzic by microsoft

0.1%
5k
AI research project for music understanding and generation
Created 4 years ago
Updated 1 year ago
Starred by Aravind Srinivas Aravind Srinivas(Cofounder of Perplexity), Dan Abramov Dan Abramov(Core Contributor to React; Coauthor of Redux, Create React App), and
11 more.

jukebox by openai

0.0%
8k
Generative model for music research paper
Created 6 years ago
Updated 1 year ago
Feedback? Help us improve.