sheetsage by chrisdonahue

CLI tool for music lead sheet transcription

Created 3 years ago

423 stars

Top 69.7% on SourcePulse

View on GitHub

1 Expert Loves This Project

Robert Stojnic

Cocreator of Papers with Code

Project Summary

Sheet Sage transcribes music into lead sheets, providing melody and chord information for Western pop music. It is designed for musicians, researchers, and developers interested in automated music transcription and analysis. The tool aims to simplify the process of creating playable sheet music from audio sources.

How It Works

Sheet Sage employs a deep learning approach, trained on a custom dataset derived from Hooktheory's TheoryTab DB. It utilizes beat and downbeat detection algorithms, with an option to integrate features from OpenAI's Jukebox for enhanced transcription quality. The system outputs lead sheets in PDF format, along with LilyPond files and MIDI representations of the melody and harmony.

Quick Start & Requirements

Installation: Requires Linux and Docker. Run ./prepare.sh for initial setup (downloads ~4GB Docker image and ~100MB data).
Transcription: Execute ./sheetsage.sh <youtube_url_or_local_file>.
Jukebox Integration: Requires a GPU with at least 12GB memory and CUDA installed. Run ./prepare.sh -j to download ~10GB of Jukebox model files.
Documentation: https://github.com/chrisdonahue/sheetsage

Highlighted Details

Transcribes melody and chords from audio.
Supports YouTube URLs and local audio files.
Offers parameters to fine-tune tempo, downbeat detection, and segment selection.
Jukebox integration for improved transcription quality (GPU required).
Released a 50-hour aligned melody and harmony dataset (CC BY-NC-SA 3.0).

Maintenance & Community

The project is maintained by Chris Donahue. Further community interaction details are not explicitly provided in the README.

Licensing & Compatibility

Code License: MIT.
Model/Dataset License: CC BY-NC-SA 3.0.
Dependencies: Jukebox, madmom, and Melisma have additional licensing terms that may affect commercial use. Users must ensure compliance with all terms.

Limitations & Caveats

The system is primarily optimized for Western pop music and may yield suboptimal results for other genres. Downbeat detection can be brittle, requiring manual adjustments. Commercial use is restricted due to the CC BY-NC-SA 3.0 license of the training data and dependencies.

Health Check

Last Commit

9 months ago

Responsiveness

1+ week

Pull Requests (30d)

Issues (30d)

Star History

7 stars in the last 30 days