jukebox by openai

Generative model for music research paper

Created 5 years ago

8,043 stars

Top 6.4% on SourcePulse

View on GitHub

13 Experts Love This Project

Aravind Srinivas

Cofounder of Perplexity

Dan Abramov

Core Contributor to React; Coauthor of Redux, Create React App

Lewis Tunstall

Research Engineer at Hugging Face

Johannes Schickling

Cofounder of Prisma

and 9 more!

Project Summary

Jukebox is an open-source project providing code for a generative music model capable of producing novel music in various styles, including lyrics. It is targeted at researchers and developers interested in AI music generation.

How It Works

Jukebox employs a hierarchical VQ-VAE and transformer architecture. It first compresses audio into discrete codes using a VQ-VAE, then models these codes with a transformer, and finally upsamples them back to audio. This approach allows for modeling long-range dependencies in music, enabling the generation of coherent and stylistically diverse pieces.

Quick Start & Requirements

Install: Use Conda for environment setup (conda create --name jukebox python=3.7.5, conda activate jukebox). Install dependencies like mpi4py, pytorch, torchvision, cudatoolkit=10.0, and av. Then, git clone the repository, cd jukebox, and run pip install -r requirements.txt followed by pip install -e ..
Prerequisites: Python 3.7.5, CUDA 10.0, and significant GPU memory (16GB+ recommended for V100). Training requires av=7.0.01 and potentially Apex for faster training.
Resources: Sampling 20 seconds of music on a V100 takes ~3 hours. Model sizes range from 3.8GB (1b_lyrics) to 11.5GB (5b_lyrics).

Highlighted Details

Supports sampling from scratch, continuing existing samples, upsampling, and priming with custom audio.
Offers training pipelines for VQ-VAE, priors, and upsamplers, with options for label conditioning (artist, genre, timing) and lyric conditioning.
Pre-trained models are available, and the framework allows for fine-tuning on new datasets or styles.
Detailed instructions are provided for configuring and training models with various conditioning methods.

Maintenance & Community

The project is marked as "Archive" with no expected updates. It originates from OpenAI.

Licensing & Compatibility

The project is released under a Noncommercial Use License, which restricts commercial use of both the code and released weights.

Limitations & Caveats

The project is archived and no longer maintained. Training the largest 5B model requires GPipe, which is not supported in this release. The setup process involves specific older versions of dependencies (Python 3.7.5, CUDA 10.0), which may pose compatibility challenges with modern systems.

Health Check

Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

12 stars in the last 30 days