YuE by multimodal-art-projection

Open-source tool for generating full songs from lyrics

Created 1 year ago

6,319 stars

Top 8.0% on SourcePulse

View on GitHub

3 Experts Love This Project

Jiayi Pan

Author of SWE-Gym; MTS at xAI

Project Summary

YuE is an open-source foundation model series for generating full-length songs from lyrics, offering both vocal and accompaniment tracks. It supports diverse genres, languages, and vocal techniques, aiming to democratize AI music creation for artists and researchers.

How It Works

YuE employs a multi-stage generation process. The core approach involves transforming lyrics into a complete song structure, including vocals and instrumental backing. It supports both standard "Chain-of-Thought" (CoT) generation and "In-Context Learning" (ICL) for style transfer or voice cloning by conditioning on reference audio. This dual approach allows for creative control and stylistic adherence.

Quick Start & Requirements

Install: Recommended to use conda for environment setup. Install dependencies via pip install -r requirements.txt. FlashAttention 2 is mandatory for memory efficiency.
Prerequisites: Python >= 3.8, CUDA >= 11.8, PyTorch with matching cudatoolkit, git-lfs.
Models: Download model weights from Hugging Face (e.g., m-a-p/YuE-s1-7B-anneal-en-cot).
Resources: Requires significant GPU memory; 24GB recommended for basic use, 80GB+ for extensive generation.
Demo/UI: Gradio interfaces (YuE-UI, YuE-exllamav2-UI, YuEGP) and a Windows installer (Pinokio) are available.
Docs: Prompt engineering guide available.

Highlighted Details

Supports incremental song generation and music continuation.
Offers dual-track ICL for advanced style transfer and voice cloning.
Optimized for lower VRAM GPUs (e.g., 8GB) via quantized models and community UIs.
Generates multi-minute songs with distinct vocal and accompaniment tracks.

Maintenance & Community

Active development with recent updates on incremental generation and ICL modes.
Community support via Discord.
Project co-led by HKUST and M-A-P, with support from industry partners.

Licensing & Compatibility

Licensed under Apache License 2.0.
Encourages commercial use and monetization of generated outputs with attribution to "YuE by HKUST/M-A-P".

Limitations & Caveats

The model requires substantial GPU resources for full-song generation. While community optimizations exist for lower VRAM, they may impact musicality. The "intro" label is noted as less stable.

Health Check

Last Commit

1 year ago

Responsiveness

1 day

Pull Requests (30d)

Issues (30d)

Star History

54 stars in the last 30 days