ACE-Step-1.5  by ace-step

Advanced open-source music generation model

Created 7 months ago
8,392 stars

Top 6.1% on SourcePulse

GitHubView on GitHub
Project Summary

<2-3 sentences summarising what the project addresses and solves, the target audience, and the benefit.> ACE-Step 1.5 is an open-source music generation model designed to deliver commercial-grade audio quality on consumer hardware. It targets music artists, producers, and content creators, offering a fast, efficient, and locally runnable solution that significantly enhances creative workflows. The model provides advanced control and personalization capabilities, democratizing high-fidelity music synthesis.

How It Works

The project employs a novel hybrid architecture where a Language Model (LM) acts as an omni-capable planner. This LM transforms user queries into detailed song blueprints, guiding a Diffusion Transformer (DiT) through Chain-of-Thought synthesis of metadata and lyrics. A key innovation is its alignment mechanism, which uses intrinsic reinforcement learning based on internal model states, bypassing biases from external reward models or human preferences. This approach enables precise stylistic control and versatile editing.

Quick Start & Requirements

  • Primary Install/Run:
    • Windows Portable Package (Recommended): Download and extract ACE-Step-1.5.7z. Launch the Gradio Web UI via start_gradio_ui.bat or the REST API Server via start_api_server.bat.
    • Standard Installation: Install the uv package manager (via curl/PowerShell script). Clone the repository (git clone https://github.com/ACE-Step/ACE-Step-1.5.git), navigate into the directory, and run uv sync. Launch via uv run acestep (Gradio UI) or uv run acestep-api (REST API).
  • Prerequisites: Python 3.11. CUDA GPU is recommended for performance; CPU/MPS are supported but slower. The portable package specifies CUDA 12.8.
  • Resource Footprint: Runs locally with <4GB VRAM. LoRA training needs ~12GB VRAM (1hr/8 songs on 3090).
  • Documentation/Demos: Links to Hugging Face, ModelScope, Space Demo, Discord, and Technical Report are available.

Highlighted Details

  • Performance: Achieves ultra-fast generation (under 10 seconds on an RTX 3090) and runs locally with <4GB VRAM.
  • Quality: Delivers commercial-grade output, outperforming many commercial alternatives.
  • Versatility: Supports flexible durations (10s to 10min), multi-language prompts (50+), and advanced editing features like cover generation, repainting, and vocal-to-BGM conversion.
  • Personalization: Enables lightweight LoRA training from just a few songs to capture user-specific styles.

Maintenance & Community

The project is co-led by ACE Studio and StepFun. A Discord server is available for community interaction.

Licensing & Compatibility

ACE-Step 1.5 is released under the MIT license, permitting broad use, including commercial applications and integration into closed-source projects without significant restrictions.

Limitations & Caveats

While functional on CPU/MPS, performance is significantly reduced. Intel GPU support is experimental, with potential speed limitations for longer audio and lack of specific acceleration features. The project also warns against fake domains, directing users exclusively to its official GitHub Pages site.

Health Check
Last Commit

13 hours ago

Responsiveness

Inactive

Pull Requests (30d)
134
Issues (30d)
140
Star History
1,609 stars in the last 30 days

Explore Similar Projects

Starred by Dan Abramov Dan Abramov(Core Contributor to React; Coauthor of Redux, Create React App), Benjamin Bolte Benjamin Bolte(Cofounder of K-Scale Labs), and
3 more.

riffusion-app-hobby by riffusion

0%
3k
Web app for real-time music generation using stable diffusion
Created 3 years ago
Updated 1 year ago
Starred by Aravind Srinivas Aravind Srinivas(Cofounder of Perplexity), Dan Abramov Dan Abramov(Core Contributor to React; Coauthor of Redux, Create React App), and
11 more.

jukebox by openai

0.0%
8k
Generative model for music research paper
Created 6 years ago
Updated 1 year ago
Feedback? Help us improve.