Discover and explore top open-source AI tools and projects—updated daily.
Stability-AIAudio generation platform for music and sound effects
Top 81.2% on SourcePulse
<2-3 sentences summarising what the project addresses and solves, the target audience, and the benefit.> Stable Audio 3 is an open platform for fast, high-quality audio and music generation, offering streamlined inference and fine-tuning. It targets researchers and power users seeking efficient tools for creating and editing audio content, benefiting from state-of-the-art models and flexible hardware support.
How It Works
This project leverages a new Semantic-Acoustic Music Encoder (SAME) autoencoder, supporting stereo, 44.1 kHz audio. It provides three core inference modes: text-to-audio, audio-to-audio editing, and inpainting/continuation. This design enables variable-length generation, efficient VRAM utilization, and personalization through stackable LoRA fine-tuning, optimizing both generative tractability and reconstruction quality.
Quick Start & Requirements
uv sync. Run Gradio UI with uv run python run_gradio.py --model medium.uv package manager. CUDA 12.6+ is default for PyTorch; specific versions can be pinned. Flash Attention 2 is required for the medium model.Highlighted Details
Maintenance & Community
The project is associated with the Harmonai Discord server, which hosts discussions and weekly office hours on AI audio and music. The underfit tool by Dadabots is mentioned as an experimental option for advanced LoRA training.
Licensing & Compatibility
The project is released under the Stability AI Community License. Specific compatibility notes for commercial use or closed-source linking are not detailed in the README.
Limitations & Caveats
The 'Large' model is exclusively available via API and not supported by this repository. Stable Audio 3 Medium requires Flash Attention 2, and installation issues can lead to static glitch sounds. Troubleshooting Flash Attention installation is critical for the medium model's functionality.
6 days ago
Inactive
haoheliu
lucidrains
open-mmlab