Discover and explore top open-source AI tools and projects—updated daily.
fspeciiAI music studio for professional audio creation
Top 64.2% on SourcePulse
HeartMuLa Studio is a professional, Suno-like AI music generation platform designed for creators seeking advanced features like reference audio style transfer and LLM-powered lyric generation. It targets engineers, researchers, and power users, offering a powerful toolset for producing complete songs with vocals, instrumentals, and customizable styles, while optimizing for performance and VRAM usage.
How It Works
The studio leverages the HeartLib AI engine (MuQ, MuLan, HeartCodec) for core music generation, enabling full song creation up to four minutes, instrumental tracks, and style definition via tags. A key differentiator is its experimental reference audio style transfer, allowing users to upload any audio file to influence the generated music, with adjustable intensity and precise region selection via a waveform visualizer. AI-powered lyrics are generated using LLMs, supporting both local Ollama and cloud-based OpenRouter, with features for topic-based generation, style suggestions, and prompt enhancement. The architecture combines a React/TypeScript frontend with a FastAPI backend.
Quick Start & Requirements
Installation is streamlined via a ./start.sh script or a recommended Docker setup.
pip install triton or triton-windows) is needed for torch.compile.Highlighted Details
torch.compile for up to 2x faster inference.Maintenance & Community
The project is actively developed by fspecii/HeartMuLa. No specific details regarding core maintainers, sponsorships, or dedicated community channels (like Discord/Slack) are provided in the README.
Licensing & Compatibility
The project is released under the permissive MIT License, which generally allows for commercial use and integration into closed-source projects without significant restrictions.
Limitations & Caveats
HeartMuLa Studio is not supported on systems with less than 10GB of VRAM; systems with 10-14GB VRAM require model swapping, impacting generation speed. The reference audio style transfer feature is marked as experimental. Initial model downloads and torch.compile can lead to slower first-run performance. Flash Attention is disabled on older NVIDIA GPUs (SM 6.x and older) and AMD GPUs, with compatibility varying for the latter.
1 week ago
Inactive
fixie-ai