Discover and explore top open-source AI tools and projects—updated daily.
AI agent for audio creation and editing
Top 60.2% on SourcePulse
WavCraft is an AI agent designed for audio creation and editing, targeting researchers and content creators. It simplifies complex audio tasks like text-guided generation, editing, and scriptwriting by leveraging large language models to orchestrate various audio expert models and digital signal processing functions.
How It Works
WavCraft functions as an LLM-driven agent, connecting diverse audio models and DSP functions. This approach allows users to interact with audio using natural language prompts for tasks such as editing existing audio clips based on text descriptions or generating new audio from scratch. The agent's architecture integrates multiple specialized audio models, enabling a unified interface for various audio manipulation needs.
Quick Start & Requirements
bash scripts/setup_envs.sh
.OPENAI_KEY
and HF_KEY
environment variables.bash scripts/start_services.sh
.python3 WavCraft.py basic -f --input-wav assets/duck_quacking_in_water.wav --input-text "Add dog barking."
python3 WavCraft-chat.py basic -f -c
python3 check_watermark.py --wav-path /path/to/audio/file
--model
argument.Highlighted Details
Maintenance & Community
The project acknowledges contributions from WavJourney, AudioCraft, AudioSep, AudioSR, AudioLDM, and WavMark. The primary author is Jinhua Liang.
Licensing & Compatibility
The repository is for research purposes only. Users must not disable watermarking techniques.
Limitations & Caveats
This repository is intended for research purposes only, and the developers are not responsible for the semantics of generated or edited audio. Users are explicitly prohibited from disabling the watermarking features.
7 months ago
Inactive