WavCraft by JinhuaLiang

AI agent for audio creation and editing

Created 1 year ago

525 stars

Top 60.0% on SourcePulse

Project Summary

WavCraft is an AI agent designed for audio creation and editing, targeting researchers and content creators. It simplifies complex audio tasks like text-guided generation, editing, and scriptwriting by leveraging large language models to orchestrate various audio expert models and digital signal processing functions.

How It Works

WavCraft functions as an LLM-driven agent, connecting diverse audio models and DSP functions. This approach allows users to interact with audio using natural language prompts for tasks such as editing existing audio clips based on text descriptions or generating new audio from scratch. The agent's architecture integrates multiple specialized audio models, enabling a unified interface for various audio manipulation needs.

Quick Start & Requirements

Install via bash scripts/setup_envs.sh.
Requires OPENAI_KEY and HF_KEY environment variables.
Launch services with bash scripts/start_services.sh.
Basic usage: python3 WavCraft.py basic -f --input-wav assets/duck_quacking_in_water.wav --input-text "Add dog barking."
Interactive chat: python3 WavCraft-chat.py basic -f -c
Watermark check: python3 check_watermark.py --wav-path /path/to/audio/file
Supports openLLMs (e.g., MistralAI family) by specifying the --model argument.

Highlighted Details

Text-guided audio editing and generation.
AI-powered audio scriptwriting.
Watermarking for generated/modified audio detection.
Support for openLLMs like Mistral-7B-Instruct-v0.2.

Maintenance & Community

The project acknowledges contributions from WavJourney, AudioCraft, AudioSep, AudioSR, AudioLDM, and WavMark. The primary author is Jinhua Liang.

Licensing & Compatibility

The repository is for research purposes only. Users must not disable watermarking techniques.

Limitations & Caveats

This repository is intended for research purposes only, and the developers are not responsible for the semantics of generated or edited audio. Users are explicitly prohibited from disabling the watermarking features.

WavCraft by JinhuaLiang

Explore Similar Projects

AudioStory by TencentARC

WavJourney by Audio-AGI

SonicVale by xcLee001

tango by declare-lab

openvino-plugins-ai-audacity by intel

FunMusic by FunAudioLLM

AudioLDM by haoheliu

audiolm-pytorch by lucidrains

Kimi-Audio by MoonshotAI

AudioGPT by AIGC-Audio

csm by SesameAILabs

audiocraft by facebookresearch