Discover and explore top open-source AI tools and projects—updated daily.
Zero-shot podcast generation system
Top 86.5% on SourcePulse
MoonCast is an open-source system for high-quality, zero-shot podcast generation, aiming to advance human-like speech synthesis. It targets researchers and developers interested in voice technology, enabling the creation of natural and expressive synthetic voices.
How It Works
The system employs a two-stage process for script generation, utilizing LLM prompts with Gemini 2.0 Pro Experimental 02-05 for conversational dialogue and broad topic coverage. Audio generation leverages publicly available podcast segments as prompts.
Quick Start & Requirements
conda create -n mooncast -y python=3.10
), activate it (conda activate mooncast
), and install dependencies (pip install -r requirements.txt
, pip install flash-attn --no-build-isolation
, pip install huggingface_hub
, pip install gradio==5.22.0
).python download_pretrain.py
.flash-attn
and CUDA_VISIBLE_DEVICIES
), flash-attn
.CUDA_VISIBLE_DEVICIES=0 python app.py
for a Gradio interface.Highlighted Details
Maintenance & Community
The project welcomes contributions from anyone interested in improving code, documentation, or providing feedback. Contact information is provided for audio file usage concerns.
Licensing & Compatibility
The project is intended for research purposes only. Redistribution of original or generated audio is strictly prohibited. Users must comply with applicable laws and ethical guidelines.
Limitations & Caveats
The audio prompts are sourced from publicly available podcast segments and are for demonstration purposes only. The project is intended strictly for research purposes, and users are responsible for responsible use.
5 months ago
Inactive