MoonCast by jzq2000

Zero-shot podcast generation system

Created 11 months ago

345 stars

Top 80.6% on SourcePulse

Project Summary

MoonCast is an open-source system for high-quality, zero-shot podcast generation, aiming to advance human-like speech synthesis. It targets researchers and developers interested in voice technology, enabling the creation of natural and expressive synthetic voices.

How It Works

The system employs a two-stage process for script generation, utilizing LLM prompts with Gemini 2.0 Pro Experimental 02-05 for conversational dialogue and broad topic coverage. Audio generation leverages publicly available podcast segments as prompts.

Quick Start & Requirements

Installation: Create a conda environment (conda create -n mooncast -y python=3.10), activate it (conda activate mooncast), and install dependencies (pip install -r requirements.txt, pip install flash-attn --no-build-isolation, pip install huggingface_hub, pip install gradio==5.22.0).
Pretrained Weights: Download via python download_pretrain.py.
Requirements: Python 3.10, CUDA (implied by flash-attn and CUDA_VISIBLE_DEVICIES), flash-attn.
Demo: A HuggingFace space is available for testing audio generation.
Local UI: Run CUDA_VISIBLE_DEVICIES=0 python app.py for a Gradio interface.

Highlighted Details

Zero-shot podcast generation.
Gemini 2.0 Pro Experimental 02-05 for script generation.
Gradio-based UI for audio generation.
Focus on natural and expressive synthetic voices.

Maintenance & Community

The project welcomes contributions from anyone interested in improving code, documentation, or providing feedback. Contact information is provided for audio file usage concerns.

Licensing & Compatibility

The project is intended for research purposes only. Redistribution of original or generated audio is strictly prohibited. Users must comply with applicable laws and ethical guidelines.

Limitations & Caveats

The audio prompts are sourced from publicly available podcast segments and are for demonstration purposes only. The project is intended strictly for research purposes, and users are responsible for responsible use.

MoonCast by jzq2000

Explore Similar Projects

Voice-Clone-Studio by FranckyB

PodCastLM by YOYZHANG

WavJourney by Audio-AGI

ComfyUI_IndexTTS by billwuhao

SonicVale by xcLee001

MOSS-TTSD by OpenMOSS

Twocast by panyanyany

FireRedTTS2 by FireRedTeam

podcastfy by souzatharsis

Zonos by Zyphra

Qwen3-TTS by QwenLM

csm by SesameAILabs