personalized-podcast by zarazhangrui

AI podcast generator from any content

Created 3 months ago

397 stars

Top 72.3% on SourcePulse

Project Summary

This project provides a "coding agent skill" that transforms any text-based content into a personalized, two-host AI podcast. It targets users who prefer audio consumption of information, individuals seeking self-reflection through AI analysis of their personal data, and developers interested in automated podcast generation. The primary benefit is the ability to listen to newsletters, research papers, meeting notes, or even personal documents on the go, with customizable hosts, voices, and scripts, delivered directly to standard podcast apps via an RSS feed.

How It Works

The core approach leverages a coding agent to read input content (pasted text, files, or URLs) and generate a natural-sounding conversational script between two distinct AI hosts. Text-to-speech synthesis is handled by Fish Audio, offering a vast library of voices. The generated audio segments are then stitched together using pydub and ffmpeg, incorporating natural pacing and fade effects. This pipeline runs entirely locally, eliminating the need for separate backends or hosted services. Customization is achieved through editing a PROMPT.md file for script behavior and a config.yaml file for host personalities, voices, and show parameters.

Quick Start & Requirements

Install: Clone the repository using a coding agent: gh repo clone zarazhangrui/personalized-podcast-skill ~/.claude/skills/personalized-podcast
Run: Execute the podcast command within your agent: /podcast <paste content, point to files, or describe a topic>
Prerequisites:
- A coding agent supporting skills (e.g., Claude Code, Gemini CLI, Copilot CLI).
- Python 3.10+.
- ffmpeg (install via brew install ffmpeg on macOS).
- A Fish Audio account (free tier available) and an API key stored in a local .env file.
Setup: The first run automatically sets up the Python environment and installs dependencies.
Resources: Voice discovery available at fish.audio/discovery.

Highlighted Details

Versatile Content Ingestion: Accepts direct text input, file paths (e.g., .txt, .md, .pdf), and URLs.
Unique Self-Reflection Use Case: Enables AI hosts to analyze personal content like resumes or journal entries, offering insights into the user's thinking and communication patterns.
Highly Customizable Formats: Supports various show structures including debates, eavesdropping analyses, interviews, solo monologues, and news roundups via prompt engineering.
Extensive Voice and Tone Options: Users can select from millions of voices on Fish Audio and define the show's personality (e.g., serious, casual, confrontational) in the configuration.
Automatic RSS Feed Generation: Optionally creates a GitHub Pages-hosted RSS feed for seamless integration with popular podcast applications like Apple Podcasts and Spotify.

Maintenance & Community

The project is built by Zara Zhang. No specific details regarding additional contributors, community channels (like Discord or Slack), or a public roadmap are provided in the README.

Licensing & Compatibility

The open-source license is not explicitly stated in the README. This omission creates ambiguity regarding commercial use and derivative works. The project is designed for local execution via a coding agent.

Limitations & Caveats

The project's functionality is contingent on the availability and compatibility of a suitable coding agent and the Fish Audio TTS service. The absence of a specified license poses a significant caveat for adoption, particularly in commercial or collaborative contexts, as usage terms are unclear.

Health Check

Last Commit

2 months ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

35 stars in the last 30 days