Podcast by artnoage

AI-powered system for podcast creation from academic texts

Created 1 year ago

264 stars

Top 96.8% on SourcePulse

Project Summary

This project provides an AI-powered system for creating podcasts from academic PDFs, targeting researchers and content creators. It automates the process from text extraction and summarization to dialogue generation and audio synthesis, with a unique self-improvement loop for prompt optimization.

How It Works

The system processes PDF inputs using OCR, then employs AI agents for summarization, scriptwriting (host/guest dialogue), and enhancement (banter, flow). Advanced TTS generates distinct audio voices. A key innovation is the use of TextGrad with weight clipping for prompt optimization based on user feedback, ensuring stable and meaningful prompt evolution. This creates a continuous improvement cycle, with new prompts versioned by timestamps. Simulation and evaluation scripts assess and validate this self-improvement process.

Quick Start & Requirements

Install: conda create -n podcast python=3.12 -y && conda activate podcast && conda install pip -y, then pip install -r requirements.txt and pip install uvicorn. For frontend: cd frontend && npm install.
Prerequisites: Python 3.12, Rust (for jiter), Node.js/npm, OpenAI API key.
Setup: Requires setting the OPENAI_API_KEY environment variable or via a .env file.
Docs: https://www.metaskepsis.com/ (demo)

Highlighted Details

AI agents for summarization, scriptwriting, and dialogue enhancement.
TextGrad with weight clipping for prompt optimization and stable learning.
Timestamped version control for prompts and generated content.
Simulation and evaluation scripts for assessing self-improvement.
Integrated React frontend and FastAPI backend for a user-friendly interface.

Maintenance & Community

The project welcomes collaboration, particularly on self-improving prompts and local TTS solutions. Contributions can be made via GitHub issues and pull requests.

Licensing & Compatibility

The README does not explicitly state a license. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The system relies on OpenAI's API, incurring potential costs. The README mentions a "weight clipping" concept inspired by gradient clipping but doesn't detail its implementation or impact on prompt quality. Feedback allocation across multiple agents is noted as a complex area requiring further refinement.

Podcast by artnoage

Explore Similar Projects

voquill by josiahsrc

opensource_notebooklm by satvik314

smol-podcaster by FanaHOVA

PodCastLM by YOYZHANG

payload-ai by ashbuilds

paper_to_podcast by Azzedde

meetingmind by misbahsy

Local-NotebookLM by Goekdeniz-Guelmez

PDF2Audio by lamm-mit

pdf-to-podcast by NVIDIA-AI-Blueprints

pdf-to-podcast by knowsuchagency

podcastfy by souzatharsis