Discover and explore top open-source AI tools and projects—updated daily.
AI-powered system for podcast creation from academic texts
Top 97.5% on SourcePulse
This project provides an AI-powered system for creating podcasts from academic PDFs, targeting researchers and content creators. It automates the process from text extraction and summarization to dialogue generation and audio synthesis, with a unique self-improvement loop for prompt optimization.
How It Works
The system processes PDF inputs using OCR, then employs AI agents for summarization, scriptwriting (host/guest dialogue), and enhancement (banter, flow). Advanced TTS generates distinct audio voices. A key innovation is the use of TextGrad with weight clipping for prompt optimization based on user feedback, ensuring stable and meaningful prompt evolution. This creates a continuous improvement cycle, with new prompts versioned by timestamps. Simulation and evaluation scripts assess and validate this self-improvement process.
Quick Start & Requirements
conda create -n podcast python=3.12 -y && conda activate podcast && conda install pip -y
, then pip install -r requirements.txt
and pip install uvicorn
. For frontend: cd frontend && npm install
.jiter
), Node.js/npm, OpenAI API key.OPENAI_API_KEY
environment variable or via a .env
file.Highlighted Details
Maintenance & Community
The project welcomes collaboration, particularly on self-improving prompts and local TTS solutions. Contributions can be made via GitHub issues and pull requests.
Licensing & Compatibility
The README does not explicitly state a license. Compatibility for commercial use or closed-source linking is not specified.
Limitations & Caveats
The system relies on OpenAI's API, incurring potential costs. The README mentions a "weight clipping" concept inspired by gradient clipping but doesn't detail its implementation or impact on prompt quality. Feedback allocation across multiple agents is noted as a complex area requiring further refinement.
11 months ago
Inactive