Podcast  by artnoage

AI-powered system for podcast creation from academic texts

Created 11 months ago
261 stars

Top 97.5% on SourcePulse

GitHubView on GitHub
Project Summary

This project provides an AI-powered system for creating podcasts from academic PDFs, targeting researchers and content creators. It automates the process from text extraction and summarization to dialogue generation and audio synthesis, with a unique self-improvement loop for prompt optimization.

How It Works

The system processes PDF inputs using OCR, then employs AI agents for summarization, scriptwriting (host/guest dialogue), and enhancement (banter, flow). Advanced TTS generates distinct audio voices. A key innovation is the use of TextGrad with weight clipping for prompt optimization based on user feedback, ensuring stable and meaningful prompt evolution. This creates a continuous improvement cycle, with new prompts versioned by timestamps. Simulation and evaluation scripts assess and validate this self-improvement process.

Quick Start & Requirements

  • Install: conda create -n podcast python=3.12 -y && conda activate podcast && conda install pip -y, then pip install -r requirements.txt and pip install uvicorn. For frontend: cd frontend && npm install.
  • Prerequisites: Python 3.12, Rust (for jiter), Node.js/npm, OpenAI API key.
  • Setup: Requires setting the OPENAI_API_KEY environment variable or via a .env file.
  • Docs: https://www.metaskepsis.com/ (demo)

Highlighted Details

  • AI agents for summarization, scriptwriting, and dialogue enhancement.
  • TextGrad with weight clipping for prompt optimization and stable learning.
  • Timestamped version control for prompts and generated content.
  • Simulation and evaluation scripts for assessing self-improvement.
  • Integrated React frontend and FastAPI backend for a user-friendly interface.

Maintenance & Community

The project welcomes collaboration, particularly on self-improving prompts and local TTS solutions. Contributions can be made via GitHub issues and pull requests.

Licensing & Compatibility

The README does not explicitly state a license. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The system relies on OpenAI's API, incurring potential costs. The README mentions a "weight clipping" concept inspired by gradient clipping but doesn't detail its implementation or impact on prompt quality. Feedback allocation across multiple agents is noted as a complex area requiring further refinement.

Health Check
Last Commit

11 months ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
2 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.