smol-podcaster  by FanaHOVA

Podcast production agent

created 2 years ago
348 stars

Top 80.9% on sourcepulse

GitHubView on GitHub
Project Summary

smol-podcaster automates podcast production tasks, including transcription, chapter generation, and content ideation for social media. It targets podcast creators and researchers seeking to streamline post-production workflows. The primary benefit is the automated generation of structured content from raw audio files.

How It Works

The tool leverages large language models (LLMs) for transcription, speaker diarization, chapter creation, and content summarization. It processes audio files, extracts key information, and formats it into usable show notes, chapter lists, and promotional content. The system supports both CLI and a web UI with background task processing via Celery, allowing for parallel execution and remote operation.

Quick Start & Requirements

  • Install dependencies: pip install -r requirements.txt
  • Configure API keys in .env file.
  • Run via CLI: python smol_podcaster.py AUDIO_FILE_URL GUEST_NAME NUMBER_OF_SPEAKERS
  • Run with Web UI + Celery: Requires a message broker (e.g., RabbitMQ). Use honcho start or run celery -A tasks worker --loglevel=INFO and flask --app web.py --debug run separately. Access at localhost:5000.
  • Prerequisites: Python 3.x, a message broker for Celery, and API keys for LLM services.

Highlighted Details

  • Generates diarized transcripts with speaker labels and timestamps.
  • Creates chapter lists with timestamps, adaptable for audio/video sync issues via string similarity.
  • Provides title ideas and tweet suggestions based on podcast content.
  • Offers an "Edit Show Notes" feature for consolidating and refining LLM-generated content.

Maintenance & Community

The project is maintained by FanaHOVA. No specific community channels or roadmap details are provided in the README.

Licensing & Compatibility

  • License: MIT License.
  • Compatibility: Permissive MIT license allows for commercial use and integration into closed-source projects.

Limitations & Caveats

The project requires configuration of external LLM API keys and a message broker for distributed task execution. Audio/video timestamp synchronization relies on string similarity, which may not perfectly align content if transcription errors are significant.

Health Check
Last commit

10 months ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
6 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.