SlideBot-AI  by tonyqinatcmu

AI presentation generator from diverse inputs

Created 3 weeks ago

New!

603 stars

Top 54.2% on SourcePulse

GitHubView on GitHub
Project Summary

SlideBot AI automates professional presentation generation from diverse inputs like text, documents, or audio. It targets users needing efficient, high-quality slide creation, leveraging AI for content structuring, design, and visuals. The core benefit is significant time savings and enhanced presentation quality.

How It Works

This platform integrates a FastAPI (Python 3.10+) backend with a React frontend, powered by Google Gemini for text and image generation, and optionally iFlytek for speech-to-text. It follows a workflow: input processing, AI-generated outline creation (editable), design proposal, and finally, per-slide image generation, incorporating user-uploaded assets. This multi-AI approach offers a comprehensive, interactive presentation generation pipeline.

Quick Start & Requirements

  • Install: Clone the repository, install backend dependencies (pip install -r requirements.txt), frontend dependencies (npm install), build frontend (npm run build), and start the server (python server.py).
  • Prerequisites: Python 3.10+, Node.js 18+, Google Gemini API Key. Optional: iFlytek API keys.
  • Access: Local access via http://localhost:8001.
  • Online Demos: Available at http://223.6.255.214/ (China) and http://47.77.231.44/ (Overseas), requiring an invitation code.
  • API Key Acquisition: Google AI Studio, 讯飞开放平台 (optional).

Highlighted Details

  • AI-Driven Workflow: Automates outline generation, design, and image creation from text, documents (PDF, Word, Excel), or audio.
  • Interactive Customization: Real-time editing of outlines and designs, with options for custom colors, fonts, logos, and slide counts.
  • Advanced Input: Supports uploading reference documents, spreadsheets for charts, and audio for transcription.
  • Output: Generates ZIP (images) or PDF exports, with optimized image compression.

Maintenance & Community

  • Maintained by Shenlin (Tony) Qin and Jie Tang. Active development indicated by recent updates (Jan 2025).
  • Community support via GitHub Issues and Discussions.

Licensing & Compatibility

  • License: MIT License.
  • Compatibility: Permissive for commercial use and integration into closed-source projects.

Limitations & Caveats

Planned features include speech generation for speaker notes and style persistence. Core AI functionality requires external API keys. Online demos necessitate an invitation code.

Health Check
Last Commit

3 weeks ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
1
Star History
612 stars in the last 26 days

Explore Similar Projects

Feedback? Help us improve.