SlideBot-AI by tonyqinatcmu

AI presentation generator from diverse inputs

Created 5 months ago

1,214 stars

Top 31.7% on SourcePulse

Project Summary

SlideBot AI automates professional presentation generation from diverse inputs like text, documents, or audio. It targets users needing efficient, high-quality slide creation, leveraging AI for content structuring, design, and visuals. The core benefit is significant time savings and enhanced presentation quality.

How It Works

This platform integrates a FastAPI (Python 3.10+) backend with a React frontend, powered by Google Gemini for text and image generation, and optionally iFlytek for speech-to-text. It follows a workflow: input processing, AI-generated outline creation (editable), design proposal, and finally, per-slide image generation, incorporating user-uploaded assets. This multi-AI approach offers a comprehensive, interactive presentation generation pipeline.

Quick Start & Requirements

Install: Clone the repository, install backend dependencies (pip install -r requirements.txt), frontend dependencies (npm install), build frontend (npm run build), and start the server (python server.py).
Prerequisites: Python 3.10+, Node.js 18+, Google Gemini API Key. Optional: iFlytek API keys.
Access: Local access via http://localhost:8001.
Online Demos: Available at http://223.6.255.214/ (China) and http://47.77.231.44/ (Overseas), requiring an invitation code.
API Key Acquisition: Google AI Studio, 讯飞开放平台 (optional).

Highlighted Details

AI-Driven Workflow: Automates outline generation, design, and image creation from text, documents (PDF, Word, Excel), or audio.
Interactive Customization: Real-time editing of outlines and designs, with options for custom colors, fonts, logos, and slide counts.
Advanced Input: Supports uploading reference documents, spreadsheets for charts, and audio for transcription.
Output: Generates ZIP (images) or PDF exports, with optimized image compression.

Maintenance & Community

Maintained by Shenlin (Tony) Qin and Jie Tang. Active development indicated by recent updates (Jan 2025).
Community support via GitHub Issues and Discussions.

Licensing & Compatibility

License: MIT License.
Compatibility: Permissive for commercial use and integration into closed-source projects.

Limitations & Caveats

Planned features include speech generation for speaker notes and style persistence. Core AI functionality requires external API keys. Online demos necessitate an invitation code.

Health Check

Last Commit

5 months ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

2 stars in the last 30 days