Whisper-TikTok  by MatteoFasulo

TikTok video generator using Whisper, Edge TTS, and FFMPEG

created 2 years ago
290 stars

Top 91.7% on sourcepulse

GitHubView on GitHub
Project Summary

This project provides an end-to-end solution for generating TikTok videos using AI. It targets content creators looking to automate video production by transcribing audio, generating natural-sounding voiceovers, and assembling video content with customizable subtitles. The primary benefit is the significant reduction in manual effort required to create engaging short-form videos.

How It Works

The system orchestrates a pipeline involving several AI models and tools. It starts by fetching a background video (randomly or from a specified URL) and uses Microsoft Edge Cloud TTS for natural voiceovers. OpenAI's Whisper model transcribes the generated audio into SRT subtitles, which are then embedded into the background video using FFMPEG. The process is configurable via a JSON file and offers command-line options for customization.

Quick Start & Requirements

  • Install dependencies: pip install -U -r requirements.txt
  • Requires FFMPEG to be installed and available in the system's PATH.
  • For optimal performance, a GPU with CUDA is recommended for the Whisper model, though it will fall back to CPU.
  • TikTok upload requires a TikTok account and a cookies.txt file generated via a provided guide.
  • Local Web-UI: streamlit run app.py --server.port=8501 --server.address=0.0.0.0
  • Command-Line: python main.py [OPTIONS]
  • Online Demo: https://huggingface.co/spaces/MatteoFasulo/Whisper-TikTok-Demo

Highlighted Details

  • Leverages Microsoft Edge Cloud TTS for natural-sounding voiceovers.
  • Utilizes OpenAI Whisper for accurate audio transcription and subtitle generation.
  • Supports customization of subtitles (font, color, size, position) via FFMPEG.
  • Includes an optional feature to upload generated videos directly to TikTok.
  • Offers both a local Web-UI (Streamlit) and command-line interface.

Maintenance & Community

  • Key dependencies include edge-tts and stable-ts.
  • Contributions are welcomed via Contributing Guidelines.
  • Upcoming features include OpenAI API integration and Reddit content extraction.

Licensing & Compatibility

  • Licensed under the Apache License, Version 2.0.
  • Permissive license suitable for commercial use and integration with closed-source projects.

Limitations & Caveats

  • TikTok upload functionality requires manual cookie generation and may be subject to TikTok's API changes.
  • While GPU acceleration is supported for Whisper, CPU fallback will result in significantly slower processing.
Health Check
Last commit

6 months ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
18 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.