storyteller  by jaketae

Multimodal AI for animated short stories from text prompts

created 2 years ago
526 stars

Top 60.9% on sourcepulse

GitHubView on GitHub
Project Summary

This project provides a multimodal AI storyteller that generates short animated videos from a text prompt. It's designed for users interested in AI-driven content creation, enabling them to produce visual narratives with accompanying audio.

How It Works

The system orchestrates a pipeline of AI models. A GPT model expands a given prompt into a story, generating text sentence by sentence. For each sentence, Stable Diffusion creates a corresponding image. Finally, a neural text-to-speech (TTS) model narrates the story, and these components are combined into a video. This approach automates the entire creative process from text to a complete audiovisual experience.

Quick Start & Requirements

  • Install via pip: $ pip install storyteller-core
  • Source install: $ git clone https://github.com/jaketae/storyteller.git && cd storyteller && pip install .
  • Apple Silicon users may need to install mecab via Homebrew (brew install mecab).
  • GPU acceleration (CUDA or MPS) is recommended for faster generation.
  • Official documentation and examples are available via the CLI help: $ storyteller --help.

Highlighted Details

  • Generates a complete animated video with audio and visuals from a single text prompt.
  • Supports customization of story prompts, image generation prefixes, and model choices (GPT, Stable Diffusion, TTS).
  • Offers CLI and Python API for integration and advanced usage.
  • Allows fine-grained control over device placement (CPU, CUDA, MPS) and data types (float32, float16) for optimization.

Maintenance & Community

The project is maintained by jaketae. Further community engagement channels are not explicitly listed in the README.

Licensing & Compatibility

  • Licensed under the MIT License.
  • Permissive license allows for commercial use and integration into closed-source projects.

Limitations & Caveats

PyTorch support for Apple Silicon's MPS is noted as a work in progress, with potential issues on specific PyTorch versions. The default models used (e.g., gpt2, stabilityai/stable-diffusion-2) may require significant VRAM when running on GPU.

Health Check
Last commit

1 year ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
7 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.