storyteller by jaketae

Multimodal AI for animated short stories from text prompts

Created 3 years ago

534 stars

Top 59.3% on SourcePulse

Project Summary

This project provides a multimodal AI storyteller that generates short animated videos from a text prompt. It's designed for users interested in AI-driven content creation, enabling them to produce visual narratives with accompanying audio.

How It Works

The system orchestrates a pipeline of AI models. A GPT model expands a given prompt into a story, generating text sentence by sentence. For each sentence, Stable Diffusion creates a corresponding image. Finally, a neural text-to-speech (TTS) model narrates the story, and these components are combined into a video. This approach automates the entire creative process from text to a complete audiovisual experience.

Quick Start & Requirements

Install via pip: $ pip install storyteller-core
Source install: $ git clone https://github.com/jaketae/storyteller.git && cd storyteller && pip install .
Apple Silicon users may need to install mecab via Homebrew (brew install mecab).
GPU acceleration (CUDA or MPS) is recommended for faster generation.
Official documentation and examples are available via the CLI help: $ storyteller --help.

Highlighted Details

Generates a complete animated video with audio and visuals from a single text prompt.
Supports customization of story prompts, image generation prefixes, and model choices (GPT, Stable Diffusion, TTS).
Offers CLI and Python API for integration and advanced usage.
Allows fine-grained control over device placement (CPU, CUDA, MPS) and data types (float32, float16) for optimization.

Maintenance & Community

The project is maintained by jaketae. Further community engagement channels are not explicitly listed in the README.

Licensing & Compatibility

Licensed under the MIT License.
Permissive license allows for commercial use and integration into closed-source projects.

Limitations & Caveats

PyTorch support for Apple Silicon's MPS is noted as a work in progress, with potential issues on specific PyTorch versions. The default models used (e.g., gpt2, stabilityai/stable-diffusion-2) may require significant VRAM when running on GPU.

storyteller by jaketae

Explore Similar Projects

MAGIC by yxuansu

Liquid by FoundationVision

Comfyui_image2prompt by zhongpei

obsidian-ai-assistant by qgrail

WavJourney by Audio-AGI

Vlogger by Vchitect

org-ai by rksm

tiny-diffusion by nathan-barry

shortrocity by unconv

auto-video-generateor by kuangdd2024

deep-daze by lucidrains

dia by nari-labs