dream-to-video-skill  by mediastormDev

AI agent for generating cinematic videos from text descriptions

Created 1 month ago
261 stars

Top 97.1% on SourcePulse

GitHubView on GitHub
Project Summary

This AI skill automates the creation of cinematic videos from textual dream descriptions. It targets users who want to visualize their dreams, offering a pipeline that transforms raw text into professional video prompts, leverages browser automation to interact with the Jimeng video generation platform, and applies custom post-processing effects, ultimately delivering finished video files.

How It Works

The core process begins with a dream description, which an AI agent transforms into a detailed cinematic video prompt adhering to 10 specific rules (e.g., photorealistic style, fisheye lens, silent narrative). This prompt is then added to a local SQLite + JSONL task queue. A background worker utilizes Playwright to drive Chromium for browser automation, handling login, prompt submission, reference image uploads, and progress monitoring on the Jimeng platform. Upon completion, videos are downloaded and automatically processed with an "Elliptic Shatter" edge effect, producing both original and modified versions.

Quick Start & Requirements

  • Primary Install: npx skills add mediastormDev/dream-to-video-skill -s dream-to-video
  • Manual Install: Clone repo, symlink skill directory into agent's skill folder.
  • Prerequisites: Python >= 3.10, Chromium (installed via Playwright), API key from a supported provider (Claude, OpenAI, OpenRouter, Google Gemini).
  • Setup: Clone repo, cd dream-to-video-skill/dream_to_video, pip install -r requirements.txt, playwright install chromium, then python main.py login (requires QR code scan).
  • Links: GitHub Repository

Highlighted Details

  • AI-driven prompt transformation with 10 strict cinematic rules.
  • Browser automation via Playwright for seamless interaction with the Jimeng platform.
  • Automatic "Elliptic Shatter" post-processing effect applied to generated videos.
  • Local task queue management using SQLite and JSONL.

Maintenance & Community

No specific details regarding notable contributors, sponsorships, partnerships, or community channels (like Discord/Slack) were found in the provided README.

Licensing & Compatibility

  • License: MIT.
  • Compatibility: The MIT license generally permits commercial use and integration into closed-source projects without significant restrictions.

Limitations & Caveats

The reliance on browser automation for interacting with the Jimeng platform introduces potential fragility; platform updates could break functionality. The system requires valid API keys for AI model interaction and a one-time QR code scan for Jimeng platform login.

Health Check
Last Commit

1 month ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
64 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.