dream-to-video-skill by mediastormDev

AI agent for generating cinematic videos from text descriptions

Created 4 months ago

329 stars

Top 82.7% on SourcePulse

Project Summary

This AI skill automates the creation of cinematic videos from textual dream descriptions. It targets users who want to visualize their dreams, offering a pipeline that transforms raw text into professional video prompts, leverages browser automation to interact with the Jimeng video generation platform, and applies custom post-processing effects, ultimately delivering finished video files.

How It Works

The core process begins with a dream description, which an AI agent transforms into a detailed cinematic video prompt adhering to 10 specific rules (e.g., photorealistic style, fisheye lens, silent narrative). This prompt is then added to a local SQLite + JSONL task queue. A background worker utilizes Playwright to drive Chromium for browser automation, handling login, prompt submission, reference image uploads, and progress monitoring on the Jimeng platform. Upon completion, videos are downloaded and automatically processed with an "Elliptic Shatter" edge effect, producing both original and modified versions.

Quick Start & Requirements

Primary Install: npx skills add mediastormDev/dream-to-video-skill -s dream-to-video
Manual Install: Clone repo, symlink skill directory into agent's skill folder.
Prerequisites: Python >= 3.10, Chromium (installed via Playwright), API key from a supported provider (Claude, OpenAI, OpenRouter, Google Gemini).
Setup: Clone repo, cd dream-to-video-skill/dream_to_video, pip install -r requirements.txt, playwright install chromium, then python main.py login (requires QR code scan).
Links: GitHub Repository

Highlighted Details

AI-driven prompt transformation with 10 strict cinematic rules.
Browser automation via Playwright for seamless interaction with the Jimeng platform.
Automatic "Elliptic Shatter" post-processing effect applied to generated videos.
Local task queue management using SQLite and JSONL.

Maintenance & Community

No specific details regarding notable contributors, sponsorships, partnerships, or community channels (like Discord/Slack) were found in the provided README.

Licensing & Compatibility

License: MIT.
Compatibility: The MIT license generally permits commercial use and integration into closed-source projects without significant restrictions.

Limitations & Caveats

The reliance on browser automation for interacting with the Jimeng platform introduces potential fragility; platform updates could break functionality. The system requires valid API keys for AI model interaction and a one-time QR code scan for Jimeng platform login.

Health Check

Last Commit

4 months ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

16 stars in the last 30 days