FilmAgent  by HITsz-TMG

Multi-agent framework for film automation in virtual 3D spaces

created 11 months ago
1,009 stars

Top 37.8% on sourcepulse

GitHubView on GitHub
Project Summary

FilmAgent is a multi-agent framework designed for end-to-end film automation within virtual 3D environments. It targets researchers and developers interested in AI-driven content creation, simulating key film crew roles to generate scripts, actor actions, and camera shots. The system aims to streamline the filmmaking process by integrating human-like collaborative workflows.

How It Works

FilmAgent structures the film automation process into three stages: idea development, scriptwriting, and cinematography. It employs multi-agent collaboration strategies like "Critique-Correct-Verify" and "Debate-Judge" to refine outputs. This approach allows agents representing different film roles (director, screenwriter, actor, cinematographer) to iteratively improve the script and shot composition, leading to more coherent and detailed final productions.

Quick Start & Requirements

  • Installation: Requires Python 3.9.18 and uses conda for environment management. Install dependencies via pip install -r env.txt.
  • Prerequisites:
    • OpenAI API key or DeepSeek models (GPT-4o, DeepSeek-v3, DeepSeek-r1).
    • ChatTTS for voice acting (requires separate download and setup).
    • Unity Editor (version 2022.3.14f1c1 recommended) for final execution.
    • Newtonsoft.Json Unity package might be needed.
  • Setup: Involves setting up Python environments, downloading ChatTTS, and configuring Unity projects.
  • Resources: Links to project page, paper, slides, and video demonstrations are provided.

Highlighted Details

  • Integrates DeepSeek-v3 and r1 models for enhanced decision-making.
  • Demonstrates improved plot coherence, character dialogue alignment, and camera shot diversity through multi-agent collaboration.
  • Compares favorably to models like Sora, offering stronger storytelling and consistency at the cost of requiring pre-built 3D spaces.
  • Future integration with text-to-video models like Sora and Vidu is planned.

Maintenance & Community

The project is associated with HITsz-TMG and has recent updates (Feb 2025). Mentions of recommendations from notable individuals and organizations suggest community interest.

Licensing & Compatibility

The repository is hosted on GitHub, implying a standard open-source license, though the specific license is not detailed in the README. Compatibility for commercial use or closed-source linking would require checking the explicit license file.

Limitations & Caveats

The system requires pre-built 3D virtual spaces and Unity for execution, which represents a significant setup overhead. Using DeepSeek-r1 for multi-agent processes is noted as potentially very slow. The Unity execution step may require multiple attempts for audio files to load correctly.

Health Check
Last commit

4 months ago

Responsiveness

1 week

Pull Requests (30d)
0
Issues (30d)
0
Star History
58 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.