FilmAgent by HITsz-TMG

Multi-agent framework for film automation in virtual 3D spaces

Created 1 year ago

1,115 stars

Top 34.3% on SourcePulse

Project Summary

FilmAgent is a multi-agent framework designed for end-to-end film automation within virtual 3D environments. It targets researchers and developers interested in AI-driven content creation, simulating key film crew roles to generate scripts, actor actions, and camera shots. The system aims to streamline the filmmaking process by integrating human-like collaborative workflows.

How It Works

FilmAgent structures the film automation process into three stages: idea development, scriptwriting, and cinematography. It employs multi-agent collaboration strategies like "Critique-Correct-Verify" and "Debate-Judge" to refine outputs. This approach allows agents representing different film roles (director, screenwriter, actor, cinematographer) to iteratively improve the script and shot composition, leading to more coherent and detailed final productions.

Quick Start & Requirements

Installation: Requires Python 3.9.18 and uses conda for environment management. Install dependencies via pip install -r env.txt.
Prerequisites:
- OpenAI API key or DeepSeek models (GPT-4o, DeepSeek-v3, DeepSeek-r1).
- ChatTTS for voice acting (requires separate download and setup).
- Unity Editor (version 2022.3.14f1c1 recommended) for final execution.
- Newtonsoft.Json Unity package might be needed.
Setup: Involves setting up Python environments, downloading ChatTTS, and configuring Unity projects.
Resources: Links to project page, paper, slides, and video demonstrations are provided.

Highlighted Details

Integrates DeepSeek-v3 and r1 models for enhanced decision-making.
Demonstrates improved plot coherence, character dialogue alignment, and camera shot diversity through multi-agent collaboration.
Compares favorably to models like Sora, offering stronger storytelling and consistency at the cost of requiring pre-built 3D spaces.
Future integration with text-to-video models like Sora and Vidu is planned.

Maintenance & Community

The project is associated with HITsz-TMG and has recent updates (Feb 2025). Mentions of recommendations from notable individuals and organizations suggest community interest.

Licensing & Compatibility

The repository is hosted on GitHub, implying a standard open-source license, though the specific license is not detailed in the README. Compatibility for commercial use or closed-source linking would require checking the explicit license file.

Limitations & Caveats

The system requires pre-built 3D virtual spaces and Unity for execution, which represents a significant setup overhead. Using DeepSeek-r1 for multi-agent processes is noted as potentially very slow. The Unity execution step may require multiple attempts for audio files to load correctly.

Health Check

Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

11 stars in the last 30 days