Discover and explore top open-source AI tools and projects—updated daily.
Multimodal emotion reasoning for open-vocabulary understanding
Top 99.4% on SourcePulse
AffectGPT addresses the limitations of traditional emotion recognition by introducing Open-Vocabulary Multimodal Emotion Recognition (OV-MER). It enables the prediction of any number and category of emotions from multimodal data, advancing emotion AI towards real-world applicability. The project provides the AffectGPT framework, datasets, and a benchmark suite for researchers and practitioners in multimodal LLMs and affective computing.
How It Works
The project defines the OV-MER task, allowing for a more comprehensive understanding of human emotions beyond fixed labels. AffectGPT is a framework designed for this task, leveraging multimodal large language models (MLLMs). It integrates pre-trained audio and video encoders with a large language model (Qwen2.5-7B-Instruct) to process and reason about emotional content across modalities. This approach offers a flexible and nuanced approach to emotion AI.
Quick Start & Requirements
Setup involves downloading datasets and models from provided links. Key resources include:
https://huggingface.co/datasets/MERChallenge/MER2025
https://pan.baidu.com/s/1kbfs5pG_hAri0QwvQl-Ecg?pwd=b9vn
or https://1024terabox.com/s/1AE7uAU3Ib8aRBSyF1TMpow
https://pan.baidu.com/s/1IvC4H7Xt1AzMFocGMBBbHQ?pwd=hzf9
A Python environment with necessary libraries for MLLMs and data processing is required. Specific hardware requirements (e.g., GPUs) are not detailed but are expected for MLLM operations.Highlighted Details
Maintenance & Community
The project acknowledges numerous related works, indicating engagement with the broader research community. No specific community channels (e.g., Discord, Slack) or formal roadmap are detailed in the provided text.
Licensing & Compatibility
The project is released under the Apache 2.0 license. However, it is strictly intended for non-commercial use ONLY. This restriction significantly impacts its compatibility for commercial applications or integration into proprietary systems.
Limitations & Caveats
The primary limitation is the strict non-commercial use restriction. Setup requires manual downloading of large datasets and models from external cloud storage links (Baidu/Terabox), which may present accessibility or reliability challenges. The project is research-focused, providing specific frameworks and datasets rather than a general-purpose library.
1 month ago
Inactive