Discover and explore top open-source AI tools and projects—updated daily.
ZebangChengMultimodal emotion recognition and reasoning via instruction tuning
Top 67.5% on SourcePulse
Emotion-LLaMA addresses the limitations of single-modality emotion recognition and the challenges in multimodal large language models (MLLMs) for integrating audio and visual cues. It enables nuanced emotion recognition and reasoning by processing audio, visual, and textual inputs, targeting researchers and developers in human-computer interaction, education, and counseling.
How It Works
Emotion-LLaMA integrates audio, visual, and textual data using specialized encoders. Features are aligned into a shared space, and a modified LLaMA model, fine-tuned with instructions, handles the multimodal understanding and reasoning. This approach aims to capture complex emotional expressions more effectively than single-modality methods.
Quick Start & Requirements
environment.yaml), and activate it.moviepy, soundfile, and opencv-python.Highlighted Details
Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
5 days ago
1 day
raunak-agarwal