Discover and explore top open-source AI tools and projects—updated daily.
HumanMLLMMultimodal LLM for explainable emotion recognition using reinforcement learning
Top 38.3% on SourcePulse
R1-Omni is an open-source, omni-multimodal large language model specifically designed for emotion recognition. It leverages Reinforcement Learning with Verifiable Reward (RLVR) to enhance reasoning, understanding, and generalization capabilities, particularly in out-of-distribution scenarios. The project targets researchers and developers working on multimodal AI and affective computing.
How It Works
R1-Omni builds upon the HumanOmni-0.5B base model and integrates RLVR for improved emotion recognition. This approach allows the model to learn from feedback, optimizing its ability to interpret complex emotional cues from both visual and audio data. The RLVR methodology is key to its enhanced performance, especially in generalizing to unseen data distributions.
Quick Start & Requirements
siglip-224, whisper-large-v3, and bert-base-uncased models and update configuration files (config.json, inference.py) with local paths.python inference.py --modal video_audio --model_path ./R1-Omni-0.5B --video_path video.mp4 --instruct "..."Highlighted Details
Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
7 months ago
Inactive