MELD  by declare-lab

Dataset for emotion recognition research

created 6 years ago
932 stars

Top 40.1% on sourcepulse

GitHubView on GitHub
Project Summary

MELD is a comprehensive multimodal dataset for emotion recognition in multi-party conversations, derived from the Friends TV series. It provides text, audio, and visual modalities for over 13,000 utterances across more than 1400 dialogues, with each utterance labeled for seven emotions and three sentiment categories. This dataset is valuable for researchers and developers building advanced conversational AI systems, affective computing models, and dialogue generation tools that require nuanced understanding of emotional dynamics in group interactions.

How It Works

MELD extends the EmotionLines dataset by incorporating synchronized audio and visual data alongside text. Utterances are extracted with precise timestamps from TV show subtitles, ensuring alignment across modalities. The dataset's structure facilitates context-aware emotion recognition by capturing conversational flow and speaker interactions, aiming to improve performance over unimodal approaches.

Quick Start & Requirements

  • Data Download: wget http://web.eecs.umich.edu/~mihalcea/downloads/MELD.Raw.tar.gz
  • Prerequisites: Python, libraries for data processing (e.g., pickle). Specific feature extraction or model training may require deep learning frameworks (TensorFlow, PyTorch) and potentially GPU acceleration.
  • Resources: The raw data is substantial. Pre-extracted features are available for download.
  • Details: MELD Dataset, Paper

Highlighted Details

  • Contains over 1400 dialogues and 13,000 utterances from the Friends TV series.
  • Each utterance is annotated with seven emotions (Anger, Disgust, Sadness, Joy, Neutral, Surprise, Fear) and sentiment (positive, negative, neutral).
  • Includes both a multi-party version and a dyadic subset for specialized research.
  • Provides pre-extracted features for text (Glove, CNN, bcLSTM) and audio (openSMILE, SVM-based selection, bcLSTM).

Maintenance & Community

The project is associated with the declare-lab at the University of Michigan. Recent updates include new papers and SOTA baselines (COSMIC).

Licensing & Compatibility

The dataset itself is intended for research purposes. Specific licensing details for commercial use are not explicitly stated in the README, but the data is derived from a TV series, implying potential copyright considerations.

Limitations & Caveats

The dataset is based on a specific TV show, which may limit generalizability to other conversational contexts. Some utterances might have missing start/end times due to subtitle inconsistencies, requiring manual correction or omission.

Health Check
Last commit

1 year ago

Responsiveness

1 week

Pull Requests (30d)
0
Issues (30d)
0
Star History
39 stars in the last 90 days

Explore Similar Projects

Starred by Omar Sanseviero Omar Sanseviero(DevRel at Google DeepMind) and Patrick von Platen Patrick von Platen(Core Contributor to Hugging Face Transformers and Diffusers).

voice_datasets by jim-schwoebel

0.1%
2k
Voice dataset list for voice/sound computing
created 6 years ago
updated 1 year ago
Feedback? Help us improve.