MicroLens  by westlake-repl

Large-scale multimodal micro-video dataset and recommendation code

Created 2 years ago
252 stars

Top 99.6% on SourcePulse

GitHubView on GitHub
Project Summary

A large-scale, content-driven dataset for micro-video recommendation research, MicroLens provides raw multimodal data (text, audio, image, video) and user interactions. It targets researchers and engineers in recommender systems and multimodal AI, enabling the development and evaluation of advanced, context-aware recommendation models.

How It Works

The dataset comprises multiple versions (MicroLens-50k, -100k, -1M) with rich multimodal features and user-video interaction logs. This structure facilitates training recommendation models that leverage diverse content modalities, moving beyond traditional ID-based approaches. It supports research in content-driven recommendation, multimodal understanding, and fairness.

Quick Start & Requirements

  • Environment: Python 3.8.12, PyTorch 1.8.0, CUDA 11.1, Torchvision 0.9.0, Transformers 4.23.1.
  • Dataset Download: Links: https://recsys.westlake.edu.cn/MicroLens-50k-Dataset/, https://recsys.westlake.edu.cn/MicroLens-100k-Dataset/. MicroLens-1M available for WWW 2025 MIRC.
  • Code: Baseline models (VideoRec, IDRec, VIDRec) available under Code/. Training/testing scripts include run_id.py, run_text.py, run_image.py, run_video.py.
  • Prerequisites: LMDB file preparation required for image/video models; script Data Generation/generate_cover_frames_lmdb.py provided.
  • Links: Dataset download, quick_download.txt for video downloads, MMRec framework integration.

Highlighted Details

  • Multiple dataset sizes, including a 1M-scale version for challenges.
  • Includes raw multimodal data (text, audio, image, video) and user comments.
  • Supports multimodal recommendation tasks and fairness research.
  • Codebase includes implementations for 15 video models.
  • Integrated into the MMRec framework.

Maintenance & Community

The project actively develops and expands the dataset, releasing new versions and features. It has received attention from Google DeepMind and YouTube, evidenced by invited talks. The lab is hiring research personnel, indicating ongoing research activity. No direct community channels are listed.

Licensing & Compatibility

No explicit open-source license is stated. A "Caution" prohibits private modification and secondary distribution of the dataset, encouraging open-sourcing processing code or notifying authors of alterations. This suggests a restrictive usage policy, potentially impacting commercial or closed-source integration without explicit permission.

Limitations & Caveats

Dataset redistribution is restricted. Specific, older versions of Python, PyTorch, and CUDA are required. Preparation of LMDB files is necessary for certain model types, adding a setup step.

Health Check
Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
7 stars in the last 30 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Simon Willison Simon Willison(Coauthor of Django), and
10 more.

LAVIS by salesforce

0.0%
11k
Library for language-vision AI research
Created 3 years ago
Updated 1 year ago
Feedback? Help us improve.