MicroLens by westlake-repl

Large-scale multimodal micro-video dataset and recommendation code

Created 2 years ago

286 stars

Top 91.5% on SourcePulse

Project Summary

A large-scale, content-driven dataset for micro-video recommendation research, MicroLens provides raw multimodal data (text, audio, image, video) and user interactions. It targets researchers and engineers in recommender systems and multimodal AI, enabling the development and evaluation of advanced, context-aware recommendation models.

How It Works

The dataset comprises multiple versions (MicroLens-50k, -100k, -1M) with rich multimodal features and user-video interaction logs. This structure facilitates training recommendation models that leverage diverse content modalities, moving beyond traditional ID-based approaches. It supports research in content-driven recommendation, multimodal understanding, and fairness.

Quick Start & Requirements

Environment: Python 3.8.12, PyTorch 1.8.0, CUDA 11.1, Torchvision 0.9.0, Transformers 4.23.1.
Dataset Download: Links: https://recsys.westlake.edu.cn/MicroLens-50k-Dataset/, https://recsys.westlake.edu.cn/MicroLens-100k-Dataset/. MicroLens-1M available for WWW 2025 MIRC.
Code: Baseline models (VideoRec, IDRec, VIDRec) available under Code/. Training/testing scripts include run_id.py, run_text.py, run_image.py, run_video.py.
Prerequisites: LMDB file preparation required for image/video models; script Data Generation/generate_cover_frames_lmdb.py provided.
Links: Dataset download, quick_download.txt for video downloads, MMRec framework integration.

Highlighted Details

Multiple dataset sizes, including a 1M-scale version for challenges.
Includes raw multimodal data (text, audio, image, video) and user comments.
Supports multimodal recommendation tasks and fairness research.
Codebase includes implementations for 15 video models.
Integrated into the MMRec framework.

Maintenance & Community

The project actively develops and expands the dataset, releasing new versions and features. It has received attention from Google DeepMind and YouTube, evidenced by invited talks. The lab is hiring research personnel, indicating ongoing research activity. No direct community channels are listed.

Licensing & Compatibility

No explicit open-source license is stated. A "Caution" prohibits private modification and secondary distribution of the dataset, encouraging open-sourcing processing code or notifying authors of alterations. This suggests a restrictive usage policy, potentially impacting commercial or closed-source integration without explicit permission.

Limitations & Caveats

Dataset redistribution is restricted. Specific, older versions of Python, PyTorch, and CUDA are required. Preparation of LMDB files is necessary for certain model types, adding a setup step.

MicroLens by westlake-repl

Explore Similar Projects

Awesome-Video-LMM-Post-Training by yunlong10

ml-slowfast-llava by apple

framedex by Simbastack-hq

Video-MME by MME-Benchmarks

MiniGPT4-video by Vision-CAIR

SkyReels-V3 by SkyworkAI

grounded-video-description by facebookresearch

Qwen3-VL-Embedding by QwenLM

Awesome-LLMs-for-Video-Understanding by yunlong10

InternVideo by OpenGVLab

VideoRAG by HKUDS

LAVIS by salesforce