LLMRec  by HKUDS

Recommender system enhanced via LLM graph augmentation (WSDM'24 paper)

created 1 year ago
468 stars

Top 66.1% on sourcepulse

GitHubView on GitHub
Project Summary

LLMRec introduces a novel framework for recommendation systems by leveraging Large Language Models (LLMs) to augment interaction graphs. It targets researchers and practitioners in recommendation systems seeking to enhance model performance by incorporating rich textual and multi-modal content. The primary benefit is improved recommendation accuracy through LLM-driven graph enrichment.

How It Works

LLMRec enhances recommendation models by applying three LLM-based graph augmentation strategies: reinforcing user-item interaction edges, enriching item node attributes with LLM-generated text, and creating user profiles from interaction history. This approach intuitively leverages natural language to capture nuanced relationships and user preferences, offering a more comprehensive understanding of the recommendation landscape compared to traditional methods.

Quick Start & Requirements

  • Install: pip install -r requirements.txt
  • Prerequisites: Python, PyTorch. LLM augmentation stages require API access or pre-generated data.
  • Usage:
    1. LLM Augmentation: python ./gpt_ui_aug.py, python ./gpt_user_profiling.py, python ./gpt_i_attribute_generate_aug.py
    2. Training: python ./main.py --dataset {netflix, movielens}
  • Data: Pre-processed multi-modal datasets (Netflix, MovieLens) with LLM-augmented text and embeddings are available for download.
  • Links: Netflix Dataset

Highlighted Details

  • Implements LLM-based augmentation for user-item edges, item attributes, and user profiles.
  • Provides multi-modal datasets (text, images) for Netflix and MovieLens.
  • Utilizes CLIP-ViT and Sentence-BERT for visual and textual feature encoding.
  • Codebase is structured based on MMSSL, LATTICE, and MICRO.

Maintenance & Community

The project is associated with the University of Hong Kong and Baidu Inc. The repository was last updated in March 2024.

Licensing & Compatibility

The repository does not explicitly state a license. The provided datasets are for research purposes, with a specific request to cite the paper if the 'netflix' dataset is used.

Limitations & Caveats

The LLM augmentation stages may require significant API costs or computational resources if run directly. The provided code for baselines (LATTICE, MMSSL) requires minor modifications for dataset path adjustments.

Health Check
Last commit

1 year ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
18 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.