RLMRec by HKUDS

LLM-enhanced representation learning framework for recommendation systems

Created 2 years ago

435 stars

Top 68.4% on SourcePulse

Project Summary

RLMRec offers a model-agnostic framework for enhancing recommendation systems by integrating Large Language Models (LLMs) for representation learning. It targets researchers and practitioners in recommender systems seeking to leverage rich textual data for improved user and item understanding, aiming to capture intricate semantic aspects of user behaviors and preferences beyond traditional collaborative filtering signals.

How It Works

RLMRec integrates LLMs to generate rich user and item profiles from textual descriptions. It then aligns the semantic space of these LLM-generated embeddings with the representation space derived from collaborative relational signals using a cross-view alignment framework. This approach allows for the incorporation of auxiliary textual information, creating a more comprehensive understanding of user-item interactions.

Quick Start & Requirements

Install: Clone the repository (git clone --depth 1 https://github.com/HKUDS/RLMRec.git), create a conda environment (conda create -y -n rlmrec python=3.9), activate it (conda activate rlmrec), and install dependencies including PyTorch 1.13.1 with CUDA 11.6 support, torch-scatter, and torch-sparse.
Data: Download provided datasets (Amazon-book, Yelp, Steam) via wget or Google Drive.
Prerequisites: Python 3.9, PyTorch 1.13.1+cu116, CUDA 11.6, torch-scatter, torch-sparse.
Examples: Run backbone models (python encoder/train_encoder.py --model {model_name} --dataset {dataset} --cuda 0) or RLMRec variants (_plus for contrastive, _gene for generative alignment).
Docs: WWW'2024 Paper

Highlighted Details

Model-agnostic framework applicable to various recommender architectures (GCCF, LightGCN, SGL, SimGCL, DCCF, AutoCF).
Utilizes three public datasets (Amazon-book, Yelp, Steam) with pre-generated user/item profiles and semantic embeddings.
Supports both contrastive and generative alignment strategies for integrating LLM representations.
Provides example scripts for profile generation and semantic representation encoding using OpenAI API or other embedding models.

Maintenance & Community

The project is associated with HKUDS and the WWW'2024 conference. No specific community channels (Discord/Slack) or active maintenance signals are explicitly mentioned in the README.

Licensing & Compatibility

The repository does not explicitly state a license. The code is based on the SSLRec framework, which is typically under a permissive license, but this should be verified. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The profile generation scripts require an OpenAI API key and direct modification of source files to insert the key, which is not ideal for secure deployment. The project relies on specific older versions of PyTorch and CUDA, potentially limiting compatibility with newer hardware or software stacks.

Health Check

Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

9 stars in the last 30 days