P5 by jeykigung

LLM for recommendation tasks, based on the P5 paradigm

Created 3 years ago

368 stars

Top 76.7% on SourcePulse

Project Summary

This repository implements P5, a unified pre-training paradigm for recommendation systems that treats all data as natural language sequences. It aims to advance recommender systems towards a universal engine capable of zero-shot and few-shot predictions via personalized prompts, benefiting researchers and practitioners in recommendation and natural language processing.

How It Works

P5 converts diverse recommendation data (user-item interactions, metadata, reviews) into natural language sequences. It leverages a unified language modeling objective for pre-training, enabling it to serve as a foundation model for various downstream recommendation tasks. This approach facilitates multi-modal integration and instruction-based recommendation, reducing the need for extensive fine-tuning through adaptive personalized prompts.

Quick Start & Requirements

Install: Clone the repository and install dependencies: pip install -r requirements.txt (requirements.txt not provided, inferred from README).
Prerequisites: Python 3.9.7, PyTorch 1.10.1, transformers 4.2.1, tqdm, numpy, sentencepiece, pyyaml.
Data: Download preprocessed data from Google Drive and place in the data folder. Raw data can be downloaded and placed in raw_data.
Checkpoints: Download pretrained checkpoints into the snap folder.
Training: Use scripts in the scripts folder (e.g., bash scripts/pretrain_P5_base_beauty.sh 4 for 4 GPUs).
Evaluation: Use Jupyter notebooks in the notebooks folder.
Links: Paper: https://arxiv.org/pdf/2203.13366.pdf, Hugging Face: https://huggingface.co/makitanikaze/P5.

Highlighted Details

Unified "Pretrain, Personalized Prompt, and Predict" (P5) paradigm for recommendation.
Converts all recommendation data into natural language sequences.
Enables zero-shot and few-shot recommendation via personalized prompts.
Advances recommender systems from shallow to deep and big models.

Maintenance & Community

The project cites VL-T5, PETER, and S3-Rec as acknowledgements. No specific community links (Discord, Slack) or roadmap are provided in the README.

Licensing & Compatibility

The README does not explicitly state a license. The code is released for research purposes. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The project relies on specific older versions of PyTorch (1.10.1) and transformers (4.2.1), which may pose compatibility challenges with newer libraries. The README does not detail specific performance benchmarks or known limitations beyond the scope of the research paper.

P5 by jeykigung

Explore Similar Projects

LLMRank by RUCAIBox

awesome-recommend-system-pretraining-papers by archersama

OpenOneRec by Kuaishou-OneRec

personal_chatgpt by chunhuizhang

finetune by IndicoDataSolutions

torch-rechub by datawhalechina

Awesome-RSPapers by RUCAIBox

LLM4Rec-Awesome-Papers by WLiK

autolabel by refuel-ai

recommendation by amitkaps

LMOps by microsoft

practicalAI-cn by MLEveryday