P5  by jeykigung

LLM for recommendation tasks, based on the P5 paradigm

created 3 years ago
340 stars

Top 82.4% on sourcepulse

GitHubView on GitHub
Project Summary

This repository implements P5, a unified pre-training paradigm for recommendation systems that treats all data as natural language sequences. It aims to advance recommender systems towards a universal engine capable of zero-shot and few-shot predictions via personalized prompts, benefiting researchers and practitioners in recommendation and natural language processing.

How It Works

P5 converts diverse recommendation data (user-item interactions, metadata, reviews) into natural language sequences. It leverages a unified language modeling objective for pre-training, enabling it to serve as a foundation model for various downstream recommendation tasks. This approach facilitates multi-modal integration and instruction-based recommendation, reducing the need for extensive fine-tuning through adaptive personalized prompts.

Quick Start & Requirements

  • Install: Clone the repository and install dependencies: pip install -r requirements.txt (requirements.txt not provided, inferred from README).
  • Prerequisites: Python 3.9.7, PyTorch 1.10.1, transformers 4.2.1, tqdm, numpy, sentencepiece, pyyaml.
  • Data: Download preprocessed data from Google Drive and place in the data folder. Raw data can be downloaded and placed in raw_data.
  • Checkpoints: Download pretrained checkpoints into the snap folder.
  • Training: Use scripts in the scripts folder (e.g., bash scripts/pretrain_P5_base_beauty.sh 4 for 4 GPUs).
  • Evaluation: Use Jupyter notebooks in the notebooks folder.
  • Links: Paper: https://arxiv.org/pdf/2203.13366.pdf, Hugging Face: https://huggingface.co/makitanikaze/P5.

Highlighted Details

  • Unified "Pretrain, Personalized Prompt, and Predict" (P5) paradigm for recommendation.
  • Converts all recommendation data into natural language sequences.
  • Enables zero-shot and few-shot recommendation via personalized prompts.
  • Advances recommender systems from shallow to deep and big models.

Maintenance & Community

The project cites VL-T5, PETER, and S3-Rec as acknowledgements. No specific community links (Discord, Slack) or roadmap are provided in the README.

Licensing & Compatibility

The README does not explicitly state a license. The code is released for research purposes. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The project relies on specific older versions of PyTorch (1.10.1) and transformers (4.2.1), which may pose compatibility challenges with newer libraries. The README does not detail specific performance benchmarks or known limitations beyond the scope of the research paper.

Health Check
Last commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
10 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.