RLSeq2Seq  by yaserkl

Research paper code for sequence-to-sequence models using deep reinforcement learning

created 7 years ago
767 stars

Top 46.4% on sourcepulse

GitHubView on GitHub
Project Summary

This repository provides a framework for applying Deep Reinforcement Learning (RL) techniques to sequence-to-sequence (seq2seq) models, primarily for abstractive text summarization. It addresses common seq2seq challenges like exposure bias and train/test inconsistency by integrating RL methods. The target audience includes researchers and practitioners in NLP and deep learning looking to leverage RL for improved seq2seq performance.

How It Works

The framework implements several RL approaches for seq2seq tasks, including Scheduled Sampling (with hard/soft argmax), End-to-End Backpropagation, Policy-Gradient with Self-Critic, and Actor-Critic methods using DDQN and Dueling Networks. These RL techniques aim to optimize seq2seq models directly for task-specific metrics (like ROUGE scores) rather than relying solely on maximum likelihood estimation, thereby mitigating exposure bias and improving generation quality.

Quick Start & Requirements

  • Install: pip install -r python_requirements.txt
  • Prerequisites: Python 2.7, TensorFlow 1.10.1, CUDA 9, Cudnn 7.1.
  • Data: Requires pre-processed CNN/Daily Mail or Newsroom datasets. Helper scripts are provided for downloading and preprocessing.
  • Documentation: arXiv paper

Highlighted Details

  • Implements various RL strategies: Scheduled Sampling, Policy-Gradient (Self-Critic), and Actor-Critic (DDQN).
  • Supports attention mechanisms: temporal attention and intra-decoder attention.
  • Offers options for different training regimes: MLE, RL, and combined MLE+RL.
  • Includes detailed command-line examples for training and evaluation of different models.

Maintenance & Community

The project is marked as "no longer actively maintained." Contributions are welcome via pull requests.

Licensing & Compatibility

  • License: MIT License (as indicated by the PyPI badge, though the LICENSE.txt file is not directly linked).
  • Compatibility: Requires older versions of TensorFlow (1.10.1) and Python (2.7), which may pose compatibility challenges with modern systems.

Limitations & Caveats

The project explicitly states it is "no longer actively maintained." The reliance on outdated TensorFlow (1.10.1) and Python (2.7) versions presents significant adoption hurdles and potential compatibility issues with current hardware and software ecosystems.

Health Check
Last commit

2 years ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
2 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.