neural-template-gen  by harvardnlp

Research paper code for neural template learning for text generation

created 7 years ago
265 stars

Top 97.2% on sourcepulse

GitHubView on GitHub
Project Summary

This repository provides code for learning neural templates for text generation, specifically addressing data-to-text tasks. It is targeted at researchers and practitioners in Natural Language Generation (NLG) who are interested in structured prediction and template-based generation methods. The benefit is a novel approach to NLG that combines neural networks with structured prediction for more controllable and interpretable text generation.

How It Works

The core approach utilizes a Conditional Hierarchical Structured Sequence Model (CHSSM) that learns to generate text by first predicting a latent "template" or "skeleton" and then filling in the slots within that template. This structured prediction framework allows for explicit control over the generation process and can lead to more coherent and data-aligned outputs compared to purely end-to-end sequence-to-sequence models. The model is trained using Viterbi training and can extract templates from the learned segmentations.

Quick Start & Requirements

Highlighted Details

  • Supports both non-autoregressive and autoregressive generation modes.
  • Includes Viterbi segmentation for template extraction.
  • Offers generation capabilities using learned templates and segmentations.
  • Provides example training and generation commands for E2E NLG and WikiBio datasets.

Maintenance & Community

  • Primary contact: swiseman[at]ttic.edu.
  • No explicit community channels (Discord/Slack) or roadmap mentioned.

Licensing & Compatibility

  • License not explicitly stated in the README.
  • Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The code is explicitly tested with Python 2.7 and PyTorch 0.3.1, which are outdated versions and may present significant compatibility challenges with modern Python environments and PyTorch versions. Training is noted as sensitive to the random seed, potentially requiring multiple runs for optimal performance.

Health Check
Last commit

3 years ago

Responsiveness

1 week

Pull Requests (30d)
0
Issues (30d)
0
Star History
2 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.