GPT2-Summary  by qingkongzhiqian

Chinese summary generation model based on GPT2

Created 5 years ago
408 stars

Top 71.5% on SourcePulse

GitHubView on GitHub
Project Summary

This project provides a Chinese text summarization model based on the GPT-2 architecture. It is designed for researchers and developers working with Chinese NLP tasks who need a pre-trained model for generating concise summaries from longer texts. The benefit is a readily available, fine-tuned GPT-2 model for Chinese summarization.

How It Works

The model leverages the GPT-2 architecture, specifically fine-tuning the GPT2-Chinese and GPT2-chitchat models. It processes training data by concatenating text and its summary, then trains the model on this sequence. This approach aims to adapt the generative capabilities of GPT-2 for the specific task of Chinese text summarization.

Quick Start & Requirements

Highlighted Details

  • Fine-tuned on NLPCC summarization data (news corpus).
  • Uses a "sequential" concatenation method for training data.
  • Offers two pre-trained models: GPT2-nlpcc-summary and GPT2-wiki.
  • Interactive prediction allows pasting text directly into the console.

Maintenance & Community

The project appears to be a personal or academic endeavor with no explicit mention of active maintenance, community channels (like Discord/Slack), or a roadmap. It acknowledges contributions from the GPT2-Chinese and GPT2-chitchat projects.

Licensing & Compatibility

The README does not explicitly state a license. Given its reliance on GPT2-Chinese and GPT2-chitchat, users should verify the licenses of those underlying projects for compatibility, especially for commercial use.

Limitations & Caveats

The project is presented as an exploration and may not be suitable for production without retraining the general and summary models. The README notes that the summarization quality might be better in vertical domains than general news. The sampling-based decoding means generated summaries can vary.

Health Check
Last Commit

2 years ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
0 stars in the last 30 days

Explore Similar Projects

Starred by Stas Bekman Stas Bekman(Author of "Machine Learning Engineering Open Book"; Research Engineer at Snowflake).

pytorch-nlp-notebooks by scoutbee

0%
419
PyTorch tutorials for NLP tasks
Created 6 years ago
Updated 5 years ago
Feedback? Help us improve.