question_generation  by patil-suraj

Question generation study using transformers

created 5 years ago
1,129 stars

Top 34.7% on sourcepulse

GitHubView on GitHub
Project Summary

This repository provides a straightforward, end-to-end approach to neural question generation (QG) using pre-trained transformer models. It targets researchers and developers looking to implement or experiment with QG, offering simplified data processing, training scripts, and inference pipelines, aiming to make QG more accessible than existing complex methods.

How It Works

The project explores three main QG strategies: answer-aware QG (where the answer is provided), answer extraction models, and end-to-end (answer-agnostic) QG. It leverages the T5 model, adapting it for these tasks through various input formatting techniques like "prepend" and "highlight" to guide the model. A key innovation is the "multitask QA-QG" approach, which fine-tunes a single T5 model to perform answer extraction, question generation, and question answering simultaneously, reducing pipeline complexity.

Quick Start & Requirements

  • Install: pip install transformers==3.0.0 nltk nlp==0.2.0 (nlp optional for fine-tuning).
  • Prerequisites: Python 3.x, NLTK data (python -m nltk.downloader punkt).
  • Usage: Utilizes 🤗 Transformers-like pipelines: pipeline("question-generation"), pipeline("multitask-qa-qg"), pipeline("e2e-qg").
  • Docs: 🤗 Transformers

Highlighted Details

  • Implements answer-aware QG with "prepend" and "highlight" input formats.
  • Offers a multitask QA-QG model for integrated answer extraction, QG, and QA.
  • Supports end-to-end QG, generating multiple questions from context without answer supervision.
  • Provides benchmark results on SQuAD1.0 dev set using BLEU-4, METEOR, ROUGE-L, QA-EM, and QA-F1 metrics.

Maintenance & Community

  • Primarily developed by Suraj Patil.
  • Uses Weights & Biases (wandb) for experiment tracking.
  • Relevant papers are linked for further reading.

Licensing & Compatibility

  • The repository itself does not explicitly state a license in the README.
  • Dependencies like 🤗 Transformers are typically Apache 2.0 licensed, allowing commercial use.

Limitations & Caveats

  • The project specifies transformers==3.0.0, which is an older version and may require careful dependency management or updates for compatibility with newer 🤗 Transformers releases.
  • Fine-tuning requires additional data processing steps using provided scripts.
Health Check
Last commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
9 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.