question_generation by patil-suraj

Question generation study using transformers

Created 5 years ago

1,144 stars

Top 33.6% on SourcePulse

View on GitHub

7 Experts Love This Project

Edward Sun

Research Scientist at Meta Superintelligence Lab

Malte Pietsch

Cofounder of deepset

Rodrigo Nader

Cofounder of Langflow

Patrick von Platen

Author of Hugging Face Diffusers; Research Engineer at Mistral

and 3 more!

Project Summary

This repository provides a straightforward, end-to-end approach to neural question generation (QG) using pre-trained transformer models. It targets researchers and developers looking to implement or experiment with QG, offering simplified data processing, training scripts, and inference pipelines, aiming to make QG more accessible than existing complex methods.

How It Works

The project explores three main QG strategies: answer-aware QG (where the answer is provided), answer extraction models, and end-to-end (answer-agnostic) QG. It leverages the T5 model, adapting it for these tasks through various input formatting techniques like "prepend" and "highlight" to guide the model. A key innovation is the "multitask QA-QG" approach, which fine-tunes a single T5 model to perform answer extraction, question generation, and question answering simultaneously, reducing pipeline complexity.

Quick Start & Requirements

Install: pip install transformers==3.0.0 nltk nlp==0.2.0 (nlp optional for fine-tuning).
Prerequisites: Python 3.x, NLTK data (python -m nltk.downloader punkt).
Usage: Utilizes 🤗 Transformers-like pipelines: pipeline("question-generation"), pipeline("multitask-qa-qg"), pipeline("e2e-qg").
Docs: 🤗 Transformers

Highlighted Details

Implements answer-aware QG with "prepend" and "highlight" input formats.
Offers a multitask QA-QG model for integrated answer extraction, QG, and QA.
Supports end-to-end QG, generating multiple questions from context without answer supervision.
Provides benchmark results on SQuAD1.0 dev set using BLEU-4, METEOR, ROUGE-L, QA-EM, and QA-F1 metrics.

Maintenance & Community

Primarily developed by Suraj Patil.
Uses Weights & Biases (wandb) for experiment tracking.
Relevant papers are linked for further reading.

Licensing & Compatibility

The repository itself does not explicitly state a license in the README.
Dependencies like 🤗 Transformers are typically Apache 2.0 licensed, allowing commercial use.

Limitations & Caveats

The project specifies transformers==3.0.0, which is an older version and may require careful dependency management or updates for compatibility with newer 🤗 Transformers releases.
Fine-tuning requires additional data processing steps using provided scripts.

Health Check

Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

3 stars in the last 30 days