cdQA  by cdqa-suite

End-to-end question answering system (no longer maintained)

created 6 years ago
616 stars

Top 54.3% on sourcepulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

This project provides an end-to-end system for closed-domain question answering, built upon HuggingFace's transformers library. It's designed for users needing to extract answers from specific document sets, offering tools for data preparation, model training, prediction, and evaluation.

How It Works

cdQA employs a pipeline architecture that first retrieves relevant documents using a retriever component and then uses a reader component (a BERT or DistilBERT model fine-tuned on SQuAD) to extract the precise answer from the retrieved context. This two-stage approach allows for efficient processing of large document collections while maintaining high accuracy.

Quick Start & Requirements

  • Install via pip: pip install cdqa
  • Hardware: Experiments conducted on CPU (AWS EC2 t2.medium) and GPU (AWS EC2 p3.2xlarge with Tesla V100).
  • Dependencies: Java OpenJDK required for PDF conversion.
  • Documentation: Notebook Examples available via Binder or Google Colab.

Highlighted Details

  • End-to-end system for closed-domain QA.
  • Includes data converters for PDF and Markdown.
  • Supports training custom readers and fine-tuning on SQuAD-like datasets.
  • REST API deployment option available.

Maintenance & Community

[NOT MAINTAINED] This repository is kept for educational purposes. A maintained alternative is deepset-ai/haystack.

Licensing & Compatibility

  • License: Apache-2.0.
  • Compatibility: Generally compatible with commercial and closed-source applications.

Limitations & Caveats

The project is explicitly marked as not maintained, indicating a lack of ongoing development or support. Users seeking current features or bug fixes should consider the suggested alternative.

Health Check
Last commit

5 years ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
0 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.