ScienceQA  by lupantech

Science QA dataset & code for multimodal reasoning research

created 2 years ago
683 stars

Top 50.7% on sourcepulse

GitHubView on GitHub
Project Summary

This repository provides the dataset and code for the NeurIPS 2022 paper "Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering." It addresses the challenge of multimodal reasoning in science education by offering a benchmark dataset and methods for generating explanatory "thought chains" to solve complex questions. The target audience includes AI researchers and developers working on multimodal understanding, explainable AI, and educational AI applications.

How It Works

The ScienceQA dataset comprises over 21,000 multimodal multiple-choice questions spanning natural, language, and social sciences, featuring rich annotations including lectures and explanations. The core approach involves using language models to generate these "thought chains" (lectures and explanations) as a form of chain-of-thought (CoT) reasoning. This mimics human multi-hop reasoning processes, enhancing question-answering performance by providing step-by-step justifications.

Quick Start & Requirements

  • Install dependencies: pip install -r requirements.txt
  • Prerequisites: Python 3.8.10, PyTorch 1.12.1+cu113, CUDA 11.3.
  • Dataset download: Run tools/download.sh or download from Google Drive.
  • Run GPT-3 (CoT) model: cd models && python run_gpt3.py --label exp1 --test_split test --test_number -1 --shot_number 2 --prompt_format QCM-ALE --seed 3
  • Explore: Project page: https://scienceqa.github.io/

Highlighted Details

  • Dataset includes 21,208 multimodal questions across 3 subjects, 26 topics, 127 categories, and 379 skills.
  • CoT prompting improved GPT-3 few-shot performance by 1.20% and UnifiedQA fine-tuned performance by 3.99%.
  • Leaderboard features evaluations of numerous models, including recent SOTA results from LLaVA (92.53%) and Chameleon (86.54%).
  • Dataset is available on HuggingFace Datasets.

Maintenance & Community

The project is actively maintained, with recent updates in late 2023 featuring over 100 models. Community engagement is encouraged via email, Twitter, and GitHub issues.

Licensing & Compatibility

  • Code: MIT License.
  • Dataset: Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License (CC BY-NC-SA 4.0). This license restricts commercial use.

Limitations & Caveats

The leaderboard data is manually collected and may contain errors or ambiguities. The dataset's CC BY-NC-SA 4.0 license restricts commercial use of the dataset itself.

Health Check
Last commit

10 months ago

Responsiveness

1 week

Pull Requests (30d)
0
Issues (30d)
0
Star History
25 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.