ScienceQA by lupantech

Science QA dataset & code for multimodal reasoning research

Created 3 years ago

717 stars

Top 48.0% on SourcePulse

View on GitHub

2 Experts Love This Project

Jinze Bai

Research Scientist at Alibaba Qwen

Elvis Saravia

Founder of DAIR.AI

Project Summary

This repository provides the dataset and code for the NeurIPS 2022 paper "Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering." It addresses the challenge of multimodal reasoning in science education by offering a benchmark dataset and methods for generating explanatory "thought chains" to solve complex questions. The target audience includes AI researchers and developers working on multimodal understanding, explainable AI, and educational AI applications.

How It Works

The ScienceQA dataset comprises over 21,000 multimodal multiple-choice questions spanning natural, language, and social sciences, featuring rich annotations including lectures and explanations. The core approach involves using language models to generate these "thought chains" (lectures and explanations) as a form of chain-of-thought (CoT) reasoning. This mimics human multi-hop reasoning processes, enhancing question-answering performance by providing step-by-step justifications.

Quick Start & Requirements

Install dependencies: pip install -r requirements.txt
Prerequisites: Python 3.8.10, PyTorch 1.12.1+cu113, CUDA 11.3.
Dataset download: Run tools/download.sh or download from Google Drive.
Run GPT-3 (CoT) model: cd models && python run_gpt3.py --label exp1 --test_split test --test_number -1 --shot_number 2 --prompt_format QCM-ALE --seed 3
Explore: Project page: https://scienceqa.github.io/

Highlighted Details

Dataset includes 21,208 multimodal questions across 3 subjects, 26 topics, 127 categories, and 379 skills.
CoT prompting improved GPT-3 few-shot performance by 1.20% and UnifiedQA fine-tuned performance by 3.99%.
Leaderboard features evaluations of numerous models, including recent SOTA results from LLaVA (92.53%) and Chameleon (86.54%).
Dataset is available on HuggingFace Datasets.

Maintenance & Community

The project is actively maintained, with recent updates in late 2023 featuring over 100 models. Community engagement is encouraged via email, Twitter, and GitHub issues.

Licensing & Compatibility

Code: MIT License.
Dataset: Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License (CC BY-NC-SA 4.0). This license restricts commercial use.

Limitations & Caveats

The leaderboard data is manually collected and may contain errors or ambiguities. The dataset's CC BY-NC-SA 4.0 license restricts commercial use of the dataset itself.

Health Check

Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

8 stars in the last 30 days