apps by hendrycks

Dataset for measuring coding challenge competence (NeurIPS 2021)

Created 4 years ago

502 stars

Top 62.1% on SourcePulse

View on GitHub

2 Experts Love This Project

Johannes Hagemann

Cofounder of Prime Intellect

Jinze Bai

Research Scientist at Alibaba Qwen

Project Summary

This repository provides the APPS dataset and code for measuring coding challenge competence, targeting researchers and developers in AI and natural language processing. It enables the evaluation of large language models on their ability to solve programming problems, offering a standardized benchmark for progress in automated programming.

How It Works

The APPS dataset consists of programming problems sourced from competitive programming platforms. The associated code allows for fine-tuning and evaluating transformer-based language models on their ability to generate correct code solutions. This approach standardizes the assessment of coding competence, moving beyond simple code completion tasks.

Quick Start & Requirements

Dataset download: ~1.3GB.
Fine-tuned weights for GPT-2 1.5B and GPT-Neo 2.7B are available.
Training and evaluation instructions are detailed in train/README and eval/README respectively.
The dataset is also available via Hugging Face datasets library.

Highlighted Details

Benchmark for measuring coding challenge competence.
Includes training and evaluation code.
Offers fine-tuned weights for large transformer models.
Links to related datasets for math, ethics, and academic subjects.

Maintenance & Community

The project is associated with authors from prominent institutions and presented at NeurIPS 2021. Further community engagement details are not explicitly provided in the README.

Licensing & Compatibility

The repository's license is not explicitly stated in the README. Compatibility for commercial use or closed-source linking would require clarification of the licensing terms.

Limitations & Caveats

The README does not specify the exact license, which may impact commercial adoption. Detailed setup and execution instructions are referenced in separate README files within subdirectories.

Health Check

Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

1 stars in the last 30 days