Benchmark for code intelligence tasks
Top 25.4% on sourcepulse
CodeXGLUE is a comprehensive benchmark dataset and open challenge designed to advance AI for code intelligence, targeting researchers and practitioners in software engineering and artificial intelligence. It provides a standardized platform for evaluating and comparing models across a wide array of code-related tasks, aiming to boost developer productivity.
How It Works
CodeXGLUE addresses the lack of standardized evaluation for code intelligence by curating 14 datasets across 10 diverse tasks, including code-code translation, defect detection, code completion, code search, and code summarization. It supports models inspired by NLP advancements, offering baseline implementations like CodeBERT (BERT-style) for understanding and CodeGPT (GPT-style) for generation, along with an Encoder-Decoder framework for sequence-to-sequence tasks.
Quick Start & Requirements
Highlighted Details
Maintenance & Community
This project is a research initiative from Microsoft Research Asia, Developer Division, and Bing. Further details on participation and submission are available via email to codexglue@microsoft.com.
Licensing & Compatibility
The code is released under the MIT License, while the datasets are governed by the Computational Use of Data Agreement (C-UDA) License.
Limitations & Caveats
The README does not specify any explicit limitations or caveats regarding model performance, dataset biases, or ongoing development status.
1 year ago
Inactive