KLUE  by KLUE-benchmark

Korean NLU benchmark for advancing Korean NLP

created 4 years ago
579 stars

Top 56.7% on sourcepulse

GitHubView on GitHub
Project Summary

KLUE is a comprehensive benchmark dataset and evaluation framework designed to advance Korean Natural Language Understanding (NLU) research. It addresses the lack of standardized evaluation datasets for Korean NLP, enabling fair comparison of models and facilitating progress in Korean language AI. The benchmark is suitable for NLP researchers and practitioners working with Korean language data.

How It Works

KLUE comprises 8 diverse Korean NLU tasks, including Topic Classification, Sentence Textual Similarity, Natural Language Inference, Named Entity Recognition, Relation Extraction, Dependency Parsing, Machine Reading Comprehension, and Dialogue State Tracking. It provides curated datasets, specific evaluation metrics, and fine-tuning recipes for pre-trained language models (PLMs). The project also releases its own PLMs, KLUE-BERT and KLUE-RoBERTa, trained on Korean data to serve as strong baselines.

Quick Start & Requirements

Highlighted Details

  • Covers 8 distinct Korean NLU tasks, offering broad evaluation coverage.
  • Includes custom pre-trained models (KLUE-BERT, KLUE-RoBERTa) optimized for Korean.
  • Provides baseline scores and performance comparisons against models like mBERT and XLM-R.
  • Emphasizes AI ethical considerations in dataset design.

Maintenance & Community

The project is associated with numerous researchers from various institutions and has significant industry sponsorship from companies like Upstage, NAVER, and Google. A leaderboard is available at https://klue-benchmark.com.

Licensing & Compatibility

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License (CC BY-SA 4.0). This license allows for commercial use and modification, provided attribution is given and any derivative works are shared under the same license.

Limitations & Caveats

The benchmark focuses exclusively on Korean language understanding tasks. While baseline models are provided, achieving state-of-the-art performance on all tasks may require significant computational resources and further fine-tuning.

Health Check
Last commit

3 years ago

Responsiveness

1 week

Pull Requests (30d)
0
Issues (30d)
0
Star History
12 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.