JGLUE by yahoojapan

Benchmark for Japanese NLU

created 3 years ago

320 stars

Top 86.0% on sourcepulse

Project Summary

JGLUE is a comprehensive benchmark for evaluating Japanese Natural Language Understanding (NLU) capabilities, designed to foster research in the Japanese language domain. It comprises six diverse tasks—text classification, sentence pair classification, and question answering—with multiple datasets for each, making it suitable for researchers and developers working with Japanese NLP models.

How It Works

JGLUE was constructed from scratch, avoiding translation from English benchmarks to ensure linguistic authenticity. It utilizes Yahoo! Crowdsourcing for data annotation and includes datasets like MARC-ja (text classification), JSTS (semantic textual similarity), JNLI (natural language inference), JSQuAD (reading comprehension), and JCommonsenseQA (commonsense reasoning). The benchmark provides detailed dataset descriptions and baseline performance scores using various Japanese BERT and RoBERTa models.

Quick Start & Requirements

Dataset Preparation: Requires downloading original datasets (e.g., MARC, MS COCO Caption Dataset, SQuAD, CommonsenseQA) and running provided Python scripts for conversion and preprocessing. Specific morphological analyzers (MeCab, Juman++) are needed for certain models.
Fine-tuning: The fine-tuning process uses the Hugging Face transformers library. Detailed instructions are available in fine-tuning/README.md.
Resources: Preprocessing MARC-ja involves Python dependencies listed in preprocess/requirements.txt. Fine-tuning requires significant computational resources typical for large language models.
Links:

Highlighted Details

Native Japanese Benchmark: Built entirely from Japanese data, unlike translated benchmarks.
Diverse Tasks: Covers text classification, sentence pair classification, and QA with multiple datasets.
Comprehensive Baselines: Includes performance scores for various Japanese BERT and RoBERTa models on all tasks.
Data Quality: MARC-ja dataset quality was enhanced via crowdsourced judgments.

Maintenance & Community

Developed through a joint research project between Yahoo Japan Corporation and Kawahara Lab at Waseda University. A leaderboard was planned but the test set has been released.

JGLUE by yahoojapan

Explore Similar Projects

Awesome-instruction-tuning by zhilizju

360zhinao by Qihoo360

anli by facebookresearch

awesome-align by neulab

KLUE by KLUE-benchmark

massive by alexa

contriever by facebookresearch

FewCLUE by CLUEbenchmark

ngram by EurekaLabsAI

xtreme by google-research

100-Days-of-NLP by graviraja

KoBERT by SKTBrain