FewCLUE by CLUEbenchmark

Chinese benchmark for few-shot learning evaluation

Created 4 years ago

518 stars

Top 60.7% on SourcePulse

Project Summary

FewCLUE is a comprehensive benchmark designed to evaluate few-shot learning capabilities for Chinese NLP tasks. It targets researchers and practitioners aiming to advance models that can learn effectively from minimal data, offering a standardized evaluation framework and diverse datasets.

How It Works

FewCLUE comprises nine diverse Chinese NLP tasks, including sentiment analysis, text classification, natural language inference, and coreference resolution, with varying numbers of classes and data sizes. It supports evaluating models using standard fine-tuning, as well as advanced few-shot learning techniques like PET, P-tuning, and zero-shot learning, providing a robust platform for comparing different approaches.

Quick Start & Requirements

Installation: Clone the repository (git clone https://github.com/CLUEbenchmark/FewCLUE.git).
Dependencies: Python 3.x (or 2.7), TensorFlow 1.14+, Keras 2.3.1, bert4keras.
Models: Requires pre-downloaded models like chinese_roberta_wwm_ext.
Running Baselines: Scripts are provided for fine-tuning, PET, P-tuning, and zero-shot methods. Example: bash run_classifier_multi_dataset.sh for fine-tuning.
Resources: Requires significant computational resources for training and evaluation, especially for larger models and datasets.
Documentation: Detailed task descriptions, baseline implementations, and experimental results are available in the README.

Highlighted Details

Features 9 diverse Chinese NLP tasks, including newly created ones and subsets from the CLUE benchmark.
Supports multiple few-shot learning methodologies (PET, P-tuning, Zero-shot, LM-BFF, ADAPET, EFL).
Includes detailed experimental results comparing various models and techniques against human performance.
Provides extensive learning materials, including PPTs and videos from workshops and competitions.

Maintenance & Community

The project is associated with the CLUE benchmark initiative.
Community engagement is encouraged via GitHub pull requests and email. A QQ group (836811304) is available for discussion.

Licensing & Compatibility

The repository's license is listed as "正在添加中" (being added), indicating it may not be finalized or clearly stated. This requires clarification for commercial use or integration into closed-source projects.

Limitations & Caveats

The licensing status is unclear, posing a potential adoption blocker.
The benchmark focuses on Chinese language tasks, limiting its direct applicability to other languages.
Some datasets are very small (e.g., 32 training samples), which might lead to unstable results without careful methodology.

Health Check

Last Commit

3 years ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

0 stars in the last 30 days