FewCLUE  by CLUEbenchmark

Chinese benchmark for few-shot learning evaluation

Created 4 years ago
514 stars

Top 60.9% on SourcePulse

GitHubView on GitHub
Project Summary

FewCLUE is a comprehensive benchmark designed to evaluate few-shot learning capabilities for Chinese NLP tasks. It targets researchers and practitioners aiming to advance models that can learn effectively from minimal data, offering a standardized evaluation framework and diverse datasets.

How It Works

FewCLUE comprises nine diverse Chinese NLP tasks, including sentiment analysis, text classification, natural language inference, and coreference resolution, with varying numbers of classes and data sizes. It supports evaluating models using standard fine-tuning, as well as advanced few-shot learning techniques like PET, P-tuning, and zero-shot learning, providing a robust platform for comparing different approaches.

Quick Start & Requirements

  • Installation: Clone the repository (git clone https://github.com/CLUEbenchmark/FewCLUE.git).
  • Dependencies: Python 3.x (or 2.7), TensorFlow 1.14+, Keras 2.3.1, bert4keras.
  • Models: Requires pre-downloaded models like chinese_roberta_wwm_ext.
  • Running Baselines: Scripts are provided for fine-tuning, PET, P-tuning, and zero-shot methods. Example: bash run_classifier_multi_dataset.sh for fine-tuning.
  • Resources: Requires significant computational resources for training and evaluation, especially for larger models and datasets.
  • Documentation: Detailed task descriptions, baseline implementations, and experimental results are available in the README.

Highlighted Details

  • Features 9 diverse Chinese NLP tasks, including newly created ones and subsets from the CLUE benchmark.
  • Supports multiple few-shot learning methodologies (PET, P-tuning, Zero-shot, LM-BFF, ADAPET, EFL).
  • Includes detailed experimental results comparing various models and techniques against human performance.
  • Provides extensive learning materials, including PPTs and videos from workshops and competitions.

Maintenance & Community

  • The project is associated with the CLUE benchmark initiative.
  • Community engagement is encouraged via GitHub pull requests and email. A QQ group (836811304) is available for discussion.

Licensing & Compatibility

  • The repository's license is listed as "正在添加中" (being added), indicating it may not be finalized or clearly stated. This requires clarification for commercial use or integration into closed-source projects.

Limitations & Caveats

  • The licensing status is unclear, posing a potential adoption blocker.
  • The benchmark focuses on Chinese language tasks, limiting its direct applicability to other languages.
  • Some datasets are very small (e.g., 32 training samples), which might lead to unstable results without careful methodology.
Health Check
Last Commit

3 years ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
2 stars in the last 30 days

Explore Similar Projects

Starred by Tobi Lutke Tobi Lutke(Cofounder of Shopify), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
7 more.

autolabel by refuel-ai

0.1%
2k
Python library to label text datasets using LLMs
Created 2 years ago
Updated 6 months ago
Starred by Aravind Srinivas Aravind Srinivas(Cofounder of Perplexity), Shizhe Diao Shizhe Diao(Author of LMFlow; Research Scientist at NVIDIA), and
12 more.

gpt-3 by openai

0.0%
16k
Research paper on large language model few-shot learning
Created 5 years ago
Updated 5 years ago
Feedback? Help us improve.