KoELECTRA by monologg

Pretrained ELECTRA model for Korean language tasks

Created 5 years ago

628 stars

Top 52.7% on SourcePulse

Project Summary

KoELECTRA provides pretrained ELECTRA models specifically for the Korean language, offering improved performance over BERT-like models by leveraging the Replaced Token Detection pre-training task. It is designed for researchers and developers working with Korean NLP tasks, enabling more effective text understanding and generation.

How It Works

KoELECTRA utilizes the ELECTRA architecture, which trains a discriminator model to distinguish between original and replaced tokens generated by a smaller generator model. This approach allows for learning from all input tokens, leading to greater efficiency and performance. The models are trained on a substantial Korean corpus (34GB) and are compatible with the Hugging Face Transformers library.

Quick Start & Requirements

Install: Use the Hugging Face Transformers library.
Prerequisites: Python, Transformers library. No specific OS or hardware requirements beyond standard Python environments.
Usage: Load models directly from Hugging Face Hub (e.g., monologg/koelectra-base-v3-discriminator).
Links: Hugging Face Hub, Transformers Documentation

Highlighted Details

Offers multiple versions (v1, v2, v3) with varying training data and vocabulary sizes.
Provides both "Base" (768 hidden size) and "Small" (128 hidden size) variants.
Achieves competitive results on various Korean NLP benchmarks including NSMC, Naver NER, PAWS, KorNLI, KorSTS, Question Pair, KorQuaD, and Korean-Hate-Speech.
Models are readily available on Hugging Face S3, eliminating manual downloads.

Maintenance & Community

The project is actively maintained, with updates including new versions (v2, v3), bug fixes (PyTorch loading issues), and TensorFlow v2 model uploads.
References to related projects and resources are provided.

Licensing & Compatibility

The repository does not explicitly state a license. However, usage via Hugging Face implies compatibility with the Transformers library's licensing. Commercial use should be verified.

Limitations & Caveats

The specific license for the models and code is not clearly stated in the README, which may pose a concern for commercial applications.
While TensorFlow v2 models are available, the README notes that direct loading from tf_model.h5 was removed due to issues, reverting to from_pt=True loading.

Health Check

Last Commit

1 year ago

Responsiveness

1 day

Pull Requests (30d)

0

Issues (30d)

0

Star History

1 stars in the last 30 days

Explore Similar Projects

Starred by

Chip Huyen

Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems").

awesome-japanese-nlp-resources by taishi-i

Curated list of NLP resources for Japanese

Created 3 years ago

Updated 5 days ago

Starred by

Chip Huyen

Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems").

awesome-japanese-llm by llm-jp

Japanese LLM list: models, benchmarks, datasets

Created 2 years ago

Updated 1 day ago

Starred by

Andrew Kane

Andrew Kane(Author of pgvector).

text2text by artitw

Text2Text toolkit for language modeling tasks

Created 5 years ago

Updated 1 year ago

LMkor by kiyoungkim1

Korean language models for NLP tasks

Created 5 years ago

Updated 3 years ago

Starred by

Lysandre Debut

Lysandre Debut(Chief Open-Source Officer at Hugging Face).

bert-japanese by cl-tohoku

Pretrained BERT models for Japanese text

Created 6 years ago

Updated 1 year ago

KoBART by SKT-AI

Korean encoder-decoder language model

Created 5 years ago

Updated 7 months ago

ru_transformers by mgrankin

GPT-2 finetuning notebook for Russian language models

Created 6 years ago

Updated 5 years ago

KoGPT2 by SKT-AI

Korean GPT-2 model for text generation

Created 6 years ago

Updated 1 year ago

gpt2-ml by imcaspar

GPT-2 for multiple languages, including pretrained models

Created 6 years ago

Updated 2 years ago

zero_nlp by yuanzhoulvpi2017

NLP solution for Chinese language models, data, training, and inference

Created 2 years ago

Updated 5 months ago

Chinese-BERT-wwm by ymcui

Pre-trained language models for Chinese NLP tasks

Created 6 years ago

Updated 6 months ago

Starred by

Jeremy Howard

Jeremy Howard(Cofounder of fast.ai),

Alex Cheema

Alex Cheema(Cofounder of EXO Labs), and

22 more.

unilm by microsoft

Foundation models for language, vision, speech, and multimodal tasks

Created 6 years ago

Updated 3 weeks ago

Feedback? Help us improve.