kogpt by kakaobrain

Korean generative pre-trained transformer for classifying, searching, summarizing, or generating Korean texts

Created 4 years ago

1,015 stars

Top 36.8% on SourcePulse

View on GitHub

1 Expert Loves This Project

Omar Sanseviero

DevRel at Google DeepMind

Project Summary

KoGPT is a large language model developed by KakaoBrain, specifically designed for Korean text generation. It offers a 6B parameter model, KoGPT6B-ryan1.5b, leveraging Rotary Position Embeddings (RoPE) for enhanced performance. This model is suitable for researchers and developers working with Korean NLP tasks like classification, summarization, and generation.

How It Works

KoGPT is a Transformer-based causal language model. It utilizes Rotary Position Embeddings (RoPE) for positional encoding, a design choice that can improve performance in sequence modeling tasks by injecting relative positional information. The model is trained on a large corpus of Korean text, making it highly proficient in understanding and generating Korean language nuances.

Quick Start & Requirements

Install: Use pip install transformers torch.
Requirements: NVIDIA GPU with at least 16GB VRAM for float16 or 32GB VRAM for float32 versions. CUDA is recommended.
Usage:
- Command-line inference: python -m kogpt
- Python API: Refer to the provided Python code snippet for loading and running the model with Hugging Face transformers.
Links: Hugging Face Model

Highlighted Details

Achieves strong performance on Korean NLP benchmarks, outperforming other models on YNAT (F1) and KLUE-STS (F1) in few-shot settings.
Offers both float32 and float16 versions for flexibility in hardware requirements and performance.
The model is trained on raw data, which may include profanity, lewd, or political content.

Maintenance & Community

The project is open-source, with contact available at contact@kakaobrain.com for cooperation.
A web demo is available on Hugging Face Spaces.

Licensing & Compatibility

Source Code: Apache 2.0 License.
Pretrained Weights: CC-BY-NC-ND 4.0 License.
Restrictions: The CC-BY-NC-ND 4.0 license prohibits commercial use and derivative works without permission. Users must comply with license terms to avoid legal action.

Limitations & Caveats

KoGPT is primarily trained on Korean text and may perform poorly on non-Korean inputs or specific Korean dialects not well-represented in the training data. Due to training on raw data, it can generate socially unacceptable or offensive text, and its output is difficult to predict.

Health Check

Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

2 stars in the last 30 days