kogpt  by kakaobrain

Korean generative pre-trained transformer for classifying, searching, summarizing, or generating Korean texts

created 3 years ago
1,018 stars

Top 37.4% on sourcepulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

KoGPT is a large language model developed by KakaoBrain, specifically designed for Korean text generation. It offers a 6B parameter model, KoGPT6B-ryan1.5b, leveraging Rotary Position Embeddings (RoPE) for enhanced performance. This model is suitable for researchers and developers working with Korean NLP tasks like classification, summarization, and generation.

How It Works

KoGPT is a Transformer-based causal language model. It utilizes Rotary Position Embeddings (RoPE) for positional encoding, a design choice that can improve performance in sequence modeling tasks by injecting relative positional information. The model is trained on a large corpus of Korean text, making it highly proficient in understanding and generating Korean language nuances.

Quick Start & Requirements

  • Install: Use pip install transformers torch.
  • Requirements: NVIDIA GPU with at least 16GB VRAM for float16 or 32GB VRAM for float32 versions. CUDA is recommended.
  • Usage:
    • Command-line inference: python -m kogpt
    • Python API: Refer to the provided Python code snippet for loading and running the model with Hugging Face transformers.
  • Links: Hugging Face Model

Highlighted Details

  • Achieves strong performance on Korean NLP benchmarks, outperforming other models on YNAT (F1) and KLUE-STS (F1) in few-shot settings.
  • Offers both float32 and float16 versions for flexibility in hardware requirements and performance.
  • The model is trained on raw data, which may include profanity, lewd, or political content.

Maintenance & Community

  • The project is open-source, with contact available at contact@kakaobrain.com for cooperation.
  • A web demo is available on Hugging Face Spaces.

Licensing & Compatibility

  • Source Code: Apache 2.0 License.
  • Pretrained Weights: CC-BY-NC-ND 4.0 License.
  • Restrictions: The CC-BY-NC-ND 4.0 license prohibits commercial use and derivative works without permission. Users must comply with license terms to avoid legal action.

Limitations & Caveats

KoGPT is primarily trained on Korean text and may perform poorly on non-Korean inputs or specific Korean dialects not well-represented in the training data. Due to training on raw data, it can generate socially unacceptable or offensive text, and its output is difficult to predict.

Health Check
Last commit

1 year ago

Responsiveness

1 week

Pull Requests (30d)
0
Issues (30d)
0
Star History
5 stars in the last 90 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems) and Lianmin Zheng Lianmin Zheng(Author of SGLang).

fish-speech by fishaudio

0.3%
23k
Open-source TTS for multilingual speech synthesis
created 1 year ago
updated 1 week ago
Feedback? Help us improve.