kanana  by kakao

Bilingual language models for Korean/English, compute-efficient vs. SOTA

created 5 months ago
255 stars

Top 99.2% on sourcepulse

GitHubView on GitHub
Project Summary

Kanana is a series of bilingual (Korean/English) language models designed for compute efficiency, offering strong Korean language capabilities and competitive English performance. The models are suitable for researchers and developers working with Korean NLP tasks, providing base, instruct, embedding, and RAG-optimized versions.

How It Works

Kanana models leverage several pre-training techniques for efficiency, including high-quality data filtering, staged pre-training, depth up-scaling, and pruning/distillation. Post-training involves supervised fine-tuning and preference optimization to enhance user interaction. The models are available in various sizes, from 2.1B to 32.5B parameters, with smaller versions released to foster Korean LLM research.

Quick Start & Requirements

  • Install transformers>=4.45.0 via pip.
  • Requires CUDA-enabled GPU and torch_dtype=torch.bfloat16 for optimal performance.
  • Example usage for base, instruct, and embedding models is provided, along with vLLM integration.
  • Official Hugging Face models: kakaocorp/kanana-nano-2.1b-base and others.

Highlighted Details

  • Achieves compute-efficient training with competitive performance across various benchmarks (MMLU, KMMLU, HumanEval, etc.).
  • Offers specialized models for embedding, function calling, and RAG.
  • Released 2.1B parameter models (base, instruct, embedding, function call, RAG) to promote Korean LLM research.
  • Technical report and blog posts detail development methodologies.

Maintenance & Community

Licensing & Compatibility

  • Licensed under CC-BY-NC-4.0.
  • Non-commercial use only due to the NC clause.

Limitations & Caveats

  • The CC-BY-NC-4.0 license restricts commercial use.
  • Performance benchmarks are detailed in the linked technical report, with only partial results shown in the README.
Health Check
Last commit

1 week ago

Responsiveness

Inactive

Pull Requests (30d)
1
Issues (30d)
0
Star History
30 stars in the last 90 days

Explore Similar Projects

Starred by George Hotz George Hotz(Author of tinygrad; Founder of the tiny corp, comma.ai), Calvin French-Owen Calvin French-Owen(Coounder of Segment), and
12 more.

StableLM by Stability-AI

0.0%
16k
Language models by Stability AI
created 2 years ago
updated 1 year ago
Feedback? Help us improve.