kanana by kakao

Bilingual language models for Korean/English, compute-efficient vs. SOTA

Created 10 months ago

276 stars

Top 93.9% on SourcePulse

Project Summary

Kanana is a series of bilingual (Korean/English) language models designed for compute efficiency, offering strong Korean language capabilities and competitive English performance. The models are suitable for researchers and developers working with Korean NLP tasks, providing base, instruct, embedding, and RAG-optimized versions.

How It Works

Kanana models leverage several pre-training techniques for efficiency, including high-quality data filtering, staged pre-training, depth up-scaling, and pruning/distillation. Post-training involves supervised fine-tuning and preference optimization to enhance user interaction. The models are available in various sizes, from 2.1B to 32.5B parameters, with smaller versions released to foster Korean LLM research.

Quick Start & Requirements

Install transformers>=4.45.0 via pip.
Requires CUDA-enabled GPU and torch_dtype=torch.bfloat16 for optimal performance.
Example usage for base, instruct, and embedding models is provided, along with vLLM integration.
Official Hugging Face models: kakaocorp/kanana-nano-2.1b-base and others.

Highlighted Details

Achieves compute-efficient training with competitive performance across various benchmarks (MMLU, KMMLU, HumanEval, etc.).
Offers specialized models for embedding, function calling, and RAG.
Released 2.1B parameter models (base, instruct, embedding, function call, RAG) to promote Korean LLM research.
Technical report and blog posts detail development methodologies.

Maintenance & Community

Developed by Kakao.
Contact for technical support: kanana-llm@kakaocorp.com.
Contact for business/partnerships: alpha.k@kakaocorp.com.

Licensing & Compatibility

Licensed under CC-BY-NC-4.0.
Non-commercial use only due to the NC clause.

Limitations & Caveats

The CC-BY-NC-4.0 license restricts commercial use.
Performance benchmarks are detailed in the linked technical report, with only partial results shown in the README.

Health Check

Last Commit

5 months ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

3 stars in the last 30 days