rulm  by IlyaGusev

Russian language models and instruction tuning

created 6 years ago
467 stars

Top 66.0% on sourcepulse

GitHubView on GitHub
Project Summary

This repository provides implementations and comparisons of language models specifically tuned for the Russian language, targeting researchers and developers working with Russian NLP. It offers pre-trained models and datasets for instruction tuning and chat-based interactions, aiming to advance Russian language AI capabilities.

How It Works

The project leverages instruction tuning and chat-based fine-tuning techniques on top of base LLaMA models. It introduces custom datasets like RuTurboAlpaca (instruction-following) and Saiga (chat-based conversations), generated using GPT models and curated for Russian. The approach focuses on adapting large language models to the nuances of the Russian language and common interaction patterns.

Quick Start & Requirements

  • Models are available on HuggingFace (e.g., llama_7b_ru_turbo_alpaca_lora, saiga_7b_lora).
  • Fine-tuning can be initiated via provided Colab notebooks.
  • Requires Python and standard ML libraries; specific hardware (GPU) is recommended for training/fine-tuning.
  • Official demos and fine-tuning resources are linked in the README.

Highlighted Details

  • Offers multiple model sizes (7B, 13B, 30B, 70B) and tuning strategies (LoRA).
  • Includes comprehensive evaluation results on Russian NLP benchmarks like RussianSuperGLUE.
  • Provides datasets for instruction tuning (RuTurboAlpaca) and conversational AI (Saiga, GPT Role-play Realm).
  • Models are trained on a mix of Russian and English data.

Maintenance & Community

The project is actively developed by Ilya Gusev. Links to demos and evaluation code are provided.

Licensing & Compatibility

Models are based on LLaMA, which has its own license. The datasets and code appear to be permissively licensed, but users should verify compatibility with LLaMA's terms for commercial use.

Limitations & Caveats

The README explicitly recommends using Saiga models over the original RuTurboAlpaca models, suggesting Saiga models are better supported and perform better on side-by-side metrics. The project is primarily focused on Russian language tasks.

Health Check
Last commit

11 months ago

Responsiveness

1 week

Pull Requests (30d)
0
Issues (30d)
0
Star History
2 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.