ru_transformers  by mgrankin

GPT-2 finetuning notebook for Russian language models

Created 6 years ago
767 stars

Top 45.5% on SourcePulse

GitHubView on GitHub
Project Summary

This repository provides tools and pre-trained models for Russian GPT-2 language generation, targeting researchers and developers interested in fine-tuning or deploying large language models for Russian text. It offers a comprehensive guide for training, evaluation, and deployment, including performance benchmarks and detailed instructions for dataset preparation and model configuration.

How It Works

The project leverages the GPT-2 architecture and implements progressive layer unfreezing for efficient transfer learning. It utilizes a custom YTTM tokenizer, noted for its speed and smaller file sizes compared to SentencePiece. Training is optimized with mixed-precision (fp16) and supports both GPU and Google TPU acceleration.

Quick Start & Requirements

Highlighted Details

  • Perplexity benchmarks provided for various model sizes (124M, 355M) and training configurations on different Russian datasets.
  • Supports gradual unfreezing strategy (0, 1, 2, 7, -1) for progressive training.
  • Includes scripts for model evaluation, text processing, and token conversion.
  • Offers a REST API deployment example using uvicorn.

Maintenance & Community

  • The project appears to be maintained by mgrankin.
  • Links to Telegram bots (@PorfBot, @NeuroPoetBot) for direct model interaction are provided.

Licensing & Compatibility

  • The repository does not explicitly state a license in the README.
  • Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

  • The README mentions potential issues with Apex and DataParallel (apex/issues/227), which might affect mixed-precision training on certain configurations.
  • Instructions for SentencePiece installation are provided but noted as skippable if using YTTM.
  • The project relies on AWS S3 for model distribution.
Health Check
Last Commit

4 years ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
0 stars in the last 30 days

Explore Similar Projects

Starred by Luis Capelo Luis Capelo(Cofounder of Lightning AI), Eugene Yan Eugene Yan(AI Scientist at AWS), and
14 more.

text by pytorch

0.0%
4k
PyTorch library for NLP tasks
Created 8 years ago
Updated 1 week ago
Feedback? Help us improve.