bert4torch  by Tongjilibo

PyTorch library for transformer models

created 3 years ago
1,310 stars

Top 31.2% on sourcepulse

GitHubView on GitHub
Project Summary

This repository provides an elegant PyTorch implementation of transformer models, aiming to simplify the process of loading, fine-tuning, and deploying large language models (LLMs). It is designed for researchers and developers working with NLP tasks who need a flexible and efficient framework for various transformer architectures.

How It Works

The library offers a unified interface for building and managing transformer models, abstracting away much of the complexity associated with different architectures and pre-trained weights. It supports loading models from Hugging Face or local checkpoints, handling configuration files, and integrating common training tricks like LoRA. The design emphasizes code clarity and reusability, drawing inspiration from the Keras training style.

Quick Start & Requirements

  • Install: pip install bert4torch
  • Requirements: PyTorch (developed with v2.0, compatible with v1.10), Python. GPU recommended for LLMs.
  • Setup: Minimal for basic usage; LLM fine-tuning and deployment require significant computational resources and dataset preparation.
  • Links: Documentation, Torch4keras, Examples

Highlighted Details

  • Supports a wide range of LLMs (ChatGLM, Llama, Baichuan, Qwen, etc.) and traditional transformers (BERT, RoBERTa, T5, etc.).
  • One-click deployment for LLM services via command line (bert4torch-llm-server).
  • Integrates common training tricks and callbacks for efficient fine-tuning.
  • Offers a comprehensive table of supported pre-trained weights and their loading methods.
  • Code is designed for ease of understanding and customization, with a focus on code reuse.

Maintenance & Community

  • The project is primarily maintained by a single individual.
  • Community support is available via WeChat (contact author for group invitation).
  • Star History Chart

Licensing & Compatibility

  • The repository does not explicitly state a license in the README. This requires clarification for commercial use or integration into closed-source projects.

Limitations & Caveats

  • The project is largely maintained by a single individual, which could impact long-term development velocity and support.
  • The absence of a clear license in the README is a significant caveat for adoption, especially for commercial applications.
Health Check
Last commit

2 days ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
26
Star History
27 stars in the last 90 days

Explore Similar Projects

Starred by Stas Bekman Stas Bekman(Author of Machine Learning Engineering Open Book; Research Engineer at Snowflake) and Travis Fischer Travis Fischer(Founder of Agentic).

lingua by facebookresearch

0.1%
5k
LLM research codebase for training and inference
created 9 months ago
updated 2 weeks ago
Feedback? Help us improve.