LLMBox by RUCAIBox

LLM library for training and evaluating large language models

Created 2 years ago

847 stars

Top 42.2% on SourcePulse

Project Summary

LLMBox is a comprehensive Python library for training and evaluating Large Language Models (LLMs), targeting researchers and practitioners. It offers a unified pipeline for diverse training strategies and extensive model evaluation capabilities, aiming to streamline LLM development and benchmarking.

How It Works

LLMBox provides a flexible framework supporting multiple training paradigms including Supervised Fine-tuning (SFT), Pre-training (PT), PPO, and DPO. It integrates parameter-efficient fine-tuning methods like LoRA and QLoRA, and efficiency boosters such as Flash Attention and DeepSpeed. For utilization, it supports various inference acceleration techniques like KV Cache management and vLLM, alongside advanced evaluation methods (generation, perplexity, probability) across 59+ datasets, with support for In-Context Learning and Chain-of-Thought prompting.

Quick Start & Requirements

Install via git clone and pip install -r requirements.txt. Minimal requirements for OpenAI model evaluation are available via requirements-openai.txt.
Supports Huggingface models, OpenAI, Anthropic, QWen, and OpenAI-compatible models.
Requires Python and standard ML dependencies. GPU with CUDA is recommended for training and efficient inference.
Training example: python train.py --model_name_or_path meta-llama/Llama-2-7b-hf ...
Utilization example: python inference.py -m gpt-3.5-turbo -d copa
Documentation: LLMBox Documentation

Highlighted Details

Supports tokenizer vocabulary merging for expanded token sets.
Integrates Self-Instruct and Evol-Instruct for automated data augmentation.
Achieves up to 6x speedup in local inference via KV Cache management or vLLM.
Reproduces results from original papers of major LLMs.

Maintenance & Community

Developed and maintained by AI Box.
Open to contributions via issue tracker and pull requests.
Mentions specific contributor @xansar for issue resolution.

Licensing & Compatibility

MIT License.
Permissive license suitable for commercial use and integration with closed-source projects.

Limitations & Caveats

The README notes that vLLM, while faster, may require parameter adjustments for optimal results compared to default transformer configurations. Specific dataset evaluation methods (e.g., normalized accuracy for ARC, OpenbookQA, RACE) are mentioned, implying potential variations in how metrics are reported across different datasets and methods.

Health Check

Last Commit

6 months ago

Responsiveness

1 day

Pull Requests (30d)

Issues (30d)

Star History

2 stars in the last 30 days