YuLan-Chat by RUC-GSAI

Open-source LLM for chat, instruction-following, and general language tasks

Created 2 years ago

633 stars

Top 52.4% on SourcePulse

View on GitHub

1 Expert Loves This Project

Yaowei Zheng

Author of LLaMA-Factory

Project Summary

YuLan-Chat offers open-source chat-based large language models, primarily targeting Chinese and English language users. Developed by Renmin University of China, it aims to provide helpful, honest, and harmless AI assistants, with recent versions trained from scratch and featuring enhanced Chinese language support and longer context windows.

How It Works

YuLan-Chat models are built through large-scale pre-training on English, Chinese, and multilingual data. They are then fine-tuned using a curriculum learning strategy with high-quality instructions and human preference data. This approach enhances their helpfulness, honesty, and harmlessness. Specific versions have expanded vocabularies and context lengths (up to 4k) to better support Chinese inputs and outputs.

Quick Start & Requirements

Installation: Create a conda environment (conda create -n yulan python=3.10 -y, conda activate yulan), install PyTorch and bitsandbytes (versions 1.13 and 0.39.0 recommended), then pip install -r requirements.txt.
Model Weights: For LLaMA-based models, apply delta weights to original LLaMA checkpoints. LLaMA-2 based models can be used directly.
Usage: Models can be loaded via Huggingface Transformers. Example Python code and command-line inference scripts are provided. INT-8 quantization is available for single-GPU deployment.
Resources: INT-8 quantized 13B models require ~24GB VRAM (RTX3090), and 65B models require ~80GB VRAM (A100).

Highlighted Details

Offers models ranging from 13B to 65B parameters, including a recent 12B model trained from scratch.
Supports up to 8k context length for some versions (e.g., YuLan-Chat-2-13B).
Evaluated on MMLU, C-Eval, and AGI-Eval benchmarks, showing competitive performance, particularly in Chinese language tasks.
Includes a lightweight 2.4B model (YuLan-Mini) trained on 1T tokens.

Maintenance & Community

The project is actively developed by researchers from Renmin University of China. Specific contributors are listed for pre-training and fine-tuning roles. No community links (Discord, Slack) are provided in the README.

Licensing & Compatibility

The project uses the MIT License. However, all data and code are restricted to academic purposes only, which may limit commercial use or integration into closed-source projects.

Limitations & Caveats

While efforts are made to mitigate harmful outputs, the models are probabilistic and may generate biased, discriminatory, or otherwise harmful content. The project disclaims responsibility for consequences arising from the dissemination of such information.

Health Check

Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

1 stars in the last 30 days