BayLing  by ictnlp

Multilingual LLM for cross-lingual alignment and instruction following

created 2 years ago
317 stars

Top 86.7% on sourcepulse

GitHubView on GitHub
Project Summary

BayLing is a multilingual large language model designed to bridge cross-lingual alignment and instruction following, particularly for English and Chinese. It targets researchers and developers working with multilingual NLP tasks, offering superior performance in English/Chinese generation and instruction following, with capabilities extending to over 100 languages through efficient alignment techniques.

How It Works

BayLing achieves efficient language alignment by combining high-resource language instructions (Chinese and English) with cross-lingual instructions for over 100 languages during training. This approach facilitates knowledge transfer from high-resource languages to low-resource languages, enhancing its multilingual generative capabilities. The model is based on the LLaMA architecture, with versions available in 7B, 13B, and Llama-3-8B parameter sizes.

Quick Start & Requirements

  • Install: Clone the repository and install requirements: pip install -r requirements.txt.
  • Prerequisites: Python 3.10, PyTorch 2.0, transformers 4.28.1, FastChat. GPU with at least 10GB (7B model) or 16GB (13B model) VRAM is recommended for command-line interaction.
  • Setup: Basic setup involves cloning the repo and installing dependencies. Applying delta weights requires downloading base LLaMA models.
  • Links: Homepage, Demo, BayLing 1 Paper, BayLing 2 Paper.

Highlighted Details

  • Achieves 90% of ChatGPT's performance on various multilingual and general tasks.
  • Demonstrates strong performance on WMT22 multilingual translation benchmarks, competitive with specialized models and other LLMs.
  • Human evaluations show BayLing-13B ranks first in translation quality (18%), instruction following (30%), and multi-turn interaction (20%), second only to ChatGPT.
  • Evaluated on standardized tests like GaoKao, SAT, GRE, and GMAT, showing competitive results against models like GPT-3.5-turbo.

Maintenance & Community

The project is developed by the NLP Group of the Institute of Computing Technology, Chinese Academy of Sciences (ICT/CAS). Updates are regularly posted, with recent releases of BayLing-2 models on Huggingface. Contact: bayling@ict.ac.cn.

Licensing & Compatibility

Model weights (delta version) and inference code are released under GNU General Public License v3.0 (GPLv3). The online demo is for non-commercial use only and is subject to LLaMA's Model License, OpenAI's Terms of Use, ShareGPT's Privacy Practices, and WMT22's Data License.

Limitations & Caveats

BayLing may generate inaccurate factual information, lacks proficiency in reasoning, mathematics, and coding tasks, and carries a risk of producing harmful or biased content. It cannot guarantee absolute accuracy. The project disclaims responsibility for data security, public opinion risks, or misuse of the models.

Health Check
Last commit

8 months ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
3 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.