Multilingual LLM for cross-lingual alignment and instruction following
Top 86.7% on sourcepulse
BayLing is a multilingual large language model designed to bridge cross-lingual alignment and instruction following, particularly for English and Chinese. It targets researchers and developers working with multilingual NLP tasks, offering superior performance in English/Chinese generation and instruction following, with capabilities extending to over 100 languages through efficient alignment techniques.
How It Works
BayLing achieves efficient language alignment by combining high-resource language instructions (Chinese and English) with cross-lingual instructions for over 100 languages during training. This approach facilitates knowledge transfer from high-resource languages to low-resource languages, enhancing its multilingual generative capabilities. The model is based on the LLaMA architecture, with versions available in 7B, 13B, and Llama-3-8B parameter sizes.
Quick Start & Requirements
pip install -r requirements.txt
.Highlighted Details
Maintenance & Community
The project is developed by the NLP Group of the Institute of Computing Technology, Chinese Academy of Sciences (ICT/CAS). Updates are regularly posted, with recent releases of BayLing-2 models on Huggingface. Contact: bayling@ict.ac.cn.
Licensing & Compatibility
Model weights (delta version) and inference code are released under GNU General Public License v3.0 (GPLv3). The online demo is for non-commercial use only and is subject to LLaMA's Model License, OpenAI's Terms of Use, ShareGPT's Privacy Practices, and WMT22's Data License.
Limitations & Caveats
BayLing may generate inaccurate factual information, lacks proficiency in reasoning, mathematics, and coding tasks, and carries a risk of producing harmful or biased content. It cannot guarantee absolute accuracy. The project disclaims responsibility for data security, public opinion risks, or misuse of the models.
8 months ago
1 day