Multilingual LLM for chat, knowledge QA, and code generation
Top 52.6% on sourcepulse
XVERSE-13B is a multilingual large language model developed by XVERSE Technology Inc., designed for tasks requiring extensive context understanding and generation. It targets researchers and developers needing a powerful, open-source LLM with strong multilingual capabilities and a long context window, offering significant advantages in handling complex queries and extended dialogues.
How It Works
XVERSE-13B utilizes a standard Decoder-only Transformer architecture. Its key innovation lies in its extended 8K context length, the longest among models of its size, enabling more comprehensive multi-turn conversations and detailed analysis. The model is trained on a massive 3.2 trillion token dataset encompassing over 40 languages, with a focus on achieving superior performance in Chinese and English. A custom BPE tokenizer with a 100,534 token vocabulary supports multilingualism efficiently.
Quick Start & Requirements
pip install -r requirements.txt
transformers
library. Example code provided for loading and inference.chat_demo.py
script is available for running a web server.Highlighted Details
Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
Like all LLMs, XVERSE-13B may produce inaccurate, biased, or offensive content. Developers must conduct safety testing and tuning for specific applications. The model's knowledge cutoff is July 2023. The repository warns against using the model for harmful purposes and disclaims liability for misuse.
1 year ago
1 week