LLM research release with 8k sequence length
Top 48.8% on sourcepulse
XGen is a family of 7-billion parameter Large Language Models (LLMs) developed by Salesforce AI Research, specifically designed to handle long input sequences up to 8,000 tokens. This research release targets developers and researchers working with extended contexts, offering improved performance on tasks requiring comprehension of lengthy documents or conversations.
How It Works
XGen models are trained on an 8K input sequence length, a significant increase over many contemporary LLMs. This extended context window is achieved through architectural choices and training methodologies detailed in their associated research paper, enabling the model to maintain coherence and understanding over much longer text spans. The models leverage the OpenAI Tiktoken package for tokenization.
Quick Start & Requirements
pip install tiktoken
Salesforce/xgen-7b-8k-base
). The provided Python snippet demonstrates loading and generating text using the transformers
library.torch_dtype=torch.bfloat16
recommended for efficiency.Highlighted Details
Maintenance & Community
This is a research release by Salesforce AI Research. Further community engagement details are not provided in the README.
Licensing & Compatibility
The models are released for research purposes only. Specific licensing terms beyond this research focus are not detailed in the README.
Limitations & Caveats
This release is for research purposes only and has not been evaluated for all downstream applications. Users are strongly advised to assess and address potential concerns regarding accuracy, safety, and fairness before deployment, especially in high-risk scenarios.
6 months ago
1 day