Qwen  by QwenLM

Chat & pretrained LLM by Alibaba Cloud

created 2 years ago
18,895 stars

Top 2.4% on sourcepulse

GitHubView on GitHub
Project Summary

Qwen provides a suite of large language models (LLMs) and chat models, including base models (Qwen, Qwen-1.8B, Qwen-7B, Qwen-14B, Qwen-72B) and their chat-tuned variants. Developed by Alibaba Cloud, these models are designed for a wide range of natural language processing tasks, from content creation and summarization to tool usage and agentic behavior, targeting researchers and developers.

How It Works

The Qwen models are pretrained on extensive multilingual datasets (up to 3 trillion tokens), focusing on Chinese and English across various domains. They employ techniques to support long context windows (up to 32K tokens) and offer various quantization methods (Int4, Int8, KV cache quantization) for improved efficiency. The chat models are further aligned with human preferences using SFT and RLHF, enabling conversational capabilities and tool integration.

Quick Start & Requirements

  • Installation: pip install -r requirements.txt
  • Prerequisites: Python 3.8+, PyTorch 1.12+, Transformers 4.32+. CUDA 11.4+ recommended for GPU usage. Optional: flash-attention for performance.
  • Usage: Examples provided for Hugging Face Transformers and ModelScope integration. Docker images are available for simplified deployment.
  • Resources: Links to Hugging Face, ModelScope, and a technical report are provided.

Highlighted Details

  • Offers models ranging from 1.8B to 72B parameters, with competitive benchmark performance against models like LLaMA2 and GPT-3.5.
  • Supports advanced features like system prompts for customization, tool usage, and function calling.
  • Provides detailed finetuning guides (full-parameter, LoRA, Q-LoRA) and deployment options (vLLM, FastChat, local API).
  • Includes quantization techniques (GPTQ, KV cache quantization) and performance benchmarks for speed and memory.

Maintenance & Community

  • The repository QwenLM/Qwen is noted as no longer actively maintained due to codebase differences with newer versions.
  • Community channels include Discord and WeChat. Contact email: qianwen_opensource@alibabacloud.com.

Licensing & Compatibility

  • Source code is under Apache 2.0 License.
  • Model weights for 7B, 14B, and 72B require application for commercial use via DashScope. Qwen-1.8B is under a RESEARCH LICENSE AGREEMENT, requiring contact for commercial use.

Limitations & Caveats

  • The primary repository QwenLM/Qwen is not actively maintained; users should refer to QwenLM/Qwen2.
  • Some quantization packages (e.g., auto-gptq) may have version compatibility issues with transformers and optimum.
  • KV cache quantization and flash attention cannot be used simultaneously.
  • Manual copying of certain non-Python files (.cpp, .cu) might be necessary for specific functionalities.
Health Check
Last commit

1 week ago

Responsiveness

1 day

Pull Requests (30d)
1
Issues (30d)
20
Star History
899 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.