ChatGLM3 is a series of open bilingual conversational large language models developed by Zhipu AI and Tsinghua University KEG Lab. It offers enhanced base models, improved functionality including tool calling and code interpretation, and extended context lengths (up to 128K tokens). The models are suitable for researchers and developers looking for powerful, deployable conversational AI with commercial use permissions.
How It Works
ChatGLM3-6B-Base utilizes a more diverse training dataset and refined training strategies, claiming superior performance among models under 10B parameters. It supports native tool calling, code execution, and agent tasks through a new prompt format. The series includes variants with extended context windows (32K and 128K) for better long-text comprehension.
Quick Start & Requirements
- Install via
pip install -r requirements.txt
. Ensure correct PyTorch version.
- GPU with at least 13GB VRAM recommended for FP16. 4-bit quantization is available for lower VRAM. CPU inference requires ~32GB RAM.
- Mac users can utilize the MPS backend with PyTorch-Nightly.
- Official documentation: https://github.com/THUDM/ChatGLM3
- Demos: Gradio and Streamlit web demos, CLI demo, OpenAI API compatible server.
Highlighted Details
- ChatGLM3-6B-Base outperforms other models under 10B parameters on benchmarks like GSM8K (72.3), MATH (25.7), and BBH (66.1).
- ChatGLM3-6B-32K shows over 50% improvement in long-text applications compared to its predecessor.
- Supports tool calling, code interpreter, and agent tasks.
- Offers an OpenAI-compatible API server for easy integration.
Maintenance & Community
- Developed by THUDM and Zhipu AI.
- Community channels: Discord, WeChat.
- Latest updates often on Huggingface.
Licensing & Compatibility
- Weights are fully open for academic research.
- Free commercial use is permitted upon registration via a questionnaire.
- Users are urged to adhere to the open-source agreement and avoid harmful applications.
Limitations & Caveats
- The project team has not developed any official applications (web, mobile, desktop) for the open-source models.
- Due to model size and probabilistic nature, output accuracy is not guaranteed and can be influenced by user input.
- The project disclaims responsibility for data security, public opinion risks, or misuse arising from model outputs.