ChatGLM3  by zai-org

Bilingual chat LLM for complex scenarios (tool use, code execution, agents)

created 1 year ago
13,734 stars

Top 3.7% on sourcepulse

GitHubView on GitHub
Project Summary

ChatGLM3 is a series of open bilingual conversational large language models developed by Zhipu AI and Tsinghua University KEG Lab. It offers enhanced base models, improved functionality including tool calling and code interpretation, and extended context lengths (up to 128K tokens). The models are suitable for researchers and developers looking for powerful, deployable conversational AI with commercial use permissions.

How It Works

ChatGLM3-6B-Base utilizes a more diverse training dataset and refined training strategies, claiming superior performance among models under 10B parameters. It supports native tool calling, code execution, and agent tasks through a new prompt format. The series includes variants with extended context windows (32K and 128K) for better long-text comprehension.

Quick Start & Requirements

  • Install via pip install -r requirements.txt. Ensure correct PyTorch version.
  • GPU with at least 13GB VRAM recommended for FP16. 4-bit quantization is available for lower VRAM. CPU inference requires ~32GB RAM.
  • Mac users can utilize the MPS backend with PyTorch-Nightly.
  • Official documentation: https://github.com/THUDM/ChatGLM3
  • Demos: Gradio and Streamlit web demos, CLI demo, OpenAI API compatible server.

Highlighted Details

  • ChatGLM3-6B-Base outperforms other models under 10B parameters on benchmarks like GSM8K (72.3), MATH (25.7), and BBH (66.1).
  • ChatGLM3-6B-32K shows over 50% improvement in long-text applications compared to its predecessor.
  • Supports tool calling, code interpreter, and agent tasks.
  • Offers an OpenAI-compatible API server for easy integration.

Maintenance & Community

  • Developed by THUDM and Zhipu AI.
  • Community channels: Discord, WeChat.
  • Latest updates often on Huggingface.

Licensing & Compatibility

  • Weights are fully open for academic research.
  • Free commercial use is permitted upon registration via a questionnaire.
  • Users are urged to adhere to the open-source agreement and avoid harmful applications.

Limitations & Caveats

  • The project team has not developed any official applications (web, mobile, desktop) for the open-source models.
  • Due to model size and probabilistic nature, output accuracy is not guaranteed and can be influenced by user input.
  • The project disclaims responsibility for data security, public opinion risks, or misuse arising from model outputs.
Health Check
Last commit

6 months ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
144 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.