ChatGLM3  by zai-org

Bilingual chat LLM for complex scenarios (tool use, code execution, agents)

Created 1 year ago
13,743 stars

Top 3.6% on SourcePulse

GitHubView on GitHub
Project Summary

ChatGLM3 is a series of open bilingual conversational large language models developed by Zhipu AI and Tsinghua University KEG Lab. It offers enhanced base models, improved functionality including tool calling and code interpretation, and extended context lengths (up to 128K tokens). The models are suitable for researchers and developers looking for powerful, deployable conversational AI with commercial use permissions.

How It Works

ChatGLM3-6B-Base utilizes a more diverse training dataset and refined training strategies, claiming superior performance among models under 10B parameters. It supports native tool calling, code execution, and agent tasks through a new prompt format. The series includes variants with extended context windows (32K and 128K) for better long-text comprehension.

Quick Start & Requirements

  • Install via pip install -r requirements.txt. Ensure correct PyTorch version.
  • GPU with at least 13GB VRAM recommended for FP16. 4-bit quantization is available for lower VRAM. CPU inference requires ~32GB RAM.
  • Mac users can utilize the MPS backend with PyTorch-Nightly.
  • Official documentation: https://github.com/THUDM/ChatGLM3
  • Demos: Gradio and Streamlit web demos, CLI demo, OpenAI API compatible server.

Highlighted Details

  • ChatGLM3-6B-Base outperforms other models under 10B parameters on benchmarks like GSM8K (72.3), MATH (25.7), and BBH (66.1).
  • ChatGLM3-6B-32K shows over 50% improvement in long-text applications compared to its predecessor.
  • Supports tool calling, code interpreter, and agent tasks.
  • Offers an OpenAI-compatible API server for easy integration.

Maintenance & Community

  • Developed by THUDM and Zhipu AI.
  • Community channels: Discord, WeChat.
  • Latest updates often on Huggingface.

Licensing & Compatibility

  • Weights are fully open for academic research.
  • Free commercial use is permitted upon registration via a questionnaire.
  • Users are urged to adhere to the open-source agreement and avoid harmful applications.

Limitations & Caveats

  • The project team has not developed any official applications (web, mobile, desktop) for the open-source models.
  • Due to model size and probabilistic nature, output accuracy is not guaranteed and can be influenced by user input.
  • The project disclaims responsibility for data security, public opinion risks, or misuse arising from model outputs.
Health Check
Last Commit

8 months ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
32 stars in the last 30 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Simon Willison Simon Willison(Coauthor of Django), and
10 more.

Yi by 01-ai

0%
8k
Open-source bilingual LLMs trained from scratch
Created 1 year ago
Updated 9 months ago
Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Jeff Hammerbacher Jeff Hammerbacher(Cofounder of Cloudera), and
4 more.

ChatGLM-6B by zai-org

0.0%
41k
Bilingual dialogue language model for research
Created 2 years ago
Updated 1 year ago
Feedback? Help us improve.