ChatGLM-6B by zai-org

Bilingual dialogue language model for research

Created 2 years ago

41,220 stars

Top 0.7% on SourcePulse

View on GitHub

6 Experts Love This Project

Chip Huyen

Author of "AI Engineering", "Designing Machine Learning Systems"

Jeff Hammerbacher

Cofounder of Cloudera

and 2 more!

Project Summary

ChatGLM-6B is an open-source, bilingual (Chinese/English) dialogue language model based on the General Language Model (GLM) architecture. It offers a 6.2 billion parameter model optimized for Chinese question answering and dialogue, with the ability to run on consumer-grade GPUs with as little as 6GB VRAM through quantization.

How It Works

The model is trained on approximately 1T tokens of Chinese and English data, incorporating techniques like supervised fine-tuning, self-feedback, and Reinforcement Learning from Human Feedback (RLHF). This approach aims to generate human-preference aligned responses. For customization, it supports efficient parameter fine-tuning via P-Tuning v2, requiring minimal VRAM.

Quick Start & Requirements

Install dependencies: pip install -r requirements.txt (transformers version >= 4.23.1, recommended 4.27.1).
For CPU inference with quantized models, install gcc and openmp.
GPU VRAM requirements: 6GB (INT4 inference), 8GB (INT8 inference), 13GB (FP16 inference).
Fine-tuning requires slightly more VRAM (7GB for INT4).
Official Demo: web_demo.py, cli_demo.py
API Deployment: api.py

Highlighted Details

Supports INT4 quantization for low VRAM usage (6GB).
Efficient parameter fine-tuning (P-Tuning v2) available.
Model weights are open for academic research and free for commercial use upon registration.
Recent updates include ChatGLM2-6B with improved performance and 32K context length, and VisualGLM-6B for multimodal capabilities.

Maintenance & Community

Active development with releases like CodeGeeX2, ChatGLM2-6B, and VisualGLM-6B.
Community links: Discord and WeChat (details in README).
Related projects and acceleration efforts are listed.

Licensing & Compatibility

Code licensed under Apache-2.0.
Model weights follow a separate Model License, allowing academic research and free commercial use after registration.

Limitations & Caveats

The model's 6B parameter size limits its factual recall, logical reasoning, and performance on complex tasks. It may generate biased or harmful content and has weaker English language capabilities compared to Chinese. The model is also susceptible to misdirection and has limited conversational robustness.

Health Check

Last Commit

1 year ago

Responsiveness

1 day

Pull Requests (30d)

Issues (30d)

Star History

100 stars in the last 30 days