GLM-4 by zai-org

Open multilingual multimodal chat LMs for dialogue, reasoning, and rumination

Created 1 year ago

7,025 stars

Top 7.2% on SourcePulse

View on GitHub

1 Expert Loves This Project

Yaowei Zheng

Author of LLaMA-Factory

Project Summary

The GLM-4 series offers open-source, multilingual, multimodal chat Large Language Models (LLMs) designed for dialogue, reasoning, and agent tasks. Targeting researchers and developers, these models provide competitive performance against leading proprietary models, with a focus on user-friendly local deployment and extended context capabilities.

How It Works

The GLM-4 models are built on a foundation of extensive pre-training (up to 15T tokens) incorporating reasoning-focused synthetic data. Post-training employs human preference alignment for dialogue, alongside techniques like rejection sampling and reinforcement learning to enhance instruction following, code generation, and function calling. Specialized variants like GLM-Z1-Rumination-32B-0414 utilize scaled end-to-end reinforcement learning with rubric-graded responses and tool usage for complex, open-ended problem-solving.

Quick Start & Requirements

Models are available via Hugging Face, ModelScope, and WiseModel.
Supports long context via YaRN (Rope Scaling) for context lengths up to 1M tokens (e.g., GLM-4-9B-Chat-1M).
Fine-tuning scripts are provided, requiring transformers and torch. Example fine-tuning command: cd finetune && pip install -r ../inference/requirements.txt && pip install -r requirements.txt && python finetune.py data/AdvertiseGen/ THUDM/GLM-4-9B-0414 configs/lora.yaml.
See Model and Prompt Implementation for detailed usage and YaRN configuration.

Highlighted Details

GLM-4-32B-0414 achieves comparable performance to GPT-4o and DeepSeek-V3-0324 on benchmarks like SimpleQA and HotpotQA.
Models demonstrate strong capabilities in code generation, artifact generation, function calling, and research-style writing.
GLM-Z1-9B-0414 offers top-ranked performance among open-source models of its size, balancing efficiency and effectiveness.
Supports tool usage with predefined functions for search, click, open, and finish.

Maintenance & Community

Active development with recent releases in April 2025 and June 2024.
Community channels include Discord and X (formerly Twitter).
Technical Report available.

Licensing & Compatibility

Models are released under a permissive license, allowing for commercial use and integration into closed-source projects. (License type not explicitly stated but implied by availability on Hugging Face and commercial use mentions).

Limitations & Caveats

GLM-Z1-Rumination-32B-0414 does not support custom system prompts or tools; requires external search or retrieval APIs.
The GLM-4-9B-0414 model is optimized for batch operations and has not undergone the same agent capability enhancements as the 32B models.

Health Check

Last Commit

6 months ago

Responsiveness

1 day

Pull Requests (30d)

Issues (30d)

Star History

75 stars in the last 30 days