GLM-4.5 by zai-org

Foundation models for intelligent agents

Created 5 months ago

3,738 stars

Top 12.9% on SourcePulse

1 Expert Loves This Project

merrymercy

Coauthor of SGLang, vLLM

Project Summary

GLM-4.5 is an open-source series of large language models designed for intelligent agents, offering both a 355B parameter (GLM-4.5) and a more efficient 106B parameter (GLM-4.5-Air) variant. These models unify reasoning, coding, and agent capabilities, featuring a hybrid reasoning approach with distinct "thinking" and "non-thinking" modes for complex tasks and immediate responses, respectively.

How It Works

GLM-4.5 models are hybrid reasoning systems that combine a large base model with specialized reasoning and tool-use capabilities. They employ a dual-mode architecture: "thinking mode" for intricate problem-solving and tool integration, and "non-thinking mode" for faster, direct responses. This approach aims to balance computational depth with response latency, making them suitable for diverse agentic applications.

Quick Start & Requirements

Installation: pip install -r requirements.txt transformers
Inference Frameworks: Supports transformers, vLLM, and SGLang.
Hardware:
- BF16: GLM-4.5 requires 16x H100 or 8x H200 GPUs; GLM-4.5-Air requires 4x H100 or 2x H200 GPUs.
- FP8: GLM-4.5 requires 8x H100 or 4x H200 GPUs; GLM-4.5-Air requires 2x H100 or 1x H200 GPUs.
- Context Length: Full 128K context requires double the GPU counts listed above.
- Memory: Server memory must exceed 1TB for normal operation.
Fine-tuning: Supports LoRA and SFT/RL via Llama Factory and Swift.
Links: Hugging Face, ModelScope, Technical Blog

Highlighted Details

Achieves 63.2 on 12 industry benchmarks, ranking 3rd among all models.
GLM-4.5-Air offers competitive 59.8 performance with superior efficiency.
Open-sourced base, hybrid reasoning, and FP8 versions.
Supports 128K context length.

Maintenance & Community

Community channels: WeChat, Discord.
API services available on Z.ai API Platform and Zhipu AI Open Platform.

Licensing & Compatibility

MIT License. Permissive for commercial use and secondary development.

Limitations & Caveats

Inference requires substantial high-end GPU resources (e.g., 8x H100 for FP8 GLM-4.5-Air).
FP8 inference requires hardware natively supporting FP8.
Flash infer issues may require specific environment variable configurations.

Health Check

Last Commit

2 weeks ago

Responsiveness

Inactive

Pull Requests (30d)

5

Issues (30d)

10

Star History

477 stars in the last 30 days

Explore Similar Projects

Agent-KB by OPPO-PersonalAI

Agent KB: Cross-domain problem-solving with hierarchical memory

Created 7 months ago

Updated 4 months ago

Awesome-RAG-Reasoning by DavidZWZ

Curated resources for Retrieval-Augmented Generation (RAG) and Reasoning in LLMs

Created 6 months ago

Updated 5 months ago

Starred by

Elvis Saravia

Elvis Saravia(Founder of DAIR.AI).

CoT-Igniting-Agent by Zoeyyao27

Paper list for chain-of-thought reasoning and language agents

Created 2 years ago

Updated 2 years ago

Starred by

Elvis Saravia

Elvis Saravia(Founder of DAIR.AI).

MiMo-V2-Flash by XiaomiMiMo

Efficient MoE foundation model for reasoning, coding, and agents

Created 3 weeks ago

Updated 3 days ago

AgentKit by Holmeswww

Agentic framework for LLM prompting via natural language

Created 1 year ago

Updated 1 year ago

Awesome-Reasoning-Foundation-Models by reasoning-survey

Survey of reasoning with foundation models

Created 2 years ago

Updated 6 months ago

OpenCUA by xlang-ai

Framework for computer-use agents

Created 6 months ago

Updated 2 days ago

Starred by

Phil Wang

Phil Wang(Prolific Research Paper Implementer),

Vincent Weisser

Vincent Weisser(Cofounder of Prime Intellect), and

1 more.

system-2-research by open-thought

Reasoning resources for AI systems, agents, and cognitive architectures

Created 1 year ago

Updated 10 months ago

RAT-retrieval-augmented-thinking by Doriandarko

AI tool enhancing responses via structured reasoning and retrieval

Created 11 months ago

Updated 11 months ago

agent-sdk-go by Ingenimax

Go framework for building production-ready AI agents

Created 10 months ago

Updated 5 days ago

Starred by

Tony Lee

Tony Lee(Author of HELM; Research Engineer at Meta),

Vincent Weisser

Vincent Weisser(Cofounder of Prime Intellect), and

16 more.

Qwen3 by QwenLM

Large language model series by Qwen team, Alibaba Cloud

Created 1 year ago

Updated 2 days ago

Starred by

Hiroshi Shibata

Hiroshi Shibata(Core Contributor to Ruby),

Marc Klingen

Marc Klingen(Cofounder of Langfuse), and

12 more.

agno by agno-agi

Lightweight library for building AI Agents with memory, knowledge, and reasoning

Created 3 years ago

Updated 17 hours ago

Feedback? Help us improve.