HelloGML by Hello-Application-XH

Cloudflare Worker API layer for 智谱清言 models

Created 2 months ago

327 stars

Top 83.1% on SourcePulse

Project Summary

This project provides a Cloudflare Worker acting as an API gateway for Zhipu Qingyan's (chatglm.cn) private models. It translates Zhipu's proprietary API into standard OpenAI, Claude, and Gemini protocols, enabling broad client compatibility. The service is ideal for developers and users seeking to integrate powerful GLM models into existing LLM applications, offering features like streaming chat, AI image and video generation, and robust tool calling capabilities, all managed through a flexible token pooling system.

How It Works

The core architecture utilizes a Cloudflare Worker to intercept client requests. It validates API keys against a Cloudflare KV store, then selects an available Zhipu refresh_token from a dynamically managed pool using a round-robin strategy. The Worker obtains an access_token, constructs a signed request to the chatglm.cn private API, and streams the response back. Crucially, it converts Zhipu's SSE stream into the requested protocol format (OpenAI, Claude, or Gemini), ensuring seamless integration with various clients. This approach decouples authentication (API keys) from resource access (refresh_tokens), allowing for centralized management and efficient utilization of Zhipu's model resources.

Quick Start & Requirements

Primary Install/Run: Local development involves cd cf-worker, npm install, and npx wrangler dev --local. Deployment requires npx wrangler deploy after configuring Cloudflare KV and wrangler.toml.
Prerequisites: Node.js 18+, a Cloudflare account, and a Zhipu Qingyan account with a chatglm_refresh_token.
Setup: Local setup is quick via npm. Deployment requires Cloudflare setup and obtaining a refresh_token from chatglm.cn via browser developer tools.
Links: Mentions compatibility with NextChat, LobeChat, and Dify.

Highlighted Details

Multi-Protocol Support: Simultaneously compatible with OpenAI (/v1/chat/completions), Claude (/v1/messages), and Gemini (/v1beta/models/...) request formats.
Advanced AI Features: Supports AI image generation (text-to-image, image-to-image) and video generation (text-to-video, image-to-video) with style and atmosphere controls.
Function Calling: Full implementation of Function Calling compatible with OpenAI and Claude formats, enabling integration with clients like claude-code and open-code.
Dynamic Token Management: API keys authenticate users, while a shared pool of refresh_tokens manages access to Zhipu resources, allowing for efficient, centralized control and automatic failover.
Network Search & Reasoning: Models can trigger web searches, with results and thought processes (reasoning_content) returned in the stream.

Maintenance & Community

The project is currently in a state of flux, with an unstable "auto" branch offering experimental features like automatic token acquisition. The next planned update is May 19th, after which bug fixes and updates will resume. No specific community links (Discord, Slack) are provided, beyond a mention of the "Linux.do 社区".

Licensing & Compatibility

The repository README does not specify a software license. This absence makes it impossible to definitively assess compatibility for commercial use or closed-source linking without further clarification.

Limitations & Caveats

Tool call reliability is dependent on the model's adherence to prompt instructions, with complex parameters or vague descriptions potentially leading to failures. The auto branch is explicitly marked as unstable. The .workers.dev domain may face accessibility issues in mainland China, recommending custom domain binding. No license is specified, hindering commercial adoption assessment.

Health Check

Last Commit

2 months ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

13 stars in the last 30 days