CodeGeeX4  by zai-org

Code generation model for versatile AI software development

created 1 year ago
2,062 stars

Top 22.0% on sourcepulse

GitHubView on GitHub
Project Summary

CodeGeeX4-ALL-9B is an open-source, multilingual code generation model designed for comprehensive AI software development tasks. It targets developers seeking a powerful, efficient, and versatile coding assistant, offering capabilities from code completion and interpretation to repository-level Q&A and function calling.

How It Works

This model is a fine-tuned version of GLM-4-9B, leveraging a large-scale, multilingual code dataset. Its architecture supports a 128K sequence length, enabling it to process extensive code contexts for tasks like repository-wide Q&A and "needle in a haystack" retrieval. It uniquely supports function calling, outperforming GPT-4 in execution success rates on this specific capability.

Quick Start & Requirements

  • Ollama: ollama run codegeex4 (requires Ollama 0.2+)
  • Huggingface Transformers: transformers>=4.39.0,<4.41.0
  • vLLM: vllm==0.5.1
  • Hardware: GPU recommended for optimal performance. CUDA 12 is supported.
  • Resources: Requires significant VRAM for the 9B model, especially with the 128K context.
  • Docs: Homepage, VS Code Extension, Jetbrains Extension, HF Demo

Highlighted Details

  • Achieves state-of-the-art performance for models under 10B parameters on benchmarks like BigCodeBench and NaturalCodeBench.
  • Supports a 128K context window, demonstrating 100% retrieval accuracy in "Code Needle In A Haystack" evaluations.
  • Unique function calling capability with higher execution success rates than GPT-4.
  • Offers extensions for VS Code and Jetbrains, and supports local deployment via Ollama and vLLM.

Maintenance & Community

The project is associated with THUDM (Tsinghua University). Community interaction channels are not explicitly listed in the README.

Licensing & Compatibility

  • Code: Apache-2.0 license.
  • Model Weights: Custom "Model License". Academic research is permitted. Commercial use requires registration via a provided form.

Limitations & Caveats

Commercial use of model weights is restricted and requires explicit registration. The README does not detail specific limitations regarding unsupported platforms or known bugs.

Health Check
Last commit

11 months ago

Responsiveness

1 week

Pull Requests (30d)
0
Issues (30d)
5
Star History
152 stars in the last 90 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Woosuk Kwon Woosuk Kwon(Author of vLLM), and
11 more.

WizardLM by nlpxucan

0.1%
9k
LLMs built using Evol-Instruct for complex instruction following
created 2 years ago
updated 1 month ago
Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Travis Fischer Travis Fischer(Founder of Agentic), and
6 more.

codellama by meta-llama

0.1%
16k
Inference code for CodeLlama models
created 1 year ago
updated 11 months ago
Feedback? Help us improve.