CodeGeeX4  by zai-org

Code generation model for versatile AI software development

Created 1 year ago
2,180 stars

Top 20.7% on SourcePulse

GitHubView on GitHub
Project Summary

CodeGeeX4-ALL-9B is an open-source, multilingual code generation model designed for comprehensive AI software development tasks. It targets developers seeking a powerful, efficient, and versatile coding assistant, offering capabilities from code completion and interpretation to repository-level Q&A and function calling.

How It Works

This model is a fine-tuned version of GLM-4-9B, leveraging a large-scale, multilingual code dataset. Its architecture supports a 128K sequence length, enabling it to process extensive code contexts for tasks like repository-wide Q&A and "needle in a haystack" retrieval. It uniquely supports function calling, outperforming GPT-4 in execution success rates on this specific capability.

Quick Start & Requirements

  • Ollama: ollama run codegeex4 (requires Ollama 0.2+)
  • Huggingface Transformers: transformers>=4.39.0,<4.41.0
  • vLLM: vllm==0.5.1
  • Hardware: GPU recommended for optimal performance. CUDA 12 is supported.
  • Resources: Requires significant VRAM for the 9B model, especially with the 128K context.
  • Docs: Homepage, VS Code Extension, Jetbrains Extension, HF Demo

Highlighted Details

  • Achieves state-of-the-art performance for models under 10B parameters on benchmarks like BigCodeBench and NaturalCodeBench.
  • Supports a 128K context window, demonstrating 100% retrieval accuracy in "Code Needle In A Haystack" evaluations.
  • Unique function calling capability with higher execution success rates than GPT-4.
  • Offers extensions for VS Code and Jetbrains, and supports local deployment via Ollama and vLLM.

Maintenance & Community

The project is associated with THUDM (Tsinghua University). Community interaction channels are not explicitly listed in the README.

Licensing & Compatibility

  • Code: Apache-2.0 license.
  • Model Weights: Custom "Model License". Academic research is permitted. Commercial use requires registration via a provided form.

Limitations & Caveats

Commercial use of model weights is restricted and requires explicit registration. The README does not detail specific limitations regarding unsupported platforms or known bugs.

Health Check
Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)
1
Issues (30d)
5
Star History
71 stars in the last 30 days

Explore Similar Projects

Starred by Shizhe Diao Shizhe Diao(Author of LMFlow; Research Scientist at NVIDIA), Omar Khattab Omar Khattab(Coauthor of DSPy, ColBERT; Professor at MIT), and
5 more.

CodeXGLUE by microsoft

0.3%
2k
Benchmark for code intelligence tasks
Created 5 years ago
Updated 1 year ago
Feedback? Help us improve.