CodeGeeX  by zai-org

Code generation model for multilingual programming

created 2 years ago
8,580 stars

Top 6.0% on sourcepulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

CodeGeeX is a 13-billion parameter, open-source, multilingual code generation model designed for tasks like code completion, translation, and summarization. It targets developers and researchers seeking to leverage large language models for programming assistance and evaluation across multiple languages.

How It Works

CodeGeeX is a transformer-based decoder-only model trained on a corpus of over 158.7 billion tokens spanning 23 programming languages. It utilizes a vocabulary of 50,400 tokens, processing whitespaces as separate tokens. The model architecture features 40 transformer layers with a hidden size of 5,120 and an expanded feed-forward layer size of 20,480, supporting a maximum sequence length of 2,048.

Quick Start & Requirements

  • Installation: pip install -e . or use the provided Docker image (docker pull codegeex/codegeex:latest).
  • Prerequisites: Python 3.7+, CUDA 11+, PyTorch 1.10+, DeepSpeed 0.6+.
  • Model Weights: Requires application and download (~26GB).
  • Inference: Supports single GPU (27GB+ RAM), quantized inference (15GB+ RAM), and multi-GPU inference (<6GB RAM per GPU).
  • Resources: Official VS Code and Jetbrains extensions are available.
  • Links: Homepage, DEMO, Model Weights, Paper, HumanEval-X.

Highlighted Details

  • Achieves state-of-the-art average performance on the HumanEval-X benchmark for multilingual code generation.
  • Supports cross-lingual code translation between 5 languages (Python, C++, Java, JavaScript, Go).
  • Offers IDE extensions for VS Code and Jetbrains for integrated coding assistance.
  • Quantized and model-parallel inference options reduce GPU memory requirements.

Maintenance & Community

  • Active development with releases like CodeGeeX2 and CodeGeeX4.
  • Community engagement via Discord, Slack, and Telegram.
  • Supported by Tsinghua University (KEG, IIIS), Peng Cheng Laboratory, and Zhipu.AI.

Licensing & Compatibility

  • Code is licensed under Apache-2.0.
  • Model weights have a separate, unspecified license. Commercial use may require clarification.

Limitations & Caveats

The model weights license is not explicitly detailed, potentially impacting commercial use. While competitive, performance can vary across language pairs for translation tasks.

Health Check
Last commit

11 months ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
154 stars in the last 90 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Travis Fischer Travis Fischer(Founder of Agentic), and
6 more.

codellama by meta-llama

0.1%
16k
Inference code for CodeLlama models
created 1 year ago
updated 11 months ago
Feedback? Help us improve.