CodeGen2  by salesforce

Program synthesis research release (ICLR 2023)

created 2 years ago
272 stars

Top 95.5% on sourcepulse

GitHubView on GitHub
Project Summary

CodeGen2 provides official research releases of large language models (LLMs) for program synthesis, specifically addressing the challenges of training LLMs on both programming and natural languages. It targets researchers and developers working on code generation, autocompletion, and other program synthesis tasks, offering models ranging from 1 billion to 16 billion parameters.

How It Works

CodeGen2 utilizes an auto-regressive sampling approach for program synthesis. The models are trained on a diverse dataset encompassing both natural language and programming languages, enabling them to generate code based on natural language descriptions or to fill in code snippets. This dual-language training is a key differentiator, allowing for more versatile and context-aware code generation.

Quick Start & Requirements

  • Install/Run: Use Hugging Face Transformers library.
  • Prerequisites: PyTorch, Hugging Face Transformers.
  • Resources: Requires significant GPU memory for larger models (e.g., 16B parameter model).
  • Docs: Hugging Face Hub for model cards and usage examples.

Highlighted Details

  • Offers four model sizes: 1B, 3.7B, 7B, and 16B parameters.
  • Supports both causal and infill sampling for program synthesis.
  • Models are available on Hugging Face Hub for easy integration.
  • Research presented at ICLR 2023.

Maintenance & Community

  • Developed by Salesforce Research.
  • No explicit community links (Discord, Slack) or roadmap provided in the README.

Licensing & Compatibility

  • The README does not explicitly state a license. However, models released by Salesforce Research on Hugging Face are typically under a permissive license (e.g., Apache 2.0 or MIT), but this should be verified on the specific model card.

Limitations & Caveats

The README does not detail specific limitations, performance benchmarks, or known issues. The significant hardware requirements for larger models may be a barrier to entry for some users.

Health Check
Last commit

2 years ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
0 stars in the last 90 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems) and Jiayi Pan Jiayi Pan(Author of SWE-Gym; AI Researcher at UC Berkeley).

DeepSeek-Coder-V2 by deepseek-ai

0.4%
6k
Open-source code language model comparable to GPT4-Turbo
created 1 year ago
updated 10 months ago
Feedback? Help us improve.