CodeGen2  by salesforce

Program synthesis research release (ICLR 2023)

Created 2 years ago
271 stars

Top 95.0% on SourcePulse

GitHubView on GitHub
Project Summary

CodeGen2 provides official research releases of large language models (LLMs) for program synthesis, specifically addressing the challenges of training LLMs on both programming and natural languages. It targets researchers and developers working on code generation, autocompletion, and other program synthesis tasks, offering models ranging from 1 billion to 16 billion parameters.

How It Works

CodeGen2 utilizes an auto-regressive sampling approach for program synthesis. The models are trained on a diverse dataset encompassing both natural language and programming languages, enabling them to generate code based on natural language descriptions or to fill in code snippets. This dual-language training is a key differentiator, allowing for more versatile and context-aware code generation.

Quick Start & Requirements

  • Install/Run: Use Hugging Face Transformers library.
  • Prerequisites: PyTorch, Hugging Face Transformers.
  • Resources: Requires significant GPU memory for larger models (e.g., 16B parameter model).
  • Docs: Hugging Face Hub for model cards and usage examples.

Highlighted Details

  • Offers four model sizes: 1B, 3.7B, 7B, and 16B parameters.
  • Supports both causal and infill sampling for program synthesis.
  • Models are available on Hugging Face Hub for easy integration.
  • Research presented at ICLR 2023.

Maintenance & Community

  • Developed by Salesforce Research.
  • No explicit community links (Discord, Slack) or roadmap provided in the README.

Licensing & Compatibility

  • The README does not explicitly state a license. However, models released by Salesforce Research on Hugging Face are typically under a permissive license (e.g., Apache 2.0 or MIT), but this should be verified on the specific model card.

Limitations & Caveats

The README does not detail specific limitations, performance benchmarks, or known issues. The significant hardware requirements for larger models may be a barrier to entry for some users.

Health Check
Last Commit

2 years ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
0 stars in the last 30 days

Explore Similar Projects

Starred by Yineng Zhang Yineng Zhang(Inference Lead at SGLang; Research Scientist at Together AI), Stas Bekman Stas Bekman(Author of "Machine Learning Engineering Open Book"; Research Engineer at Snowflake), and
3 more.

prompt-lookup-decoding by apoorvumang

0.2%
566
Decoding method for faster LLM generation
Created 1 year ago
Updated 1 year ago
Starred by Shizhe Diao Shizhe Diao(Author of LMFlow; Research Scientist at NVIDIA), Tri Dao Tri Dao(Chief Scientist at Together AI), and
1 more.

hnet by goombalab

1.5%
722
Hierarchical sequence modeling with dynamic chunking
Created 2 months ago
Updated 1 month ago
Starred by Didier Lopes Didier Lopes(Founder of OpenBB), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
3 more.

DeepSeek-Coder-V2 by deepseek-ai

0.3%
6k
Open-source code language model comparable to GPT4-Turbo
Created 1 year ago
Updated 11 months ago
Feedback? Help us improve.