CodeT5  by salesforce

Code LLMs for code understanding and generation research

created 4 years ago
3,034 stars

Top 16.1% on sourcepulse

GitHubView on GitHub
Project Summary

CodeT5 and CodeT5+ are open-source large language models from Salesforce Research designed for code understanding and generation. They aim to boost developer productivity by providing capabilities like text-to-code generation, code autocompletion, and code summarization, functioning as an AI-powered coding assistant.

How It Works

CodeT5 utilizes a unified encoder-decoder architecture, pre-trained on a massive corpus of code and natural language. Its identifier-aware approach enhances understanding of code structure and semantics. CodeT5+ builds upon this foundation with further architectural improvements and training strategies, as detailed in its associated research papers.

Quick Start & Requirements

Models are available on HuggingFace. Installation typically involves transformers library. Specific requirements depend on the model size and task, often including Python and PyTorch.

Highlighted Details

  • Supports text-to-code generation, code autocompletion, and code summarization.
  • CodeT5+ models released in May 2023.
  • CodeRL paper and associated checkpoints released in July 2022.
  • Models are available for various tasks, including multilingual code summarization.

Maintenance & Community

The project is actively maintained by Salesforce Research. Users can get involved by creating GitHub issues or submitting Pull Requests. Contact is encouraged via email for application sharing.

Licensing & Compatibility

Released under the BSD-3 License. However, usage is restricted from promoting or profiting from harmful activities. Commercial use is permitted, but users are encouraged to document high-stakes applications.

Limitations & Caveats

While powerful, the models are research releases. Specific performance and limitations are detailed in the associated academic papers. Users are encouraged to report applications and use appropriate documentation for high-stakes scenarios.

Health Check
Last commit

1 year ago

Responsiveness

1 week

Pull Requests (30d)
0
Issues (30d)
0
Star History
69 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.