DeepSeek-Coder by deepseek-ai

Code LLM for code completion and generation

Created 2 years ago

22,617 stars

Top 1.9% on SourcePulse

View on GitHub

9 Experts Love This Project

Carol Willing

Core Contributor to CPython, Jupyter

Research Scientist at DeepSeek

and 5 more!

Project Summary

DeepSeek Coder is a suite of open-source code language models trained from scratch on 2 trillion tokens, comprising 87% code and 13% natural language. It offers models ranging from 1B to 33B parameters, designed for project-level code completion and infilling with a 16K context window. The models achieve state-of-the-art performance on various coding benchmarks, making them suitable for developers and researchers seeking advanced code generation capabilities.

How It Works

The models are pre-trained on a massive dataset of code and natural language, with a focus on project-level context. This is achieved through a 16K context window and a fill-in-the-blank task, enabling the models to understand and generate code across entire projects. Instruction-tuned variants are also available for conversational coding assistance.

Quick Start & Requirements

Install dependencies: pip install -r requirements.txt
Requires PyTorch and Hugging Face Transformers.
Demo available on Hugging Face Spaces.
Official models can be downloaded from Hugging Face Hub.

Highlighted Details

Outperforms existing open-source code LLMs on HumanEval, MBPP, and DS-1000 benchmarks.
7B model achieves performance comparable to CodeLlama-34B.
33B instruct model rivals GPT-3.5-turbo on HumanEval.
Supports over 100 programming languages.

Maintenance & Community

Active development and community support.
Links to Discord and WeChat for community interaction.
Resources available via awesome-deepseek-coder.

Licensing & Compatibility

Code repository licensed under MIT.
Model usage subject to a separate Model License.
Supports commercial use.

Limitations & Caveats

Quantization to GGUF (llama.cpp) and GPTQ (exllamav2) requires specific setup steps and potential PR merges.
Instruct models can perform code completion with a specific eos_token_id adjustment.

Health Check

Last Commit

2 months ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

170 stars in the last 30 days