Code LLM for code generation, completion, and question answering
Top 26.3% on sourcepulse
CodeShell is a 7B parameter, multilingual code large language model developed by PKU-KCL and Sichuan Tianfu Bank AI Team. It offers strong performance on code generation and understanding tasks, targeting developers seeking efficient coding assistance. The project provides a full-stack solution including models, IDE plugins, and deployment options, aiming to enhance software development workflows.
How It Works
CodeShell is built on a GPT-2 architecture, incorporating Grouped-Query Attention and RoPE positional embeddings. It was trained on 500 billion tokens of data, including GitHub, Stack, and StarCoder datasets, with rigorous deduplication and filtering. The model features an optimized tokenizer that improves Chinese language compression and supports an 8192 token context window.
Quick Start & Requirements
pip install -r requirements.txt
Highlighted Details
Maintenance & Community
The project is actively developed by PKU-KCL. Community discussions and support are available via GitHub issues for the main repository and associated plugins.
Licensing & Compatibility
The models are released under a custom license that permits commercial use under specific conditions: daily active users must not exceed 1 million, the entity cannot be a software or cloud service provider, and re-licensing is prohibited without permission. An application process is required for commercial use. The project also references the Apache 2.0 license.
Limitations & Caveats
Commercial use requires explicit permission via an email application process, which may introduce delays or restrictions. While performance is strong on benchmarks, real-world effectiveness may vary. The project mentions a multi-task evaluation system is "coming soon."
1 year ago
1 day