OpenCodeInterpreter by OpenCodeInterpreter

Open-source code generation system for bridging LLMs and code interpreters

Created 1 year ago

1,696 stars

Top 24.8% on SourcePulse

View on GitHub

2 Experts Love This Project

Casper Hansen

Author of AutoAWQ

Pawel Garbacki

Cofounder of Fireworks AI

Project Summary

OpenCodeInterpreter provides a suite of open-source code generation models that integrate execution and iterative refinement, aiming to rival proprietary systems like GPT-4 Code Interpreter. It's designed for developers and researchers seeking enhanced code generation capabilities with built-in error correction and performance improvement through execution feedback.

How It Works

The system leverages large language models trained on extensive code datasets and incorporates a feedback loop where generated code is executed. Errors or suboptimal outputs from execution are fed back to the model, enabling it to refine the code iteratively. This approach, particularly the integration of execution feedback, demonstrably improves performance on benchmarks like HumanEval and MBPP.

Quick Start & Requirements

Install: Clone the repository, navigate to the demo directory, create and activate a conda environment (conda create -n demo python=3.10, conda activate demo), install requirements (pip install -r requirements.txt).
Prerequisites: Python 3.10, Conda, Hugging Face account with write permissions for HF_TOKEN environment variable.
Run: python3 chatbot.py --path "model_name" (e.g., m-a-p/OpenCodeInterpreter-DS-6.7B).
Resources: Requires local execution environment and Hugging Face Hub access.
Demo: OpenCodeInterpreter Demo README

Highlighted Details

Models achieve top rankings on the BigCode leaderboard (33B model).
Performance gains are quantified on HumanEval and MBPP benchmarks, showing significant uplift with execution feedback.
Supports multiple model families (DS, CL, GM, SC2) with varying parameter counts.
Utilizes a 68K multi-turn interaction dataset (Code-Feedback) for training and refinement.

Maintenance & Community

Actively releasing new models (e.g., StarCoder2-based SC2 series).
Demo deployed on Hugging Face Spaces.
Contact via GitHub issues or email: xiangyue.work@gmail.com, zhengtianyu0428@gmail.com.

Licensing & Compatibility

Models are open-sourced on Hugging Face.
Specific license details for the models and code are not explicitly stated in the README, but the project is hosted on GitHub under an unspecified license. Compatibility for commercial use or closed-source linking requires clarification.

Limitations & Caveats

The README mentions that performance gains are based on single-iteration feedback, implying that unrestricted iterations might yield different results. The specific license for the codebase and models needs verification for commercial applications.

Health Check

Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

9 stars in the last 30 days