OpenCodeInterpreter  by OpenCodeInterpreter

Open-source code generation system for bridging LLMs and code interpreters

created 1 year ago
1,670 stars

Top 25.9% on sourcepulse

GitHubView on GitHub
Project Summary

OpenCodeInterpreter provides a suite of open-source code generation models that integrate execution and iterative refinement, aiming to rival proprietary systems like GPT-4 Code Interpreter. It's designed for developers and researchers seeking enhanced code generation capabilities with built-in error correction and performance improvement through execution feedback.

How It Works

The system leverages large language models trained on extensive code datasets and incorporates a feedback loop where generated code is executed. Errors or suboptimal outputs from execution are fed back to the model, enabling it to refine the code iteratively. This approach, particularly the integration of execution feedback, demonstrably improves performance on benchmarks like HumanEval and MBPP.

Quick Start & Requirements

  • Install: Clone the repository, navigate to the demo directory, create and activate a conda environment (conda create -n demo python=3.10, conda activate demo), install requirements (pip install -r requirements.txt).
  • Prerequisites: Python 3.10, Conda, Hugging Face account with write permissions for HF_TOKEN environment variable.
  • Run: python3 chatbot.py --path "model_name" (e.g., m-a-p/OpenCodeInterpreter-DS-6.7B).
  • Resources: Requires local execution environment and Hugging Face Hub access.
  • Demo: OpenCodeInterpreter Demo README

Highlighted Details

  • Models achieve top rankings on the BigCode leaderboard (33B model).
  • Performance gains are quantified on HumanEval and MBPP benchmarks, showing significant uplift with execution feedback.
  • Supports multiple model families (DS, CL, GM, SC2) with varying parameter counts.
  • Utilizes a 68K multi-turn interaction dataset (Code-Feedback) for training and refinement.

Maintenance & Community

Licensing & Compatibility

  • Models are open-sourced on Hugging Face.
  • Specific license details for the models and code are not explicitly stated in the README, but the project is hosted on GitHub under an unspecified license. Compatibility for commercial use or closed-source linking requires clarification.

Limitations & Caveats

The README mentions that performance gains are based on single-iteration feedback, implying that unrestricted iterations might yield different results. The specific license for the codebase and models needs verification for commercial applications.

Health Check
Last commit

1 year ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
33 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.