SandboxFusion by bytedance

Secure code sandbox for LLM-generated code execution and evaluation

Created 1 year ago

930 stars

Top 39.1% on SourcePulse

Project Summary

SandboxFusion provides a secure, containerized environment for executing and evaluating code generated by Large Language Models (LLMs). It is designed for researchers and developers working with LLM-based code generation, offering support for numerous programming languages and popular code evaluation benchmarks.

How It Works

The system utilizes Docker containers to isolate code execution, ensuring security and reproducibility. It supports a wide array of languages including Python, C++, Java, Go, Node.js, and even CUDA for GPU acceleration. SandboxFusion also integrates with various code evaluation datasets like HumanEval, MultiPL-E, and MBPP, facilitating robust benchmarking of LLM-generated code.

Quick Start & Requirements

Installation: Via Docker or manual setup with conda and poetry.
Prerequisites: conda, poetry. For Docker, a base image is provided, with instructions to customize the server image.
Resources: Requires Docker or a Python 3.12 environment with conda and poetry.
Documentation: https://bytedance.github.io/SandboxFusion/

Highlighted Details

Supports 20+ programming languages, including Python and CUDA with GPU acceleration.
Includes implementations for numerous LLM code evaluation benchmarks (HumanEval, MBPP, etc.).
Offers both a code runner and an online judge for evaluation and RL datasets.
Provides comprehensive testing utilities for development and validation.

Maintenance & Community

The project lists several contributors from Bytedance. Further community engagement details (e.g., Discord, Slack) are not specified in the README.

Licensing & Compatibility

Licensed under the Apache License, Version 2.0. This license is permissive and generally compatible with commercial use and closed-source linking.

Limitations & Caveats

The README does not detail specific limitations, unsupported platforms, or known issues. The setup for specific language runtimes requires manual execution of provided shell scripts.

SandboxFusion by bytedance

Explore Similar Projects

nanolang by jordanhubbard

marsha by alantech

mcp-server-code-execution-mode by elusznik

tamingLLMs by souzatharsis

anynode by lks-ai

AgentRun by tjmlabs

llm-sandbox by vndee

kit by cased

Llama2-Code-Interpreter by SeungyounShin

engshell by emcf

monty by pydantic

aipyapp by knownsec