evmbench by paradigmxyz

LLM-powered benchmark for smart contract vulnerability detection

Created 4 months ago

434 stars

Top 67.9% on SourcePulse

View on GitHub

1 Expert Loves This Project

Georgios Konstantopoulos

CTO, General Partner at Paradigm

Project Summary

Summary

ParadigmXYZ/evmbench provides an automated benchmark and harness for discovering and exploiting smart contract vulnerabilities. It leverages LLM-driven agents to analyze Solidity code, offering a structured UI for uploading contracts, selecting analysis models, and receiving detailed vulnerability reports. This tool is designed for security researchers and developers seeking to enhance smart contract security through automated, AI-assisted auditing.

How It Works

The system employs a microservices architecture: a Next.js frontend interacts with a FastAPI backend, which manages job state via PostgreSQL, secrets via a dedicated service, and queues tasks using RabbitMQ. An "Instancer" service consumes RabbitMQ messages to launch worker containers (Docker locally, Kubernetes optionally). These workers execute LLM agents (e.g., Codex) within isolated environments, analyzing uploaded smart contracts using predefined prompts and toolkits like Foundry and Slither. Results are then processed and presented through the frontend. This approach allows for scalable, automated code analysis with LLM capabilities.

Quick Start & Requirements

Prerequisites: Docker, Bun.

Installation:

Build base and worker Docker images:

cd backend
docker build -t evmbench/base:latest -f docker/base/Dockerfile .
docker build -t evmbench/worker:latest -f docker/worker/Dockerfile .

Start backend stack (API, DB, MQ):

cp .env.example .env
docker compose up -d --build

Start frontend development server:
```
cd frontend
bun install
bun dev
```

Access: Frontend at http://127.0.0.1:3000, backend config at http://127.0.0.1:1337/v1/integration/frontend.
Dependencies: Managed via Docker Compose (PostgreSQL, RabbitMQ). OpenAI API key required for LLM access, with an optional proxy mode for enhanced security.

Highlighted Details

Utilizes LLM agents for automated smart contract vulnerability detection.
Worker containers are pre-configured with essential security analysis tools (Foundry, Slither).
Supports flexible LLM model selection and integration via a model_map.json.
Offers an optional OpenAI proxy mode to manage API key exposure within worker environments.
Configurable maximum audit runtime via EVM_BENCH_CODEX_TIMEOUT_SECONDS (default: 10800s).

Maintenance & Community

Development contributions are noted from members of the OtterSec team, including es3n1n, jktrn, TrixterTheTux, and sahuang. No specific community channels (e.g., Discord, Slack) or roadmap links are provided in the README.

Licensing & Compatibility

The repository includes a LICENSE file, indicating an open-source release. Specific license terms and compatibility for commercial use or integration with closed-source projects are not detailed in the provided README.

Limitations & Caveats

The worker runtime environment, processing untrusted uploaded code, must be treated as a security risk. The Kubernetes backend is an optional deployment target. The project focuses on a "detect-only" mode for LLM agents, and its current status (e.g., alpha, beta) is not explicitly stated.

Health Check

Last Commit

2 weeks ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

9 stars in the last 30 days