verilog-eval by NVlabs

Evaluation harness for LLMs on Verilog code generation and spec-to-RTL tasks

Created 2 years ago

362 stars

Top 77.6% on SourcePulse

Project Summary

This repository provides an evaluation harness for benchmarking Large Language Models (LLMs) on Verilog hardware description language (HDL) code generation tasks. It targets researchers and engineers evaluating LLMs for hardware design automation, offering improved prompts, support for specification-to-RTL tasks, and detailed error analysis.

How It Works

The harness utilizes a Makefile to orchestrate the evaluation workflow, supporting two primary tasks: code-complete-iccad2023 and spec-to-rtl. It manages datasets as plain text files and allows flexible configuration of LLM parameters such as model choice, in-context learning examples (0-4 shots), number of samples, temperature, and top-p. The evaluation process involves generating Verilog code from LLM prompts and then verifying its correctness using iverilog and verilator.

Quick Start & Requirements

Install: make
Prerequisites:
- iverilog (v12, v13 not supported)
- verilator
- python3 (v3.11.0 recommended, e.g., via conda create -n codex python=3.11)
- Python packages: langchain, langchain-openai, langchain-nvidia-ai-endpoints
Setup: Requires manual installation of iverilog from source (v12 branch).
Docs: VerilogEval Overview

Highlighted Details

Supports specification-to-RTL tasks in addition to code completion.
Includes in-context learning examples and reframed prompts for improved LLM performance.
Analyzes and categorizes common iverilog compilation errors.
Reports Pass@1 metrics for both low (temp=0) and high (temp=0.85) temperature settings.

Maintenance & Community

The project is associated with NVlabs and has published research papers detailing its methodology and findings. Links to relevant papers are provided for citation.

Licensing & Compatibility

The repository does not explicitly state a license in the README. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The project is currently Linux-only and requires manual compilation of a specific iverilog version (v12). MachineEval is not supported, and the original Pass@10 metric is no longer reported. A Dockerfile and prebuilt JSONL support are planned but not yet available.

verilog-eval by NVlabs

Explore Similar Projects

godot-dodo by minosvasilias

llm-verified-with-monte-carlo-tree-search by namin

monitors4codegen by microsoft

code-eval by abacaj

Seed-Coder by ByteDance-Seed

LLMDebugger by FloridSleeves

cwm by facebookresearch

prometheus-eval by prometheus-eval

weave by wandb

LiveCodeBench by LiveCodeBench

Awesome-Code-LLM by codefuse-ai

DeepSeek-Coder by deepseek-ai