verilog-eval  by NVlabs

Evaluation harness for LLMs on Verilog code generation and spec-to-RTL tasks

created 1 year ago
295 stars

Top 90.7% on sourcepulse

GitHubView on GitHub
Project Summary

This repository provides an evaluation harness for benchmarking Large Language Models (LLMs) on Verilog hardware description language (HDL) code generation tasks. It targets researchers and engineers evaluating LLMs for hardware design automation, offering improved prompts, support for specification-to-RTL tasks, and detailed error analysis.

How It Works

The harness utilizes a Makefile to orchestrate the evaluation workflow, supporting two primary tasks: code-complete-iccad2023 and spec-to-rtl. It manages datasets as plain text files and allows flexible configuration of LLM parameters such as model choice, in-context learning examples (0-4 shots), number of samples, temperature, and top-p. The evaluation process involves generating Verilog code from LLM prompts and then verifying its correctness using iverilog and verilator.

Quick Start & Requirements

  • Install: make
  • Prerequisites:
    • iverilog (v12, v13 not supported)
    • verilator
    • python3 (v3.11.0 recommended, e.g., via conda create -n codex python=3.11)
    • Python packages: langchain, langchain-openai, langchain-nvidia-ai-endpoints
  • Setup: Requires manual installation of iverilog from source (v12 branch).
  • Docs: VerilogEval Overview

Highlighted Details

  • Supports specification-to-RTL tasks in addition to code completion.
  • Includes in-context learning examples and reframed prompts for improved LLM performance.
  • Analyzes and categorizes common iverilog compilation errors.
  • Reports Pass@1 metrics for both low (temp=0) and high (temp=0.85) temperature settings.

Maintenance & Community

The project is associated with NVlabs and has published research papers detailing its methodology and findings. Links to relevant papers are provided for citation.

Licensing & Compatibility

The repository does not explicitly state a license in the README. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The project is currently Linux-only and requires manual compilation of a specific iverilog version (v12). MachineEval is not supported, and the original Pass@10 metric is no longer reported. A Dockerfile and prebuilt JSONL support are planned but not yet available.

Health Check
Last commit

2 weeks ago

Responsiveness

1 week

Pull Requests (30d)
0
Issues (30d)
4
Star History
43 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.