fuzz4all  by fuzz4all

Fuzzer using LLMs for universal input generation

created 1 year ago
275 stars

Top 94.9% on sourcepulse

GitHubView on GitHub
Project Summary

Fuzz4All is a universal fuzzing framework that leverages Large Language Models (LLMs) to generate diverse and realistic inputs for various programming languages. It is designed for researchers and developers seeking to improve software robustness by exploring a wide range of input possibilities, particularly for languages where traditional fuzzing techniques may be less effective.

How It Works

Fuzz4All utilizes LLMs as its core input generation and mutation engine. It employs a novel autoprompting technique to create LLM prompts specifically tailored for fuzzing tasks. A key component is its LLM-powered fuzzing loop, which iteratively refines these prompts based on feedback, enabling the generation of novel and effective test cases for arbitrary inputs and languages.

Quick Start & Requirements

  • Installation: Recommended via Docker image (https://doi.org/10.5281/zenodo.10456883). Alternatively, use conda create -n fuzz4all python=3.10, conda activate fuzz4all, pip install -r requirements.txt, and pip install -e ..
  • Prerequisites: Requires python=3.10, CUDA for GPU acceleration, and an OpenAI API key for GPT-4 autoprompting. Supports bigcode/starcoderbase and starcoderbase-1b models.
  • Configuration: Set environment variables like FUZZING_BATCH_SIZE, FUZZING_MODEL, and FUZZING_DEVICE. Fuzzing targets are configured via YAML files in the configs/ directory.
  • Execution: Run with python Fuzz4All/fuzz.py --config {config_file.yaml} main_with_config --folder outputs/fuzzing_outputs --batch_size {batch_size} --model_name {model_name} --target {target_name}.
  • Resources: Requires user-provided target binaries.

Highlighted Details

  • First fuzzer to universally target many input languages using LLMs.
  • Novel autoprompting technique for generating effective fuzzing prompts.
  • Iterative LLM-powered fuzzing loop for prompt refinement.
  • Supports targeted fuzzing by pointing to specific API/library documentation.

Maintenance & Community

The project is associated with the ICSE'24 paper "Fuzz4All: Universal Fuzzing with Large Language Models." Further details and artifact access are available via a Zenodo link.

Licensing & Compatibility

The repository does not explicitly state a license in the provided README. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The project is presented as research code from an ICSE'24 paper, suggesting it may be experimental. The use of LLMs, especially for code generation, carries inherent risks of producing potentially harmful code, necessitating cautious execution in sandboxed environments. Support is currently limited to specific StarCoder models, though extensibility is mentioned.

Health Check
Last commit

9 months ago

Responsiveness

Inactive

Pull Requests (30d)
1
Issues (30d)
0
Star History
33 stars in the last 90 days

Explore Similar Projects

Starred by Elie Bursztein Elie Bursztein(Cybersecurity Lead at Google DeepMind), Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), and
1 more.

oss-fuzz-gen by google

0.3%
1k
LLM-powered fuzz target generator for C/C++/Java/Python projects, benchmarked via OSS-Fuzz
created 1 year ago
updated 5 days ago
Starred by Boris Cherny Boris Cherny(Creator of Claude Code; MTS at Anthropic), Hiroshi Shibata Hiroshi Shibata(Core Contributor to Ruby), and
4 more.

oss-fuzz by google

0.2%
11k
Continuous fuzzing for open source software
created 9 years ago
updated 1 day ago
Feedback? Help us improve.