gpt-wpre by moyix

Binary reverse engineering via LLM

Created 2 years ago

380 stars

Top 74.8% on SourcePulse

View on GitHub

1 Expert Loves This Project

Simon Willison

Coauthor of Django

Project Summary

This project provides a prototype for whole-program reverse engineering using GPT-3, aiming to summarize complex binaries by recursively summarizing function dependencies. It's targeted at reverse engineers and security researchers who need to understand large codebases but are limited by the context window of large language models. The primary benefit is generating natural language summaries of program functionality, even for large or complex software.

How It Works

GPT-WPRE leverages Ghidra for decompilation and call graph extraction. It then employs a recursive summarization strategy: functions are summarized in an order determined by a topological sort of the call graph, with callees' summaries provided as context for their callers. For functions exceeding the LLM's context limit, it attempts to summarize sequential chunks of code, recursively reducing chunk size if necessary.

Quick Start & Requirements

Install Python dependencies: pip install -r requirements.txt
Run Ghidra with ghidra_bridge enabled.
Obtain an OpenAI API key.
Extract call graph and decompilations: python extract_ghidra_decomp.py
Summarize functions: python recursive_summarize.py -f <function_name> <program_directory>
See samples/libpng16.so.16.38.0_stripped for example output.

Highlighted Details

Uses OpenAI's text-davinci-003 model.
Handles functions exceeding context limits by chunking and summarizing parts.
Includes a --dry-run flag to estimate API costs.
Offers a debugging script (extras/debug_summaries.py) for side-by-side comparison of source, decompiled code, and summaries.

Maintenance & Community

The project appears to be a personal prototype by "moyix" with no explicit mention of community channels, ongoing development, or sponsorships.

Licensing & Compatibility

The README does not specify a license.

Limitations & Caveats

The tool is a "toy prototype" tested on only one program (libpng) and may not generalize well. It does not handle mutual recursion or cycles in the call graph, which will cause exceptions. The summarization prompts are basic and could likely be improved with prompt engineering. API costs can be significant for full program analysis.

Health Check

Last Commit

2 years ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

0 stars in the last 30 days