RepoAudit  by PurCL

Autonomous LLM agent for repository-level code auditing

Created 7 months ago
261 stars

Top 97.4% on SourcePulse

GitHubView on GitHub
Project Summary

Summary

RepoAudit is an autonomous LLM-agent designed for large-scale, repository-level code auditing. It addresses the detection of general bugs such as Null Pointer Dereference, Memory Leak, and Use After Free across multiple programming languages. Targeting engineers and researchers, it offers a compilation-free, multi-lingual, and customizable approach to automated code analysis, aiming to mimic manual auditing processes.

How It Works

This project employs a multi-agent LLM framework that leverages LLMSCAN for code parsing. It utilizes tree-sitter for foundational syntactic analysis via the MetaScanAgent and performs inter-procedural data-flow analysis with the DFBScanAgent to identify complex bugs. This approach allows for analysis without requiring code compilation, supports diverse programming languages, and enables detection of various bug types, offering a novel alternative to traditional static analysis tools.

Quick Start & Requirements

Setup involves creating a Python 3.13 conda environment (conda create -n repoaudit python=3.13, conda activate repoaudit), installing dependencies (pip install -r requirements.txt), and building the Tree-sitter library (cd lib; python build.py). Users must configure OpenAI and Anthropic API keys via environment variables. A helper script src/run_repoaudit.sh facilitates scanning, accepting an optional project path and bug type (MLK, NPD, UAF).

  • Project Repo: https://github.com/PurCL/RepoAudit
  • User Guide, Project Architecture, Extension Guide, DeepWiki: Documentation details are mentioned within the README.

Highlighted Details

  • Supports C/C++, Java, Python, and Go.
  • Detects Memory Leak (MLK), Null Pointer Dereference (NPD), and Use After Free (UAF).
  • Features compilation-free analysis and multi-lingual capabilities.
  • Includes parallel auditing support via --max-neural-workers and --max-symbolic-workers options.
  • Claims to have identified over 100 bugs in open-source projects.

Maintenance & Community

RepoAudit has been accepted at ICML 2025, with a preprint available for related work on network protocol bug detection. Development is ongoing, with plans to open-source additional agents. Community engagement is facilitated through GitHub issues and pull requests, and direct contact information for maintainers is provided.

Licensing & Compatibility

The project is released under the Purdue license. This custom license requires careful review by users, particularly for commercial applications, as its terms may differ from standard open-source licenses.

Limitations & Caveats

Operation necessitates API keys for OpenAI and Anthropic, potentially incurring usage costs. A strict Python 3.13 version is required, and the Tree-sitter build step is an additional setup prerequisite. The Purdue license warrants thorough examination for commercial use. Furthermore, the statement "Other agents in RepoAudit will be released soon" suggests that the current open-sourced components may not represent the full intended functionality of the framework.

Health Check
Last Commit

1 week ago

Responsiveness

Inactive

Pull Requests (30d)
1
Issues (30d)
0
Star History
29 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.