FM-Agent  by haoran-ding

LLM-powered agent for scaling formal methods in large codebases

Created 2 weeks ago

New!

374 stars

Top 75.7% on SourcePulse

GitHubView on GitHub
Project Summary

Summary

FM-Agent addresses the challenge of scaling formal methods to large-scale software systems. It provides an LLM-based framework for automated compositional reasoning, enabling developers to detect bugs in complex codebases like large compilers. The primary benefit is automating rigorous verification processes that were previously impractical for extensive projects.

How It Works

The framework employs LLM-based Hoare-style reasoning, guided by markdown workflows. It analyzes code, generates specifications, and identifies discrepancies between intended and actual behavior. This approach scales formal methods by leveraging LLMs' pattern recognition and reasoning capabilities, automating the generation of bug reports and trigger conditions.

Quick Start & Requirements

  • Installation: Requires Ubuntu (24.04 LTS tested), Python 3.12, pip >= 23, and specific Python packages (openai, OpenCode). An install.sh script handles most dependencies after setting the OPENROUTER_API_KEY environment variable.
  • Prerequisites: An OpenRouter API key is mandatory for LLM interaction. The execution environment for generated test cases must be prepared. The oh-my-opencode plugin's comment-checker hook should be disabled.
  • Links: Paper available at arXiv:2604.11556.

Highlighted Details

  • Successfully applied to large systems (e.g., 143K LoC C compiler).
  • Supports automated compositional reasoning across multiple languages: Rust, C, C++, Python, Java, Go, CUDA, JavaScript, TypeScript, ArkTS.
  • Generates detailed bug reports including specification claims, actual behavior, code evidence, trigger conditions, and reproduction steps.
  • Outputs machine-readable bug summaries and probe scripts for verification.

Maintenance & Community

No specific details on maintainers, community channels (e.g., Discord, Slack), or roadmap are provided in the README. Contact is via GitHub issues or email.

Licensing & Compatibility

  • License: The repository's license is not specified in the README.
  • Compatibility: No explicit notes regarding commercial use or linking with closed-source projects.

Limitations & Caveats

Effectiveness is highly dependent on the underlying LLM's reasoning capabilities; weaker models may produce hallucinations. Claude Opus/Sonnet 4.6+ are recommended. The framework is currently restricted to using the OpenRouter API and requires a pre-configured environment for executing generated test cases. There is also a potential for name conflicts with the generated fm_agent/ directory.

Health Check
Last Commit

5 days ago

Responsiveness

Inactive

Pull Requests (30d)
1
Issues (30d)
6
Star History
375 stars in the last 17 days

Explore Similar Projects

Feedback? Help us improve.