FM-Agent by fmagent-project

LLM-powered agent for scaling formal methods in large codebases

Created 3 months ago

443 stars

Top 66.9% on SourcePulse

Project Summary

Summary

FM-Agent addresses the challenge of scaling formal methods to large-scale software systems. It provides an LLM-based framework for automated compositional reasoning, enabling developers to detect bugs in complex codebases like large compilers. The primary benefit is automating rigorous verification processes that were previously impractical for extensive projects.

How It Works

The framework employs LLM-based Hoare-style reasoning, guided by markdown workflows. It analyzes code, generates specifications, and identifies discrepancies between intended and actual behavior. This approach scales formal methods by leveraging LLMs' pattern recognition and reasoning capabilities, automating the generation of bug reports and trigger conditions.

Quick Start & Requirements

Installation: Requires Ubuntu (24.04 LTS tested), Python 3.12, pip >= 23, and specific Python packages (openai, OpenCode). An install.sh script handles most dependencies after setting the OPENROUTER_API_KEY environment variable.
Prerequisites: An OpenRouter API key is mandatory for LLM interaction. The execution environment for generated test cases must be prepared. The oh-my-opencode plugin's comment-checker hook should be disabled.
Links: Paper available at arXiv:2604.11556.

Highlighted Details

Successfully applied to large systems (e.g., 143K LoC C compiler).
Supports automated compositional reasoning across multiple languages: Rust, C, C++, Python, Java, Go, CUDA, JavaScript, TypeScript, ArkTS.
Generates detailed bug reports including specification claims, actual behavior, code evidence, trigger conditions, and reproduction steps.
Outputs machine-readable bug summaries and probe scripts for verification.

Maintenance & Community

No specific details on maintainers, community channels (e.g., Discord, Slack), or roadmap are provided in the README. Contact is via GitHub issues or email.

Licensing & Compatibility

License: The repository's license is not specified in the README.
Compatibility: No explicit notes regarding commercial use or linking with closed-source projects.

Limitations & Caveats

Effectiveness is highly dependent on the underlying LLM's reasoning capabilities; weaker models may produce hallucinations. Claude Opus/Sonnet 4.6+ are recommended. The framework is currently restricted to using the OpenRouter API and requires a pre-configured environment for executing generated test cases. There is also a potential for name conflicts with the generated fm_agent/ directory.

FM-Agent by fmagent-project

Explore Similar Projects

slopc by shorwood

claude-skills by honnibal

llm-debugger-vscode-extension by mohsen1

vera by aallan

llm-verified-with-monte-carlo-tree-search by namin

devlooper by modal-labs

parsel by ezelikman

truecourse by truecourse-ai

web-codegen-scorer by angular

cpg by Fraunhofer-AISEC

Awesome-Code-LLM by codefuse-ai

qlty by qltysh