Research paper code/data for monitor-guided code LM decoding via static analysis
Top 96.5% on sourcepulse
This repository provides the code and data for "Monitor-Guided Decoding of Code LMs with Static Analysis of Repository Context," a NeurIPS 2023 paper. It introduces Monitor-Guided Decoding (MGD) to improve code generation by using static analysis monitors to guide Large Language Models (LLMs). The project is relevant for researchers and engineers working on code generation, LLM evaluation, and static analysis integration.
How It Works
The core innovation is Monitor-Guided Decoding (MGD), which leverages static analysis to constrain LLM output during generation. A "monitor" component, built using the multilspy
library, queries language servers for static analysis results (e.g., type information, method signatures). These results are then used to guide the LLM's decoding process, preventing common errors like "symbol not found" and improving code correctness. This approach enhances compilation rates and ground-truth matching without requiring model retraining.
Quick Start & Requirements
pip install -r requirements.txt
.python3 eval_results.py <inference_results_csv> <pragmatic_code_filecontents_json> <output_directory>
to reproduce paper results. A sample evaluation can be run with python3 evaluation_scripts/eval_results.py inference_results/dotprompts_results_sample.csv datasets/PragmaticCode/fileContentsByRepo.json results_sample/
.multilspy
are in its repository tests.Highlighted Details
multilspy
library provides a unified interface to language servers for static analysis.Maintenance & Community
The multilspy
library has been migrated to its own repository (microsoft/multilspy
). Contributions are welcome via pull requests, subject to a Contributor License Agreement (CLA). The project follows the Microsoft Open Source Code of Conduct.
Licensing & Compatibility
The repository's license is not explicitly stated in the README. The multilspy
library is available via a zip archive from GitHub, implying a permissive license, but specific terms should be verified.
Limitations & Caveats
The README mentions a RuntimeError
related to asyncio event loops when running tests, recommending Python >= 3.10. The primary datasets and inference results are large and require Git LFS. The specific license for the monitors4codegen
repository itself is not clearly defined.
11 months ago
Inactive