monitors4codegen  by microsoft

Research paper code/data for monitor-guided code LM decoding via static analysis

created 1 year ago
268 stars

Top 96.5% on sourcepulse

GitHubView on GitHub
Project Summary

This repository provides the code and data for "Monitor-Guided Decoding of Code LMs with Static Analysis of Repository Context," a NeurIPS 2023 paper. It introduces Monitor-Guided Decoding (MGD) to improve code generation by using static analysis monitors to guide Large Language Models (LLMs). The project is relevant for researchers and engineers working on code generation, LLM evaluation, and static analysis integration.

How It Works

The core innovation is Monitor-Guided Decoding (MGD), which leverages static analysis to constrain LLM output during generation. A "monitor" component, built using the multilspy library, queries language servers for static analysis results (e.g., type information, method signatures). These results are then used to guide the LLM's decoding process, preventing common errors like "symbol not found" and improving code correctness. This approach enhances compilation rates and ground-truth matching without requiring model retraining.

Quick Start & Requirements

  • Installation: Create a Python virtual environment (venv or conda) with Python 3.10+. Install dependencies via pip install -r requirements.txt.
  • Prerequisites: Python 3.10+, Git LFS for large data files.
  • Evaluation: Run python3 eval_results.py <inference_results_csv> <pragmatic_code_filecontents_json> <output_directory> to reproduce paper results. A sample evaluation can be run with python3 evaluation_scripts/eval_results.py inference_results/dotprompts_results_sample.csv datasets/PragmaticCode/fileContentsByRepo.json results_sample/.
  • Datasets: PragmaticCode (Java projects) and DotPrompts (method completion examples) are available at Zenodo.
  • Documentation: Usage examples for multilspy are in its repository tests.

Highlighted Details

  • MGD improves compilation rates by 19-25% and boosts ground-truth match across granularities.
  • Supports multiple monitors for joint property enforcement.
  • multilspy library provides a unified interface to language servers for static analysis.
  • Includes datasets (PragmaticCode, DotPrompts) and inference results for various LLMs.

Maintenance & Community

The multilspy library has been migrated to its own repository (microsoft/multilspy). Contributions are welcome via pull requests, subject to a Contributor License Agreement (CLA). The project follows the Microsoft Open Source Code of Conduct.

Licensing & Compatibility

The repository's license is not explicitly stated in the README. The multilspy library is available via a zip archive from GitHub, implying a permissive license, but specific terms should be verified.

Limitations & Caveats

The README mentions a RuntimeError related to asyncio event loops when running tests, recommending Python >= 3.10. The primary datasets and inference results are large and require Git LFS. The specific license for the monitors4codegen repository itself is not clearly defined.

Health Check
Last commit

11 months ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
16 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.