Lingma-SWE-GPT  by LingmaTongyi

Specialized LLM for software engineering automation

Created 1 year ago
252 stars

Top 99.6% on SourcePulse

GitHubView on GitHub
Project Summary

Lingma SWE-GPT is an open-source large language model tailored for software engineering tasks, built upon Qwen base models and fine-tuned with development process data. It aims to provide intelligent assistance for software improvement and complex engineering challenges. The project includes SWESynInfer, a three-stage workflow extending AutoCodeRover, designed to simulate expert developer cognitive processes for enhanced accuracy in code synthesis and inference.

How It Works

Lingma SWE-GPT leverages the Qwen architecture, augmented with specialized software engineering data. Its core inference workflow, SWESynInfer, is a three-stage process that builds upon the AutoCodeRover framework. This workflow introduces enhancements to more accurately mimic expert developer reasoning, enabling more sophisticated solutions for software engineering problems.

Quick Start & Requirements

  1. Clone Repository: git clone https://github.com/LingmaTongyi/Lingma-SWE-GPT.git
  2. Environment Setup: Create a virtual environment using conda env create -f environment.yml or Mamba. Activate it with conda activate swesyninfer.
  3. Path Configuration: Update SWESynInfer/SWE-bench/setup_result/setup_map.json with your local repository path using python scripts/1_change_testbed_path.py YOUR_ABSOLUTE_PATH/Lingma-SWE-GPT/SWE-bench/repos/testbed.
  4. Git Configuration: Set global user name and email: git config --global user.name "Your Name" and git config --global user.email "your.email@example.com".
  5. Model Deployment (vLLM):
    • 7B Model: Requires 4 GPUs. Use python -m vllm.entrypoints.openai.api_server --gpu-memory-utilization 0.95 --served-model-name Lingma-SWE-GPT --model Lingma/Lingma-SWE-GPT-7B --tensor-parallel-size 4 --max-model-len 131072 --trust-remote-code --rope-scaling '{"type": "yarn", "factor": 4.0, "original_max_position_embeddings": 32768}'.
    • 72B Model: Requires a minimum of 4 GPUs. Use the same command as above, but replace --model Lingma/Lingma-SWE-GPT-7B with --model Lingma/Lingma-SWE-GPT-72B.
  6. Run SWE-GPT: python scripts/run.py conf/vanilla-lite-swebench.conf.
  7. Model Checkpoints: Available at ModelScope 7B and ModelScope 72B.
  8. Prerequisites: Python, Conda/Mamba, Git, vLLM, CUDA-enabled GPUs (minimum 4 for 72B model).

Highlighted Details

  • Achieved a 30.20% (72B) and 18.20% (7B) solution rate on the SWE-bench Verified leaderboard.
  • Demonstrated a 51.16% fault location success rate on SWE-bench Verified.
  • Outperforms other open-source models of similar scale, showing a 22.76% increase compared to Llama 3.1 405B on software engineering tasks.

Maintenance & Community

The project acknowledges foundational work from the Qwen, SWE-bench, AutoCodeRover, and Agentless teams. Specific community channels (e.g., Discord, Slack) or active maintainer information are not detailed in the provided README. Development is ongoing, indicated by "TODO" items.

Licensing & Compatibility

The provided README snippet does not specify the software license. This omission requires clarification for understanding usage restrictions, particularly for commercial applications or integration into closed-source projects.

Limitations & Caveats

  • Multilingual support (Java, JavaScript, TypeScript, Rust) is listed as a future development item ("TODO").
  • The 72B model requires a minimum of 4 GPUs for deployment.
  • Setup involves manual path configuration within setup_map.json.
  • Evaluation is recommended via the SWE-bench Docker image.
  • License information is not provided.
Health Check
Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
0 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.