ml4se  by saltudelft

Curated list of ML for software engineering research

Created 5 years ago
719 stars

Top 47.9% on SourcePulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

This repository serves as a comprehensive, curated collection of resources for researchers and practitioners in Machine Learning for Software Engineering (ML4SE). It aggregates papers, theses, datasets, and tools, organized by sub-fields like type inference, code completion, and vulnerability detection, aiming to provide a centralized hub for state-of-the-art knowledge and tools in this rapidly evolving domain.

How It Works

The repository functions as a living bibliography, meticulously listing academic publications, research projects, and relevant software tools. It categorizes these resources by specific ML4SE tasks, allowing users to quickly navigate and discover relevant work. The curation process emphasizes recent advancements and widely recognized contributions, facilitating efficient exploration of the ML4SE landscape.

Quick Start & Requirements

This repository is a curated list and does not require installation or execution. It serves as a reference guide.

Highlighted Details

  • Extensive coverage of ML applications across numerous software engineering tasks, including type inference, code completion, generation, summarization, vulnerability detection, and program repair.
  • Detailed listings include paper titles, venues, publication years, and direct links to PDFs and associated code repositories where available.
  • Includes a dedicated section for PhD theses, talks, datasets, and tools, offering a holistic view of the ML4SE ecosystem.
  • Resources are regularly updated, reflecting the latest research trends and tools in the field.

Maintenance & Community

The repository is maintained by the Software Engineering Research Group at Delft University of Technology, with contributions encouraged via pull requests. It links to relevant research groups and academic venues.

Licensing & Compatibility

The repository itself is licensed under a permissive license (likely MIT or similar, though not explicitly stated for the repo itself), allowing broad use and contribution. Individual resources (papers, code) are subject to their respective licenses.

Limitations & Caveats

As a curated list, the repository's depth and breadth are dependent on community contributions and maintainer efforts. While comprehensive, it may not capture every single relevant publication or tool.

Health Check
Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
4 stars in the last 30 days

Explore Similar Projects

Starred by Shizhe Diao Shizhe Diao(Author of LMFlow; Research Scientist at NVIDIA), Omar Khattab Omar Khattab(Coauthor of DSPy, ColBERT; Professor at MIT), and
5 more.

CodeXGLUE by microsoft

0.3%
2k
Benchmark for code intelligence tasks
Created 5 years ago
Updated 1 year ago
Starred by Lewis Tunstall Lewis Tunstall(Research Engineer at Hugging Face), Eric Zhu Eric Zhu(Coauthor of AutoGen; Research Scientist at Microsoft Research), and
6 more.

awesome-machine-learning-on-source-code by src-d

0.1%
6k
Curated list of ML applied to source code (MLonCode)
Created 8 years ago
Updated 4 years ago
Feedback? Help us improve.