selfcodealign by bigcode-project

Research paper for self-alignment in code generation

Created 1 year ago

323 stars

Top 84.3% on SourcePulse

View on GitHub

3 Experts Love This Project

Pawel Garbacki

Cofounder of Fireworks AI

Philipp Schmid

DevRel at Google DeepMind

Wing Lian

Founder of Axolotl AI

Project Summary

SelfCodeAlign offers a fully open and transparent pipeline for enhancing code generation models without human annotations or proprietary distilled data. This approach enables the creation of self-aligned code models, such as StarCoder2-Instruct, achieving state-of-the-art performance on coding tasks. The project targets researchers and developers seeking to build and improve code LLMs with reproducible and accessible methods.

How It Works

The SelfCodeAlign pipeline leverages a self-improvement loop. It uses an existing code model (StarCoder2-15B) to generate instruction-response pairs. These synthetic data are then used to fine-tune the same model, effectively aligning it with desired coding behaviors without external human feedback or reliance on data from larger, closed models. This method promotes transparency and reproducibility in LLM alignment.

Quick Start & Requirements

Install: pip install -e . (from the cloned repository)
Prerequisites: Python 3.10+, PyTorch, Transformers, Accelerate, bitsandbytes, datasets, peft, trl, scikit-learn, pandas, numpy, tqdm, wandb.
Model: Requires access to bigcode/starcoder2-15b.
Dataset: Utilizes bigcode/self-oss-instruct-sc2-exec-filter-50k.
Resources: Fine-tuning requires significant GPU resources (e.g., multiple A100s).
Details: Refer to README-SC2INST.md for detailed instructions.

Highlighted Details

First fully open and transparent pipeline for self-alignment of code LLMs.
Creates StarCoder2-Instruct, a self-aligned model achieving state-of-the-art performance.
Eliminates reliance on human annotations or proprietary distilled data.
Permissively licensed output model and pipeline.

Maintenance & Community

The project is associated with the BigCode community, a collaboration focused on responsible development of large language models for code. Key contributors include researchers from institutions like Hugging Face and ServiceNow.

Licensing & Compatibility

The project and its output model (StarCoder2-Instruct) are released under permissive licenses, facilitating commercial use and integration into closed-source projects. Specific license details are available with the model and code.

Limitations & Caveats

The fine-tuning process is computationally intensive, requiring substantial GPU resources. While the pipeline is designed for transparency, replicating the exact results may depend on specific hardware configurations and hyperparameter tuning.

Health Check

Last Commit

10 months ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

2 stars in the last 30 days