AutoIF by QwenLM

Research paper for improving LLM instruction-following via self-play with execution feedback

Created 1 year ago

320 stars

Top 84.9% on SourcePulse

View on GitHub

2 Experts Love This Project

Yaowei Zheng

Author of LLaMA-Factory

Wing Lian

Founder of Axolotl AI

Project Summary

This repository provides AutoIF, a method for automatically generating and verifying instruction-following data for large language models using code execution feedback. It is designed for researchers and developers aiming to improve LLM instruction-following capabilities through scalable, self-play data synthesis.

How It Works

AutoIF synthesizes data in stages, starting with seed instructions and progressing through verification function generation, quality cross-validation, and back-translation. It then augments queries, verifies responses against generated functions, and filters for high-quality instruction-response pairs. This approach leverages code execution to provide objective feedback, ensuring the generated data is reliable and effective for training.

Quick Start & Requirements

Install: pip install -r requirements.txt within the ./AutoIF/ directory.
Prerequisites: Python 3.9, PyTorch 2.1.2+cu121, Transformers 4.41.2.
Setup: Requires running a series of Python scripts for data synthesis and verification.
Docs: https://github.com/QwenLM/AutoIF

Highlighted Details

Automates instruction-following data generation and quality verification.
Utilizes code execution feedback for reliable data quality assessment.
Supports both Strong-to-Weak Distillation and Self-Alignment training setups.
Integrates with LLaMA-Factory for SFT and DPO training.

Maintenance & Community

The project is associated with Qwen, Alibaba Inc. Further community or roadmap details are not explicitly provided in the README.

Licensing & Compatibility

The repository does not explicitly state a license. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The README indicates that Transformers version lower than 4.41.2 is unlikely to work. Specific implementation details for training 7B and 70B models are deferred to the associated paper.

Health Check

Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

3 stars in the last 30 days