AutoIF  by QwenLM

Research paper for improving LLM instruction-following via self-play with execution feedback

Created 1 year ago
320 stars

Top 84.9% on SourcePulse

GitHubView on GitHub
Project Summary

This repository provides AutoIF, a method for automatically generating and verifying instruction-following data for large language models using code execution feedback. It is designed for researchers and developers aiming to improve LLM instruction-following capabilities through scalable, self-play data synthesis.

How It Works

AutoIF synthesizes data in stages, starting with seed instructions and progressing through verification function generation, quality cross-validation, and back-translation. It then augments queries, verifies responses against generated functions, and filters for high-quality instruction-response pairs. This approach leverages code execution to provide objective feedback, ensuring the generated data is reliable and effective for training.

Quick Start & Requirements

  • Install: pip install -r requirements.txt within the ./AutoIF/ directory.
  • Prerequisites: Python 3.9, PyTorch 2.1.2+cu121, Transformers 4.41.2.
  • Setup: Requires running a series of Python scripts for data synthesis and verification.
  • Docs: https://github.com/QwenLM/AutoIF

Highlighted Details

  • Automates instruction-following data generation and quality verification.
  • Utilizes code execution feedback for reliable data quality assessment.
  • Supports both Strong-to-Weak Distillation and Self-Alignment training setups.
  • Integrates with LLaMA-Factory for SFT and DPO training.

Maintenance & Community

The project is associated with Qwen, Alibaba Inc. Further community or roadmap details are not explicitly provided in the README.

Licensing & Compatibility

The repository does not explicitly state a license. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The README indicates that Transformers version lower than 4.41.2 is unlikely to work. Specific implementation details for training 7B and 70B models are deferred to the associated paper.

Health Check
Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
3 stars in the last 30 days

Explore Similar Projects

Starred by Jeff Hammerbacher Jeff Hammerbacher(Cofounder of Cloudera), Yineng Zhang Yineng Zhang(Inference Lead at SGLang; Research Scientist at Together AI), and
13 more.

open-instruct by allenai

0.5%
4k
Training codebase for instruction-following language models
Created 2 years ago
Updated 1 day ago
Starred by Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), John Yang John Yang(Coauthor of SWE-bench, SWE-agent), and
28 more.

stanford_alpaca by tatsu-lab

0.0%
30k
Instruction-following LLaMA model training and data generation
Created 2 years ago
Updated 1 year ago
Feedback? Help us improve.