es-fine-tuning-paper  by VsonicV

Evolution Strategies for LLM Fine-Tuning

Created 2 months ago
263 stars

Top 97.2% on SourcePulse

GitHubView on GitHub
Project Summary

This repository provides the source code for the paper "Evolution Strategies at Scale: LLM Fine-Tuning Beyond Reinforcement Learning." It addresses the challenge of fine-tuning large language models (LLMs) by employing Evolution Strategies (ES) to directly optimize billions of parameters, offering a novel alternative to traditional reinforcement learning methods. The project targets researchers and engineers seeking to scale LLM optimization efficiently.

How It Works

The core innovation lies in applying Evolution Strategies (ES) for direct LLM parameter optimization. Instead of relying on gradient-based methods or reinforcement learning reward signals, ES treats model fine-tuning as a search problem. It iteratively generates populations of model variants, evaluates their performance on specific tasks, and uses the results to guide the evolution of better-performing models. This approach is designed to scale to models with billions of parameters, potentially offering a more direct and computationally tractable path to fine-tuning compared to complex RL setups.

Quick Start & Requirements

  • Installation: Set up a Python environment (version >= 3.10) and activate it. From the repository root, install dependencies using pip install -r requirement.txt.
  • Accelerated Version Dependencies: For the faster implementation, additionally install vllm==0.11.0 and tensorboard.
  • Prerequisites: Python 3.10+, the accelerate library, and GPU hardware are necessary for running the fine-tuning scripts.
  • Documentation: The associated research paper is available at https://arxiv.org/abs/2509.24372.

Highlighted Details

  • An accelerated version of the code achieves over 10x speed-up in running time, making large-scale ES fine-tuning more practical.
  • The project enables direct optimization of billions of LLM parameters using ES, presenting a distinct paradigm from RL-based fine-tuning.
  • It offers distinct scripts for fine-tuning using partially correlated noise and complete i.i.d. noise, allowing for experimentation with different ES noise strategies.

Maintenance & Community

The repository is under active development, with ongoing additions of experimental code, and users should anticipate potential breaking changes. A community forum for ES fine-tuning is available in the Discussions section.

Licensing & Compatibility

The provided README does not specify a software license. Consequently, the terms for commercial use, redistribution, or integration into closed-source projects remain undefined.

Limitations & Caveats

The accelerated implementations are noted to be subject to breaking changes due to ongoing development. The project is still actively incorporating experimental code, indicating a potentially evolving API and feature set.

Health Check
Last Commit

1 week ago

Responsiveness

Inactive

Pull Requests (30d)
6
Issues (30d)
2
Star History
30 stars in the last 30 days

Explore Similar Projects

Starred by Eric Zhang Eric Zhang(Founding Engineer at Modal), Yineng Zhang Yineng Zhang(Inference Lead at SGLang; Research Scientist at Together AI), and
3 more.

tunix by google

0.9%
2k
JAX-native library for efficient LLM post-training
Created 8 months ago
Updated 3 days ago
Starred by Casper Hansen Casper Hansen(Author of AutoAWQ), Yineng Zhang Yineng Zhang(Inference Lead at SGLang; Research Scientist at Together AI), and
5 more.

xtuner by InternLM

0.2%
5k
LLM fine-tuning toolkit for research
Created 2 years ago
Updated 2 days ago
Feedback? Help us improve.