search-and-learn by huggingface

Recipes to scale inference-time compute of open models

Created 1 year ago

1,123 stars

Top 34.1% on SourcePulse

View on GitHub

6 Experts Love This Project

Wing Lian

Founder of Axolotl AI

Pawel Garbacki

Cofounder of Fireworks AI

Philipp Schmid

DevRel at Google DeepMind

Leandro von Werra

Head of Research at Hugging Face

and 2 more!

Project Summary

This project provides recipes and scripts to scale inference-time compute for open-source Large Language Models (LLMs), enabling them to tackle complex problems by "thinking longer." It targets researchers and developers interested in improving LLM performance beyond traditional parameter scaling, offering a practical approach to test-time compute optimization.

How It Works

The core approach involves augmenting LLM inference with search algorithms that guide the model's reasoning process. This is achieved by using "verifier" models or reward models to score intermediate steps, allowing the LLM to explore multiple solution paths. Supported techniques include Best-of-N sampling, beam search, and Diverse Verifier Tree Search (DVTS), configured via YAML files. This method aims to replicate the benefits of increased compute seen in proprietary models like OpenAI's o1, but with open-source models.

Quick Start & Requirements

Install via pip install -e .[dev] after creating a Conda environment with Python 3.11.
Requires Hugging Face CLI login (huggingface-cli login).
Git LFS installation is necessary (sudo apt-get install git-lfs).
Official blog post and recipes README provide detailed instructions.

Highlighted Details

Focuses on scaling test-time compute as a complementary approach to train-time compute.
Implements search against verifiers for "verifiable problems" (math, code).
Includes methods for training process reward models (PRMs) for fine-grained step scoring.
Recipes are provided to replicate reported test-time compute scaling results.

Maintenance & Community

The project is an initial release from Hugging Face (Edward Beeching, Lewis Tunstall, Sasha Rush). Further community engagement and development are expected.

Licensing & Compatibility

The repository appears to be licensed under the Apache 2.0 license, which is permissive for commercial use and closed-source linking.

Limitations & Caveats

The project is an initial release, focusing on specific techniques for verifiable problems. The effectiveness and scalability for broader problem domains or different model architectures may require further investigation.

Health Check

Last Commit

7 months ago

Responsiveness

1+ week

Pull Requests (30d)

Issues (30d)

Star History

7 stars in the last 30 days