atropos by NousResearch

RL environment framework for LLM trajectory collection/evaluation

Created 8 months ago

811 stars

Top 43.6% on SourcePulse

View on GitHub

6 Experts Love This Project

Yaowei Zheng

Author of LLaMA-Factory

Georgios Konstantopoulos

CTO, General Partner at Paradigm

Thomas Wolf

Cofounder of Hugging Face

Pawel Garbacki

Cofounder of Fireworks AI

and 2 more!

Project Summary

Atropos is a framework for Reinforcement Learning (RL) environments designed for Large Language Models (LLMs). It enables the collection and evaluation of LLM trajectories across diverse environments, including static datasets, interactive games, and human feedback loops, aiming to accelerate LLM-based RL research.

How It Works

Atropos employs a multi-turn and asynchronous RL approach, decoupling environment steps from policy updates for efficiency. It is inference-agnostic, supporting various LLM providers (OpenAI, vLLM, SGLang) and trainer-independent, allowing experimentation with different RL algorithms. The framework is designed for scalability and decentralization, allowing multiple environment instances to contribute rollouts to a central service.

Quick Start & Requirements

Install: pip install atroposlib
Prerequisites: Python 3.10+, inference server (e.g., vLLM, SGLang) for running environments.
Setup: Basic installation is quick. Running environments requires setting up an inference server and configuring environment files.
Documentation: Base Environment Class, Environments Overview, Example Trainer.

Highlighted Details

Achieved up to 4.6x improvement on tool calling tasks and 2.5x on financial fundamentals prediction using Atropos RL.
Supports RLAIF for personality modification, with released model artifacts like DeepHermes Egregore.
Offers tools for local debugging and offline data generation for SFT/DPO.
Natively supports any model provider adhering to the OpenAI API standard.

Maintenance & Community

Developed by Nous Research and the open-source AI community.
Hackathon announced for May 18th, 2025.
Follow on Twitter: @NousResearch.
Contributing guide available.

Licensing & Compatibility

License: MIT.
Compatible with commercial use and closed-source linking.

Limitations & Caveats

The project is version 0.1, indicating potential for ongoing development and API changes. While it supports various inference backends, users must configure and run these separately.

Health Check

Last Commit

1 day ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

50 stars in the last 30 days