atropos  by NousResearch

RL environment framework for LLM trajectory collection/evaluation

Created 8 months ago
811 stars

Top 43.6% on SourcePulse

GitHubView on GitHub
Project Summary

Atropos is a framework for Reinforcement Learning (RL) environments designed for Large Language Models (LLMs). It enables the collection and evaluation of LLM trajectories across diverse environments, including static datasets, interactive games, and human feedback loops, aiming to accelerate LLM-based RL research.

How It Works

Atropos employs a multi-turn and asynchronous RL approach, decoupling environment steps from policy updates for efficiency. It is inference-agnostic, supporting various LLM providers (OpenAI, vLLM, SGLang) and trainer-independent, allowing experimentation with different RL algorithms. The framework is designed for scalability and decentralization, allowing multiple environment instances to contribute rollouts to a central service.

Quick Start & Requirements

  • Install: pip install atroposlib
  • Prerequisites: Python 3.10+, inference server (e.g., vLLM, SGLang) for running environments.
  • Setup: Basic installation is quick. Running environments requires setting up an inference server and configuring environment files.
  • Documentation: Base Environment Class, Environments Overview, Example Trainer.

Highlighted Details

  • Achieved up to 4.6x improvement on tool calling tasks and 2.5x on financial fundamentals prediction using Atropos RL.
  • Supports RLAIF for personality modification, with released model artifacts like DeepHermes Egregore.
  • Offers tools for local debugging and offline data generation for SFT/DPO.
  • Natively supports any model provider adhering to the OpenAI API standard.

Maintenance & Community

  • Developed by Nous Research and the open-source AI community.
  • Hackathon announced for May 18th, 2025.
  • Follow on Twitter: @NousResearch.
  • Contributing guide available.

Licensing & Compatibility

  • License: MIT.
  • Compatible with commercial use and closed-source linking.

Limitations & Caveats

The project is version 0.1, indicating potential for ongoing development and API changes. While it supports various inference backends, users must configure and run these separately.

Health Check
Last Commit

1 day ago

Responsiveness

Inactive

Pull Requests (30d)
28
Issues (30d)
10
Star History
50 stars in the last 30 days

Explore Similar Projects

Starred by Will Brown Will Brown(Research Lead at Prime Intellect) and Eric Zhu Eric Zhu(Coauthor of AutoGen; Research Scientist at Microsoft Research).

hud-python by hud-evals

3.3%
257
AI agent development and evaluation toolkit
Created 10 months ago
Updated 15 hours ago
Starred by Bryan Helmig Bryan Helmig(Cofounder of Zapier), Will Brown Will Brown(Research Lead at Prime Intellect), and
1 more.

ReCall by Agent-RL

0.5%
1k
RL framework for LLM tool use
Created 10 months ago
Updated 8 months ago
Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Wing Lian Wing Lian(Founder of Axolotl AI), and
3 more.

ROLL by alibaba

2.3%
3k
RL library for large language models
Created 7 months ago
Updated 16 hours ago
Starred by Evan Hubinger Evan Hubinger(Head of Alignment Stress-Testing at Anthropic), Jiayi Pan Jiayi Pan(Author of SWE-Gym; MTS at xAI), and
1 more.

rl by pytorch

0.3%
3k
PyTorch library for reinforcement learning research
Created 4 years ago
Updated 9 hours ago
Starred by Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), Will Brown Will Brown(Research Lead at Prime Intellect), and
14 more.

verifiers by PrimeIntellect-ai

1.0%
4k
RL for LLMs in verifiable environments
Created 11 months ago
Updated 13 hours ago
Feedback? Help us improve.