atropos  by NousResearch

RL environment framework for LLM trajectory collection/evaluation

Created 4 months ago
691 stars

Top 49.2% on SourcePulse

GitHubView on GitHub
Project Summary

Atropos is a framework for Reinforcement Learning (RL) environments designed for Large Language Models (LLMs). It enables the collection and evaluation of LLM trajectories across diverse environments, including static datasets, interactive games, and human feedback loops, aiming to accelerate LLM-based RL research.

How It Works

Atropos employs a multi-turn and asynchronous RL approach, decoupling environment steps from policy updates for efficiency. It is inference-agnostic, supporting various LLM providers (OpenAI, vLLM, SGLang) and trainer-independent, allowing experimentation with different RL algorithms. The framework is designed for scalability and decentralization, allowing multiple environment instances to contribute rollouts to a central service.

Quick Start & Requirements

  • Install: pip install atroposlib
  • Prerequisites: Python 3.10+, inference server (e.g., vLLM, SGLang) for running environments.
  • Setup: Basic installation is quick. Running environments requires setting up an inference server and configuring environment files.
  • Documentation: Base Environment Class, Environments Overview, Example Trainer.

Highlighted Details

  • Achieved up to 4.6x improvement on tool calling tasks and 2.5x on financial fundamentals prediction using Atropos RL.
  • Supports RLAIF for personality modification, with released model artifacts like DeepHermes Egregore.
  • Offers tools for local debugging and offline data generation for SFT/DPO.
  • Natively supports any model provider adhering to the OpenAI API standard.

Maintenance & Community

  • Developed by Nous Research and the open-source AI community.
  • Hackathon announced for May 18th, 2025.
  • Follow on Twitter: @NousResearch.
  • Contributing guide available.

Licensing & Compatibility

  • License: MIT.
  • Compatible with commercial use and closed-source linking.

Limitations & Caveats

The project is version 0.1, indicating potential for ongoing development and API changes. While it supports various inference backends, users must configure and run these separately.

Health Check
Last Commit

20 hours ago

Responsiveness

Inactive

Pull Requests (30d)
13
Issues (30d)
1
Star History
106 stars in the last 30 days

Explore Similar Projects

Starred by Edward Sun Edward Sun(Research Scientist at Meta Superintelligence Lab), Shizhe Diao Shizhe Diao(Author of LMFlow; Research Scientist at NVIDIA), and
1 more.

swe-rl by facebookresearch

0.2%
596
RL for software evolution
Created 6 months ago
Updated 6 months ago
Starred by Bryan Helmig Bryan Helmig(Cofounder of Zapier), Will Brown Will Brown(Research Lead at Prime Intellect), and
1 more.

ReCall by Agent-RL

1.2%
1k
RL framework for LLM tool use
Created 6 months ago
Updated 4 months ago
Starred by Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), Sebastian Raschka Sebastian Raschka(Author of "Build a Large Language Model (From Scratch)"), and
14 more.

verifiers by willccbb

3.1%
3k
RL for LLMs in verifiable environments
Created 7 months ago
Updated 22 hours ago
Starred by Evan Hubinger Evan Hubinger(Head of Alignment Stress-Testing at Anthropic), Jiayi Pan Jiayi Pan(Author of SWE-Gym; MTS at xAI), and
1 more.

rl by pytorch

0.4%
3k
PyTorch library for reinforcement learning research
Created 3 years ago
Updated 2 days ago
Feedback? Help us improve.