atropos  by NousResearch

RL environment framework for LLM trajectory collection/evaluation

created 3 months ago
570 stars

Top 57.4% on sourcepulse

GitHubView on GitHub
Project Summary

Atropos is a framework for Reinforcement Learning (RL) environments designed for Large Language Models (LLMs). It enables the collection and evaluation of LLM trajectories across diverse environments, including static datasets, interactive games, and human feedback loops, aiming to accelerate LLM-based RL research.

How It Works

Atropos employs a multi-turn and asynchronous RL approach, decoupling environment steps from policy updates for efficiency. It is inference-agnostic, supporting various LLM providers (OpenAI, vLLM, SGLang) and trainer-independent, allowing experimentation with different RL algorithms. The framework is designed for scalability and decentralization, allowing multiple environment instances to contribute rollouts to a central service.

Quick Start & Requirements

  • Install: pip install atroposlib
  • Prerequisites: Python 3.10+, inference server (e.g., vLLM, SGLang) for running environments.
  • Setup: Basic installation is quick. Running environments requires setting up an inference server and configuring environment files.
  • Documentation: Base Environment Class, Environments Overview, Example Trainer.

Highlighted Details

  • Achieved up to 4.6x improvement on tool calling tasks and 2.5x on financial fundamentals prediction using Atropos RL.
  • Supports RLAIF for personality modification, with released model artifacts like DeepHermes Egregore.
  • Offers tools for local debugging and offline data generation for SFT/DPO.
  • Natively supports any model provider adhering to the OpenAI API standard.

Maintenance & Community

  • Developed by Nous Research and the open-source AI community.
  • Hackathon announced for May 18th, 2025.
  • Follow on Twitter: @NousResearch.
  • Contributing guide available.

Licensing & Compatibility

  • License: MIT.
  • Compatible with commercial use and closed-source linking.

Limitations & Caveats

The project is version 0.1, indicating potential for ongoing development and API changes. While it supports various inference backends, users must configure and run these separately.

Health Check
Last commit

2 days ago

Responsiveness

Inactive

Pull Requests (30d)
28
Issues (30d)
1
Star History
346 stars in the last 90 days

Explore Similar Projects

Starred by Thomas Wolf Thomas Wolf(Cofounder of Hugging Face), Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), and
2 more.

Gymnasium by Farama-Foundation

0.5%
10k
Python API standard for single-agent reinforcement learning environments
created 2 years ago
updated 1 week ago
Feedback? Help us improve.