MLGym  by facebookresearch

Gym environment for ML research agents

Created 7 months ago
557 stars

Top 57.5% on SourcePulse

GitHubView on GitHub
Project Summary

MLGym is an experimental framework and benchmark designed for advancing AI research agents, particularly LLM agents. It provides a diverse set of 13 open-ended AI research tasks across computer vision, NLP, RL, and game theory, requiring real-world AI research skills for problem-solving. The primary goal is to benchmark LLM agents and facilitate RL-based training in a research environment.

How It Works

MLGym operates by running AI research tasks within isolated containers (Docker or Podman), ensuring reproducible environments. It supports GPU acceleration for computationally intensive tasks. The framework orchestrates agent interactions with these tasks, allowing for the evaluation and training of agents using various ML algorithms, with a focus on reinforcement learning.

Quick Start & Requirements

  • Installation: Clone the repository, create a Python 3.11 conda environment, and install the package with pip install -e ..
  • Prerequisites: Requires Docker or Podman. For GPU support on Linux, nvidia-container-toolkit is necessary. macOS users need to set up Podman machine and export DOCKER_HOST. API keys for services like OpenAI and Anthropic can be configured via a .env file.
  • Running Tasks: Use python run.py with arguments specifying container type, task configuration, model, and resource limits. Example: python run.py --container_type docker --task_config_path tasks/battleOfSexes.yaml --model litellm:claude-3-5-sonnet-20240620 --gpus 0.
  • Documentation: Detailed documentation is under construction. A trajectory visualizer is available via streamlit run demo/trajectory_visualizer.py.

Highlighted Details

  • Benchmarks 13 diverse AI research tasks.
  • Supports LLM agent training and evaluation.
  • Offers a trajectory visualizer for inspecting agent behavior.
  • Designed for reproducible research environments using containers.

Maintenance & Community

Maintained by GenAI at Meta and UCSB NLP. Contribution guidelines and a maintenance plan are available.

Licensing & Compatibility

The majority of the code is licensed under CC-BY-NC 4.0 (Attribution-NonCommercial 4.0 International). SWE-Agent and Modded-NanoGPT are MIT licensed; Gymnax and Gymnax-blines are Apache 2.0 licensed. The non-commercial clause restricts use in proprietary or commercial applications.

Limitations & Caveats

MLGym is an experimental framework under heavy development, with potential for major design changes. The non-commercial license may limit adoption for commercial use cases.

Health Check
Last Commit

1 month ago

Responsiveness

1+ week

Pull Requests (30d)
0
Issues (30d)
1
Star History
11 stars in the last 30 days

Explore Similar Projects

Starred by Jiayi Pan Jiayi Pan(Author of SWE-Gym; MTS at xAI) and Jianwei Yang Jianwei Yang(Research Scientist at Meta Superintelligence Lab).

allenact by allenai

0%
369
Open-source framework for embodied AI research
Created 5 years ago
Updated 3 weeks ago
Feedback? Help us improve.