Experiment orchestration framework for AI research
Top 58.1% on sourcepulse
AI2 Tango is an open-source Python library designed to streamline machine learning research by organizing experiments into discrete, cacheable, and reusable steps. It targets researchers and engineers working on complex, iterative projects, offering a structured alternative to ad-hoc file management and version tracking.
How It Works
Tango structures research workflows as Directed Acyclic Graphs (DAGs) of "steps." Each step is a Python function or class decorated with @step()
. The library caches step outputs based on a unique ID derived from step inputs and metadata (fully qualified name, version). This caching mechanism avoids redundant computation when inputs haven't changed, significantly speeding up iterative development. Unlike other workflow engines, Tango intentionally excludes source code hashes from cache keys, allowing for code modifications without invalidating the cache unless a VERSION
class variable is manually updated, promoting transparency and control.
Quick Start & Requirements
pip install ai2-tango
or pip install 'ai2-tango[all]'
for all integrations.conda install tango -c conda-forge
or conda install tango-all -c conda-forge
.Highlighted Details
Maintenance & Community
Developed and maintained by the AllenNLP team at the Allen Institute for Artificial Intelligence (AI2).
Licensing & Compatibility
Licensed under Apache 2.0, permitting commercial use and integration with closed-source projects.
Limitations & Caveats
While designed for research, the README notes that tools like Metaflow, Airflow, or Redun may be better suited for production workflows. The caching mechanism relies on manual VERSION
updates for code changes to invalidate the cache.
1 year ago
1 day