Python library for data transformation DAGs
Top 20.9% on sourcepulse
Apache Hamilton provides a Python library for defining, visualizing, and executing data transformation Directed Acyclic Graphs (DAGs). It targets data scientists and engineers seeking to improve the modularity, testability, and maintainability of their data pipelines, from ETL to ML workflows and LLM applications. Hamilton's core benefit is enabling portable, expressive, and self-documenting dataflows that integrate seamlessly across various Python environments.
How It Works
Hamilton models data transformations as Python functions, where function parameters define dependencies. The library automatically constructs the DAG from these functions, promoting readable, modular code. Its unique function modifiers allow for DRY code and reduced complexity in large DAGs, while built-in data and schema validation (@check_output
, SchemaValidator
) enhance robustness. This approach separates DAG definition from execution, facilitating collaboration and smoother transitions from development to production.
Quick Start & Requirements
pip install "sf-hamilton[visualization]"
.pip install "sf-hamilton[ui,sdk]"
.Highlighted Details
Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
Hamilton is not an orchestrator or a feature store, but rather a framework for defining data transformation logic. For complex control flow like loops or conditional logic (e.g., for LLM agents), the sister library Burr is recommended.
6 days ago
1 day