syftr  by datarobot

Agentic workflow optimizer

created 2 months ago
293 stars

Top 91.3% on SourcePulse

GitHubView on GitHub
Project Summary

syftr is an agent optimizer designed to discover the most cost-effective agentic workflows for users. It targets researchers and developers building complex AI agent systems, enabling them to efficiently navigate trade-offs between accuracy, cost, latency, and throughput. The primary benefit is the automated identification of Pareto-optimal configurations for generative AI workflows.

How It Works

syftr employs multi-objective Bayesian Optimization, enhanced by a novel "Pareto Pruner," to efficiently search vast configuration spaces. It leverages Ray for distributed computation across CPUs and GPUs, and Optuna for its flexible define-by-run optimization framework. LlamaIndex is integrated for constructing RAG workflows, allowing optimization of both agentic and non-agentic components, including prompt tuning via the Trace library.

Quick Start & Requirements

  • Installation: Clone the repository and use uv for environment management, followed by uv sync --extra dev and uv pip install -e .. Alternatively, install via pip: pip install syftr.
  • Configuration: Requires a config.yaml file in ~/.syftr/ or the current directory, with sample configurations available.
  • Credentials: Azure OpenAI API key and endpoint URL are required for example studies. PostgreSQL DSN is recommended for distributed workloads, with SQLite as a fallback.
  • Dependencies: Python 3.12.7, Ray, Optuna, LlamaIndex, HuggingFace Datasets, Trace.
  • Resources: Running example notebooks or studies is recommended for validation.
  • Docs: Blogpost, Technical Paper

Highlighted Details

  • Optimizes agentic and non-agentic RAG workflows.
  • Supports multi-objective Bayesian Optimization with a custom "Pareto Pruner."
  • Integrates with Ray for distributed execution and Optuna for flexible optimization.
  • Configurable LLM and embedding model endpoints.
  • Provides CLI and API for study execution and results retrieval.

Maintenance & Community

The project is developed by DataRobot. Contribution guidelines and a code of conduct are provided.

Licensing & Compatibility

The repository does not explicitly state a license in the README. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The README does not specify a license, which may impact commercial adoption. Initial setup requires obtaining API keys and configuring connection details for services like Azure OpenAI and PostgreSQL.

Health Check
Last commit

21 hours ago

Responsiveness

Inactive

Pull Requests (30d)
8
Issues (30d)
4
Star History
295 stars in the last 90 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Nathan Lambert Nathan Lambert(AI Researcher at AI2), and
1 more.

tianshou by thu-ml

0.1%
9k
PyTorch RL library for algorithm development and application
created 7 years ago
updated 4 days ago
Feedback? Help us improve.