ai-data-science-team  by business-science

AI-powered data science team of agents

created 7 months ago
2,335 stars

Top 19.9% on sourcepulse

GitHubView on GitHub
Project Summary

This Python library provides an AI-powered team of specialized agents designed to automate and accelerate common data science tasks, such as data cleaning, feature engineering, machine learning modeling, and exploratory data analysis. It targets data scientists and analysts looking to increase efficiency and explore business problems like churn modeling and risk assessment.

How It Works

The project leverages a modular, agent-based architecture, with each agent specializing in a particular data science function. These agents are built using Langchain and can interact with various tools and libraries like Pandas, SQL databases, and H2O.ai for machine learning. This approach allows for a flexible and extensible system where agents can be combined or extended for complex workflows.

Quick Start & Requirements

  • Install: pip install ai-data-science-team
  • Prerequisites: Python, OpenAI API Key.
  • Example Usage: The README provides a detailed example for the H2O Machine Learning Agent, demonstrating data loading, LLM initialization, agent invocation with user instructions, and leaderboard retrieval. See all examples here.

Highlighted Details

  • Offers specialized agents for data wrangling, visualization, cleaning, feature engineering, SQL interaction, and machine learning (H2O, MLflow).
  • Includes "Data Science Apps" like a Pandas AI Data Analyst and an Exploratory Data Copilot for automated EDA.
  • Supports multi-agent workflows for more complex tasks.
  • Actively under development with new agents and features being released.

Maintenance & Community

This is a beta version under active development. The primary contributor is Business-Science.io. Further community engagement details are not explicitly provided in the README.

Licensing & Compatibility

  • License: MIT License.
  • Compatibility: Permissive MIT license allows for commercial use and integration into closed-source projects.

Limitations & Caveats

The library is in beta, with potential for breaking changes until version 0.1.0. It is intended for educational purposes and explicitly states it is not a replacement for a company's data science team, with no warranties provided.

Health Check
Last commit

4 months ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
449 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.