agentic-data-scientist by K-Dense-AI

Automated multi-agent data science CLI

Created 4 months ago

578 stars

Top 56.1% on SourcePulse

Project Summary

This framework provides an end-to-end, adaptive multi-agent system for complex data science tasks. It targets engineers and researchers seeking a robust, self-correcting workflow that separates planning from execution, validates progress continuously, and leverages advanced AI models and tools. The primary benefit is an automated, reliable, and adaptable data science process that reduces errors and rework.

How It Works

The core approach is an adaptive multi-agent workflow built on Google's Agent Development Kit (ADK) and Claude Agent SDK. It meticulously separates planning from execution, creating comprehensive analysis plans with clear success criteria before any work begins. During execution, a continuous validation loop monitors progress against these criteria, allowing the system to self-correct and adapt its plan based on discoveries, ensuring the final deliverable meets actual needs rather than rigid initial assumptions. This iterative refinement and validation process aims to catch issues early and improve overall quality.

Quick Start & Requirements

Installation: Install via PyPI using uv tool install agentic-data-scientist or use directly with uvx agentic-data-scientist --mode simple.
Prerequisites:
- Python 3.12+
- Node.js (for Claude Code CLI)
- Claude Code CLI: npm install -g @anthropic-ai/claude-code
- API Keys: OPENROUTER_API_KEY (for planning/review agents) and ANTHROPIC_API_KEY (for coding agent).
Documentation: Getting Started Guide, API Reference, Tools Configuration, Extending, Examples are available.

Highlighted Details

Adaptive Multi-Agent Workflow: Features iterative planning, execution, validation, and reflection for dynamic problem-solving.
Intelligent Planning & Continuous Validation: Generates detailed analysis plans upfront and tracks progress against success criteria at every step.
Claude Scientific Skills Integration: Provides access to over 120 scientific skills, databases (UniProt, PubChem, KEGG), and packages (BioPython, RDKit) via the Claude Code SDK.
Context Window Management: Employs aggressive event compression and LLM-based summarization to manage context window usage during long-running analyses, preventing token overflow.
MCP Integration: Supports tool access via Model Context Protocol servers.

Maintenance & Community

The project is developed by K-Dense Inc. and welcomes contributions. Community support is available via the K-Dense Community Slack channel. Development practices include using uv for dependency management and ruff for linting/formatting, with releases following conventional commits.

Licensing & Compatibility

The project is released under the MIT License, permitting commercial use and integration with closed-source projects.

Limitations & Caveats

The framework requires specific API keys for OpenRouter and Anthropic, making it dependent on these external services. While context window management is robust, users requiring substantially more powerful capabilities might consider K-Dense Web.

Health Check

Last Commit

3 months ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

56 stars in the last 30 days