AutoKaggle  by multimodal-art-projection

Autonomous framework for data science competitions

Created 1 year ago
302 stars

Top 88.0% on SourcePulse

GitHubView on GitHub
Project Summary

A multi-agent framework designed to automate data science pipelines for Kaggle competitions. AutoKaggle assists data scientists by combining iterative development, comprehensive testing, and an ML tools library within a collaborative multi-agent system, aiming to automate complex workflows while maintaining high customizability.

How It Works

AutoKaggle employs a multi-agent collaboration model featuring five specialized agents (Reader, Planner, Developer, Reviewer, Summarizer) that work through six key competition phases. The framework emphasizes iterative development and unit testing for robust code verification, supported by a validated ML tools library for data cleaning, feature engineering, and modeling. This approach aims to streamline and automate the end-to-end process of participating in data science competitions.

Quick Start & Requirements

  • Primary install: Clone the repository, create and activate a conda environment (conda create -n AutoKaggle python=3.11, conda activate AutoKaggle), then install dependencies (pip install -r requirements.txt).
  • Prerequisites: Python 3.11, Conda, and an OpenAI API key stored in api_key.txt.
  • Data Preparation: Tabular competition data (train.csv, test.csv, sample_submission.csv, overview.txt) must be placed in ./multi_agents/competition/.
  • Run Command: bash run_multi_agent.sh.
  • Relevant Pages: GitHub Repository

Highlighted Details

  • Achieved an 85% validation submission rate across 8 diverse Kaggle competitions.
  • Attained a comprehensive score of 0.82.

Maintenance & Community

No specific details regarding maintainers, community channels (like Discord/Slack), or roadmaps are provided in the README.

Licensing & Compatibility

Licensed under the Apache 2.0 License. The project explicitly states it is not affiliated with Kaggle but uses the name for compatibility. The Apache 2.0 license is generally permissive for commercial use and integration into closed-source projects.

Limitations & Caveats

The project is not officially associated with Kaggle or Google and is in the process of rebranding. The use of the "Kaggle" name is solely for indicating compatibility with Kaggle competitions.

Health Check
Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
10 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.